1751-1800 of 4660 results (21ms)
2020-08-06 §
14:47 <fdans> deploying refinery [analytics]
08:07 <elukey> roll restart druid-brokers (on both clusters) to pick up new changes for monitorings [analytics]
2020-08-05 §
13:04 <elukey> restart yarn resource managers on an-master100[12] to pick up new Yarn settings - https://gerrit.wikimedia.org/r/c/operations/puppet/+/618529 [analytics]
13:03 <elukey> set yarn_scheduler_minimum_allocation_mb = 1 (was zero) to Hadoop to workaround a Flink 1.1 issue (namely it doesn't work if the value is <= 0) [analytics]
09:32 <elukey> set ticket max renewable lifetime to 7d on all kerberos clients (was zero, the default) [analytics]
2020-08-04 §
08:30 <elukey> resume druid-related oozie coordinator jobs via Hue (after druid upgrade) [analytics]
08:28 <elukey> started netflow kafka supervisor on Druid Analytics (after upgrade) [analytics]
08:19 <elukey> restore systemd timers for druid jobs on an-launcher1002 (after druid upgrade) [analytics]
07:33 <elukey> stop systemd timers related to druid on an-launcher1002 [analytics]
07:29 <elukey> stop kafka supervisor for netflow on Druid Analytics (prep step for druid upgrade) [analytics]
07:00 <elukey> suspend all druid-related coordinators in Hue as prep step for upgrade [analytics]
2020-08-03 §
09:53 <elukey> move all druid-related systemd timer to spark client mode - T254493 [analytics]
08:07 <elukey> roll restart aqs on aqs* to pick up new druid settings [analytics]
2020-08-01 §
13:22 <joal> Rerun cassandra-monthly-wf-local_group_default_T_unique_devices-2020-7 to load missing data (email with bug description sent to list) [analytics]
2020-07-31 §
14:46 <mforns> restarted webrequest oozie bundle [analytics]
14:46 <mforns> restarted mediawiki history reduced oozie job [analytics]
09:00 <elukey> SET GLOBAL expire_logs_days=14; on matomo1002's mysql [analytics]
09:00 <elukey> SET GLOBAL expire_logs_days=14; on an-coord1001's mysql [analytics]
06:32 <elukey> roll restart of druid brokers on druid100[4-8] to pick up new changes [analytics]
2020-07-30 §
19:14 <mforns> finished refinery deploy (for v0.0.132) [analytics]
18:48 <mforns> starting refinery deploy (for v0.0.132) [analytics]
18:27 <mforns> deployed refinery-source v0.0.132 [analytics]
2020-07-29 §
14:37 <mforns> quick deployment of pageview white-list [analytics]
2020-07-28 §
17:52 <ottomata> stopped riting eventlogging data log files on eventlog1002 and stopped syncing them to stat100[67] - T259030 [analytics]
14:29 <elukey> stop client-side-events-log.service on eventlog1002 to avoid /srv to fill up [analytics]
09:48 <elukey> re-enable eventlogging file consumers on eventlog1002 [analytics]
09:10 <elukey> temporarily stop eventlogging file consumers on eventlog1002 to copy some data over to stat1005 (/srv partition full) [analytics]
08:03 <elukey> Superset migrated to CAS [analytics]
06:42 <elukey> re-run webrequest-load hour 2020-7-28-3 [analytics]
2020-07-27 §
17:15 <elukey> restart eventlogging on eventlog1002 to update the event whitelist (exclude MobileWebUIClickTracking) [analytics]
08:19 <elukey> reset-failed the monitor_refine_failures for eventlogging on an-launcher1002 [analytics]
06:44 <elukey> truncate big log file on an-launcher1002 that is filling up the /srv partition [analytics]
2020-07-22 §
15:05 <joal> manually drop /user/analytics/.Trash/200714000000/wmf/data/wmf/pageview/actor to free some space [analytics]
15:03 <joal> Manually drop /wmf/data/wmf/mediawiki/wikitext/history/snapshot=2020-03 to free some spqce [analytics]
15:01 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics-privatedata/logs [analytics]
14:49 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics/logs/* [analytics]
08:09 <elukey> turnilo.wikimedia.org migrated to CAS [analytics]
2020-07-21 §
18:30 <mforns> finished re-deploying refinery to unbreak unique devices per domain monthly [analytics]
18:05 <mforns> re-deploying refinery to unbreak unique devices per domain monthly [analytics]
17:34 <mforns> restarted unique_devices-per_domain-daily-coord [analytics]
15:09 <elukey> yarn.wikimedia.org migrated earlier on to CAS auth [analytics]
14:58 <ottomata> Refine - reverted change to not merge hive schema + event schema before reading - T255818 [analytics]
13:36 <ottomata> Refine no longer merges with Hive table schema when reading (except for refine_eventlogging_analytics job) - T255818 [analytics]
2020-07-20 §
19:56 <joal> kill-restart cassandra unique-devices loading daily and monthly after deploy (2020-07-20 and 2020-07-01) [analytics]
19:55 <joal> kill-restart mediawiki-history-denormalize after dpeloy (2020-07-01) [analytics]
19:55 <joal> kill-restart webrequest after dpeloy (2020-07-20T18:00) [analytics]
19:19 <mforns> finished refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> starting refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> deployed refinery-source v0.0.131 [analytics]
18:16 <joal> Rerun cassandra-daily-coord-local_group_default_T_unique_devices from 2020-07-15 to 2020-07-19 (both included) [analytics]