251-300 of 817 results (14ms)
2017-02-09 §
10:05 <elukey> re-enabled oozie bundles after maintenance [analytics]
10:04 <elukey> performed master failover from an1001 to an1002 (and vice-versa) for java upgrades [analytics]
10:04 <elukey> restarted oozie, hive-server and metastore for java upgrades [analytics]
09:49 <elukey> suspended oozie bundles temporarily to allow graceful restarts [analytics]
2017-02-08 §
18:05 <ottomata> restarting pivot [analytics]
17:52 <ottomata> restarting pivot [analytics]
15:35 <elukey> restarted all the failed oozie cassandra load jobs [analytics]
2017-02-07 §
20:24 <joal> Resubmit cassandra-coord-pageview-per-project-hourly for 2017-02-07T18:00 [analytics]
14:36 <elukey> restarted webrequest-load-wf-text-2017-2-7-13 [analytics]
2017-02-04 §
13:18 <joal> Restarted mediacounts-archive job for day 2017-02-03 (had failed) [analytics]
2017-02-02 §
12:07 <joal> Restarted daily and monthly pageview druid loading jobs [analytics]
12:03 <joal> Deployed refinery to correct bug introduced in https://gerrit.wikimedia.org/r/#/c/335067/ [analytics]
10:13 <joal> Killed-Restarted last access uniques monthly jobs to pick up new config -0097552-161121120201437-oozie-oozi-C [analytics]
2017-02-01 §
19:01 <joal> Killed-Restarted Mobile apps Uniques monthly jobs to pick up new config - 0096638-161121120201437-oozie-oozi-C [analytics]
18:47 <joal> Deploy refinery for uniques monthly patches [analytics]
17:27 <joal> Restarting 2 webrequest-load text jobs that failed during NM restart (2016-02-01T11:00 and T13:00) [analytics]
13:12 <elukey> restarted pageview-druid-monthly-coord and pageview-druid-daily-coord oozie coordinators after deployment [analytics]
12:17 <elukey> deployed Refinery via scap and then executed the hdfs copies on stat1002 [analytics]
2017-01-31 §
16:11 <elukey> started Cassandra nodetool cleanup for aqs1007-a [analytics]
16:04 <elukey> started Cassandra nodetool cleanup for aqs1004-b [analytics]
08:31 <elukey> started Cassandra nodetool cleanup for aqs1004-a [analytics]
2017-01-26 §
19:20 <joal> Restart webrequest-lood-coord-text 2017-01-26T15:00 after cluster shake [analytics]
19:18 <elukey> restored an1001 as RM and HDFS master [analytics]
2017-01-24 §
21:30 <ottomata> restarted hadoop-mapreduce-historyserver on analytics1001. it died to do OOM [analytics]
2017-01-22 §
13:27 <joal> Rerun pageview-druid-daily-wf-2017-1-20 trying to see if it fixes automagically [analytics]
2017-01-19 §
15:51 <joal> Launched 0080172-161121120201437-oozie-oozi-B to recover from missing webrequest-load 2017-01-18 19:00 with a correct setup this time [analytics]
15:39 <joal> Launched 0080149-161121120201437-oozie-oozi-B to recover from missing webrequest-load 2017-01-18 19:00 [analytics]
2017-01-17 §
11:16 <joal> Remove mediawiki-history-beta datasource from druid [analytics]
09:51 <elukey> restarted mediacounts-archive-wf-2017-01-16 [analytics]
2017-01-11 §
19:23 <joal> Start mediawiki history reconstruction job on newly sqooped data [analytics]
18:25 <joal> Replace /wmf/data/raw/mediawiki/tables/ with newly sqooped data [analytics]
2017-01-10 §
15:30 <joal> Restart 0024519-160420145651441-oozie-oozi-C for day 2017-01-09 to see if it fails again [analytics]
2017-01-06 §
20:35 <joal> Launched 0063574-161121120201437-oozie-oozi-C to cover for upload-2017-01-06-[16-17] [analytics]
19:04 <elukey> started 0063446-161121120201437-oozie-oozi-C to re-run upload-2017-1-6-17 [analytics]
2016-12-22 §
15:28 <elukey> changed firewall rules to allow only $ANALYTICS_NETWORKS (rather than the broader $INTERNAL) for the Yarn UI http service (an1001) and the hive metastore (an1003) [analytics]
2016-12-19 §
21:27 <nuria> deployed analytics refinery, restarted webrequest load and pageview_hourly jobs [analytics]
20:11 <nuria> deployed analytics/refinery to cluster (2nd try) [analytics]
2016-12-13 §
11:12 <elukey> deleted /srv/stat1001 on stat1004 [analytics]
2016-12-09 §
14:32 <joal> restarted eventlogging mysql consumer after DB restart [analytics]
13:57 <joal> Stopped EventLogging Mysql consumer for database restart [analytics]
2016-12-08 §
18:37 <ottomata> preferred-replica-election on analytics kafka cluster to bring 1012 back as leader for its partitions [analytics]
18:15 <ottomata> restarting broker on kafka1012 to repro T152674 [analytics]
2016-12-07 §
21:59 <ottomata> restarting eventlogging again to pick up puppet changes to use kafka-confluent writer [analytics]
19:39 <ottomata> restarting analytics eventlogging to test out confluent kafka producer for processors [analytics]
2016-12-05 §
11:02 <joal> Killing wikidata-articleplaceholder_metrics job and restarting it starting Nov. 1st for code update [analytics]
10:43 <joal> Deploy refinery onto hdfs [analytics]
10:35 <joal> deploying refinery [analytics]
2016-12-02 §
09:43 <joal> Restarted yesterday failed oozie webrequest-load jobs (upload, text, misc, hours 21, 22,23) [analytics]
2016-12-01 §
20:27 <ottomata> bouncing kafka broker on kafka1018 to test config changes to eventlogging analytics kafka clients [analytics]
20:25 <ottomata> restarting eventlogging analytics processes again to pick up api_version change for consumers too [analytics]