651-700 of 953 results (14ms)
2016-05-23 §
14:57 <elukey> suspended all the oozie bundles as prep step for https://gerrit.wikimedia.org/r/#/c/290252 (yes I know super paranoid mode on) [analytics]
06:42 <elukey> Removed Kafka temp. override for webrequest_upload retention.ms after freeing some disk space. [analytics]
06:37 <elukey> Set kafka retention.ms=172800000 for the topic webrequest_upload to free some disk space on kafka1022 [analytics]
2016-05-20 §
12:50 <elukey> aqs100[123] restarted for openjdk upgrades [analytics]
08:53 <elukey> cassandra upgraded to 2.1.13 on aqs1003 [analytics]
08:30 <elukey> aqs1002 migrated to cassandra 2.1.13 [analytics]
2016-05-19 §
12:44 <elukey> restarted oozie on analytics1003 for security upgrades [analytics]
12:32 <elukey> suspended all the oozie bundles as prep step for oozie's restart (security upgrades) [analytics]
12:28 <elukey> restarted hue on analytics1027 for security upgrades [analytics]
2016-05-17 §
20:55 <ottomata> restarted eventlogging after kafka1013 flapped [analytics]
15:54 <joal> Deploying onto new aqs [analytics]
15:45 <joal> Deploying aqs (deploy code only) [analytics]
2016-05-16 §
17:40 <joal> Deploying refinery on hdfs [analytics]
17:34 <joal> deploying refinery from tin [analytics]
2016-05-12 §
14:36 <joal> Start cassandra unique dives loading oozie job backfilling from 2016-05-05 included onward [analytics]
10:20 <elukey> restarted oozie, hive-* daemons on analytics1003 for java upgrades [analytics]
2016-05-11 §
13:31 <elukey> camus + puppet disabled on analytics1027 [analytics]
10:54 <elukey> eventbus restarted on kafka100[12] for security upgrades [analytics]
2016-05-10 §
19:11 <ottomata> reenabling camus and puppet on analytics1027 [analytics]
18:05 <ottomata> restarting eventlogging after broker stop [analytics]
2016-05-09 §
17:37 <elukey> camus+puppet re-enabled on analytics1027 [analytics]
17:31 <elukey> analytics1002 Yarn+HDFS masters failed over to analytics1001 for Java upgrades (restored original state) [analytics]
17:22 <elukey> analytics1001 Yarn+HDFS masters failed over to analytics1002 for Java upgrades [analytics]
15:51 <elukey> disabled camus+puppet on analytics1027 as prep step for maintenance on the cluster. [analytics]
14:24 <elukey> restarting hadoop java daemons for Java upgrades on analytics104X and analytics105X hosts [analytics]
13:12 <elukey> restarting hadoop java daemons for Java upgrades on analytics102X and analytics 103X hosts [analytics]
10:22 <elukey> restarted eventlogging on eventlogging1001 for security upgrades [analytics]
2016-05-06 §
11:48 <elukey> restarting aqs on aqs100[123] for security upgrades. [analytics]
2016-05-05 §
19:29 <nuria_> deployed latest master candidate to vital-signs on labs [analytics]
14:55 <joal> restarted pageview job [analytics]
14:47 <joal> deploying refinery on hdfs [analytics]
14:44 <joal> deploying refinery from tin [analytics]
11:33 <joal> restarting every analytics oozie job [analytics]
10:28 <joal> Pause hadoop job to restart in clean mode [analytics]
09:51 <joal> deploy refinery onto hdfs [analytics]
09:48 <joal> deploying refinery from tin [analytics]
2016-05-04 §
17:16 <elukey> migrating the hue database [analytics]
16:43 <elukey> stopping hue on analytics1027 to dump its databse [analytics]
2016-05-03 §
17:06 <joal> Manually add jam.wikipedia to the accepted project list for pageview [analytics]
2016-05-02 §
21:59 <mforns> deployed vital-signs [analytics]
21:59 <mforns> deployed edit-analysis [analytics]
21:35 <mforns> deployed language-reportcard [analytics]
21:18 <mforns> deployed browser-reports to unbreak production [analytics]
18:30 <joal> manually touch _SUCCESS file in hdfs://analytics-hadoop/wmf/data/raw/webrequest/webrequest_text/hourly/2016/05/02/14/ to launch refine process despites load job failure [analytics]
17:38 <elukey> removed out of service banner from dashiki dashboards [analytics]
17:33 <elukey> reverted Varnish config to return 503s for datasets and stats [analytics]
12:14 <elukey> deployed Varnish change to force HTTP 503 for datasets.wikimedia.org, stats.wikimedia.org, metrics.wikimedia.org as prep-step for OS reimage. [analytics]
12:05 <elukey> enabled maintenance banner to dashiki based dashboards via https://meta.wikimedia.org/wiki/Dashiki:OutOfService [analytics]
11:21 <elukey> deployed last version of Event Logging. Service also restarted. [analytics]
2016-04-30 §
13:42 <elukey> disabled puppet on analytics1047 and scheduled downtime for the host, IO errors in the dmesg for /dev/sdd. Stopped also Hadoop daemons to remove it from the cluster temporarily (not sure how to do it properly, will write docs). [analytics]