1451-1500 of 4940 results (15ms)
2021-02-18 §
17:27 <elukey> an-coord1002 back in service with raid1 configured [analytics]
15:48 <elukey> stop hive/mysql on an-coord1002 as precautionary step to rebuild the md array [analytics]
13:10 <elukey> failover analytics-hive to an-coord1001 after maintenance (DNS change) [analytics]
11:32 <elukey> restart hive daemons on an-coord1001 to pick up new parquet settings [analytics]
10:07 <elukey> hive failover to an-coord1002 to apply new hive settings to an-coord1001 [analytics]
10:00 <elukey> restart hive daemons on an-coord1002 (standby coord) to pick up new default parquet file format change [analytics]
09:46 <elukey> upgrade presto to 0.246-wmf on an-coord1001, an-presto*, stat100x [analytics]
2021-02-17 §
17:44 <razzi> rebalance kafka partitions for webrequest_upload partition 0 [analytics]
16:14 <razzi> rebalance kafka partitions for eqiad.mediawiki.api-request [analytics]
07:04 <elukey> reboot stat1004/stat1006/stat1007 for kernel upgrades [analytics]
2021-02-16 §
22:31 <razzi> rebalance kafka partitions for codfw.mediawiki.api-request [analytics]
17:44 <razzi> rebalance kafka partitions for netflow [analytics]
17:42 <razzi> rebalance kafka partitions for atskafka_test_webrequest_text [analytics]
07:32 <elukey> restart hadoop daemons on an-worker1099 after reconfiguring a new disk [analytics]
06:58 <elukey> restart hdfs/yarn daemons on an-worker1097 to exclude a failed disk [analytics]
2021-02-15 §
20:38 <mforns> running hdfs fsck to troubleshoot corrupt blocks [analytics]
17:28 <elukey> restart hdfs namenodes on the main cluster to pick up new racking changes (worker nodes from the backup cluster) [analytics]
2021-02-14 §
09:38 <joal> Restart and backfill mediacount and mediarequest, and backfill mediarequest-AQS and mediacount archive [analytics]
09:38 <joal> deploy refinery onto hdfs [analytics]
09:14 <joal> Deploy hotfix for mediarequest and mediacount [analytics]
2021-02-12 §
19:19 <milimetric> deployed refinery with query syntax fix for the last broken cassandra job and an updated EL whitelist [analytics]
18:34 <razzi> rebalance kafka partitions for atskafka_test_webrequest_text [analytics]
18:31 <razzi> rebalance kafka partitions for __consumer_offsets [analytics]
17:48 <joal> Rerun wikidata-articleplaceholder_metrics-wf-2021-2-10 [analytics]
17:47 <joal> Rerun wikidata-specialentitydata_metrics-wf-2021-2-10 [analytics]
17:43 <joal> Rerun wikidata-json_entity-weekly-wf-2021-02-01 [analytics]
17:08 <elukey> reboot presto workers for kernel upgrade [analytics]
16:32 <mforns> finished deployment of analytics-refinery [analytics]
15:26 <mforns> started deployment of analytics-refinery [analytics]
15:16 <elukey> roll restart druid broker on druid-public to pick up new settings [analytics]
07:54 <elukey> roll restart of druid brokers on druid-public - locked after scheduled datasource deletion [analytics]
07:46 <elukey> force a manual run of refinery-druid-drop-public-snapshots on an-launcher1002 (3d before its natural start) - controlled execution to see how druid + 3xdataset replication reacts [analytics]
2021-02-11 §
14:26 <joal> Restart oozie API job after spark sharelib fix (start: 2021-02-10T18:00) [analytics]
14:20 <joal> Rerun failed clicstream instance 2021-01 after sharelib fix [analytics]
14:16 <joal> Restart oozie after having fixed the spark-2.4.4 sharelib [analytics]
14:12 <joal> Fix oozie sharelib for spark-2.4.4 by copying oozie-sharelib-spark-4.3.0.jar onto the spark folder [analytics]
02:19 <milimetric> deployed again to fix old spelling error :) referererererer [analytics]
00:05 <milimetric> deployed refinery and synced to hdfs, restarting cassandra jobs gently [analytics]
2021-02-10 §
21:46 <razzi> rebalance kafka partitions for eqiad.mediawiki.cirrussearch-request [analytics]
21:10 <razzi> rebalance kafka partitions for codfw.mediawiki.cirrussearch-request [analytics]
19:11 <elukey> drop /user/oozie/share + chown o+rx -R /user/oozie/share + restart oozie [analytics]
17:56 <razzi> rebalance kafka partitions for eventlogging-client-side [analytics]
01:07 <milimetric> deployed refinery with some fixes after BigTop upgrade, will restart three coordinators right now [analytics]
2021-02-09 §
22:04 <razzi> rebalance kafka partitions for eqiad.resource-purge [analytics]
20:51 <joal> Rerun webrequest-load-coord-[text|upload] for 2021-02-09T07:00 after data was imported to camus [analytics]
20:50 <razzi> rebalance kafka partitions for codfw.resource-purge [analytics]
20:31 <joal> Rerun webrequest-load-coord-[text|upload] for 2021-02-09T06:00 after data was imported to camus [analytics]
16:30 <elukey> restart datanode on ana-worker1100 [analytics]
16:14 <ottomata> restart datanode on analytics1059 with 16g heap [analytics]
16:08 <ottomata> restart datanode on an-worker1080 withh 16g heap [analytics]