1-50 of 10000 results (97ms)
2025-06-04 §
23:55 <brett@cumin2002> END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir [production]
22:45 <brett@cumin2002> START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir [production]
22:30 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7016.magru.wmnet [production]
22:27 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1185.eqiad.wmnet with OS bullseye [production]
22:20 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7014.magru.wmnet [production]
22:18 <damilare> SmashPig upgraded from d08693e5 to 3222a1f3 [production]
22:16 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] (duration: 13m 54s) [production]
22:12 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7016.magru.wmnet [production]
22:12 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7015.magru.wmnet [production]
22:12 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7015.magru.wmnet [production]
22:11 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7015.magru.wmnet [production]
22:11 <brett> sudo -i cumin 'A:ncredir' 'depool && apt-get update && apt-get upgrade -y && pool' -b1 -s10 [production]
22:09 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
22:04 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
22:02 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] [production]
22:02 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7015.magru.wmnet [production]
22:02 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7014.magru.wmnet [production]
22:02 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7013.magru.wmnet [production]
21:58 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7012.magru.wmnet [production]
21:43 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7013.magru.wmnet [production]
21:42 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7011.magru.wmnet [production]
21:40 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7012.magru.wmnet [production]
21:40 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7010.magru.wmnet [production]
21:39 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7010.magru.wmnet [production]
21:35 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7010.magru.wmnet [production]
21:29 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250526/ using stat1011.eqiad.wmnet) [production]
21:25 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet [production]
21:25 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7010.magru.wmnet [production]
21:24 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250526/ using stat1009.eqiad.wmnet) [production]
21:22 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7009.magru.wmnet [production]
21:14 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7008.magru.wmnet [production]
21:07 <vriley@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1185.eqiad.wmnet with OS bullseye [production]
21:06 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1186.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
21:05 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7009.magru.wmnet [production]
21:05 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7008.magru.wmnet [production]
21:04 <cjming> end of UTC late backport window [production]
21:04 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7007.magru.wmnet [production]
21:02 <cjming@deploy1003> Finished scap sync-world: Backport for [[gerrit:1153689|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153690|SUL3: Retry local login on failure… (follow-ups) (T390784)]], [[gerrit:1153691|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153692|SUL3: Retry local login on failure… (follow-ups) (T390784)]] (d [production]
21:01 <jforrester@deploy1003> helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply [production]
20:55 <cjming@deploy1003> matmarex, cjming: Continuing with sync [production]
20:55 <vriley@cumin1002> START - Cookbook sre.hosts.provision for host an-worker1186.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
20:54 <cjming@deploy1003> matmarex, cjming: Backport for [[gerrit:1153689|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153690|SUL3: Retry local login on failure… (follow-ups) (T390784)]], [[gerrit:1153691|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153692|SUL3: Retry local login on failure… (follow-ups) (T390784)]] synced to [production]
20:51 <cjming@deploy1003> Started scap sync-world: Backport for [[gerrit:1153689|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153690|SUL3: Retry local login on failure… (follow-ups) (T390784)]], [[gerrit:1153691|SUL3: Retry local login on failure due to invalid/expired login token (T390784)]], [[gerrit:1153692|SUL3: Retry local login on failure… (follow-ups) (T390784)]] [production]
20:51 <jforrester@deploy1003> helmfile [codfw] START helmfile.d/services/wikifunctions: apply [production]
20:50 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7006.magru.wmnet [production]
20:46 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7007.magru.wmnet [production]
20:44 <robh@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7005.magru.wmnet [production]
20:40 <robh@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7006.magru.wmnet [production]
20:38 <cjming@deploy1003> Finished scap sync-world: Backport for [[gerrit:1153686|Treat File::getShortDesc() as possibly unsafe HTML (T395834)]], [[gerrit:1153687|Treat File::getShortDesc() as possibly unsafe HTML (T395834)]] (duration: 15m 37s) [production]
20:37 <robh@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7004.magru.wmnet [production]