1-50 of 10000 results (81ms)
2026-03-10 §
10:20 <ayounsi@cumin1003> START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye [production]
10:17 <gkyziridis@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . [production]
09:32 <ayounsi@cumin1003> END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw [production]
09:31 <ayounsi@cumin1003> START - Cookbook sre.network.tls for network device cr2-eqdfw [production]
09:22 <derick@deploy2002> mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # T419499 [production]
09:00 <arnaudb@dns1005> END - running authdns-update [production]
09:00 <godog> restore all host interfaces - T417393 [production]
08:58 <arnaudb@dns1005> START - running authdns-update [production]
08:30 <godog> disabled interface for cloudcephmon1004 - T417393 [production]
08:22 <godog> disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - T417393 [production]
08:18 <godog> disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - T417393 [production]
08:05 <godog> start disabling cloudcephosd interfaces - T417393 [production]
07:49 <godog> prep cloudsw reboot tests 'ceph osd set noout' - T417393 [production]
07:41 <filippo@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests [production]
06:14 <bking@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm [production]
04:09 <pt1979@cumin2002> END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo [production]
04:08 <pt1979@cumin2002> START - Cookbook sre.network.tls for network device asw1-23-ulsfo [production]
04:01 <mwpresync@deploy2002> Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s) [production]
02:09 <mwpresync@deploy2002> Finished scap build-images: Publishing wmf/next image (duration: 08m 10s) [production]
02:00 <mwpresync@deploy2002> Started scap build-images: Publishing wmf/next image [production]
01:37 <ryankemper> [WDQS] T410573 repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook [production]
00:42 <vriley@cumin1003> START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie [production]
00:39 <vriley@cumin1003> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
00:29 <vriley@cumin1003> START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED [production]
2026-03-09 §
22:51 <rzl> root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia [production]
22:34 <bking@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet [production]
22:32 <bking@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet [production]
22:30 <bking@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:29 <bking@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:28 <bking@cumin2002> END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:28 <bking@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:28 <bking@cumin2002> END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:28 <bking@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet [production]
22:03 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie [production]
22:02 <alexsanford> Redeployed security fix for T419186 [production]
21:44 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage [production]
21:40 <andrew@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage [production]
21:37 <cdobbins@puppetserver1001> conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet [production]
21:34 <cdobbins@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie [production]
21:29 <alexsanford> Deployed security fix for T419186 [production]
21:22 <andrew@cumin2002> START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie [production]
21:21 <andrew@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie [production]
21:17 <dani@deploy2002> Finished scap sync-world: Backport for [[gerrit:1249370|Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s) [production]
21:13 <dani@deploy2002> dani: Continuing with sync [production]
21:11 <dani@deploy2002> dani: Backport for [[gerrit:1249370|Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
21:09 <dani@deploy2002> Started scap sync-world: Backport for [[gerrit:1249370|Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] [production]
21:08 <andrew@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage [production]
21:05 <cdobbins@cumin2002> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage [production]
21:02 <cdobbins@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage [production]
21:01 <andrew@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage [production]