351-400 of 10000 results (114ms)
2026-03-30 §
11:52 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet [production]
11:51 <godog> bounce neutron-l3-agent on cloundnet1005 - T421054 [production]
11:06 <btullis@deploy1003> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
11:05 <btullis@deploy1003> helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. [production]
11:05 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
11:04 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
10:37 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm [production]
10:15 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage [production]
10:09 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage [production]
09:46 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm [production]
09:42 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie [production]
09:19 <ayounsi@cumin1003> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 42 [production]
09:17 <ayounsi@cumin1003> START - Cookbook sre.network.peering with action 'email' for AS: 42 [production]
09:15 <ayounsi@cumin1003> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12200 [production]
09:14 <ayounsi@cumin1003> START - Cookbook sre.network.peering with action 'email' for AS: 12200 [production]
09:11 <tappof> prometheus[12]008: reboot (T419960) [production]
09:10 <tappof> prometheus[12]006: reboot (T419960) [production]
08:56 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie [production]
08:52 <XioNoX> push pfw policy - T421556 [production]
08:51 <tappof> prometheus[12]007: reboot (T419960) [production]
08:38 <javiermonton@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply [production]
08:38 <javiermonton@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply [production]
08:37 <tappof> prometheus[12]005: reboot (T419960) [production]
08:34 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie [production]
08:17 <javiermonton@deploy1003> Finished scap sync-world: Backport for [[gerrit:1261377|stream: mediawiki.page_html_content_change (T421341)]] (duration: 35m 10s) [production]
08:03 <javiermonton@deploy1003> javiermonton: Continuing with sync [production]
08:00 <javiermonton@deploy1003> javiermonton: Backport for [[gerrit:1261377|stream: mediawiki.page_html_content_change (T421341)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
07:54 <godog> deploy rabbitmq changes to allow cli communication - T420923 [production]
07:48 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie [production]
07:48 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host bast4006.wikimedia.org with OS trixie [production]
07:48 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie [production]
07:45 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie [production]
07:45 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie [production]
07:42 <javiermonton@deploy1003> Started scap sync-world: Backport for [[gerrit:1261377|stream: mediawiki.page_html_content_change (T421341)]] [production]
07:38 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie [production]
07:24 <tappof> prometheus7002: switch to nftables and reboot (T419960) [production]
07:18 <tappof> prometheus6002: switch to nftables and reboot (T419960) [production]
07:11 <tappof> prometheus5002: switch to nftables and reboot (T419960) [production]
07:08 <tappof> prometheus4003: reboot (T419960) [production]
07:05 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie [production]
07:04 <tappof> prometheus3004: switch to nftables and reboot (T419960) [production]
05:16 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13 [production]
05:14 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13 [production]
02:07 <mwpresync@deploy1003> Finished scap build-images: Publishing wmf/next image (duration: 06m 50s) [production]
02:00 <mwpresync@deploy1003> Started scap build-images: Publishing wmf/next image [production]
2026-03-29 §
02:07 <mwpresync@deploy1003> Finished scap build-images: Publishing wmf/next image (duration: 06m 13s) [production]
02:00 <mwpresync@deploy1003> Started scap build-images: Publishing wmf/next image [production]
2026-03-28 §
14:48 <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: T421398 [production]
14:48 <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: T421398 [production]
14:16 <mutante> releases1003 - re-enabled puppet which was disabled due to T418109 but should not have been disabled during switch of the deployment server; leading to T421532 [production]