production SAL

351-400 of 10000 results (109ms)

2026-03-30 §
11:52	<jmm@cumin2002>	START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet	[production]
11:51	<godog>	bounce neutron-l3-agent on cloundnet1005 - T421054	[production]
11:06	<btullis@deploy1003>	helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.	[production]
11:05	<btullis@deploy1003>	helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.	[production]
11:05	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
11:04	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
10:37	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm	[production]
10:15	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage	[production]
10:09	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage	[production]
09:46	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm	[production]
09:42	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie	[production]
09:19	<ayounsi@cumin1003>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 42	[production]
09:17	<ayounsi@cumin1003>	START - Cookbook sre.network.peering with action 'email' for AS: 42	[production]
09:15	<ayounsi@cumin1003>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12200	[production]
09:14	<ayounsi@cumin1003>	START - Cookbook sre.network.peering with action 'email' for AS: 12200	[production]
09:11	<tappof>	prometheus[12]008: reboot (T419960)	[production]
09:10	<tappof>	prometheus[12]006: reboot (T419960)	[production]
08:56	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie	[production]
08:52	<XioNoX>	push pfw policy - T421556	[production]
08:51	<tappof>	prometheus[12]007: reboot (T419960)	[production]
08:38	<javiermonton@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply	[production]
08:38	<javiermonton@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply	[production]
08:37	<tappof>	prometheus[12]005: reboot (T419960)	[production]
08:34	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie	[production]
08:17	<javiermonton@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1261377\|stream: mediawiki.page_html_content_change (T421341)]] (duration: 35m 10s)	[production]
08:03	<javiermonton@deploy1003>	javiermonton: Continuing with sync	[production]
08:00	<javiermonton@deploy1003>	javiermonton: Backport for [[gerrit:1261377\|stream: mediawiki.page_html_content_change (T421341)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.	[production]
07:54	<godog>	deploy rabbitmq changes to allow cli communication - T420923	[production]
07:48	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie	[production]
07:48	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host bast4006.wikimedia.org with OS trixie	[production]
07:48	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie	[production]
07:45	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie	[production]
07:45	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie	[production]
07:42	<javiermonton@deploy1003>	Started scap sync-world: Backport for [[gerrit:1261377\|stream: mediawiki.page_html_content_change (T421341)]]	[production]
07:38	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host bast4006.wikimedia.org with OS trixie	[production]
07:24	<tappof>	prometheus7002: switch to nftables and reboot (T419960)	[production]
07:18	<tappof>	prometheus6002: switch to nftables and reboot (T419960)	[production]
07:11	<tappof>	prometheus5002: switch to nftables and reboot (T419960)	[production]
07:08	<tappof>	prometheus4003: reboot (T419960)	[production]
07:05	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie	[production]
07:04	<tappof>	prometheus3004: switch to nftables and reboot (T419960)	[production]
05:16	<marostegui@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13	[production]
05:14	<marostegui@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13	[production]
02:07	<mwpresync@deploy1003>	Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)	[production]
02:00	<mwpresync@deploy1003>	Started scap build-images: Publishing wmf/next image	[production]
2026-03-29 §
02:07	<mwpresync@deploy1003>	Finished scap build-images: Publishing wmf/next image (duration: 06m 13s)	[production]
02:00	<mwpresync@deploy1003>	Started scap build-images: Publishing wmf/next image	[production]
2026-03-28 §
14:48	<dzahn@cumin2002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: T421398	[production]
14:48	<dzahn@cumin2002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: T421398	[production]
14:16	<mutante>	releases1003 - re-enabled puppet which was disabled due to T418109 but should not have been disabled during switch of the deployment server; leading to T421532	[production]