production SAL

1501-1550 of 10000 results (102ms)

2025-05-20 §
17:11	<cmooney@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 11 hosts with reason: replace cr2-codfw switch control boards and install new line card	[production]
17:08	<dzahn@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2001.codfw.wmnet with reason: host reimage	[production]
16:53	<dzahn@cumin1002>	START - Cookbook sre.hosts.reimage for host zuul2001.codfw.wmnet with OS bullseye	[production]
16:46	<eevans@cumin1002>	START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Upgrading to Java 11.0.27 - eevans@cumin1002	[production]
16:43	<eevans@cumin1002>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Upgrading to Java 11.0.27 - eevans@cumin1002	[production]
16:06	<klausman@cumin1002>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002	[production]
15:58	<vgutierrez@puppetserver1001>	conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet	[production]
15:57	<vgutierrez>	depooling cp4037 before enabling edge uniques - T391411	[production]
15:54	<vgutierrez@puppetserver1001>	conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet	[production]
15:52	<cgoubert@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply	[production]
15:51	<cgoubert@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-cron: apply	[production]
15:49	<vgutierrez@puppetserver1001>	conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet	[production]
15:49	<klausman@cumin1002>	START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002	[production]
15:21	<jmm@cumin2002>	END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector	[production]
15:15	<eevans@cumin1002>	START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Java 11.0.27 - eevans@cumin1002	[production]
15:13	<jmm@cumin2002>	START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector	[production]
15:09	<klausman@cumin1002>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002	[production]
15:05	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply	[production]
15:05	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply	[production]
14:59	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply	[production]
14:58	<brouberol@cumin2002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 21 days, 0:00:00 on kafka-jumbo[1016-1018].eqiad.wmnet with reason: Parted config is broken causing the hosts to have no data disk	[production]
14:56	<btullis@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply	[production]
14:54	<bking@cumin2002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 65 hosts with reason: eqiad is depooled, noisy alerts	[production]
14:53	<topranks>	shutting down control board 1 on cr2-codfw (T393552)	[production]
14:52	<topranks>	shutting down backup RE1 on cr2-codfw (T393552)	[production]
14:51	<klausman@cumin1002>	START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002	[production]
14:50	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1087.eqiad.wmnet with OS bullseye	[production]
14:48	<moritzm>	installing expat security updates	[production]
14:39	<topranks>	switching active routing-engine to RE0 on cr2-codfw (this will cause protocol adjacencies to flap) (T364092)	[production]
14:25	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage	[production]
14:21	<bking@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage	[production]
14:21	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1083.eqiad.wmnet with OS bullseye	[production]
14:18	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1082.eqiad.wmnet with OS bullseye	[production]
14:06	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1087	[production]
14:06	<bking@cumin2002>	START - Cookbook sre.hosts.move-vlan for host cirrussearch1087	[production]
14:06	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host cirrussearch1087.eqiad.wmnet with OS bullseye	[production]
14:04	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from elastic1087 to cirrussearch1087	[production]
14:03	<topranks>	switching active routing-engine to RE1 on cr2-codfw (this will cause protocol adjacencies to flap) (T364092)	[production]
14:03	<bking@cumin2002>	END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1087	[production]
14:02	<bking@cumin2002>	START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1087	[production]
14:02	<bking@cumin2002>	END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1087 on all recursors	[production]
14:02	<bking@cumin2002>	START - Cookbook sre.dns.wipe-cache cirrussearch1087 on all recursors	[production]
14:02	<bking@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
14:02	<bking@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1087 to cirrussearch1087 - bking@cumin2002"	[production]
14:01	<phuedx@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1147796\|ext-EventStreamConfig: Update product_metrics.web_base stream (T394457)]] (duration: 14m 14s)	[production]
14:01	<jynus@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: Move s8 to s3	[production]
13:59	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1083.eqiad.wmnet with reason: host reimage	[production]
13:58	<bking@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic1087 to cirrussearch1087 - bking@cumin2002"	[production]
13:56	<topranks>	rebooting backup routing-engine RE1 on cr2-codfw to install JunOS upgrade (T364092)	[production]
13:55	<bking@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1082.eqiad.wmnet with reason: host reimage	[production]