1001-1050 of 10000 results (120ms)
2025-05-20 ยง
17:31 <vgutierrez@puppetserver1001> conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [production]
17:31 <vgutierrez> repool cp4037 with edge uniques enabled, stats available on https://grafana.wikimedia.org/goto/fYSIMlaHR?orgId=1 - T391411 [production]
17:29 <dzahn@cumin1002> START - Cookbook sre.dns.netbox [production]
17:29 <dzahn@cumin1002> START - Cookbook sre.ganeti.makevm for new host zuul1001.eqiad.wmnet [production]
17:27 <dzahn@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2001.codfw.wmnet with OS bullseye [production]
17:21 <topranks> enable FPC 0 (10x100G) card in cr2-codfw (T393552) [production]
17:18 <swfrench@deploy1003> helmfile [codfw] DONE helmfile.d/services/mw-debug: apply [production]
17:18 <swfrench@deploy1003> helmfile [codfw] START helmfile.d/services/mw-debug: apply [production]
17:17 <swfrench@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply [production]
17:16 <swfrench@deploy1003> helmfile [eqiad] START helmfile.d/services/mw-debug: apply [production]
17:13 <swfrench@deploy1003> Stopping before sync operations [production]
17:13 <swfrench@deploy1003> Started scap sync-world: Non-deploy scap run to switch mw-debug/pinkunicorn to PHP 8.1 - T391057 [production]
17:11 <dzahn@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2001.codfw.wmnet with reason: host reimage [production]
17:11 <cmooney@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 11 hosts with reason: replace cr2-codfw switch control boards and install new line card [production]
17:08 <dzahn@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2001.codfw.wmnet with reason: host reimage [production]
16:53 <dzahn@cumin1002> START - Cookbook sre.hosts.reimage for host zuul2001.codfw.wmnet with OS bullseye [production]
16:46 <eevans@cumin1002> START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Upgrading to Java 11.0.27 - eevans@cumin1002 [production]
16:43 <eevans@cumin1002> END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Upgrading to Java 11.0.27 - eevans@cumin1002 [production]
16:06 <klausman@cumin1002> END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002 [production]
15:58 <vgutierrez@puppetserver1001> conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [production]
15:57 <vgutierrez> depooling cp4037 before enabling edge uniques - T391411 [production]
15:54 <vgutierrez@puppetserver1001> conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [production]
15:52 <cgoubert@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply [production]
15:51 <cgoubert@deploy1003> helmfile [eqiad] START helmfile.d/services/mw-cron: apply [production]
15:49 <vgutierrez@puppetserver1001> conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [production]
15:49 <klausman@cumin1002> START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002 [production]
15:21 <jmm@cumin2002> END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector [production]
15:15 <eevans@cumin1002> START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Upgrading to Java 11.0.27 - eevans@cumin1002 [production]
15:13 <jmm@cumin2002> START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector [production]
15:09 <klausman@cumin1002> END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002 [production]
15:05 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply [production]
15:05 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply [production]
14:59 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply [production]
14:58 <brouberol@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 21 days, 0:00:00 on kafka-jumbo[1016-1018].eqiad.wmnet with reason: Parted config is broken causing the hosts to have no data disk [production]
14:56 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply [production]
14:54 <bking@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 65 hosts with reason: eqiad is depooled, noisy alerts [production]
14:53 <topranks> shutting down control board 1 on cr2-codfw (T393552) [production]
14:52 <topranks> shutting down backup RE1 on cr2-codfw (T393552) [production]
14:51 <klausman@cumin1002> START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002 [production]
14:50 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1087.eqiad.wmnet with OS bullseye [production]
14:48 <moritzm> installing expat security updates [production]
14:39 <topranks> switching active routing-engine to RE0 on cr2-codfw (this will cause protocol adjacencies to flap) (T364092) [production]
14:25 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage [production]
14:21 <bking@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage [production]
14:21 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1083.eqiad.wmnet with OS bullseye [production]
14:18 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1082.eqiad.wmnet with OS bullseye [production]
14:06 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1087 [production]
14:06 <bking@cumin2002> START - Cookbook sre.hosts.move-vlan for host cirrussearch1087 [production]
14:06 <bking@cumin2002> START - Cookbook sre.hosts.reimage for host cirrussearch1087.eqiad.wmnet with OS bullseye [production]
14:04 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from elastic1087 to cirrussearch1087 [production]