2001-2050 of 10000 results (19ms)
2025-03-18 §
17:09 <elukey@cumin2002> START - Cookbook sre.hosts.provision for host ganeti2048.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
17:05 <topranks> move traffic off cr1-drms to allow for pic reset / port reconfiguration T389071 [production]
17:04 <root@deploy2002> helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: apply [production]
17:00 <elukey@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host puppetserver2004.codfw.wmnet with OS bookworm [production]
16:51 <fabfur@cumin1002> conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [production]
16:39 <brett@cumin2002> START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on P{cp70[02-16].magru.wmnet} and A:cp for 9.2.9-1wm1 [production]
16:38 <elukey> restart pybal on lvs1020 and lvs1019 to pick up kartotherian svc changes [production]
16:38 <fabfur> repooling cp4038 (T388147) [production]
16:38 <Ammar> T389226 Ran mwscript-k8s --comment="T389226" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=mediawikiwiki --logwiki=metawiki 'Schwarze Feder' 'AndreasKemper' [production]
16:37 <Ammar> T389226 Ran mwscript-k8s --comment="T389226" -f -- extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=bnwikivoyage --logwiki=metawiki 'Arafatuniofdhaka' 'আরাফাত হোসেন ভূঁইয়া' [production]
16:37 <fabfur> enabled puppet on A:cp (T388147) [production]
16:35 <brett@cumin2002> END (FAIL) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=1) Rolling upgrade/restart of Apache Traffic Server on A:magru and A:cp for 9.2.9-1wm1 [production]
16:34 <brett@cumin2002> START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on A:magru and A:cp for 9.2.9-1wm1 [production]
16:33 <elukey> restart pybal on lvs2013 (kartotherian's svc change) [production]
16:33 <root@deploy2002> helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: apply [production]
16:33 <root@deploy2002> helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: apply [production]
16:29 <elukey> restart pybal on lvs2014 [production]
16:29 <fabfur@cumin1002> conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [production]
16:28 <fabfur> enabled puppet and depooled cp4038 [production]
16:27 <elukey> disable puppet on lvs low traffic hosts in eqiad/codfw to restart pybal (kartotherian svc change) [production]
16:25 <fabfur> disabled puppet on A:cp for T388147 [production]
16:24 <root@deploy2002> helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: apply [production]
16:21 <root@deploy2002> helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: apply [production]
16:17 <elukey> removed kartotherian-related confd error files from config-master2001 - related to a maintenance issue [production]
16:13 <brennen@deploy2002> Finished deploy [phabricator/deployment@8884125]: deploy phab1004 for T389220 (duration: 00m 52s) [production]
16:12 <brennen@deploy2002> Started deploy [phabricator/deployment@8884125]: deploy phab1004 for T389220 [production]
16:12 <brennen@deploy2002> Finished deploy [phabricator/deployment@8884125]: deploy phab2002 for T389220 (duration: 00m 29s) [production]
16:11 <brennen@deploy2002> Started deploy [phabricator/deployment@8884125]: deploy phab2002 for T389220 [production]
16:10 <elukey@puppetserver1001> conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker2.*,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl [production]
16:10 <arnaudb@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: bugfix [production]
16:10 <elukey@puppetserver1001> conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker1.*,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl [production]
16:10 <arnaudb@cumin1002> DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on phab1004.eqiad.wmnet with reason: debugging T389079 [production]
16:09 <arnaudb@cumin1002> DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on phabricator.wikimedia.org with reason: bug fix [production]
16:09 <arnaudb@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab1004.eqiad.wmnet with reason: bug fix [production]
16:06 <root@deploy2002> helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: apply [production]
16:02 <root@deploy2002> helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: apply [production]
16:01 <root@deploy2002> helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: apply [production]
15:56 <claime> Silenced PHPFPMTooBusy for release=canary for 6d - T389224 [production]
15:51 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2047.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:51 <root@deploy2002> helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: apply [production]
15:49 <elukey@cumin1002> START - Cookbook sre.hosts.reimage for host puppetserver2004.codfw.wmnet with OS bookworm [production]
15:46 <elukey@cumin1002> START - Cookbook sre.hosts.provision for host ganeti2047.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:34 <hnowlan@cumin2002> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - T385155 [production]
15:23 <arturo> hard-reboot tools-prometheus-6, not responding to ssh [tools]
15:18 <sergi0> run CommunityUpdates config schema migration `foreachwikiindblist growthexperiments extensions/CommunityConfiguration/maintenance/migrateConfig.php CommunityUpdates` (T387737) [releng]
15:07 <arnaudb@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on vrts1003.eqiad.wmnet with reason: debugging T389079 [production]
15:05 <hnowlan@cumin2002> START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - T385155 [production]
15:01 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply [production]
15:00 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply [production]
14:56 <inflatador> bking@logstash1033 running puppet agent to confirm that CR 1128880 didn't cause problems T386868 [production]