1-50 of 10000 results (21ms)
2025-11-21 ยง
15:40 <cmooney@cumin1003> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART [production]
15:40 <cmooney@cumin1003> START - Cookbook sre.hosts.provision for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART [production]
15:37 <cmooney@cumin1003> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART [production]
15:32 <cmooney@cumin1003> START - Cookbook sre.hosts.provision for host relforge1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART [production]
14:48 <sukhe> homer "asw*drmrs*" commit "bring up hcaptcha-proxy600[12]": T409780 [production]
14:48 <bking@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm [production]
14:42 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Depooling db1159 (T410589)', diff saved to https://phabricator.wikimedia.org/P85441 and previous config saved to /var/cache/conftool/dbconfig/20251121-144238-ladsgroup.json [production]
14:42 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance [production]
14:35 <bking@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm [production]
14:30 <bking@cumin2002> START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm [production]
14:27 <bking@cumin2002> START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm [production]
14:26 <jmm@cumin2002> DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging West1 out of all services on: 2410 hosts [production]
14:25 <sukhe> homer "cr*eqsin*" commit "bring up hcaptcha-proxy500[12]": T409780 [production]
14:25 <bking@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm [production]
14:25 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85440 and previous config saved to /var/cache/conftool/dbconfig/20251121-142500-ladsgroup.json [production]
14:24 <bking@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm [production]
14:23 <sukhe> homer "cr*ulsfo*" commit "bring up hcaptcha-proxy400[12]": T409780 [production]
14:21 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1018.eqiad.wmnet with reason: Maint [production]
14:21 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2018.codfw.wmnet with reason: Maint [production]
14:21 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Depool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85439 and previous config saved to /var/cache/conftool/dbconfig/20251121-142059-ladsgroup.json [production]
14:18 <sukhe> homer "cr*codfw*" commit "bring up hcaptcha-proxy200[12]": T409780 [production]
14:17 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85438 and previous config saved to /var/cache/conftool/dbconfig/20251121-141747-ladsgroup.json [production]
14:14 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2017.codfw.wmnet with reason: Maint [production]
14:14 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1017.eqiad.wmnet with reason: Maint [production]
14:13 <sukhe> homer "cr*eqiad*" commit "bring up hcaptcha-proxy100[12]": T409780 [production]
14:13 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Depool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85437 and previous config saved to /var/cache/conftool/dbconfig/20251121-141345-ladsgroup.json [production]
14:09 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Repool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85436 and previous config saved to /var/cache/conftool/dbconfig/20251121-140903-ladsgroup.json [production]
14:05 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2016.codfw.wmnet with reason: Maint [production]
14:05 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1016.eqiad.wmnet with reason: Maint [production]
14:03 <ladsgroup@cumin1003> dbctl commit (dc=all): 'Depool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85435 and previous config saved to /var/cache/conftool/dbconfig/20251121-140327-ladsgroup.json [production]
13:52 <ayounsi@cumin1003> END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox [production]
13:52 <ayounsi@cumin1003> START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox [production]
13:24 <mwpresync@deploy2002> Started scap build-images: Publishing wmf/next image [production]
13:19 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1007.eqiad.wmnet [production]
13:12 <btullis@cumin1003> START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1007.eqiad.wmnet [production]
12:26 <cmooney@cumin1003> END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003 [production]
12:24 <cmooney@cumin1003> START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003 [production]
10:42 <ayounsi@cumin1003> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1005.eqiad.wmnet [production]
10:19 <jnuche@deploy2002> Finished deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680 (duration: 02m 13s) [production]
10:16 <jnuche@deploy2002> Started deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680 [production]
09:45 <dpogorzelski@deploy2002> helmfile [staging] DONE helmfile.d/services/changeprop: sync [production]
09:45 <dpogorzelski@deploy2002> helmfile [staging] START helmfile.d/services/changeprop: sync [production]
09:37 <bwojtowicz@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
09:34 <bwojtowicz@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . [production]
09:16 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cumin1002.eqiad.wmnet [production]
09:16 <jmm@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
09:16 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" [production]
09:13 <jmm@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" [production]
08:46 <jnuche@deploy2002> Finished deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance (duration: 01m 48s) [production]
08:44 <jnuche@deploy2002> Started deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance [production]