2025-05-19
§
|
23:01 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1081.eqiad.wmnet with OS bullseye |
[production] |
22:58 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
22:47 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
22:45 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host thanos-be2007.codfw.wmnet with OS bullseye |
[production] |
22:43 |
<andrew@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for service: project,nova |
[admin] |
22:43 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
22:35 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova |
[admin] |
22:30 |
<andrew@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1072.eqiad.wmnet with OS bookworm |
[production] |
22:29 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
22:29 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:29 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage |
[production] |
22:28 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:28 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1080.eqiad.wmnet with OS bullseye |
[production] |
22:27 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=99) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:27 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:26 |
<andrew@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for service: project,nova |
[admin] |
22:26 |
<bking@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on relforge[1003-1004,1008-1010].eqiad.wmnet with reason: decom in progress |
[production] |
22:26 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
22:25 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage |
[production] |
22:25 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=99) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:24 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:23 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=99) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:22 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:21 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=99) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:20 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
22:11 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1081 |
[production] |
22:11 |
<bking@cumin2002> |
START - Cookbook sre.hosts.move-vlan for host cirrussearch1081 |
[production] |
22:11 |
<bking@cumin2002> |
START - Cookbook sre.hosts.reimage for host cirrussearch1081.eqiad.wmnet with OS bullseye |
[production] |
22:08 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts relforge[1003-1004].eqiad.wmnet |
[production] |
22:06 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
22:04 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Remove db1255 and db2241 from s8 (T351820)', diff saved to https://phabricator.wikimedia.org/P76317 and previous config saved to /var/cache/conftool/dbconfig/20250519-220432-ladsgroup.json |
[production] |
22:03 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage |
[production] |
22:02 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text_eqiad - <bound method SREBatchRunnerBase._reason of <cookbooks.sre.cdn.roll-upgrade-varnish.RollUpgradeVarnishRunner object at 0x7f1ff9224b80>> |
[production] |
22:02 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Remove db1209 and db2195 from x3 (T351820)', diff saved to https://phabricator.wikimedia.org/P76316 and previous config saved to /var/cache/conftool/dbconfig/20250519-220201-ladsgroup.json |
[production] |
22:01 |
<ryankemper@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts relforge[1003-1004].eqiad.wmnet |
[production] |
22:01 |
<swfrench@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply |
[production] |
22:00 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage |
[production] |
22:00 |
<swfrench@deploy1003> |
helmfile [codfw] START helmfile.d/services/shellbox-video: apply |
[production] |
21:58 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload_eqiad - <bound method SREBatchRunnerBase._reason of <cookbooks.sre.cdn.roll-upgrade-varnish.RollUpgradeVarnishRunner object at 0x7f9083afdf10>> |
[production] |
21:58 |
<andrew@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1072.eqiad.wmnet with reason: host reimage |
[production] |
21:56 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
21:54 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1072.eqiad.wmnet with reason: host reimage |
[production] |
21:53 |
<wmbot~anticomposite@tools-bastion-13> |
./SULWatcher/manage.sh restart # pymysql.err.InterfaceError: (0, '') when restarting from IRC |
[tools.stewardbots] |
21:48 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host thanos-be2006.codfw.wmnet with OS bullseye |
[production] |
21:47 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt1037.eqiad.wmnet' (T394727) |
[admin] |
21:45 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm |
[production] |
21:44 |
<mutante> |
soft rebooting instance codesearch9, web UI was down and could not get shell |
[codesearch] |
21:43 |
<swfrench@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply |
[production] |
21:43 |
<swfrench@deploy1003> |
helmfile [eqiad] START helmfile.d/services/shellbox-video: apply |
[production] |