2025-07-10
ยง
|
14:12 |
<jhathaway@cumin2002> |
START - Cookbook sre.hosts.reimage for host sretest2001.codfw.wmnet with OS bookworm |
[production] |
14:12 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2211 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P78883 and previous config saved to /var/cache/conftool/dbconfig/20250710-141202-root.json |
[production] |
14:04 |
<andrew@cumin1003> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1008.eqiad.wmnet'] |
[production] |
14:03 |
<elukey@deploy1003> |
helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
14:03 |
<elukey@deploy1003> |
helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'. |
[production] |
14:03 |
<elukey@deploy1003> |
helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
14:02 |
<elukey@deploy1003> |
helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'. |
[production] |
14:01 |
<volans@cumin2002> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
14:00 |
<volans@cumin2002> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1002.eqiad.wmnet |
[production] |
13:58 |
<volans@cumin2002> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
13:57 |
<vgutierrez> |
restarting varnish and ATS in cp5017 |
[production] |
13:56 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2211 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P78882 and previous config saved to /var/cache/conftool/dbconfig/20250710-135656-root.json |
[production] |
13:52 |
<hashar> |
UTC afternoon backport window completed |
[production] |
13:51 |
<hashar@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1167856|fix(StructuredTask): wrong order in resolving a deferred]] (duration: 11m 10s) |
[production] |
13:51 |
<volans@cumin2002> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1002.eqiad.wmnet |
[production] |
13:49 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance |
[production] |
13:49 |
<andrew@cumin1003> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1008.eqiad.wmnet'] |
[production] |
13:48 |
<volans@cumin2002> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet |
[production] |
13:47 |
<klausman@cumin1002> |
conftool action : set/pooled=true; selector: dnsdisc=inference,name=codfw |
[production] |
13:46 |
<volans> |
upgrade spicerack on cumin2002 to 11.3.0 |
[production] |
13:46 |
<elukey@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2006.codfw.wmnet with OS trixie |
[production] |
13:46 |
<hashar@deploy1003> |
migr, hashar: Continuing with sync |
[production] |
13:42 |
<hashar@deploy1003> |
migr, hashar: Backport for [[gerrit:1167856|fix(StructuredTask): wrong order in resolving a deferred]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
13:41 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2211 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P78881 and previous config saved to /var/cache/conftool/dbconfig/20250710-134150-root.json |
[production] |
13:40 |
<hashar@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1167856|fix(StructuredTask): wrong order in resolving a deferred]] |
[production] |
13:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2211 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P78880 and previous config saved to /var/cache/conftool/dbconfig/20250710-133418-marostegui.json |
[production] |
13:34 |
<root@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2211.codfw.wmnet with reason: Maintenance |
[production] |
13:30 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2171 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P78879 and previous config saved to /var/cache/conftool/dbconfig/20250710-133047-root.json |
[production] |
13:15 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2171 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P78878 and previous config saved to /var/cache/conftool/dbconfig/20250710-131541-root.json |
[production] |
13:08 |
<wmbot~dcaro@hephaestus> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=99) |
[project-proxy] |
13:08 |
<wmbot~dcaro@hephaestus> |
START - Cookbook wmcs.openstack.cloudvirt.vm_console |
[project-proxy] |
13:08 |
<klausman@deploy1003> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
13:06 |
<moritzm> |
installing ICU security updates |
[production] |
13:00 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2171 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P78876 and previous config saved to /var/cache/conftool/dbconfig/20250710-130036-root.json |
[production] |
12:59 |
<klausman@deploy1003> |
helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . |
[production] |
12:52 |
<klausman@deploy1003> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . |
[production] |
12:47 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.ceph.osd.reactivate (exit_code=99) |
[admin] |
12:45 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2171 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P78875 and previous config saved to /var/cache/conftool/dbconfig/20250710-124530-root.json |
[production] |
12:42 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.ceph.osd.reactivate |
[admin] |
12:40 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P78874 and previous config saved to /var/cache/conftool/dbconfig/20250710-124051-root.json |
[production] |
12:40 |
<slyngshede@cumin1003> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 2.8.15 upgrade (T398720) |
[production] |
12:38 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2171 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P78873 and previous config saved to /var/cache/conftool/dbconfig/20250710-123809-marostegui.json |
[production] |
12:38 |
<root@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2171.codfw.wmnet with reason: Maintenance |
[production] |
12:37 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.ceph.osd.reactivate (exit_code=99) |
[admin] |
12:35 |
<slyngshede@cumin1003> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 2.8.15 upgrade (T398720) |
[production] |
12:34 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.ceph.osd.reactivate |
[admin] |
12:32 |
<andrew@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1007.eqiad.wmnet with OS bookworm |
[production] |
12:25 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P78872 and previous config saved to /var/cache/conftool/dbconfig/20250710-122545-root.json |
[production] |
12:25 |
<aikochou@deploy1003> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . |
[production] |
12:17 |
<fceratto@cumin1002> |
END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis mediawikiwiki, testwiki in section s3 |
[production] |