2025-04-14
ยง
|
13:57 |
<vriley@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye |
[production] |
13:56 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P74960 and previous config saved to /var/cache/conftool/dbconfig/20250414-135640-fceratto.json |
[production] |
13:47 |
<arnaudb@cumin1002> |
END (ERROR) - Cookbook sre.gerrit.failover (exit_code=97) from gerrit1003.wikimedia.org to gerrit2003.wikimedia.org |
[production] |
13:47 |
<arnaudb@cumin1002> |
START - Cookbook sre.gerrit.failover from gerrit1003.wikimedia.org to gerrit2003.wikimedia.org |
[production] |
13:41 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P74956 and previous config saved to /var/cache/conftool/dbconfig/20250414-134132-fceratto.json |
[production] |
13:41 |
<TheresNoTime> |
UTC afternoon backport window done |
[production] |
13:40 |
<samtar@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1135850|Enable SUL3 on most remaining beta cluster wikis]], [[gerrit:1136104|punjabiwikimedia, maiwikimedia: fix tagline (T348611)]] (duration: 12m 00s) |
[production] |
13:38 |
<sukhe> |
reprepro -C component/nginx-ech include bookworm-wikimedia nginx_1.22.1-9+deb12u1+ech2_amd64.changes: T205378 |
[production] |
13:33 |
<samtar@deploy1003> |
matmarex, anzx, samtar: Continuing with sync |
[production] |
13:33 |
<samtar@deploy1003> |
matmarex, anzx, samtar: Backport for [[gerrit:1135850|Enable SUL3 on most remaining beta cluster wikis]], [[gerrit:1136104|punjabiwikimedia, maiwikimedia: fix tagline (T348611)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:30 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch2104 |
[production] |
13:30 |
<bking@cumin2002> |
START - Cookbook sre.hosts.move-vlan for host cirrussearch2104 |
[production] |
13:30 |
<bking@cumin2002> |
START - Cookbook sre.hosts.reimage for host cirrussearch2104.codfw.wmnet with OS bullseye |
[production] |
13:28 |
<samtar@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1135850|Enable SUL3 on most remaining beta cluster wikis]], [[gerrit:1136104|punjabiwikimedia, maiwikimedia: fix tagline (T348611)]] |
[production] |
13:28 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from cirrussearch2014 to cirrussearch2104 |
[production] |
13:27 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch2104 |
[production] |
13:27 |
<bking@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch2104 |
[production] |
13:27 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
13:27 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming cirrussearch2014 to cirrussearch2104 - bking@cumin2002" |
[production] |
13:26 |
<samtar@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1135993|CentralAuthTokenManager: Log failures for write operations (T390784)]] (duration: 11m 39s) |
[production] |
13:26 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1198 (T391056)', diff saved to https://phabricator.wikimedia.org/P74955 and previous config saved to /var/cache/conftool/dbconfig/20250414-132625-fceratto.json |
[production] |
13:23 |
<oblivian@cumin2002> |
END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Compatibility with conftool 5.1.0 (take 2) - oblivian@cumin2002" |
[production] |
13:23 |
<oblivian@cumin2002> |
END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Compatibility with conftool 5.1.0 (take 2) - oblivian@cumin2002 |
[production] |
13:22 |
<oblivian@cumin2002> |
START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Compatibility with conftool 5.1.0 (take 2) - oblivian@cumin2002 |
[production] |
13:22 |
<oblivian@cumin2002> |
START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Compatibility with conftool 5.1.0 (take 2) - oblivian@cumin2002" |
[production] |
13:22 |
<bking@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming cirrussearch2014 to cirrussearch2104 - bking@cumin2002" |
[production] |
13:22 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1198 (T391056)', diff saved to https://phabricator.wikimedia.org/P74954 and previous config saved to /var/cache/conftool/dbconfig/20250414-132232-fceratto.json |
[production] |
13:22 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1198.eqiad.wmnet with reason: Maintenance |
[production] |
13:22 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1189 (T391056)', diff saved to https://phabricator.wikimedia.org/P74953 and previous config saved to /var/cache/conftool/dbconfig/20250414-132210-fceratto.json |
[production] |
13:22 |
<oblivian@cumin2002> |
END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Compatibility with conftool 5.1.0 - oblivian@cumin2002" |
[production] |
13:22 |
<oblivian@cumin2002> |
END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Compatibility with conftool 5.1.0 - oblivian@cumin2002 |
[production] |
13:21 |
<oblivian@cumin2002> |
START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Compatibility with conftool 5.1.0 - oblivian@cumin2002 |
[production] |
13:21 |
<oblivian@cumin2002> |
START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Compatibility with conftool 5.1.0 - oblivian@cumin2002" |
[production] |
13:19 |
<samtar@deploy1003> |
samtar, matmarex: Continuing with sync |
[production] |
13:19 |
<samtar@deploy1003> |
samtar, matmarex: Backport for [[gerrit:1135993|CentralAuthTokenManager: Log failures for write operations (T390784)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:18 |
<bking@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:18 |
<bking@cumin2002> |
START - Cookbook sre.hosts.rename from cirrussearch2014 to cirrussearch2104 |
[production] |
13:17 |
<vgutierrez> |
rolling upgrade to varnish 7.1.1-1.1~bpo11+wmf3 in magru - T391334 |
[production] |
13:17 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload_magru |
[production] |
13:16 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text_magru |
[production] |
13:15 |
<_joe_> |
installed updates to conftool on cumin hosts |
[production] |
13:14 |
<samtar@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1135993|CentralAuthTokenManager: Log failures for write operations (T390784)]] |
[production] |
13:13 |
<elukey@deploy1003> |
Finished deploy [docker-pkg/deploy@a555b7b]: Upgrade to 4.0.4 (duration: 00m 38s) |
[production] |
13:13 |
<elukey@deploy1003> |
Started deploy [docker-pkg/deploy@a555b7b]: Upgrade to 4.0.4 |
[production] |
13:13 |
<godog> |
remove old LVs from prometheus[12]00[56] - T383232 |
[production] |
13:07 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P74952 and previous config saved to /var/cache/conftool/dbconfig/20250414-130703-fceratto.json |
[production] |
13:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repool pc5 T391454', diff saved to https://phabricator.wikimedia.org/P74951 and previous config saved to /var/cache/conftool/dbconfig/20250414-130222-marostegui.json |
[production] |
13:00 |
<moritzm> |
remove ganeti01.svc.eqiad.wmnet cert (replaced by cfssl cert) T357750 |
[production] |
12:56 |
<vgutierrez@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text_ulsfo and not P{cp4037.ulsfo.wmnet} and A:cp |
[production] |
12:56 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance |
[production] |