2025-06-03
ยง
|
15:51 |
<jiji@deploy1003> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
15:50 |
<jiji@deploy1003> |
helmfile [staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
15:18 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS bullseye |
[production] |
15:18 |
<moritzm> |
installing gcc-12 bugfix updates from Bookworm point releases (includes various run time libraries) |
[production] |
15:17 |
<andrew@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2010-dev.codfw.wmnet with OS bullseye |
[production] |
15:16 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance |
[production] |
15:15 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2195 (T395241)', diff saved to https://phabricator.wikimedia.org/P76966 and previous config saved to /var/cache/conftool/dbconfig/20250603-151552-fceratto.json |
[production] |
15:10 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
15:10 |
<jmm@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
15:10 |
<andrew@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcontrol2010-dev.codfw.wmnet with OS bullseye |
[production] |
15:06 |
<hashar> |
Restarted Gerrit due to issue with replication config | T395887 |
[production] |
15:00 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P76964 and previous config saved to /var/cache/conftool/dbconfig/20250603-150045-fceratto.json |
[production] |
14:58 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
14:58 |
<jmm@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
14:54 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
14:54 |
<jmm@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add prometheus7002 - jmm@cumin1003" |
[production] |
14:50 |
<fnegri@cumin1002> |
conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet,service=s4 |
[production] |
14:46 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus7002.magru.wmnet with OS bookworm |
[production] |
14:45 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P76963 and previous config saved to /var/cache/conftool/dbconfig/20250603-144538-fceratto.json |
[production] |
14:30 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2195 (T395241)', diff saved to https://phabricator.wikimedia.org/P76962 and previous config saved to /var/cache/conftool/dbconfig/20250603-143031-fceratto.json |
[production] |
14:27 |
<jmm@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus7002.magru.wmnet with reason: host reimage |
[production] |
14:26 |
<jmm@cumin1003> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1063.eqiad.wmnet |
[production] |
14:23 |
<jmm@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus7002.magru.wmnet with reason: host reimage |
[production] |
14:23 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db2195 (T395241)', diff saved to https://phabricator.wikimedia.org/P76961 and previous config saved to /var/cache/conftool/dbconfig/20250603-142314-fceratto.json |
[production] |
14:23 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance |
[production] |
14:22 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2181 (T395241)', diff saved to https://phabricator.wikimedia.org/P76960 and previous config saved to /var/cache/conftool/dbconfig/20250603-142248-fceratto.json |
[production] |
14:19 |
<jmm@cumin1003> |
DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1103.eqiad.wmnet |
[production] |
14:07 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P76959 and previous config saved to /var/cache/conftool/dbconfig/20250603-140740-fceratto.json |
[production] |
14:01 |
<Amir1> |
dropping term store tables from s8 (T351820) |
[production] |
14:01 |
<Amir1> |
dropping term store tables from s8 (T351802) |
[production] |
13:57 |
<jmm@cumin1003> |
START - Cookbook sre.hosts.reimage for host prometheus7002.magru.wmnet with OS bookworm |
[production] |
13:52 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P76957 and previous config saved to /var/cache/conftool/dbconfig/20250603-135233-fceratto.json |
[production] |
13:49 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es1039 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P76956 and previous config saved to /var/cache/conftool/dbconfig/20250603-134935-root.json |
[production] |
13:44 |
<phuedx> |
Disabled the SDS 2.4.11 Synthetic A/A Test in xLab |
[production] |
13:42 |
<dbrant@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply |
[production] |
13:40 |
<jclark@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1048.eqiad.wmnet with OS bullseye |
[production] |
13:38 |
<dbrant@deploy1003> |
helmfile [codfw] START helmfile.d/services/wikifeeds: apply |
[production] |
13:37 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2181 (T395241)', diff saved to https://phabricator.wikimedia.org/P76954 and previous config saved to /var/cache/conftool/dbconfig/20250603-133725-fceratto.json |
[production] |
13:34 |
<dbrant@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply |
[production] |
13:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es1039 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P76953 and previous config saved to /var/cache/conftool/dbconfig/20250603-133429-root.json |
[production] |
13:33 |
<dbrant@deploy1003> |
helmfile [eqiad] START helmfile.d/services/wikifeeds: apply |
[production] |
13:32 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . |
[production] |
13:32 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . |
[production] |
13:31 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . |
[production] |
13:28 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db2181 (T395241)', diff saved to https://phabricator.wikimedia.org/P76952 and previous config saved to /var/cache/conftool/dbconfig/20250603-132802-fceratto.json |
[production] |
13:27 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance |
[production] |
13:27 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . |
[production] |
13:27 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2167 (T395241)', diff saved to https://phabricator.wikimedia.org/P76951 and previous config saved to /var/cache/conftool/dbconfig/20250603-132735-fceratto.json |
[production] |
13:26 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . |
[production] |
13:22 |
<bwojtowicz@deploy1003> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . |
[production] |