|
2025-10-29
ยง
|
| 14:39 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P84350 and previous config saved to /var/cache/conftool/dbconfig/20251029-143918-marostegui.json |
[production] |
| 14:37 |
<elukey@cumin2002> |
START - Cookbook sre.hosts.powercycle for host ml-serve2001 |
[production] |
| 14:35 |
<raymond-ndibe@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor |
[tools] |
| 14:31 |
<raymond-ndibe@cloudcumin1001> |
START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor |
[tools] |
| 14:31 |
<raymond-ndibe@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor |
[toolsbeta] |
| 14:28 |
<raymond-ndibe@cloudcumin1001> |
START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor |
[toolsbeta] |
| 14:24 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P84349 and previous config saved to /var/cache/conftool/dbconfig/20251029-142410-marostegui.json |
[production] |
| 14:09 |
<bking@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply |
[production] |
| 14:09 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2151 (T407997)', diff saved to https://phabricator.wikimedia.org/P84348 and previous config saved to /var/cache/conftool/dbconfig/20251029-140902-marostegui.json |
[production] |
| 14:09 |
<bking@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply |
[production] |
| 14:06 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db2151 (T407997)', diff saved to https://phabricator.wikimedia.org/P84347 and previous config saved to /var/cache/conftool/dbconfig/20251029-140652-marostegui.json |
[production] |
| 14:06 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Depool es2027 T408406', diff saved to https://phabricator.wikimedia.org/P84346 and previous config saved to /var/cache/conftool/dbconfig/20251029-140641-fceratto.json |
[production] |
| 14:02 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance |
[production] |
| 13:56 |
<stevemunene@cumin1003> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. |
[production] |
| 13:43 |
<gehel> |
deploying envoy 1.32.12-1 + restart on W[CD]QS nodes - T404867 |
[production] |
| 13:40 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy3002.esams.wmnet with OS trixie |
[production] |
| 13:35 |
<taavi@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component calico (T408669) |
[toolsbeta] |
| 13:31 |
<kharlan@deploy2002> |
Finished scap sync-world: Backport for [[gerrit:1199762|product_metrics/suggested_investigations_interaction: add performer_groups (T404177)]] (duration: 14m 48s) |
[production] |
| 13:31 |
<moritzm> |
upgrade Envoy on debmonitor* T405808 |
[production] |
| 13:31 |
<cgoubert@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/api-gateway: apply |
[production] |
| 13:30 |
<cgoubert@deploy2002> |
helmfile [codfw] START helmfile.d/services/api-gateway: apply |
[production] |
| 13:30 |
<cgoubert@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply |
[production] |
| 13:29 |
<cgoubert@deploy2002> |
helmfile [eqiad] START helmfile.d/services/api-gateway: apply |
[production] |
| 13:29 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/api-gateway: apply |
[production] |
| 13:28 |
<taavi@cloudcumin1001> |
START - Cookbook wmcs.toolforge.component.deploy for component calico (T408669) |
[toolsbeta] |
| 13:28 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/api-gateway: apply |
[production] |
| 13:27 |
<kharlan@deploy2002> |
kharlan: Continuing with sync |
[production] |
| 13:26 |
<cgoubert@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 13:24 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage |
[production] |
| 13:23 |
<cgoubert@deploy2002> |
helmfile [codfw] START helmfile.d/services/rest-gateway: apply |
[production] |
| 13:23 |
<cgoubert@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 13:22 |
<cgoubert@deploy2002> |
helmfile [eqiad] START helmfile.d/services/rest-gateway: apply |
[production] |
| 13:19 |
<kharlan@deploy2002> |
kharlan: Backport for [[gerrit:1199762|product_metrics/suggested_investigations_interaction: add performer_groups (T404177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
| 13:18 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy3002.esams.wmnet with reason: host reimage |
[production] |
| 13:17 |
<kharlan@deploy2002> |
Started scap sync-world: Backport for [[gerrit:1199762|product_metrics/suggested_investigations_interaction: add performer_groups (T404177)]] |
[production] |
| 13:14 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 13:14 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
| 13:07 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 13:07 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
| 13:05 |
<stevemunene@cumin1003> |
START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons. |
[production] |
| 13:04 |
<klausman@deploy2002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
| 13:04 |
<klausman@deploy2002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. |
[production] |
| 13:03 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 13:03 |
<stevemunene@cumin1003> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons. |
[production] |
| 13:03 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
| 12:55 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reimage for host tcp-proxy3002.esams.wmnet with OS trixie |
[production] |
| 12:50 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 12:50 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
| 12:48 |
<taavi@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.calico.copy_images_to_registry (exit_code=0) for Calico v3.29.6 |
[tools] |
| 12:48 |
<taavi@cloudcumin1001> |
Updating container image docker-registry.svc.toolforge.org/calico/typha:v3.29.6 |
[tools] |