251-300 of 10000 results (170ms)
2026-06-02 ยง
07:14 <marostegui> Install mariadb 10.11.17 on db2186 T427345 [production]
07:12 <fceratto@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance [production]
07:12 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade [production]
07:12 <fceratto@cumin1003> START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance [production]
07:04 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage [production]
06:59 <marostegui@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage [production]
06:55 <fceratto@cumin1003> dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json [production]
06:55 <marostegui@cumin1003> START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed [production]
06:55 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance [production]
06:46 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie [production]
06:43 <marostegui@cumin1003> START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie [production]
06:42 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet [production]
06:41 <marostegui@cumin1003> START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet [production]
06:41 <marostegui@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
06:37 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet [production]
06:37 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet [production]
06:36 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet [production]
06:36 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade [production]
06:29 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage [production]
06:24 <marostegui@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage [production]
06:22 <jelto@deploy1003> helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply [production]
06:21 <jelto@deploy1003> helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply [production]
06:16 <jelto@deploy1003> helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply [production]
06:15 <jelto@deploy1003> helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply [production]
06:08 <marostegui@cumin1003> START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie [production]
06:05 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet [production]
06:05 <marostegui@cumin1003> START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet [production]
06:04 <marostegui@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
06:02 <marostegui@dns1004> END - running authdns-update [production]
06:01 <marostegui@cumin1003> dbctl commit (dc=all): 'Depool db1181 T426088', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json [production]
06:01 <marostegui@dns1004> START - running authdns-update [production]
06:00 <marostegui@cumin1003> dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T426088', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json [production]
06:00 <marostegui@cumin1003> dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T426088', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json [production]
06:00 <marostegui> Starting s7 eqiad failover from db1181 to db1236 - T426088 [production]
05:51 <marostegui@cumin1003> dbctl commit (dc=all): 'Set db1236 with weight 0 T426088', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json [production]
05:51 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T426088 [production]
05:50 <marostegui@cumin1003> START - Cookbook sre.mysql.pool pool es1052: repool after upgrade [production]
05:50 <marostegui@cumin1003> END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) [production]
05:47 <trueg@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply [production]
05:46 <trueg@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply [production]
05:45 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie [production]
05:36 <trueg@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply [production]
05:33 <trueg@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply [production]
05:30 <trueg@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply [production]
05:29 <trueg@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply [production]
05:29 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage [production]
05:28 <trueg@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply [production]
05:26 <trueg@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply [production]
05:25 <trueg@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply [production]
05:22 <marostegui@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage [production]