|
2026-05-21
ยง
|
| 12:57 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:56 |
<cwilliams@cumin1003> |
dbctl commit (dc=all): 'Set db2162 with weight 0 T426936', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json |
[production] |
| 12:56 |
<cwilliams@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T426936 |
[production] |
| 12:56 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:55 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet |
[production] |
| 12:54 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet |
[production] |
| 12:54 |
<slyngshede@cumin1003> |
START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3074.esams.wmnet} and A:cp |
[production] |
| 12:54 |
<klausman@cumin1003> |
END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet |
[production] |
| 12:54 |
<slyngshede@cumin1003> |
END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[7-8].drmrs.wmnet} and A:cp |
[production] |
| 12:54 |
<slyngshede@cumin1003> |
cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet |
[production] |
| 12:53 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:52 |
<brouberol@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage |
[production] |
| 12:51 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:49 |
<klausman@cumin1003> |
START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet |
[production] |
| 12:49 |
<klausman@cumin1003> |
START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad |
[production] |
| 12:48 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet |
[production] |
| 12:48 |
<slyngshede@cumin1003> |
END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3066.esams.wmnet} and A:cp |
[production] |
| 12:48 |
<slyngshede@cumin1003> |
cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet |
[production] |
| 12:47 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:47 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Depooling es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json |
[production] |
| 12:47 |
<fceratto@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance |
[production] |
| 12:46 |
<fceratto@cumin1003> |
START - Cookbook sre.mysql.pool pool es1039: Repooling |
[production] |
| 12:46 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:45 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet |
[production] |
| 12:45 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:44 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:43 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:43 |
<kharlan@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1290727|hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s) |
[production] |
| 12:42 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:40 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json |
[production] |
| 12:39 |
<kharlan@deploy1003> |
kharlan: Continuing with deployment |
[production] |
| 12:38 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet |
[production] |
| 12:37 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet |
[production] |
| 12:37 |
<brouberol@cumin1003> |
START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie |
[production] |
| 12:37 |
<kharlan@deploy1003> |
kharlan: Backport for [[gerrit:1290727|hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
| 12:36 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:36 |
<slyngshede@cumin1003> |
START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3066.esams.wmnet} and A:cp |
[production] |
| 12:35 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:35 |
<kharlan@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1290727|hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] |
[production] |
| 12:35 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:34 |
<brouberol@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie |
[production] |
| 12:34 |
<kart_> |
Updated cxserver to 2026-05-20-034002-production (T388690, T404295, T391703, T426605) |
[production] |
| 12:34 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 12:34 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet |
[production] |
| 12:32 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet |
[production] |
| 12:30 |
<kartik@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/cxserver: apply |
[production] |
| 12:30 |
<kartik@deploy1003> |
helmfile [eqiad] START helmfile.d/services/cxserver: apply |
[production] |
| 12:30 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet |
[production] |
| 12:29 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:29 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Depooling es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json |
[production] |