| 2025-10-10
      
      § | 
    
  | 10:17 | <marostegui@cumin1003> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1248.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 09:34 | <btullis@deploy2002> | helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 09:34 | <btullis@deploy2002> | helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. | [production] | 
            
  | 09:20 | <btullis@deploy2002> | helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 09:19 | <btullis@deploy2002> | helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. | [production] | 
            
  | 06:24 | <marostegui@cumin1003> | dbctl commit (dc=all): 'db1249 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P83737 and previous config saved to /var/cache/conftool/dbconfig/20251010-062406-root.json | [production] | 
            
  | 06:11 | <marostegui@cumin1003> | END (PASS) - Cookbook sre.mysql.clone_es (exit_code=0) of es1029.eqiad.wmnet onto es1052.eqiad.wmnet | [production] | 
            
  | 06:10 | <marostegui@cumin1003> | END (PASS) - Cookbook sre.mysql.pool (exit_code=0) es1029 gradually with 4 steps - Pool es1029.eqiad.wmnet in after cloning | [production] | 
            
  | 06:10 | <marostegui@cumin1003> | END (PASS) - Cookbook sre.mysql.clone_es (exit_code=0) of es1034.eqiad.wmnet onto es1057.eqiad.wmnet | [production] | 
            
  | 06:10 | <marostegui@cumin1003> | END (PASS) - Cookbook sre.mysql.pool (exit_code=0) es1034 gradually with 4 steps - Pool es1034.eqiad.wmnet in after cloning | [production] | 
            
  | 06:09 | <marostegui@cumin1003> | dbctl commit (dc=all): 'db1249 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P83734 and previous config saved to /var/cache/conftool/dbconfig/20251010-060900-root.json | [production] | 
            
  | 05:53 | <marostegui@cumin1003> | dbctl commit (dc=all): 'db1249 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P83731 and previous config saved to /var/cache/conftool/dbconfig/20251010-055354-root.json | [production] | 
            
  | 05:38 | <marostegui@cumin1003> | dbctl commit (dc=all): 'db1249 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P83728 and previous config saved to /var/cache/conftool/dbconfig/20251010-053848-root.json | [production] | 
            
  | 05:30 | <marostegui@cumin1003> | dbctl commit (dc=all): 'Depool db1249 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P83727 and previous config saved to /var/cache/conftool/dbconfig/20251010-053040-marostegui.json | [production] | 
            
  | 05:30 | <marostegui@cumin1003> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1249.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 05:25 | <marostegui@cumin1003> | START - Cookbook sre.mysql.pool es1029 gradually with 4 steps - Pool es1029.eqiad.wmnet in after cloning | [production] | 
            
  | 05:25 | <marostegui@cumin1003> | START - Cookbook sre.mysql.pool es1034 gradually with 4 steps - Pool es1034.eqiad.wmnet in after cloning | [production] | 
            
  | 01:14 | <mwpresync@deploy2002> | Finished scap build-images: Publishing wmf/next image (duration: 13m 32s) | [production] | 
            
  | 01:00 | <mwpresync@deploy2002> | Started scap build-images: Publishing wmf/next image | [production] | 
            
  
    | 2025-10-09
      
      § | 
    
  | 23:10 | <ryankemper@cumin2002> | conftool action : set/pooled=yes:weight=10; selector: name=wdqs2017.* | [production] | 
            
  | 22:11 | <inflatador> | bking@wdqs10(18|19|20) systemctl start load-categories-daily.service T405978 | [production] | 
            
  | 22:05 | <bking@cumin2002> | END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1019.eqiad.wmnet | [production] | 
            
  | 22:04 | <bking@cumin2002> | END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1020.eqiad.wmnet | [production] | 
            
  | 22:04 | <jdlrobson@deploy2002> | Finished scap sync-world: Backport for [[gerrit:1195049|Enable instrumentation of watchstar and other links that stopPropagation (T406390)]] (duration: 41m 38s) | [production] | 
            
  | 22:00 | <bking@cumin2002> | END (PASS) - Cookbook sre.wdqs.categories-reload (exit_code=0) reloading categories to wdqs1018.eqiad.wmnet | [production] | 
            
  | 21:51 | <dwisehaupt> | started staging db restore in root screen session on frdb1006. restoring from db backups on 20251008 | [production] | 
            
  | 21:51 | <jdlrobson@deploy2002> | jdlrobson: Continuing with sync | [production] | 
            
  | 21:47 | <jdlrobson@deploy2002> | jdlrobson: Backport for [[gerrit:1195049|Enable instrumentation of watchstar and other links that stopPropagation (T406390)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 21:25 | <TimStarling> | on db2202 cleaned up the tables I created for T400696 | [production] | 
            
  | 21:22 | <jdlrobson@deploy2002> | Started scap sync-world: Backport for [[gerrit:1195049|Enable instrumentation of watchstar and other links that stopPropagation (T406390)]] | [production] | 
            
  | 21:20 | <wfan> | payments-wiki upgraded from 028a0225 to d903982c | [production] | 
            
  | 20:57 | <reedy@deploy2002> | Finished scap sync-world: Backport for [[gerrit:1193928|Enable New UI and Multiple Module support for OATHAuth in Wikimedia production (T399644)]] (duration: 20m 04s) | [production] | 
            
  | 20:53 | <reedy@deploy2002> | reedy, sbassett: Continuing with sync | [production] | 
            
  | 20:46 | <Daimona> | Run createAndPromote as in P83722#336349 (~100x, in series) to restore event-organizer membership # T401445 | [production] | 
            
  | 20:42 | <bking@cumin2002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker2003.codfw.wmnet with OS bookworm | [production] | 
            
  | 20:42 | <reedy@deploy2002> | reedy, sbassett: Backport for [[gerrit:1193928|Enable New UI and Multiple Module support for OATHAuth in Wikimedia production (T399644)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 20:37 | <reedy@deploy2002> | Started scap sync-world: Backport for [[gerrit:1193928|Enable New UI and Multiple Module support for OATHAuth in Wikimedia production (T399644)]] | [production] | 
            
  | 20:32 | <mutante> | logmsgbot do you still log - test log T284123 | [production] | 
            
  | 20:29 | <mutante> | re-enabled QoS on gerrit servers - with previously stable config - T406774  gerrit:1194811 | [production] | 
            
  | 20:28 | <reedy@deploy2002> | Finished scap sync-world: Backport for [[gerrit:1194962|OATHAuth Recovery Code code improvement (T406501)]] (duration: 10m 19s) | [production] | 
            
  | 20:25 | <mutante> | re-enabling QoS on gerrit servers - with previously stable config - T406774 | [production] | 
            
  | 20:24 | <reedy@deploy2002> | sbassett, reedy: Continuing with sync | [production] | 
            
  | 20:24 | <bking@cumin2002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker2003.codfw.wmnet with reason: host reimage | [production] | 
            
  | 20:22 | <reedy@deploy2002> | sbassett, reedy: Backport for [[gerrit:1194962|OATHAuth Recovery Code code improvement (T406501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 20:19 | <bking@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker2003.codfw.wmnet with reason: host reimage | [production] | 
            
  | 20:18 | <reedy@deploy2002> | Started scap sync-world: Backport for [[gerrit:1194962|OATHAuth Recovery Code code improvement (T406501)]] | [production] | 
            
  | 20:17 | <reedy@deploy2002> | Finished scap sync-world: Backport for [[gerrit:1194978|Update interwiki cache]], [[gerrit:1194981|Revert "Delete the event-organizer user group on medium and small wikis" (T401445)]], [[gerrit:1194986|Assign campaignevents-generate-invitation-lists right explicitly (T401445)]] (duration: 10m 46s) | [production] | 
            
  | 20:13 | <reedy@deploy2002> | daimona, reedy: Continuing with sync | [production] | 
            
  | 20:11 | <reedy@deploy2002> | daimona, reedy: Backport for [[gerrit:1194978|Update interwiki cache]], [[gerrit:1194981|Revert "Delete the event-organizer user group on medium and small wikis" (T401445)]], [[gerrit:1194986|Assign campaignevents-generate-invitation-lists right explicitly (T401445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 20:06 | <reedy@deploy2002> | Started scap sync-world: Backport for [[gerrit:1194978|Update interwiki cache]], [[gerrit:1194981|Revert "Delete the event-organizer user group on medium and small wikis" (T401445)]], [[gerrit:1194986|Assign campaignevents-generate-invitation-lists right explicitly (T401445)]] | [production] |