| 2025-09-23
      
      § | 
    
  | 14:12 | <andrew@cloudcumin1001> | START - Cookbook wmcs.ceph.osd.reactivate | [admin] | 
            
  | 14:11 | <andrew@cloudcumin1001> | END (FAIL) - Cookbook wmcs.ceph.osd.reactivate (exit_code=99) | [admin] | 
            
  | 14:10 | <andrew@cloudcumin1001> | START - Cookbook wmcs.ceph.osd.reactivate | [admin] | 
            
  | 14:09 | <lucaswerkmeister-wmde@deploy1003> | phuedx, lucaswerkmeister-wmde: Backport for [[gerrit:1190667|lib: Update lib/metrics-platform to f1a18553 (T385180)]], [[gerrit:1190679|lib: Update metrics-platform to fc7678c10a1f (T401380)]], [[gerrit:1190647|ext.xLab: Add mw.xLab.getInstrument() (T401380 T404851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 14:08 | <btullis@cumin1003> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 14:08 | <btullis@cumin1003> | START - Cookbook sre.ganeti.makevm for new host an-launcher1003.eqiad.wmnet | [production] | 
            
  | 14:08 | <andrew@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd1025.eqiad.wmnet with OS bookworm | [production] | 
            
  | 14:06 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet | [production] | 
            
  | 14:06 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet | [production] | 
            
  | 14:03 | <lucaswerkmeister-wmde@deploy1003> | Started scap sync-world: Backport for [[gerrit:1190667|lib: Update lib/metrics-platform to f1a18553 (T385180)]], [[gerrit:1190679|lib: Update metrics-platform to fc7678c10a1f (T401380)]], [[gerrit:1190647|ext.xLab: Add mw.xLab.getInstrument() (T401380 T404851)]] | [production] | 
            
  | 13:58 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet | [production] | 
            
  | 13:57 | <andrew@cumin2002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1041.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 13:52 | <mvernon@cumin2002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ms-be[1086-1088].eqiad.wmnet with reason: awaiting controller swap | [production] | 
            
  | 13:52 | <jmm@cumin2002> | START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet | [production] | 
            
  | 13:52 | <andrew@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1041.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 13:45 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet | [production] | 
            
  | 13:45 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet | [production] | 
            
  | 13:42 | <lucaswerkmeister-wmde@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1190573|[arbcom_plwiki] Add an icon (T391009)]], [[gerrit:1190650|CheckUser/UserInfoCard: Phase 1 enable by default on pilot wikis (T405342)]] (duration: 13m 16s) | [production] | 
            
  | 13:41 | <kart_> | Updated Recommendation API to 2025-09-23-124706-production (T405004, T404976) | [production] | 
            
  | 13:37 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet | [production] | 
            
  | 13:37 | <andrew@cumin2002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bookworm | [production] | 
            
  | 13:36 | <lucaswerkmeister-wmde@deploy1003> | lucaswerkmeister-wmde, superpes, kharlan: Continuing with sync | [production] | 
            
  | 13:34 | <lucaswerkmeister-wmde@deploy1003> | lucaswerkmeister-wmde, superpes, kharlan: Backport for [[gerrit:1190573|[arbcom_plwiki] Add an icon (T391009)]], [[gerrit:1190650|CheckUser/UserInfoCard: Phase 1 enable by default on pilot wikis (T405342)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 13:34 | <jmm@cumin2002> | START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet | [production] | 
            
  | 13:31 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet | [production] | 
            
  | 13:31 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet | [production] | 
            
  | 13:31 | <kartik@deploy1003> | helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 13:29 | <andrew@cumin2002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1041.eqiad.wmnet with OS bookworm | [production] | 
            
  | 13:28 | <lucaswerkmeister-wmde@deploy1003> | Started scap sync-world: Backport for [[gerrit:1190573|[arbcom_plwiki] Add an icon (T391009)]], [[gerrit:1190650|CheckUser/UserInfoCard: Phase 1 enable by default on pilot wikis (T405342)]] | [production] | 
            
  | 13:25 | <sukhe@cumin1003> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4001.ulsfo.wmnet | [production] | 
            
  | 13:24 | <kartik@deploy1003> | helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 13:24 | <tchanders@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1189861|Increase the number of shards used for temp user name generation (T404131)]], [[gerrit:1190188|Enable temporary accounts on itwiki (T405195)]] (duration: 14m 19s) | [production] | 
            
  | 13:23 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet | [production] | 
            
  | 13:21 | <sukhe@cumin1003> | START - Cookbook sre.hosts.reboot-single for host durum4001.ulsfo.wmnet | [production] | 
            
  | 13:20 | <sukhe@cumin1003> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum2001.codfw.wmnet | [production] | 
            
  | 13:19 | <jmm@cumin2002> | START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet | [production] | 
            
  | 13:19 | <tchanders@deploy1003> | tchanders: Continuing with sync | [production] | 
            
  | 13:17 | <kartik@deploy1003> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 13:16 | <sukhe@cumin1003> | START - Cookbook sre.hosts.reboot-single for host durum2001.codfw.wmnet | [production] | 
            
  | 13:16 | <tchanders@deploy1003> | tchanders: Backport for [[gerrit:1189861|Increase the number of shards used for temp user name generation (T404131)]], [[gerrit:1190188|Enable temporary accounts on itwiki (T405195)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 13:13 | <sukhe@cumin1003> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum1002.eqiad.wmnet | [production] | 
            
  | 13:11 | <jclark@cumin1002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bookworm | [production] | 
            
  | 13:09 | <tchanders@deploy1003> | Started scap sync-world: Backport for [[gerrit:1189861|Increase the number of shards used for temp user name generation (T404131)]], [[gerrit:1190188|Enable temporary accounts on itwiki (T405195)]] | [production] | 
            
  | 13:09 | <sukhe@cumin1003> | START - Cookbook sre.hosts.reboot-single for host durum1002.eqiad.wmnet | [production] | 
            
  | 12:57 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet | [production] | 
            
  | 12:57 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet | [production] | 
            
  | 12:55 | <claime> | Enabling puppet on all cp nodes - T400131 | [production] | 
            
  | 12:50 | <dbrant@deploy1003> | helmfile [codfw] DONE helmfile.d/services/mobileapps: apply | [production] | 
            
  | 12:50 | <dbrant@deploy1003> | helmfile [codfw] START helmfile.d/services/mobileapps: apply | [production] | 
            
  | 12:49 | <dbrant@deploy1003> | helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply | [production] |