| 
      
        2025-09-17
      
      ยง
     | 
  
    
  | 09:22 | 
  <elukey@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 09:20 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw | 
  [production] | 
            
  | 09:19 | 
  <hnowlan@deploy1003> | 
  helmfile [staging] DONE helmfile.d/services/rest-gateway: apply | 
  [production] | 
            
  | 09:19 | 
  <hnowlan@deploy1003> | 
  helmfile [staging] START helmfile.d/services/rest-gateway: apply | 
  [production] | 
            
  | 09:18 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 09:17 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2055.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 09:17 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host maps2014.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 09:14 | 
  <elukey@cumin1003> | 
  END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp2054.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 09:11 | 
  <ladsgroup@cumin1003> | 
  dbctl commit (dc=all): 'Bump weight of db1206 in general group (T403966)', diff saved to https://phabricator.wikimedia.org/P83384 and previous config saved to /var/cache/conftool/dbconfig/20250917-091137-ladsgroup.json | 
  [production] | 
            
  | 09:06 | 
  <elukey@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 09:02 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART | 
  [production] | 
            
  | 08:53 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2054.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:52 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host maps2013.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 08:50 | 
  <elukey@cumin1003> | 
  END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp2053.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:44 | 
  <brouberol@deploy1003> | 
  helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:43 | 
  <brouberol@deploy1003> | 
  helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:42 | 
  <moritzm> | 
  upgrading Envoy on deployment hosts T403663 | 
  [production] | 
            
  | 08:36 | 
  <kharlan@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1189108|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189107|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189106|hCaptcha: Remove non-existent message]], [[gerrit:1189105|hCaptcha: Remove non-existent message]] (duration: 13m 41s) | 
  [production] | 
            
  | 08:33 | 
  <fabfur> | 
  restart pybal on lvs1019/lvs2013/lvs2014 to clear out alert | 
  [production] | 
            
  | 08:33 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps2013.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 08:31 | 
  <kharlan@deploy1003> | 
  kharlan: Continuing with sync | 
  [production] | 
            
  | 08:30 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2053.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:30 | 
  <elukey@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp2052.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:28 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on maps2013.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 08:28 | 
  <kharlan@deploy1003> | 
  kharlan: Backport for [[gerrit:1189108|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189107|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189106|hCaptcha: Remove non-existent message]], [[gerrit:1189105|hCaptcha: Remove non-existent message]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | 
  [production] | 
            
  | 08:27 | 
  <moritzm> | 
  upgrading Envoy on IDM hosts T403663 | 
  [production] | 
            
  | 08:22 | 
  <kharlan@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1189108|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189107|hCaptcha: Track events via Prometheus (T402767)]], [[gerrit:1189106|hCaptcha: Remove non-existent message]], [[gerrit:1189105|hCaptcha: Remove non-existent message]] | 
  [production] | 
            
  | 08:15 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2052.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:13 | 
  <elukey@cumin1003> | 
  END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cp2051.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 08:10 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host maps2013.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 08:08 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host maps2012.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 07:53 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2051.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 07:52 | 
  <elukey@cumin1003> | 
  END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp2050.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 07:49 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps2012.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 07:43 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on maps2012.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 07:32 | 
  <elukey@cumin1003> | 
  START - Cookbook sre.hosts.provision for host cp2050.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | 
  [production] | 
            
  | 07:30 | 
  <kharlan@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1189038|hCaptcha: Enable on phase 1 wikis (T402366)]] (duration: 21m 08s) | 
  [production] | 
            
  | 07:25 | 
  <kharlan@deploy1003> | 
  kharlan: Continuing with sync | 
  [production] | 
            
  | 07:24 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host maps2012.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 07:15 | 
  <kharlan@deploy1003> | 
  kharlan: Backport for [[gerrit:1189038|hCaptcha: Enable on phase 1 wikis (T402366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | 
  [production] | 
            
  | 07:13 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host maps2011.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 07:09 | 
  <kharlan@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1189038|hCaptcha: Enable on phase 1 wikis (T402366)]] | 
  [production] | 
            
  | 07:02 | 
  <moritzm> | 
  upgrading Envoy on debmonitor T403663 | 
  [production] | 
            
  | 06:54 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps2011.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 06:50 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on maps2011.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 06:47 | 
  <jmm@cumin2002> | 
  DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps[2012-2014].codfw.wmnet with reason: in setup | 
  [production] | 
            
  | 06:30 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host maps2011.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 01:12 | 
  <mwpresync@deploy1003> | 
  Finished scap build-images: Publishing wmf/next image (duration: 11m 35s) | 
  [production] | 
            
  | 01:00 | 
  <mwpresync@deploy1003> | 
  Started scap build-images: Publishing wmf/next image | 
  [production] | 
            
  | 00:38 | 
  <rzl@deploy1003> | 
  helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply | 
  [production] |