| 
      
        2024-09-26
      
      ยง
     | 
  
    
  | 08:51 | 
  <akosiaris@deploy1003> | 
  helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:50 | 
  <akosiaris@deploy1003> | 
  helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:50 | 
  <akosiaris@deploy1003> | 
  helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:50 | 
  <akosiaris@deploy1003> | 
  helmfile [staging-codfw] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:50 | 
  <akosiaris@deploy1003> | 
  helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:50 | 
  <akosiaris@deploy1003> | 
  helmfile [staging-eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:49 | 
  <elukey@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host registry2004.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 08:48 | 
  <akosiaris> | 
  deploy calico on all clusters to pick up the configuration for the lsw1-{e,f}{6,7} leaf switches. It is a noop in some clusters, but the config should be in-sync anyways | 
  [production] | 
            
  | 08:47 | 
  <akosiaris@deploy1003> | 
  helmfile [codfw] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:47 | 
  <akosiaris@deploy1003> | 
  helmfile [codfw] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:46 | 
  <akosiaris@deploy1003> | 
  helmfile [eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:46 | 
  <akosiaris@deploy1003> | 
  helmfile [eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:44 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.hosts.reboot-single for host cloudcephosd1037.eqiad.wmnet | 
  [production] | 
            
  | 08:44 | 
  <dcaro@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1036.eqiad.wmnet | 
  [production] | 
            
  | 08:38 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudcephosd1031.eqiad.wmnet | 
  [production] | 
            
  | 08:35 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.hosts.reboot-single for host cloudcephosd1036.eqiad.wmnet | 
  [production] | 
            
  | 08:27 | 
  <dcaro@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1035.eqiad.wmnet | 
  [production] | 
            
  | 08:25 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.puppet.migrate-host for host cloudcephosd1031.eqiad.wmnet | 
  [production] | 
            
  | 08:20 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudcephosd1032.eqiad.wmnet | 
  [production] | 
            
  | 08:20 | 
  <dcaro@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1040.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 08:19 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.hosts.reboot-single for host cloudcephosd1035.eqiad.wmnet | 
  [production] | 
            
  | 08:16 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1040.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 08:10 | 
  <elukey@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host registry1004.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 08:08 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.puppet.migrate-host for host cloudcephosd1032.eqiad.wmnet | 
  [production] | 
            
  | 08:06 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudcephosd1033.eqiad.wmnet | 
  [production] | 
            
  | 07:58 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host cloudcephosd1040.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 07:54 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.puppet.migrate-host for host cloudcephosd1033.eqiad.wmnet | 
  [production] | 
            
  | 07:50 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host cloudcephosd1034.eqiad.wmnet | 
  [production] | 
            
  | 07:47 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.puppet.migrate-host for host cloudcephosd1034.eqiad.wmnet | 
  [production] | 
            
  | 07:43 | 
  <jelto@cumin1002> | 
  END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version | 
  [production] | 
            
  | 07:41 | 
  <dcaro@cumin1002> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Run failed when reimaging cloudcephosd1039 and asked to run manually - dcaro@cumin1002 - T372814" | 
  [production] | 
            
  | 07:40 | 
  <dcaro@cumin1002> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Run failed when reimaging cloudcephosd1039 and asked to run manually - dcaro@cumin1002 - T372814" | 
  [production] | 
            
  | 07:37 | 
  <kartik@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1075522|Revert^2 "Enable message group subscription feature for Test Wikipedia" (T372386)]] (duration: 13m 37s) | 
  [production] | 
            
  | 07:37 | 
  <elukey@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on registry1004.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 07:36 | 
  <elukey@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on registry1004.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 07:35 | 
  <jelto@cumin1002> | 
  START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab Replica to new version | 
  [production] | 
            
  | 07:35 | 
  <jmm@cumin2002> | 
  END (FAIL) - Cookbook sre.puppet.migrate-host (exit_code=99) for host cloudcephosd1034.eqiad.wmnet | 
  [production] | 
            
  | 07:34 | 
  <jelto@cumin1002> | 
  END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version | 
  [production] | 
            
  | 07:32 | 
  <kartik@deploy1003> | 
  kartik, abi: Continuing with sync | 
  [production] | 
            
  | 07:26 | 
  <jelto@cumin1002> | 
  START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab Replica to new version | 
  [production] | 
            
  | 07:26 | 
  <kartik@deploy1003> | 
  kartik, abi: Backport for [[gerrit:1075522|Revert^2 "Enable message group subscription feature for Test Wikipedia" (T372386)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 07:24 | 
  <kartik@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1075522|Revert^2 "Enable message group subscription feature for Test Wikipedia" (T372386)]] | 
  [production] | 
            
  | 07:20 | 
  <elukey@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host registry1004.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 07:19 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.puppet.migrate-host for host cloudcephosd1034.eqiad.wmnet | 
  [production] | 
            
  | 07:18 | 
  <kartik@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]] (duration: 13m 25s) | 
  [production] | 
            
  | 07:13 | 
  <kartik@deploy1003> | 
  abi, kartik: Continuing with sync | 
  [production] | 
            
  | 07:07 | 
  <kartik@deploy1003> | 
  abi, kartik: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 07:04 | 
  <kartik@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1075521|Translate: Add VirtualDomainsMapping (T372287)]] | 
  [production] | 
            
  | 07:00 | 
  <elukey> | 
  apt-get clean on grafana2001 to free some space in the root partition | 
  [production] | 
            
  | 00:41 | 
  <sukhe@cumin1002> | 
  END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site eqiad [reason: repooling due to repeated port utilization alerts, T370962] | 
  [production] |