| 
      
        2024-11-21
      
      §
     | 
  
    
  | 16:05 | 
  <dancy@deploy2002> | 
  Started scap sync-world: testing | 
  [production] | 
            
  | 16:04 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 16:03 | 
  <cgoubert@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 16:00 | 
  <dancy@deploy2002> | 
  Installing scap version "4.127.0" for 209 hosts | 
  [production] | 
            
  | 15:39 | 
  <kartik@deploy2002> | 
  Finished scap sync-world: Backport for [[gerrit:1093927|Fix layout broken by display:flex on HorizontalLayout (T380471)]], [[gerrit:1093928|Revert "ExperimentUserDefaultsManager: use read latest when retrieving central id"]] (duration: 15m 51s) | 
  [production] | 
            
  | 15:34 | 
  <gmodena@deploy2002> | 
  Finished deploy [analytics/refinery@358ccf5] (hadoop-test): Ad-hoc deployment TEST [analytics/refinery@358ccf55] (duration: 03m 30s) | 
  [production] | 
            
  | 15:33 | 
  <kartik@deploy2002> | 
  abi, sgimeno, kartik: Continuing with sync | 
  [production] | 
            
  | 15:31 | 
  <gmodena@deploy2002> | 
  Started deploy [analytics/refinery@358ccf5] (hadoop-test): Ad-hoc deployment TEST [analytics/refinery@358ccf55] | 
  [production] | 
            
  | 15:29 | 
  <gmodena@deploy2002> | 
  Finished deploy [analytics/refinery@358ccf5] (thin): Ad-hoc deployment THIN [analytics/refinery@358ccf55] (duration: 05m 16s) | 
  [production] | 
            
  | 15:29 | 
  <ihurbain@deploy2002> | 
  helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply | 
  [production] | 
            
  | 15:29 | 
  <kartik@deploy2002> | 
  abi, sgimeno, kartik: Backport for [[gerrit:1093927|Fix layout broken by display:flex on HorizontalLayout (T380471)]], [[gerrit:1093928|Revert "ExperimentUserDefaultsManager: use read latest when retrieving central id"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 15:28 | 
  <ihurbain@deploy2002> | 
  helmfile [eqiad] START helmfile.d/services/push-notifications: apply | 
  [production] | 
            
  | 15:28 | 
  <ihurbain@deploy2002> | 
  helmfile [codfw] DONE helmfile.d/services/push-notifications: apply | 
  [production] | 
            
  | 15:27 | 
  <ihurbain@deploy2002> | 
  helmfile [codfw] START helmfile.d/services/push-notifications: apply | 
  [production] | 
            
  | 15:26 | 
  <ebernhardson@deploy2002> | 
  Finished deploy [airflow-dags/search@6183645]: increase driver memory for mjolnir feature selection (duration: 00m 31s) | 
  [production] | 
            
  | 15:26 | 
  <sukhe@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting | 
  [production] | 
            
  | 15:25 | 
  <sukhe@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting | 
  [production] | 
            
  | 15:25 | 
  <ebernhardson@deploy2002> | 
  Started deploy [airflow-dags/search@6183645]: increase driver memory for mjolnir feature selection | 
  [production] | 
            
  | 15:24 | 
  <sukhe> | 
  stop pybal on lvs2013 to confirm changes in CR 1091243 | 
  [production] | 
            
  | 15:24 | 
  <gmodena@deploy2002> | 
  Started deploy [analytics/refinery@358ccf5] (thin): Ad-hoc deployment THIN [analytics/refinery@358ccf55] | 
  [production] | 
            
  | 15:24 | 
  <kartik@deploy2002> | 
  Started scap sync-world: Backport for [[gerrit:1093927|Fix layout broken by display:flex on HorizontalLayout (T380471)]], [[gerrit:1093928|Revert "ExperimentUserDefaultsManager: use read latest when retrieving central id"]] | 
  [production] | 
            
  | 15:23 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 15:23 | 
  <cgoubert@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 15:16 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 15:15 | 
  <cgoubert@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 15:11 | 
  <eevans@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2021.codfw.wmnet with reason: Decommissioning — T380236 | 
  [production] | 
            
  | 15:10 | 
  <eevans@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2021.codfw.wmnet with reason: Decommissioning — T380236 | 
  [production] | 
            
  | 15:06 | 
  <gmodena@deploy2002> | 
  Finished deploy [analytics/refinery@358ccf5]: Ad-hoc deployment [analytics/refinery@358ccf55] (duration: 11m 44s) | 
  [production] | 
            
  | 14:56 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2169.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:55 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:54 | 
  <gmodena@deploy2002> | 
  Started deploy [analytics/refinery@358ccf5]: Ad-hoc deployment [analytics/refinery@358ccf55] | 
  [production] | 
            
  | 14:53 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2168.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:51 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2170.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:50 | 
  <sergi0> | 
  UTC afternoon deploys done | 
  [production] | 
            
  | 14:49 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2167.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:48 | 
  <sgimeno@deploy2002> | 
  Sync cancelled. | 
  [production] | 
            
  | 14:47 | 
  <cgoubert@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:47 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2166.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:43 | 
  <jynus@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on kafka-main1001.eqiad.wmnet with reason: Per claime's recommendation | 
  [production] | 
            
  | 14:43 | 
  <jynus@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on kafka-main1001.eqiad.wmnet with reason: Per claime's recommendation | 
  [production] | 
            
  | 14:43 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2157.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:41 | 
  <sgimeno@deploy2002> | 
  sgimeno: Backport for [[gerrit:1093889|ExperimentUserDefaultsManager: use read latest when retrieving central id (T379682)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 14:39 | 
  <cgoubert@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm | 
  [production] | 
            
  | 14:36 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2169.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 14:35 | 
  <sgimeno@deploy2002> | 
  Started scap sync-world: Backport for [[gerrit:1093889|ExperimentUserDefaultsManager: use read latest when retrieving central id (T379682)]] | 
  [production] | 
            
  | 14:33 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2168.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 14:31 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2170.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 14:28 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2167.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 14:25 | 
  <ihurbain@deploy2002> | 
  helmfile [staging] DONE helmfile.d/services/push-notifications: apply | 
  [production] | 
            
  | 14:25 | 
  <cgoubert@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2166.codfw.wmnet with reason: host reimage | 
  [production] |