| 
      
        2024-09-18
      
      §
     | 
  
    
  | 10:25 | 
  <stevemunene@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1176.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 10:20 | 
  <elukey> | 
  restart poolcounterd on poolcounter2003 (not serving any traffic atm, tried to clear old/stale conns) | 
  [production] | 
            
  | 10:14 | 
  <elukey@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1073427|Swap poolcounter2004 with poolcounter2006 (T332015)]] (duration: 07m 08s) | 
  [production] | 
            
  | 10:09 | 
  <elukey@deploy1003> | 
  elukey: Continuing with sync | 
  [production] | 
            
  | 10:09 | 
  <elukey@deploy1003> | 
  elukey: Backport for [[gerrit:1073427|Swap poolcounter2004 with poolcounter2006 (T332015)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 10:07 | 
  <elukey@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1073427|Swap poolcounter2004 with poolcounter2006 (T332015)]] | 
  [production] | 
            
  | 09:26 | 
  <tappof@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet | 
  [production] | 
            
  | 09:11 | 
  <stevemunene@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 09:11 | 
  <tappof@cumin2002> | 
  START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet | 
  [production] | 
            
  | 09:01 | 
  <moritzm> | 
  drain ganeti2026 T373104 | 
  [production] | 
            
  | 08:41 | 
  <tappof> | 
  centrallog2002 upgrade to bookworm in progress https://phabricator.wikimedia.org/T353912 | 
  [production] | 
            
  | 08:32 | 
  <elukey> | 
  install openjdk-17-jdk on puppetserver1002 to get some useful tools like jmap - T373527 | 
  [production] | 
            
  | 08:30 | 
  <jnuche@deploy1003> | 
  rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.23  refs T373642 | 
  [production] | 
            
  | 08:25 | 
  <brouberol@deploy1003> | 
  helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:25 | 
  <brouberol@deploy1003> | 
  helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 08:21 | 
  <jmm@cumin2002> | 
  END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2017.codfw.wmnet | 
  [production] | 
            
  | 08:16 | 
  <jmm@cumin2002> | 
  START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2017.codfw.wmnet | 
  [production] | 
            
  | 08:15 | 
  <jnuche@deploy1003> | 
  rebuilt and synchronized wikiversions files: group1 to 1.43.0-wmf.23  refs T373642 | 
  [production] | 
            
  | 07:45 | 
  <volans@cumin1002> | 
  END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | 
  [production] | 
            
  | 07:45 | 
  <volans@cumin1002> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fixed asset tag for db1179 - volans@cumin1002" | 
  [production] | 
            
  | 07:43 | 
  <volans@cumin1002> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fixed asset tag for db1179 - volans@cumin1002" | 
  [production] | 
            
  | 07:33 | 
  <volans@cumin1002> | 
  START - Cookbook sre.dns.netbox | 
  [production] | 
            
  | 06:39 | 
  <moritzm> | 
  installing curl security updates | 
  [production] | 
            
  | 06:05 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'T374807', diff saved to https://phabricator.wikimedia.org/P69250 and previous config saved to /var/cache/conftool/dbconfig/20240918-060549-arnaudb.json | 
  [production] | 
            
  | 06:03 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Promote db2220 to s7 primary T374807', diff saved to https://phabricator.wikimedia.org/P69249 and previous config saved to /var/cache/conftool/dbconfig/20240918-060332-arnaudb.json | 
  [production] | 
            
  | 06:02 | 
  <arnaudb> | 
  Starting s7 codfw failover from db2218 to db2220 - T374807 | 
  [production] | 
            
  | 05:49 | 
  <arnaudb@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T374807 | 
  [production] | 
            
  | 05:49 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Remove db2220 from API/vslow/dump T374807', diff saved to https://phabricator.wikimedia.org/P69248 and previous config saved to /var/cache/conftool/dbconfig/20240918-054921-arnaudb.json | 
  [production] | 
            
  | 05:49 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Set db2220 with weight 0 T374807', diff saved to https://phabricator.wikimedia.org/P69247 and previous config saved to /var/cache/conftool/dbconfig/20240918-054909-arnaudb.json | 
  [production] | 
            
  | 05:48 | 
  <arnaudb@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T374807 | 
  [production] | 
            
  | 05:47 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'T374804', diff saved to https://phabricator.wikimedia.org/P69246 and previous config saved to /var/cache/conftool/dbconfig/20240918-054729-arnaudb.json | 
  [production] | 
            
  | 05:45 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Promote db2179 to s4 primary T374804', diff saved to https://phabricator.wikimedia.org/P69245 and previous config saved to /var/cache/conftool/dbconfig/20240918-054515-arnaudb.json | 
  [production] | 
            
  | 05:43 | 
  <arnaudb> | 
  Starting s4 codfw failover from db2140 to db2179 - T374804 | 
  [production] | 
            
  | 05:38 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Remove db2179 from API/vslow/dump T374804', diff saved to https://phabricator.wikimedia.org/P69244 and previous config saved to /var/cache/conftool/dbconfig/20240918-053807-arnaudb.json | 
  [production] | 
            
  | 05:37 | 
  <arnaudb@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s4 T374804 | 
  [production] | 
            
  | 05:36 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Set db2179 with weight 0 T374804', diff saved to https://phabricator.wikimedia.org/P69243 and previous config saved to /var/cache/conftool/dbconfig/20240918-053633-arnaudb.json | 
  [production] | 
            
  | 05:36 | 
  <arnaudb@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: Primary switchover s4 T374804 | 
  [production] | 
            
  | 05:33 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'T375047', diff saved to https://phabricator.wikimedia.org/P69242 and previous config saved to /var/cache/conftool/dbconfig/20240918-053357-arnaudb.json | 
  [production] | 
            
  | 05:31 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Promote db2214 to s6 primary T375047', diff saved to https://phabricator.wikimedia.org/P69241 and previous config saved to /var/cache/conftool/dbconfig/20240918-053115-arnaudb.json | 
  [production] | 
            
  | 05:30 | 
  <arnaudb> | 
  Starting s6 codfw failover from db2129 to db2214 - T375047 | 
  [production] | 
            
  | 05:24 | 
  <arnaudb@cumin1002> | 
  dbctl commit (dc=all): 'Set db2214 with weight 0 T375047', diff saved to https://phabricator.wikimedia.org/P69240 and previous config saved to /var/cache/conftool/dbconfig/20240918-052446-arnaudb.json | 
  [production] | 
            
  | 05:24 | 
  <arnaudb@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s6 T375047 | 
  [production] | 
            
  | 05:23 | 
  <arnaudb@cumin1002> | 
  START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s6 T375047 | 
  [production] | 
            
  | 00:11 | 
  <dzahn@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab2002.wikimedia.org with reason: version upgrade | 
  [production] | 
            
  | 00:11 | 
  <dzahn@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab2002.wikimedia.org with reason: version upgrade | 
  [production] |