| 2024-10-09
      
      ยง | 
    
  | 15:09 | <mutante> | people.wikimedia.org - rebooting backends | [production] | 
            
  | 15:09 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet | [production] | 
            
  | 15:07 | <sukhe@cumin1002> | END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns1006.wikimedia.org | [production] | 
            
  | 15:07 | <sukhe@cumin1002> | START - Cookbook sre.hosts.remove-downtime for dns1006.wikimedia.org | [production] | 
            
  | 15:06 | <sukhe@cumin1002> | cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org | [production] | 
            
  | 15:05 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet | [production] | 
            
  | 15:04 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host crm2001.codfw.wmnet | [production] | 
            
  | 15:04 | <pt1979@cumin2002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-eqsin with reason: router replacement | [production] | 
            
  | 15:03 | <pt1979@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin with reason: router replacement | [production] | 
            
  | 15:03 | <pt1979@cumin2002> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-eqsin with reason: router replacement | [production] | 
            
  | 15:02 | <pt1979@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin with reason: router replacement | [production] | 
            
  | 15:01 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED | [production] | 
            
  | 15:00 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host crm2001.codfw.wmnet | [production] | 
            
  | 14:59 | <brouberol@cumin1002> | END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling restart_daemons on P{cephosd1001*} and (A:cephosd) | [production] | 
            
  | 14:58 | <brouberol@cumin1002> | START - Cookbook sre.ceph.roll-restart-reboot-server rolling restart_daemons on P{cephosd1001*} and (A:cephosd) | [production] | 
            
  | 14:54 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet | [production] | 
            
  | 14:53 | <jynus@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on backup[2010-2011].codfw.wmnet with reason: T376800 | [production] | 
            
  | 14:52 | <jynus@cumin1002> | START - Cookbook sre.hosts.downtime for 1:00:00 on backup[2010-2011].codfw.wmnet with reason: T376800 | [production] | 
            
  | 14:51 | <jayme@deploy1003> | helmfile [eqiad] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 14:51 | <jclark@cumin1002> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002" | [production] | 
            
  | 14:51 | <jayme@deploy1003> | helmfile [eqiad] START helmfile.d/admin 'apply'. | [production] | 
            
  | 14:50 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet | [production] | 
            
  | 14:50 | <jayme@deploy1003> | helmfile [eqiad] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 14:50 | <sukhe@cumin1002> | cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org | [production] | 
            
  | 14:47 | <brouberol@cumin1002> | END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling restart_daemons on P{cephosd1001*} and (A:cephosd) | [production] | 
            
  | 14:47 | <brouberol@cumin1002> | START - Cookbook sre.ceph.roll-restart-reboot-server rolling restart_daemons on P{cephosd1001*} and (A:cephosd) | [production] | 
            
  | 14:47 | <elukey@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:47 | <elukey@cumin2002> | START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:45 | <jhancock@cumin2002> | START - Cookbook sre.hosts.provision for host cloudlb2004-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTARTand with Dell SCP reboot policy FORCED | [production] | 
            
  | 14:45 | <jayme@deploy1003> | helmfile [codfw] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 14:44 | <jayme@deploy1003> | helmfile [codfw] START helmfile.d/admin 'apply'. | [production] | 
            
  | 14:44 | <sukhe@cumin1002> | END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum | [production] | 
            
  | 14:44 | <elukey@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:44 | <elukey@cumin2002> | START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:44 | <elukey@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:44 | <elukey@cumin2002> | START - Cookbook sre.hosts.provision for host backup1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART | [production] | 
            
  | 14:43 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudlb2004-dev | [production] | 
            
  | 14:43 | <jhancock@cumin2002> | START - Cookbook sre.network.configure-switch-interfaces for host cloudlb2004-dev | [production] | 
            
  | 14:39 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet | [production] | 
            
  | 14:36 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet | [production] | 
            
  | 14:35 | <sukhe@cumin1002> | cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org | [production] | 
            
  | 14:34 | <jmm@cumin2002> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet | [production] | 
            
  | 14:32 | <jhancock@cumin2002> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 14:31 | <elukey@deploy2002> | helmfile [staging] DONE helmfile.d/services/sessionstore: sync | [production] | 
            
  | 14:31 | <elukey@deploy2002> | helmfile [staging] START helmfile.d/services/sessionstore: sync | [production] | 
            
  | 14:30 | <jmm@cumin2002> | START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet | [production] | 
            
  | 14:30 | <jhancock@cumin2002> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 14:29 | <ladsgroup@cumin1002> | START - Cookbook sre.mysql.clone of db1198.eqiad.wmnet onto db1157.eqiad.wmnet | [production] | 
            
  | 14:28 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Depooling db1209 (T367856)', diff saved to https://phabricator.wikimedia.org/P69522 and previous config saved to /var/cache/conftool/dbconfig/20241009-142848-ladsgroup.json | [production] | 
            
  | 14:28 | <ladsgroup@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance | [production] |