| 
      
        2024-09-23
      
      ยง
     | 
  
    
  | 21:11 | 
  <jclark@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host ganeti1041.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 21:10 | 
  <toyofuku@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1074545|Do not apply table styling rules to Main page (T375245)]] | 
  [production] | 
            
  | 21:01 | 
  <jclark@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host ganeti1040.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 20:56 | 
  <btullis@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 20:56 | 
  <jclark@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host ganeti1039.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 20:56 | 
  <btullis@cumin1002> | 
  END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 20:40 | 
  <toyofuku@deploy1003> | 
  Finished scap sync-world: Backport for [[gerrit:1072600|Remove ProofreadPage dark mode namespaces exception]], [[gerrit:1074490|Promote dark mode for anons on tier 1 wikis (T374679)]] (duration: 27m 41s) | 
  [production] | 
            
  | 20:39 | 
  <btullis@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 20:25 | 
  <toyofuku@deploy1003> | 
  jdlrobson, toyofuku, ebrahim: Continuing with sync | 
  [production] | 
            
  | 20:23 | 
  <toyofuku@deploy1003> | 
  jdlrobson, toyofuku, ebrahim: Backport for [[gerrit:1072600|Remove ProofreadPage dark mode namespaces exception]], [[gerrit:1074490|Promote dark mode for anons on tier 1 wikis (T374679)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | 
  [production] | 
            
  | 20:12 | 
  <toyofuku@deploy1003> | 
  Started scap sync-world: Backport for [[gerrit:1072600|Remove ProofreadPage dark mode namespaces exception]], [[gerrit:1074490|Promote dark mode for anons on tier 1 wikis (T374679)]] | 
  [production] | 
            
  | 18:47 | 
  <jgleeson> | 
  SmashPig upgraded from 02ba8a7e to 697344d7 | 
  [production] | 
            
  | 17:06 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 17:05 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:53 | 
  <jgleeson> | 
  SmashPig upgraded from ac85ad1d to 02ba8a7e | 
  [production] | 
            
  | 16:49 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:49 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:43 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:43 | 
  <ebernhardson@deploy1003> | 
  helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:43 | 
  <pt1979@cumin1002> | 
  START - Cookbook sre.hosts.dhcp for host db1246.eqiad.wmnet | 
  [production] | 
            
  | 16:38 | 
  <ryankemper@cumin2002> | 
  END (PASS) - Cookbook sre.wdqs.restart (exit_code=0) | 
  [production] | 
            
  | 16:38 | 
  <pt1979@cumin1002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1246.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 16:08 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=no; selector: name=registry1003.eqiad.wmnet,service=docker-registry,dc=eqiad | 
  [production] | 
            
  | 16:08 | 
  <elukey@puppetserver1001> | 
  conftool action : set/weight=10; selector: name=registry1005.eqiad.wmnet,service=docker-registry,dc=eqiad | 
  [production] | 
            
  | 16:08 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=yes; selector: name=registry1005.eqiad.wmnet,service=docker-registry,dc=eqiad | 
  [production] | 
            
  | 16:05 | 
  <dcausse@deploy1003> | 
  helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:05 | 
  <dcausse@deploy1003> | 
  helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:04 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=true,weight=10; selector: name=registry1005.eqiad.wmnet,service=docker-registry,dc=eqiad | 
  [production] | 
            
  | 16:03 | 
  <dcausse@deploy1003> | 
  helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 16:03 | 
  <dcausse@deploy1003> | 
  helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 15:59 | 
  <dcausse@deploy1003> | 
  helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 15:59 | 
  <dcausse@deploy1003> | 
  helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply | 
  [production] | 
            
  | 15:46 | 
  <pt1979@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 15:35 | 
  <ryankemper@cumin2002> | 
  START - Cookbook sre.wdqs.restart | 
  [production] | 
            
  | 15:20 | 
  <stevemunene@deploy1003> | 
  Finished deploy [wdqs/wdqs@316bf7f]: allow 3 new endpoints T364233 T368085 T374195 (duration: 05m 51s) | 
  [production] | 
            
  | 15:14 | 
  <stevemunene@deploy1003> | 
  Started deploy [wdqs/wdqs@316bf7f]: allow 3 new endpoints T364233 T368085 T374195 | 
  [production] | 
            
  | 15:03 | 
  <volans@cumin1002> | 
  dbctl commit (dc=all): 'emergency failover pc3 to pc1015', diff saved to https://phabricator.wikimedia.org/P69396 and previous config saved to /var/cache/conftool/dbconfig/20240923-150320-volans.json | 
  [production] | 
            
  | 14:51 | 
  <moritzm> | 
  powercycle pc1013 (DIMM error in DIMM_A9) | 
  [production] | 
            
  | 14:50 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=true,weight=10; selector: name=registry1005.eqiad.wmnet,service=docker-registry,dc=eqiad | 
  [production] | 
            
  | 14:49 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=true; selector: name=registry1005.eqiad.wmnet | 
  [production] | 
            
  | 14:49 | 
  <elukey@puppetserver1001> | 
  conftool action : set/weight=10; selector: name=registry1005.eqiad.wmnet | 
  [production] | 
            
  | 14:48 | 
  <elukey@puppetserver1001> | 
  conftool action : set/pooled=true; selector: name=registry1005.eqiad.wmnet | 
  [production] | 
            
  | 14:38 | 
  <stevemunene@cumin1002> | 
  START - Cookbook sre.hosts.reimage for host an-worker1176.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 14:31 | 
  <elukey@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host registry1005.eqiad.wmnet with OS bookworm | 
  [production] | 
            
  | 14:30 | 
  <sukhe> | 
  sudo cumin 'O:wikidough' 'run-puppet-agent' | 
  [production] | 
            
  | 14:30 | 
  <jynus> | 
  restarting and moving replication source of pc1015 T375382 | 
  [production] | 
            
  | 14:16 | 
  <ayounsi@cumin1002> | 
  END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-codfw | 
  [production] | 
            
  | 14:16 | 
  <ayounsi@cumin1002> | 
  START - Cookbook sre.network.tls for network device cr2-codfw | 
  [production] | 
            
  | 14:16 | 
  <elukey@cumin1002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on registry1005.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 14:15 | 
  <ayounsi@cumin1002> | 
  END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr1-codfw | 
  [production] |