| 
      
        2021-08-11
      
      §
     | 
  
    
  | 19:43 | 
  <btullis> | 
  btullis@druid1003:~$ sudo systemctl stop druid-overlord && sudo systemctl disable druid-overlord | 
  [analytics] | 
            
  | 19:41 | 
  <btullis> | 
  btullis@druid1003:~$ sudo systemctl stop druid-historical && sudo systemctl disable druid-historical | 
  [analytics] | 
            
  | 19:40 | 
  <btullis> | 
  btullis@druid1003:~$ sudo systemctl stop druid-coordinator && sudo systemctl disable druid-coordinator | 
  [analytics] | 
            
  | 19:37 | 
  <btullis> | 
  btullis@druid1003:~$ sudo systemctl stop druid-broker && sudo systemctl disable druid-broker | 
  [analytics] | 
            
  | 19:30 | 
  <btullis> | 
  btullis@druid1003:~$ curl -X POST http://druid1003.eqiad.wmnet:8091/druid/worker/v1/disable | 
  [analytics] | 
            
  | 12:13 | 
  <btullis> | 
  migration of zookeeper from druid1002 to an-druid1002 complete, with quorum and two zynced followers. Re-enabling puppet on all druid nodes. | 
  [analytics] | 
            
  | 09:48 | 
  <btullis> | 
  suspended the following oozie jobs in hue: webrequest-druid-hourly-coord, pageview-druid-hourly-coord, edit-hourly-druid-coord | 
  [analytics] | 
            
  | 09:45 | 
  <btullis> | 
  btullis@an-launcher1002:~$ sudo systemctl disable eventlogging_to_druid_editattemptstep_hourly.timer eventlogging_to_druid_navigationtiming_hourly.timer eventlogging_to_druid_netflow_hourly.timer eventlogging_to_druid_prefupdate_hourly.timer | 
  [analytics] | 
            
  | 09:21 | 
  <elukey> | 
  run "sudo find /var/log/airflow -type f -mtime +15 -delete" on an-airflow1001 to free space (root partition almost full) | 
  [analytics] | 
            
  
    | 
      
        2021-07-20
      
      §
     | 
  
    
  | 20:30 | 
  <joal> | 
  rerun webrequest timed-out instances | 
  [analytics] | 
            
  | 18:58 | 
  <mforns> | 
  starting refinery deployment | 
  [analytics] | 
            
  | 18:40 | 
  <razzi> | 
  razzi@an-launcher1002:~$ sudo puppet agent --enable | 
  [analytics] | 
            
  | 18:39 | 
  <razzi> | 
  razzi@an-master1001:/var/log/hadoop-hdfs$ sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues | 
  [analytics] | 
            
  | 18:37 | 
  <razzi> | 
  razzi@an-master1002:~$ sudo -i puppet agent --enable | 
  [analytics] | 
            
  | 18:34 | 
  <razzi> | 
  razzi@an-master1002:~$ sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues | 
  [analytics] | 
            
  | 18:32 | 
  <razzi> | 
  razzi@an-master1002:~$ sudo systemctl start hadoop-yarn-resourcemanager.service | 
  [analytics] | 
            
  | 18:31 | 
  <razzi> | 
  razzi@an-master1002:~$ sudo systemctl stop hadoop-yarn-resourcemanager.service | 
  [analytics] | 
            
  | 18:22 | 
  <razzi> | 
  sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet | 
  [analytics] | 
            
  | 18:21 | 
  <razzi> | 
  re-enable yarn queues by merging puppet patch https://gerrit.wikimedia.org/r/c/operations/puppet/+/705732 | 
  [analytics] | 
            
  | 17:27 | 
  <razzi> | 
  razzi@cumin1001:~$ sudo -i wmf-auto-reimage-host -p T278423 an-master1001.eqiad.wmnet | 
  [analytics] | 
            
  | 17:17 | 
  <razzi> | 
  stop all hadoop processes on an-master1001 | 
  [analytics] | 
            
  | 16:52 | 
  <razzi> | 
  starting hadoop processes on an-master1001 since they didn't failover cleanly | 
  [analytics] | 
            
  | 16:31 | 
  <razzi> | 
  sudo bash gid_script.bash on an-maseter1001 | 
  [analytics] | 
            
  | 16:29 | 
  <razzi> | 
  razzi@alert1001:~$ sudo icinga-downtime -h an-master1001 -d 7200 -r "an-master1001 debian upgrade" | 
  [analytics] | 
            
  | 16:25 | 
  <razzi> | 
  razzi@an-master1001:~$ sudo systemctl stop hadoop-mapreduce-historyserver | 
  [analytics] | 
            
  | 16:25 | 
  <razzi> | 
  sudo systemctl stop hadoop-hdfs-zkfc.service on an-master1001 again | 
  [analytics] | 
            
  | 16:25 | 
  <razzi> | 
  sudo systemctl stop hadoop-yarn-resourcemanager on an-master1001 again | 
  [analytics] |