| 
      
        2021-03-05
      
      §
     | 
  
    
  | 15:39 | 
  <razzi> | 
  rebalance kafka partitions for webrequest_upload partition 10 | 
  [analytics] | 
            
  | 15:07 | 
  <elukey> | 
  drain + reimage analytics1073 and an-worker1086 to Debian Buster | 
  [analytics] | 
            
  | 13:36 | 
  <elukey> | 
  roll restart HDFS Namenodes for the Hadoop cluster to pick up new Xmx settings (https://gerrit.wikimedia.org/r/c/operations/puppet/+/668659) | 
  [analytics] | 
            
  | 10:20 | 
  <elukey> | 
  force run of refinery-druid-drop-public-snapshots to check Druid public's performances | 
  [analytics] | 
            
  | 10:06 | 
  <elukey> | 
  failover HDFS Namenode from 1002 to 1001 (high GC pauses triggered the HDFS zkfc daemon on 1001 and the failover to 1002) | 
  [analytics] | 
            
  | 08:32 | 
  <elukey> | 
  drain + reimage an-worker107[8,9] to Debian Buster (one Journal node included) | 
  [analytics] | 
            
  | 07:22 | 
  <elukey> | 
  drain + reimage analytics107[0-1] to debian buster | 
  [analytics] | 
            
  | 07:13 | 
  <elukey> | 
  add analytis1066 back with /dev/sdb removed | 
  [analytics] | 
            
  | 07:01 | 
  <elukey> | 
  stop hadoop daemons on analytics1066 - disk errors on /dev/sdb after reimage | 
  [analytics] | 
            
  
    | 
      
        2021-03-04
      
      §
     | 
  
    
  | 21:19 | 
  <razzi> | 
  rebalance kafka partitions for webrequest_upload partition 9 | 
  [analytics] | 
            
  | 16:27 | 
  <elukey> | 
  drain + reimage analytics106[8,9] to Debian Buster (one is a journalnode) | 
  [analytics] | 
            
  | 15:12 | 
  <elukey> | 
  drain + reimage analytics106[6,7] to Debian Buster | 
  [analytics] | 
            
  | 14:21 | 
  <elukey> | 
  drain + reimage analytics1065 to Debian Buster | 
  [analytics] | 
            
  | 13:32 | 
  <elukey> | 
  drain + reimage analytics10[63,64] to Debian Buster | 
  [analytics] | 
            
  | 12:48 | 
  <elukey> | 
  drain + reimage analytics10[61,62] to Debian Buster | 
  [analytics] | 
            
  | 10:40 | 
  <elukey> | 
  drain + reimage analytics1059/1060 to Debian Buster | 
  [analytics] | 
            
  | 09:32 | 
  <elukey> | 
  reboot an-worker[1097-1101] (GPU workers) to pick up the new kernel (5.10) | 
  [analytics] | 
            
  | 09:02 | 
  <elukey> | 
  kill/start mediawiki-geoeditors-monthly to apply backtick change (hive script) | 
  [analytics] | 
            
  | 08:48 | 
  <elukey> | 
  deploy refinery to hdfs | 
  [analytics] | 
            
  | 08:34 | 
  <elukey> | 
  deploy refinery to fix https://gerrit.wikimedia.org/r/c/analytics/refinery/+/668111 | 
  [analytics] | 
            
  | 07:38 | 
  <elukey> | 
  reboot an-worker1096 to pick up 5.10 kernel | 
  [analytics] | 
            
  
    | 
      
        2021-03-02
      
      §
     | 
  
    
  | 23:15 | 
  <mforns> | 
  finished deployment of refinery to hdfs | 
  [analytics] | 
            
  | 21:59 | 
  <mforns> | 
  starting refinery deployment using scap | 
  [analytics] | 
            
  | 21:48 | 
  <mforns> | 
  deployed refinery-source v0.1.2 | 
  [analytics] | 
            
  | 17:26 | 
  <razzi> | 
  rebalance kafka partitions for webrequest_upload partition 7 | 
  [analytics] | 
            
  | 13:42 | 
  <elukey> | 
  Add an-worker11[19,20-28,30,31] to Analytics Hadoop | 
  [analytics] | 
            
  | 10:21 | 
  <elukey> | 
  roll restart druid historicals on druid public to pick up new cache settings (enable segment caching) | 
  [analytics] | 
            
  | 10:14 | 
  <elukey> | 
  roll restart druid brokers on druid public to pick up new cache settings (no segment caching, only query caching) | 
  [analytics] | 
            
  | 08:01 | 
  <elukey> | 
  manual start of performance-asotranking on stat1007 (requested by Gilles) - T276121 | 
  [analytics] | 
            
  
    | 
      
        2021-03-01
      
      §
     | 
  
    
  | 21:24 | 
  <razzi> | 
  rebalance kafka partitions for webrequest_upload partition 6 | 
  [analytics] | 
            
  | 18:14 | 
  <razzi> | 
  restart timer that wasn't running on an-worker1101: sudo systemctl restart prometheus-debian-version-textfile.timer | 
  [analytics] | 
            
  | 17:40 | 
  <elukey> | 
  reimage an-worker1098 (GPU worker node) to Buster | 
  [analytics] | 
            
  | 14:48 | 
  <elukey> | 
  reimage an-worker1097 (gpu node) to  debian buster | 
  [analytics] | 
            
  | 11:55 | 
  <elukey> | 
  roll restart druid broker on druid-analytics (again) to enable query cache settings (missing config due to typo) | 
  [analytics] | 
            
  | 11:34 | 
  <elukey> | 
  roll restart historical daemons (again) on druid-analytics to remove stale config and enable (finally) segment caching. | 
  [analytics] | 
            
  | 11:02 | 
  <elukey> | 
  roll restart druid-broker and druid-historical daemons on druid-analytics to pick up new cache settings (disable segment caching on broker and enable it on historicals) | 
  [analytics] | 
            
  | 09:11 | 
  <elukey> | 
  restart hadoop daemons on an-worker1112 to pick up the new disk | 
  [analytics] | 
            
  | 09:11 | 
  <elukey> | 
  remount /dev/sdl on an-worker1112 (wasn't able to make it fail) | 
  [analytics] |