| 
      
        2020-10-05
      
      §
     | 
  
    
  | 19:14 | 
  <mforns> | 
  restarted oozie coord unique_devices-per_domain-monthly after deployment | 
  [analytics] | 
            
  | 19:05 | 
  <mforns> | 
  finished deploying refinery to unblock deletion of raw mediawiki_job and raw netflow data | 
  [analytics] | 
            
  | 18:45 | 
  <mforns> | 
  deploying refinery to unblock deletion of raw mediawiki_job and raw netflow data | 
  [analytics] | 
            
  | 18:20 | 
  <elukey> | 
  manual creation of /opt/rocm -> /opt/rocm-3.3.0 on stat1008 to avoid failures in finding the lib dir | 
  [analytics] | 
            
  | 17:11 | 
  <elukey> | 
  bootstrap an-worker[1115-1117] as hadoop workers | 
  [analytics] | 
            
  | 14:52 | 
  <milimetric> | 
  disabling drop-el-unsanitized-events timer until https://gerrit.wikimedia.org/r/c/analytics/refinery/+/631804/ is deployed | 
  [analytics] | 
            
  | 14:41 | 
  <elukey> | 
  shutdown stat1005 and stat1008 for ram expansion (1005 again) | 
  [analytics] | 
            
  | 14:25 | 
  <elukey> | 
  shutdown an-master1001 for ram expansion | 
  [analytics] | 
            
  | 13:54 | 
  <elukey> | 
  shutdown stat1005 for ram upgrade | 
  [analytics] | 
            
  | 13:31 | 
  <elukey> | 
  shutdown an-master1002 for ram expansion (64 -> 128G) | 
  [analytics] | 
            
  | 12:35 | 
  <elukey> | 
  execute "PURGE BINARY LOGS BEFORE '2020-09-28 00:00:00';" on an-coord1001's mysql to free space - T264081 | 
  [analytics] | 
            
  | 10:31 | 
  <elukey> | 
  bootstrap an-worker111[0,2] as hadoop workers | 
  [analytics] | 
            
  | 10:31 | 
  <elukey> | 
  bootstrap an-worker111[0,2 | 
  [analytics] | 
            
  | 06:33 | 
  <elukey> | 
  reboot stat1005 to resolve weird GPU state (scheduled last week) | 
  [analytics] | 
            
  
    | 
      
        2020-10-01
      
      §
     | 
  
    
  | 19:07 | 
  <fdans> | 
  deploying wikistats | 
  [analytics] | 
            
  | 19:06 | 
  <fdans> | 
  restarted banner_activity-druid-daily-coord from Sep 26 | 
  [analytics] | 
            
  | 18:59 | 
  <fdans> | 
  restarting  mediawiki-history-load-coord | 
  [analytics] | 
            
  | 18:57 | 
  <fdans> | 
  creating hive table wmf_raw.mediawiki_page_props | 
  [analytics] | 
            
  | 18:56 | 
  <fdans> | 
  creating hive table wmf_raw.mediawiki_user_properties | 
  [analytics] | 
            
  | 17:40 | 
  <elukey> | 
  remove + re-create /srv/deployment/analytics/refinery* on stat100[46] (perm issues after reimage) | 
  [analytics] | 
            
  | 17:32 | 
  <elukey> | 
  remove + re-create /srv/deployment/analytics/refinery on stat1007 (perm issues after reimage) | 
  [analytics] | 
            
  | 17:18 | 
  <fdans> | 
  deploying refinery | 
  [analytics] | 
            
  | 14:51 | 
  <elukey> | 
  bootstrap an-worker109[8-9] as hadoop workers (with GPU) | 
  [analytics] | 
            
  | 13:35 | 
  <elukey> | 
  bootstrap an-worker1097 (GPU node) as hadoop worker | 
  [analytics] | 
            
  | 13:15 | 
  <elukey> | 
  restart performance-asoranking on stat1007 | 
  [analytics] | 
            
  | 13:15 | 
  <elukey> | 
  execute "sudo chown analytics-privatedata:analytics-privatedata-users /srv/published-datasets/performance/autonomoussystems/*" on stat1007 to fix a perm issue after reimage | 
  [analytics] | 
            
  | 10:30 | 
  <elukey> | 
  add an-worker1103 to the hadoop cluster | 
  [analytics] | 
            
  | 07:15 | 
  <elukey> | 
  restart hdfs namenodes on an-master100[1,2] to pick up new hadoop workers settings | 
  [analytics] |