| 2024-01-15
      
      § | 
    
  | 17:02 | <btullis> | roll-restarting public druid cluster | [analytics] | 
            
  | 17:01 | <btullis> | roll-restarting analytics druid cluster | [analytics] | 
            
  | 16:55 | <joal> | Clearing analytics failed aiflow tasks after fix | [analytics] | 
            
  | 16:47 | <btullis> | restarted the hive-server2 and hive-metastore services on an-coord100[3-4] which had been accidentally omitted earlier for T332573 | [analytics] | 
            
  | 12:00 | <btullis> | removing all downtime for hadoop-all for T332573 | [analytics] | 
            
  | 11:57 | <btullis> | un-pausing all previously paused DAGS on all airflow instances for T332573 | [analytics] | 
            
  | 11:55 | <btullis> | re-enabling gobblin jobs | [analytics] | 
            
  | 11:38 | <brouberol> | redeploying the Spark History Server to pick up the new HDFS namenodes - T332573 | [analytics] | 
            
  | 11:29 | <btullis> | puppet runs cleanly on an-master1003 and it is the active namenode - running puppet an an-master1004. | [analytics] | 
            
  | 11:20 | <btullis> | running puppet on an-master1003 to set it to active for T332573 | [analytics] | 
            
  | 11:16 | <btullis> | running puppet on journal nodes first for T332573 | [analytics] | 
            
  | 11:03 | <btullis> | stopping all hadoop services | [analytics] | 
            
  | 10:59 | <btullis> | disabling puppet on all hadoop nodes | [analytics] | 
            
  | 10:54 | <btullis> | putting HDFS into safe mode for T332573 | [analytics] | 
            
  
    | 2024-01-09
      
      § | 
    
  | 21:28 | <aqu> | airflow-dags/analytics(_test) are both deployed | [analytics] | 
            
  | 21:18 | <aqu> | analytics/refinery not deployed fully on test cluster. Ticket for the bug here: https://phabricator.wikimedia.org/T354703 | [analytics] | 
            
  | 21:07 | <aqu> | Deployed refinery using scap, then deployed onto hdfs | [analytics] | 
            
  | 20:48 | <aqu> | about to deploy analytics/refinery - weekly train | [analytics] | 
            
  | 12:57 | <stevemunene> | roll restart analytics hadoop masters to pickup new net_topology script and new JRE T254480 | [analytics] | 
            
  | 11:48 | <stevemunene> | roll restarting hadoop test masters to pick up new net_topology script and new JRE | [analytics] | 
            
  | 11:36 | <stevemunene> | disable puppet on hadoop masters both test and production to test/implement new net_topology script | [analytics] | 
            
  | 10:39 | <btullis> | roll-restarting kafka-jumbo to pick up new JRE | [analytics] |