| 
      
        2023-03-07
      
      §
     | 
  
    
  | 16:55 | 
  <xcollazo> | 
  deployed image-suggestions hotfix to platform_eng Airflow instance. See https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/262. | 
  [analytics] | 
            
  | 15:23 | 
  <btullis> | 
  re-enabling ingestion via gobblin. | 
  [analytics] | 
            
  | 14:59 | 
  <nfraison> | 
  force startup of nodemanager on analytics_cluster | 
  [analytics] | 
            
  | 14:58 | 
  <btullis> | 
  pooled druid1004 | 
  [analytics] | 
            
  | 14:57 | 
  <btullis> | 
  pooling aqs1010 and aqs1016 | 
  [analytics] | 
            
  | 14:56 | 
  <btullis> | 
  pooling datahubsearch1001 | 
  [analytics] | 
            
  | 14:53 | 
  <btullis> | 
  leaving safe mode on hdfs | 
  [analytics] | 
            
  | 13:59 | 
  <btullis> | 
  disabled puppet temporarily on an-master100[1-2] to avoid an automatic restart of yarn | 
  [analytics] | 
            
  | 13:57 | 
  <btullis> | 
  stopped `hadoop-yarn-resourcemanager.service` on both an-master100[1-2] | 
  [analytics] | 
            
  | 13:54 | 
  <btullis> | 
  entering safe mode with `sudo -u hdfs kerberos-run-command hdfs hadoop dfsadmin -safemode enter` on an-master1002 | 
  [analytics] | 
            
  | 12:57 | 
  <btullis> | 
  depooled druid1004 for T329073 | 
  [analytics] | 
            
  | 12:56 | 
  <btullis> | 
  depooled datahubsearch1001 for T329073 | 
  [analytics] | 
            
  | 12:51 | 
  <btullis> | 
  disabled gobblin timers on an-launcher1002 | 
  [analytics] | 
            
  | 12:46 | 
  <btullis> | 
  depooling aqs1016for T329073 | 
  [analytics] | 
            
  | 12:45 | 
  <btullis> | 
  depooling aqs1010 for T329073 | 
  [analytics] | 
            
  | 08:00 | 
  <nfraison> | 
  Reimage an-conf1003 to upgrade to bullseye T329362 | 
  [analytics] | 
            
  
    | 
      
        2023-03-01
      
      §
     | 
  
    
  | 22:45 | 
  <mforns> | 
  re-deployed airflow analytics with some forgotten changes | 
  [analytics] | 
            
  | 22:42 | 
  <mforns> | 
  deployed Airflow analytics | 
  [analytics] | 
            
  | 22:30 | 
  <mforns> | 
  finished refinery deployment, although didn't manage to run refinery-deploy-to-hdfs without warnings... | 
  [analytics] | 
            
  | 21:48 | 
  <mforns> | 
  kill edit-hourly-coord in Hue to migrate it to Airflow | 
  [analytics] | 
            
  | 21:26 | 
  <mforns> | 
  starting refinery deploy | 
  [analytics] | 
            
  | 19:38 | 
  <SandraEbele> | 
  rerunning webrequest load text for 2023-03-01-08 hour. | 
  [analytics] | 
            
  | 18:54 | 
  <joal> | 
  Create empty partitions in event.mediawiki_page_move table for codfw datacenter from beginning of week (2023-02-27T00 -> 2023-02-28T13) | 
  [analytics] | 
            
  | 10:25 | 
  <nfraison> | 
  rebooting an-worker1132 being slower than other node (potential issue with raid card/disks) | 
  [analytics] | 
            
  | 07:59 | 
  <nfraison> | 
  restarted hiveserver2 in analytics-test to take in account -XX:MaxMetaspaceSize=512m JVM parameter | 
  [analytics] |