1401-1450 of 5598 results (13ms)
          
  
    | 
      
        2021-11-18
      
      §
     | 
  
    
  | 07:32 | 
  <elukey> | 
  restart prometheus-druid-exporter on Druid Public to see metrics difference | 
  [analytics] | 
            
  
    | 
      
        2021-11-17
      
      §
     | 
  
    
  | 16:01 | 
  <btullis> | 
  roll-restarting kafka-test brokers | 
  [analytics] | 
            
  | 12:12 | 
  <btullis> | 
  roll-restarting the presto analytics workers | 
  [analytics] | 
            
  | 11:44 | 
  <btullis> | 
  btullis@archiva1002:~$ sudo systemctl restart archiva.service | 
  [analytics] | 
            
  | 07:29 | 
  <elukey> | 
  `apt-get clean` on an-tool1005 to free space in the root partition | 
  [analytics] | 
            
  | 07:28 | 
  <elukey> | 
  `sudo pkill -U jmixter` on stat100[5,8] to allow puppet to run and remove the offboarded user | 
  [analytics] | 
            
  
    | 
      
        2021-11-16
      
      §
     | 
  
    
  | 19:40 | 
  <joal> | 
  Deploying refinery to HDFS | 
  [analytics] | 
            
  | 19:15 | 
  <joal> | 
  Deploying refinery with scap | 
  [analytics] | 
            
  | 18:23 | 
  <joal> | 
  Releasing refinery-source v0.1.21 | 
  [analytics] | 
            
  | 11:32 | 
  <btullis> | 
  btullis@cumin1001:~$ sudo cookbook sre.druid.roll-restart-workers public | 
  [analytics] | 
            
  | 10:20 | 
  <btullis> | 
  roll-restarting hadoop masters | 
  [analytics] | 
            
  
    | 
      
        2021-11-15
      
      §
     | 
  
    
  | 16:37 | 
  <joal> | 
  Rerun failed mediawiki-wikitext-history-wf-2021-10 | 
  [analytics] | 
            
  
    | 
      
        2021-11-11
      
      §
     | 
  
    
  | 06:56 | 
  <elukey> | 
  `systemctl start prometheus-mysqld-exporter@analytics_meta` on db1108 | 
  [analytics] | 
            
  
    | 
      
        2021-11-10
      
      §
     | 
  
    
  | 18:20 | 
  <btullis> | 
  btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed.service | 
  [analytics] | 
            
  | 10:19 | 
  <btullis> | 
  btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed | 
  [analytics] | 
            
  
    | 
      
        2021-11-09
      
      §
     | 
  
    
  | 16:52 | 
  <razzi> | 
  restart presto server on an-coord1001 to apply change for T292087 | 
  [analytics] | 
            
  | 16:30 | 
  <razzi> | 
  set superset presto version to 0.246 in ui | 
  [analytics] | 
            
  | 16:30 | 
  <razzi> | 
  set superset presto timeout to 170s: {"connect_args":{"session_props":{"query_max_run_time":"170s"}}} for T294771 | 
  [analytics] | 
            
  | 12:23 | 
  <btullis> | 
  btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed | 
  [analytics] | 
            
  | 07:23 | 
  <elukey> | 
  `apt-get clean` on stat1006 to free some space (root partition full) | 
  [analytics] | 
            
  
    | 
      
        2021-11-08
      
      §
     | 
  
    
  | 19:51 | 
  <ottomata> | 
  an-coord1002: drop user 'admin'@'localhost'; start slave; to fix broken replication  - T284150 | 
  [analytics] | 
            
  | 19:44 | 
  <razzi> | 
  create admin user on an-coord1001 for T284150 | 
  [analytics] | 
            
  | 18:07 | 
  <razzi> | 
  run `create user 'admin'@'localhost' identified by <password>; grant all privileges on *.* to admin;` to allow milimetric to access mysql on an-coord1002 for T284150 | 
  [analytics] | 
            
  
    | 
      
        2021-11-04
      
      §
     | 
  
    
  | 16:39 | 
  <razzi> | 
  add "can sql json on superset" permission to Alpha role on superset.wikimedia.org | 
  [analytics] | 
            
  | 16:14 | 
  <razzi> | 
  drop and restore superset_staging database to test permissions as they are in production | 
  [analytics] | 
            
  
    | 
      
        2021-11-03
      
      §
     | 
  
    
  | 17:07 | 
  <razzi> | 
  razzi@an-tool1010:~$ sudo systemctl stop superset | 
  [analytics] | 
            
  | 16:57 | 
  <razzi> | 
  dump mysql in preparation for superset upgrade | 
  [analytics] | 
            
  | 02:23 | 
  <milimetric> | 
  deployed refinery with regular train | 
  [analytics] | 
            
  
    | 
      
        2021-10-29
      
      §
     | 
  
    
  | 23:04 | 
  <btullis> | 
  deleted all remaining old cassandra snapshots on aqs100x servers. | 
  [analytics] | 
            
  | 22:58 | 
  <btullis> | 
  deleted old snapshots from aqs1006 and aqs1009 | 
  [analytics] | 
            
  | 17:45 | 
  <razzi> | 
  set presto_analytics_hive extra parameter engine_params.connect_args.session_props.query_max_run_time to 55s on superset.wikimedia.org | 
  [analytics] | 
            
  | 10:39 | 
  <elukey> | 
  roll restart of kafka-test to pick up new truststore (root PKI added) | 
  [analytics] | 
            
  
    | 
      
        2021-10-28
      
      §
     | 
  
    
  | 19:13 | 
  <ottomata> | 
  re-enable hdfs-cleaner for /wmf/gobblin | 
  [analytics] | 
            
  
    | 
      
        2021-10-26
      
      §
     | 
  
    
  | 09:01 | 
  <btullis> | 
  reverted hive services back  to an-coord1001. | 
  [analytics] | 
            
  
    | 
      
        2021-10-25
      
      §
     | 
  
    
  | 16:03 | 
  <btullis> | 
  btullis@an-coord1001:~$ sudo systemctl restart hive-server2 hive-metastore | 
  [analytics] | 
            
  | 13:02 | 
  <btullis> | 
  btullis@an-coord1002:~$ sudo systemctl restart hive-server2 hive-metastore | 
  [analytics] | 
            
  | 12:51 | 
  <btullis> | 
  btullis@aqs1007:~$ sudo nodetool-a clearsnapshot | 
  [analytics] | 
            
  
    | 
      
        2021-10-21
      
      §
     | 
  
    
  | 14:05 | 
  <ottomata> | 
  rerun refine_eventlogging_analytics refine_eventlogging_legacy  and refine_event with -ignore-done-flag=true --since=2021-10-21T01:00:00 --until=2021-10-21T04:00:00 for backfill of missing data after gobblin problems | 
  [analytics] | 
            
  | 13:39 | 
  <btullis> | 
  btullis@an-launcher1002:~$ sudo systemctl restart gobblin-event_default | 
  [analytics] | 
            
  | 10:35 | 
  <joal> | 
  Re-refine netflow data after gobblin pulled data fix | 
  [analytics] | 
            
  | 08:41 | 
  <joal> | 
  Rerun webrequest-load jobs for hour 2021-10-21T02:00 | 
  [analytics] | 
            
  
    | 
      
        2021-10-20
      
      §
     | 
  
    
  | 18:11 | 
  <razzi> | 
  Deployed refinery using scap, then deployed onto hdfs | 
  [analytics] | 
            
  | 16:36 | 
  <razzi> | 
  deploy refinery change for https://phabricator.wikimedia.org/T287084 | 
  [analytics] | 
            
  | 07:15 | 
  <joal> | 
  rerun webrequest-load-wf-upload-2021-10-20-1 after node issue | 
  [analytics] | 
            
  | 06:27 | 
  <elukey> | 
  reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage | 
  [analytics] | 
            
  
    | 
      
        2021-10-19
      
      §
     | 
  
    
  | 07:14 | 
  <joal> | 
  Rerun cassandra-daily-wf-local_group_default_T_mediarequest_top_files-2021-10-17 | 
  [analytics] | 
            
  
    | 
      
        2021-10-18
      
      §
     | 
  
    
  | 19:29 | 
  <joal> | 
  Rerun cassandra-daily-wf-local_group_default_T_top_pageviews-2021-10-17 | 
  [analytics] | 
            
  | 18:36 | 
  <joal> | 
  Rerun cassandra-daily-wf-local_group_default_T_unique_devices-2021-10-17 | 
  [analytics] | 
            
  | 16:22 | 
  <joal> | 
  rerun cassandra-daily-wf-local_group_default_T_top_percountry-2021-10-17 | 
  [analytics] | 
            
  | 16:16 | 
  <joal> | 
  Rerun cassandra-daily-wf-local_group_default_T_mediarequest_per_referer-2021-10-17 | 
  [analytics] |