| 2016-01-27
      
      § | 
    
  | 23:07 | <YuviPanda> | removed all members of templatetiger, added self instead, removed active shell sessions | [tools] | 
            
  | 20:24 | <chasemp> | master stop, truncate accounting log to accounting.01272016, master start | [tools] | 
            
  | 19:34 | <chasemp> | master start grid master | [tools] | 
            
  | 19:23 | <chasemp> | stopped master | [tools] | 
            
  | 19:11 | <YuviPanda> | depooled tools-webgrid-1405 to prep for restart, lots of stuck processes | [tools] | 
            
  | 18:29 | <valhallasw`cloud> | job 2551539 is ifttt, which is also running as 2700629. Killing 2551539 . | [tools] | 
            
  | 18:26 | <valhallasw`cloud> | messages repeatedly reports "01/27/2016 18:26:17|worker|tools-grid-master|E|execd@tools-webgrid-generic-1405.tools.eqiad.wmflabs reports running job (2551539.1/master) in queue "webgrid-generic@tools-webgrid-generic-1405.tools.eqiad.wmflabs" that was not supposed to be there - killing". SSH'ing there to investigate | [tools] | 
            
  | 18:24 | <valhallasw`cloud> | 'sleep' test job also seems to work without issues | [tools] | 
            
  | 18:23 | <valhallasw`cloud> | no errors in log file, qstat works | [tools] | 
            
  | 18:23 | <chasemp> | master sge restarted post dump and restart for jobs db | [tools] | 
            
  | 18:22 | <valhallasw`cloud> | messages file reports 'Wed Jan 27 18:21:39 UTC 2016 db_load_sge_maint_pre_jobs_dump_01272016' | [tools] | 
            
  | 18:20 | <chasemp> | master db_load -f /root/sge_maint_pre_jobs_dump_01272016 sge_job | [tools] | 
            
  | 18:19 | <valhallasw`cloud> | dumped jobs database to /root/sge_maint_pre_jobs_dump_01272016, 4.6M | [tools] | 
            
  | 18:17 | <valhallasw`cloud> | SGE Configuration successfully saved to /root/sge_maint_01272016 directory. | [tools] | 
            
  | 18:14 | <chasemp> | grid master stopped | [tools] | 
            
  
    | 2016-01-21
      
      § | 
    
  | 22:24 | <YuviPanda> | deleted tools-redis-01 and -02 (are on 1001 and 1002 now) | [tools] | 
            
  | 21:13 | <YuviPanda> | repooled exec nodes on labvirt1010 | [tools] | 
            
  | 21:08 | <YuviPanda> | gridengine-master started, verified shadow hasn't started | [tools] | 
            
  | 21:00 | <YuviPanda> | stop gridengine master | [tools] | 
            
  | 20:51 | <YuviPanda> | repooled exec nodes on labvirt1007 was last message | [tools] | 
            
  | 20:51 | <YuviPanda> | repooled exec nodes on labvirt1006 | [tools] | 
            
  | 20:39 | <YuviPanda> | failover tools-static too tools-web-static-01 | [tools] | 
            
  | 20:38 | <YuviPanda> | failover tools-checker to tools-checker-01 | [tools] | 
            
  | 20:32 | <YuviPanda> | depooled exec nodes on 1007 | [tools] | 
            
  | 20:32 | <YuviPanda> | repooled exec nodes on 1006 | [tools] | 
            
  | 20:14 | <YuviPanda> | depooled all exec nodes in labvirt1006 | [tools] | 
            
  | 20:11 | <YuviPanda> | repooled exec node son 1005 | [tools] | 
            
  | 19:53 | <YuviPanda> | depooled exec nodes on labvirt1005 | [tools] | 
            
  | 19:49 | <YuviPanda> | repooled exec nodes from labvirt1004 | [tools] | 
            
  | 19:48 | <YuviPanda> | failed over proxy to tools-proxy-01 again | [tools] | 
            
  | 19:31 | <YuviPanda> | depooled exec nodes from labvirt1004 | [tools] | 
            
  | 19:29 | <YuviPanda> | repooled exec nodes from labvirt1003 | [tools] | 
            
  | 19:13 | <YuviPanda> | depooled instances on labvirt1003 | [tools] | 
            
  | 19:06 | <YuviPanda> | re-enabled queues on exec nodes that were on labvirt1002 | [tools] | 
            
  | 19:02 | <YuviPanda> | failed over tools proxy to tools-proxy-02 | [tools] | 
            
  | 18:46 | <YuviPanda> | drained and disabled queues on all nodes on labvirt1002 | [tools] | 
            
  | 18:38 | <YuviPanda> | restarted all restartable jobs in instances on labvirt1001 and deleted all non-restartable ghost jobs. these were already dead | [tools] |