| 
      
        2014-09-04
      
      §
     | 
  
    
  | 16:06 | 
  <bd808> | 
  Manually cleaned bogus LocalRenameUserJob jobs from redis | 
  [releng] | 
            
  | 13:54 | 
  <_joe_> | 
  stopped puppet on the appservers but mw03, testing an apache change | 
  [releng] | 
            
  | 05:28 | 
  <legoktm> | 
  stopping jobrunner on deployment-jobrunner01 | 
  [releng] | 
            
  | 05:22 | 
  <legoktm> | 
  restarted jobrunner on deployment-jobrunner01 | 
  [releng] | 
            
  | 05:14 | 
  <bd808> | 
  Bad jobs in job queue filled up /var on jobrunner01 and killed jobrunner script. Leaving down for now until I find out how to delete the bad jobs. | 
  [releng] | 
            
  | 01:41 | 
  <bd808> | 
  Killed old jobs-loop.sh processes on deployment-jobrunner01 | 
  [releng] | 
            
  | 01:24 | 
  <bd808> | 
  Many jobrunner errors like "wikiversions-labs.cdb has no version entry for `amwiki`" with various wiki names | 
  [releng] | 
            
  | 01:23 | 
  <bd808|AWAY> | 
  Started jobrunner service manually on jobrunner01. | 
  [releng] | 
            
  | 00:44 | 
  <bd808> | 
  Puppet run on deployment-jobrunner01 failing with what seem to be dns issues (getaddrinfo: Name or service not known when Trebuchet is running) | 
  [releng] | 
            
  | 00:35 | 
  <bd808> | 
  Puppet run on deployment-jobrunner01 failing with what seem to be dns issues (getaddrinfo: Name or service not known) | 
  [releng] | 
            
  
    | 
      
        2014-08-27
      
      §
     | 
  
    
  | 23:03 | 
  <hashar> | 
  Blacklisting the security audit IP again on deployment-cache bits01 mobile03 and text02 | 
  [releng] | 
            
  | 22:53 | 
  <hashar> | 
  removed the blackhole ip route from deployment-cache-text02 and deployment-cache-mobile03 | 
  [releng] | 
            
  | 22:48 | 
  <hashar> | 
  the IP is a known security audit. See Chris Steipp. | 
  [releng] | 
            
  | 22:46 | 
  <hashar> | 
  blackholed an IP address on deployment-cache-text02 and deployment-cache-mobile03 , it was causing hundred of requests per seconds and overloaded the beta cluster. Use route -n to find the IP | 
  [releng] | 
            
  | 22:37 | 
  <hashar> | 
  restarting udp2log-mw on deployment-bastion.  It keeps crashing since fiarly recently | 
  [releng] | 
            
  | 22:26 | 
  <bd808> | 
  when restarting varnish on deployment-cache-text02, don't forget that there are 2 varnish services (varnish and varnish-frontend) | 
  [releng] | 
            
  | 22:19 | 
  <bd808> | 
  restarted varnish (again) on deployment-cache-text02 | 
  [releng] | 
            
  | 22:10 | 
  <bd808> | 
  restarted varnish on deployment-cache-text02 | 
  [releng] | 
            
  | 16:22 | 
  <bd808> | 
  killing `apt-get update` process running on deployment-bastion since Jun13  | 
  [releng] | 
            
  | 14:59 | 
  <bd808> | 
  Resolved puppet git merge conflict on deployment-salt | 
  [releng] | 
            
  | 14:49 | 
  <bd808> | 
  Moved hhvm core dumps to /data/project/hhvm-cores | 
  [releng] | 
            
  | 14:42 | 
  <bd808> | 
  Root dirve full on deployment-mediawiki02; hhvm core files are the culprit | 
  [releng] |