| 2015-04-03
      
      § | 
    
  | 03:58 | <Krinkle> | Jobs were throwing NOT_RECOGNISED.  Relaunched Gearman. Jobs are now happy again. | [releng] | 
            
  | 03:51 | <Krinkle> | Jenkins is unable to re-establish Gearman connection. Have to force restart Jenkins master. | [releng] | 
            
  | 03:44 | <greg-g> | *unable | [releng] | 
            
  | 03:44 | <Krinkle> | References to past hour of builds have been restored. But Jenkins is still enable to make new references properly. New builds are 404'ing the same way. | [releng] | 
            
  | 03:42 | <Krinkle> | Reloading Jenking config repaired the broken references. Build urls are now resolving again. | [releng] | 
            
  | 03:26 | <Krinkle> | Reloading Jenkins configuration from disk to mitigate | [releng] | 
            
  | 03:18 | <Krinkle> | The failure started at 03:03 exactly. The newer build metadata exists at /var/lib/jenkins/jobs/:jobname/builds/:nr, but the jobs/*/last*Build symlinks are no longer updated. | [releng] | 
            
  | 02:47 | <Krinkle> | Reloading Zuul to deploy https://gerrit.wikimedia.org/r/201644 | [releng] | 
            
  | 00:31 | <greg-g> | rm 'd .gitignore in /srv/mediawiki-staging/php-master/skins due to https://gerrit.wikimedia.org/r/#/c/200307/ clashing with a local untracked version | [releng] | 
            
  
    | 2015-04-02
      
      § | 
    
  | 22:56 | <Krinkle> | New integration-slave-precise-101x are unfinished and must remain depooled. See T94916. | [releng] | 
            
  | 22:53 | <Krinkle> | Most puppet failures blocking T94916 may be caused by the fact that intergration-puppetmaster was inadvertently changed to Trusty; puppetmaster version of Trusty is not yet supported by ops | [releng] | 
            
  | 21:41 | <Krinkle> | It seems integration-slave-jessie-1001 has role::ci::slave::labs::common instead of role::ci::slave::labs. Intentional? | [releng] | 
            
  | 21:25 | <Krinkle> | Re-creating integration-dev-slave-precise in preparation of re-creating precise slaves | [releng] | 
            
  | 14:51 | <hashar> | applying role::ci::slave::labs::common on integration-slave-jessie-1001 | [releng] | 
            
  | 14:49 | <hashar> | integration: nice thing, newly created instances are automatically made to point to integration-pummetmaster via hiera! Just have to sign the certificate on the master using: puppet ca list ; puppet ca sign i-000xxxx.eqiad.wmflabs | [releng] | 
            
  | 14:42 | <hashar> | Created [[Nova_Resource:I-00000a3b.eqiad.wmflabs|integration-slave-jessie-1001]] to try out CI slave on Jessie ([[T94836]]) | [releng] | 
            
  | 14:11 | <hashar> | reduced integration-slave1004 executors from 6 to 5 to make it on par with the other precise slaves | [releng] | 
            
  | 14:10 | <hashar> | integration-slave100[1-4] are now using Zuul provided by a Debian package as of https://gerrit.wikimedia.org/r/#/c/195272/ PS 16 | [releng] | 
            
  | 14:04 | <hashar> | uninstall the pip installed zuul version from Precise labs slaves by doing:  pip uninstall zuul && rm /usr/local/bin/zuul* . Switching them all to a Debian package | [releng] | 
            
  | 13:45 | <hashar> | pooling back integration-slave1001 and 1002 which are using zuul-cloner provided by a debian package | [releng] | 
            
  | 13:35 | <hashar> | reloading Jenkins configuration files from disk to make it knows about a change manually applied to most jobs config.xml files for https://gerrit.wikimedia.org/r/#/c/201451/ | [releng] | 
            
  | 13:01 | <Krinkle> | Reloading Zuul to deploy https://gerrit.wikimedia.org/r/201458 | [releng] | 
            
  | 12:19 | <hashar> | preventing job to run on integration-slave1001 by replacing its label with 'DoNotLabelThisSlaveHashar'. Going to install Zuul debian package on it | [releng] | 
            
  | 09:37 | <hashar> | rebooting integration-zuul-server   homedir seems to be stalled/missing | [releng] | 
            
  | 08:12 | <hashar> | upgrading packages on integration-dev | [releng] | 
            
  | 05:14 | <greg-g> | and right when I log'd that, things seem to be recovering | [releng] | 
            
  | 05:12 | <greg-g> | the shinken alerts about beta cluster issues are due to wmflabs having issues. | [releng] | 
            
  
    | 2015-04-01
      
      § | 
    
  | 07:17 | <Krinkle> | Creating integration-slave1410 as test. Will re-create our pool later today. | [releng] | 
            
  | 06:26 | <Krinkle> | Apply puppetmaster::autosigner to integration-puppetmaster | [releng] | 
            
  | 05:51 | <legoktm> | deleting non-existent job workspaces from integration slaves | [releng] | 
            
  | 05:42 | <Krinkle> | Free up space on integration-slave1001-1004 by removing obsolete phplint and qunit workspaces | [releng] | 
            
  | 02:05 | <Krinkle> | Restarting Jenkins again.. | [releng] | 
            
  | 01:35 | <legoktm> | started zuul on gallium | [releng] | 
            
  | 01:00 | <Krinkle> | Restarting Jenkins | [releng] | 
            
  | 01:00 | <Krinkle> | Jenkins is unable to start Gearman connection (HTTP 503); | [releng] | 
            
  | 01:00 | <Krinkle> | Force restarted Zuul, didn't help | [releng] | 
            
  | 00:55 | <Krinkle> | Jenkins stuck. Builds are queued in Zuul but nothing is sent to Jenkins. | [releng] | 
            
  
    | 2015-03-30
      
      § | 
    
  | 22:58 | <legoktm> | 1001-1003 were depooled, restarted and repooled. 1004 is depooled and restarted | [releng] | 
            
  | 22:40 | <legoktm> | rebooting precise jenkins slaves | [releng] | 
            
  | 22:33 | <Josve05a> | manually start mysql on db1 and db2 | [releng] | 
            
  | 21:57 | <YuviPanda> | reboot all instances from virt1000 | [releng] | 
            
  | 21:40 | <greg-g> | Beta Cluster is down due to WMF Labs issues, being taken care of now (by Coren and Yuvi) | [releng] | 
            
  | 19:53 | <legoktm> | deleted core dumps from integration-slave1001 | [releng] | 
            
  | 19:11 | <legoktm> | deploying https://gerrit.wikimedia.org/r/200646 | [releng] | 
            
  | 16:29 | <jzerebecki> | another damaged git repo integration-slave1001:~$ sudo -u jenkins-deploy rm -rf /mnt/jenkins-workspace/workspace/mwext-Wikibase-qunit/src/vendor/ | [releng] |