| 
      
        2015-04-03
      
      §
     | 
  
    
  | 10:21 | 
  <hashar> | 
  downgrading integration-puppetmaster from Trusty to Precise https://phabricator.wikimedia.org/T94927 | 
  [releng] | 
            
  | 05:42 | 
  <legoktm> | 
  deploying https://gerrit.wikimedia.org/r/200744 | 
  [releng] | 
            
  | 03:58 | 
  <Krinkle> | 
  Jobs were throwing NOT_RECOGNISED.  Relaunched Gearman. Jobs are now happy again. | 
  [releng] | 
            
  | 03:51 | 
  <Krinkle> | 
  Jenkins is unable to re-establish Gearman connection. Have to force restart Jenkins master. | 
  [releng] | 
            
  | 03:44 | 
  <greg-g> | 
  *unable | 
  [releng] | 
            
  | 03:44 | 
  <Krinkle> | 
  References to past hour of builds have been restored. But Jenkins is still enable to make new references properly. New builds are 404'ing the same way. | 
  [releng] | 
            
  | 03:42 | 
  <Krinkle> | 
  Reloading Jenking config repaired the broken references. Build urls are now resolving again. | 
  [releng] | 
            
  | 03:26 | 
  <Krinkle> | 
  Reloading Jenkins configuration from disk to mitigate  | 
  [releng] | 
            
  | 03:18 | 
  <Krinkle> | 
  The failure started at 03:03 exactly. The newer build metadata exists at /var/lib/jenkins/jobs/:jobname/builds/:nr, but the jobs/*/last*Build symlinks are no longer updated. | 
  [releng] | 
            
  | 02:47 | 
  <Krinkle> | 
  Reloading Zuul to deploy https://gerrit.wikimedia.org/r/201644 | 
  [releng] | 
            
  | 00:31 | 
  <greg-g> | 
  rm 'd .gitignore in /srv/mediawiki-staging/php-master/skins due to https://gerrit.wikimedia.org/r/#/c/200307/ clashing with a local untracked version | 
  [releng] | 
            
  
    | 
      
        2015-04-02
      
      §
     | 
  
    
  | 22:56 | 
  <Krinkle> | 
  New integration-slave-precise-101x are unfinished and must remain depooled. See T94916. | 
  [releng] | 
            
  | 22:53 | 
  <Krinkle> | 
  Most puppet failures blocking T94916 may be caused by the fact that intergration-puppetmaster was inadvertently changed to Trusty; puppetmaster version of Trusty is not yet supported by ops | 
  [releng] | 
            
  | 21:41 | 
  <Krinkle> | 
  It seems integration-slave-jessie-1001 has role::ci::slave::labs::common instead of role::ci::slave::labs. Intentional? | 
  [releng] | 
            
  | 21:25 | 
  <Krinkle> | 
  Re-creating integration-dev-slave-precise in preparation of re-creating precise slaves | 
  [releng] | 
            
  | 14:51 | 
  <hashar> | 
  applying role::ci::slave::labs::common on integration-slave-jessie-1001 | 
  [releng] | 
            
  | 14:49 | 
  <hashar> | 
  integration: nice thing, newly created instances are automatically made to point to integration-pummetmaster via hiera! Just have to sign the certificate on the master using: puppet ca list ; puppet ca sign i-000xxxx.eqiad.wmflabs  | 
  [releng] | 
            
  | 14:42 | 
  <hashar> | 
  Created [[Nova_Resource:I-00000a3b.eqiad.wmflabs|integration-slave-jessie-1001]] to try out CI slave on Jessie ([[T94836]]) | 
  [releng] | 
            
  | 14:11 | 
  <hashar> | 
  reduced integration-slave1004 executors from 6 to 5 to make it on par with the other precise slaves | 
  [releng] | 
            
  | 14:10 | 
  <hashar> | 
  integration-slave100[1-4] are now using Zuul provided by a Debian package as of https://gerrit.wikimedia.org/r/#/c/195272/ PS 16 | 
  [releng] | 
            
  | 14:04 | 
  <hashar> | 
  uninstall the pip installed zuul version from Precise labs slaves by doing:  pip uninstall zuul && rm /usr/local/bin/zuul* . Switching them all to a Debian package | 
  [releng] | 
            
  | 13:45 | 
  <hashar> | 
  pooling back integration-slave1001 and 1002 which are using zuul-cloner provided by a debian package | 
  [releng] | 
            
  | 13:35 | 
  <hashar> | 
  reloading Jenkins configuration files from disk to make it knows about a change manually applied to most jobs config.xml files for https://gerrit.wikimedia.org/r/#/c/201451/ | 
  [releng] | 
            
  | 13:01 | 
  <Krinkle> | 
  Reloading Zuul to deploy https://gerrit.wikimedia.org/r/201458  | 
  [releng] | 
            
  | 12:19 | 
  <hashar> | 
  preventing job to run on integration-slave1001 by replacing its label with 'DoNotLabelThisSlaveHashar'. Going to install Zuul debian package on it | 
  [releng] | 
            
  | 09:37 | 
  <hashar> | 
  rebooting integration-zuul-server   homedir seems to be stalled/missing | 
  [releng] | 
            
  | 08:12 | 
  <hashar> | 
  upgrading packages on integration-dev | 
  [releng] | 
            
  | 05:14 | 
  <greg-g> | 
  and right when I log'd that, things seem to be recovering | 
  [releng] | 
            
  | 05:12 | 
  <greg-g> | 
  the shinken alerts about beta cluster issues are due to wmflabs having issues. | 
  [releng] | 
            
  
    | 
      
        2015-04-01
      
      §
     | 
  
    
  | 07:17 | 
  <Krinkle> | 
  Creating integration-slave1410 as test. Will re-create our pool later today. | 
  [releng] | 
            
  | 06:26 | 
  <Krinkle> | 
  Apply puppetmaster::autosigner to integration-puppetmaster | 
  [releng] | 
            
  | 05:51 | 
  <legoktm> | 
  deleting non-existent job workspaces from integration slaves | 
  [releng] | 
            
  | 05:42 | 
  <Krinkle> | 
  Free up space on integration-slave1001-1004 by removing obsolete phplint and qunit workspaces | 
  [releng] | 
            
  | 02:05 | 
  <Krinkle> | 
  Restarting Jenkins again.. | 
  [releng] | 
            
  | 01:35 | 
  <legoktm> | 
  started zuul on gallium | 
  [releng] | 
            
  | 01:00 | 
  <Krinkle> | 
  Restarting Jenkins | 
  [releng] | 
            
  | 01:00 | 
  <Krinkle> | 
  Jenkins is unable to start Gearman connection (HTTP 503);  | 
  [releng] | 
            
  | 01:00 | 
  <Krinkle> | 
  Force restarted Zuul, didn't help | 
  [releng] | 
            
  | 00:55 | 
  <Krinkle> | 
  Jenkins stuck. Builds are queued in Zuul but nothing is sent to Jenkins. | 
  [releng] |