| 
      
        2017-07-20
      
      §
     | 
  
    
  | 16:42 | 
  <hashar> | 
  How to fix ssh access on beta cluster instances: https://phabricator.wikimedia.org/T171174#3456966 | 
  [releng] | 
            
  | 15:30 | 
  <hashar> | 
  deployment-prep : removing project wide puppet classes from https://horizon.wikimedia.org/project/puppet/   All are role::eventlogging::analytics::* | 
  [releng] | 
            
  | 15:08 | 
  <hashar> | 
  removed profile::recommendation_api  from deployment-sca01  to try to fix the ssh access for mobrovac  T171173  T171174 | 
  [releng] | 
            
  | 14:57 | 
  <zeljkof> | 
  reloading Zuul to deploy 80b9d85 | 
  [releng] | 
            
  | 14:31 | 
  <hashar> | 
  deployment-prep: manually cleaned out the puppet master configuration. It was all screwed up.  Notably I removed bits about the puppetdb | 
  [releng] | 
            
  | 10:20 | 
  <zeljkof> | 
  Reloading Zuul to deploy 80b9d855443a2f572d877b280783110684344c5d | 
  [releng] | 
            
  | 09:17 | 
  <hashar> | 
  Spawning and pooling integration-slave-docker-1003  as replacement to integration-slave-docker-1000 (broken)  - T150502 | 
  [releng] | 
            
  | 09:03 | 
  <hashar> | 
  Restoring castorby updating all jobs  to point to castor02 ( https://gerrit.wikimedia.org/r/366524 ) Starts with a cold cache :( - T171148 | 
  [releng] | 
            
  | 08:53 | 
  <hashar> | 
  Created castor02.integration.eqiad.wmflabs with puppet role role::ci::castor::server and adding it to Jenkins. Will then update the Jenkins jobs to point to it - T171148 | 
  [releng] | 
            
  | 08:00 | 
  <hashar> | 
  Disabled castor entirely via https://gerrit.wikimedia.org/r/366520 . The instance is broken - T171148 | 
  [releng] | 
            
  | 07:55 | 
  <hashar> | 
  Refreshing all Jenkins jobs defined in JJB in order to then disable castor entirely for T171148 | 
  [releng] | 
            
  | 07:09 | 
  <_joe_> | 
  rebooting castor, jobs are failing, and no one seems able to login | 
  [releng] | 
            
  | 07:05 | 
  <_joe_> | 
  adding myself to projectadmins for integration, trying to troubleshoot castor | 
  [releng] | 
            
  | 01:38 | 
  <thcipriani> | 
  scap on beta was failing because during the ldap downtime puppet created a shadow mwdeploy user, fixed using vipw and vigr | 
  [releng] |