| 
      
        2014-09-15
      
      §
     | 
  
    
  | 21:41 | 
  <andrewbogott> | 
  migrating deployment-sentry2 to virt1002  | 
  [releng] | 
            
  | 21:40 | 
  <cscott> | 
  *skipped* deploy of OCG, due to deployment-salt issues | 
  [releng] | 
            
  | 21:36 | 
  <bd808> | 
  Trying to fix salt with `salt '*' service.restart salt-minion` | 
  [releng] | 
            
  | 21:32 | 
  <bd808> | 
  only hosts responding to salt in beta are deployment-mathoid, deployment-pdf01 and deployment-stream | 
  [releng] | 
            
  | 21:29 | 
  <bd808> | 
  salt calls failing in beta with errors like "This master address: 'salt' was previously resolvable but now fails to resolve!" | 
  [releng] | 
            
  | 21:19 | 
  <bd808> | 
  Added Matanya to under_NDA sudoers group (bug 70864) | 
  [releng] | 
            
  | 20:18 | 
  <hashar> | 
  restarted salt-master  | 
  [releng] | 
            
  | 19:50 | 
  <hashar> | 
  killed on deployment-bastion  a bunch of <tt>python /usr/local/sbin/grain-ensure contains ... </tt> and <tt>/usr/bin/python /usr/bin/salt-call --out=json grains.append deployment_target scap</tt> commands | 
  [releng] | 
            
  | 18:57 | 
  <hashar> | 
  scap breakage due to ferm is logged as https://bugzilla.wikimedia.org/show_bug.cgi?id=70858 | 
  [releng] | 
            
  | 18:48 | 
  <hashar> | 
  https://gerrit.wikimedia.org/r/#/c/160485/ tweaked a default ferm configuration file which caused puppet to reload ferm.  It ends up having rules that prevent ssh from other host thus breaking rsync \\O/ | 
  [releng] | 
            
  | 18:37 | 
  <hashar> | 
  beta-scap-eqiad job is broken since ~17:20 UTC https://integration.wikimedia.org/ci/job/beta-scap-eqiad/21680/console  || rsync: failed to connect to deployment-bastion.eqiad.wmflabs (10.68.16.58): Connection timed out (110) | 
  [releng] | 
            
  
    | 
      
        2014-09-11
      
      §
     | 
  
    
  | 20:59 | 
  <spagewmf> | 
  https://integration.wikimedia.org/ci/ is down with 503 errors | 
  [releng] | 
            
  | 16:31 | 
  <YuviPanda> | 
  Delete deployment-graphite instance | 
  [releng] | 
            
  | 16:13 | 
  <bd808> | 
  Now that scap is pointed to labmon1001.eqiad.wmnet the deployment-graphite.eqiad.wmflabs host can probably be deleted; it never really worked anyway | 
  [releng] | 
            
  | 16:12 | 
  <bd808> | 
  Updated scap to include I0f7f5cae72a87f68d861340d11632fb429c557b9 | 
  [releng] | 
            
  | 15:09 | 
  <bd808> | 
  Updated hhvm-luasandbox to latest version on mediawiki03 and verified that mediawiki0[12] were already updated | 
  [releng] | 
            
  | 15:01 | 
  <bd808> | 
  Fixed incorrect $::deployment_server_override var on deployment-videoscaler01; deployment-bastion.eqiad.wmflabs is correct and deployment-salt.eqiad.wmflabs is not | 
  [releng] | 
            
  | 10:05 | 
  <ori> | 
  deployment-prep upgraded luasandbox and hhvm across the cluster | 
  [releng] | 
            
  | 08:41 | 
  <spagewmf> | 
  deployment-mediawiki01/02 are not getting latest code  | 
  [releng] | 
            
  | 05:10 | 
  <bd808> | 
  Reverted cherry-pick of I621d14e4b75a8415b16077fb27ca956c4de4c4c3 in scap; not the actual problem | 
  [releng] | 
            
  | 05:02 | 
  <bd808> | 
  Cherry-picked I621d14e4b75a8415b16077fb27ca956c4de4c4c3 to scap  to try and fix l10n update issue | 
  [releng] | 
            
  | 02:29 | 
  <mutante> | 
  raised instance quota by 1 to 42 | 
  [releng] | 
            
  
    | 
      
        2014-09-10
      
      §
     | 
  
    
  | 19:38 | 
  <bd808> | 
  Fixed beta-recompile-math-texvc-eqiad job on deployment-bastion | 
  [releng] | 
            
  | 19:38 | 
  <bd808> | 
  Made /usr/local/apache/common-local a symlink to /srv/mediawiki on deployment-bastion | 
  [releng] | 
            
  | 19:37 | 
  <bd808> | 
  Deleted old /srv/common-local on deployment-videoscaler01 | 
  [releng] | 
            
  | 19:32 | 
  <bd808> | 
  Killed jobs-loop.sh tasks on deployment-jobrunner01 | 
  [releng] | 
            
  | 19:30 | 
  <bd808> | 
  Removed old mw-job-runner cron job on deployment-jobrunner01 | 
  [releng] | 
            
  | 19:19 | 
  <bd808> | 
  Deleted /var/log/account/pacct* and /var/log/atop.log.* on deployment-jobrunner01 to make some temporary room in /var | 
  [releng] | 
            
  | 19:14 | 
  <bd808> | 
  Deleted /var/log/mediawiki/jobrunner.log and restarted jobrunner on deployment-jobrunner01: | 
  [releng] | 
            
  | 19:11 | 
  <bd808> | 
  /var full on deployment-jobrunner01 | 
  [releng] | 
            
  | 19:05 | 
  <bd808> | 
  Deleted /srv/common-local on deployment-jobrunner01 | 
  [releng] | 
            
  | 19:04 | 
  <bd808> | 
  Changed /usr/local/apache/common-local symlink to point to /srv/mediawiki on deployment-jobrunner01 | 
  [releng] | 
            
  | 19:03 | 
  <bd808> | 
  w00t!!! scap jobs is green again -- https://integration.wikimedia.org/ci/job/beta-scap-eqiad/20965/ | 
  [releng] | 
            
  | 19:00 | 
  <bd808> | 
  sync-common finished on deployement-jobrunner01; trying Jenkins scap job again | 
  [releng] | 
            
  | 18:53 | 
  <bd808> | 
  Removed symlink and make /srv/mediawiki a proper directory on deployment-jobrunner01; Running sync-common to populate. | 
  [releng] | 
            
  | 18:45 | 
  <bd808> | 
  Made /srv/mediawiki a symling to /srv/common-local on deployment-jobrunner01 | 
  [releng] | 
            
  | 10:20 | 
  <jeremyb> | 
  deployment-bastion /var at 97%, freed up ~500MB. apt-get clean && rm -rv /var/log/account/pacct* | 
  [releng] | 
            
  | 10:17 | 
  <jeremyb> | 
  deployment-bastion good puppet run | 
  [releng] | 
            
  | 10:16 | 
  <jeremyb> | 
  deployment-salt had an oom-kill recently. and some box (maybe master, maybe client?) had a disk fill up | 
  [releng] | 
            
  | 10:15 | 
  <jeremyb> | 
  deployment-mediawiki0[12] both had good puppet runs | 
  [releng] | 
            
  | 10:15 | 
  <jeremyb> | 
  deployment-salt started puppetmaster && puppet run | 
  [releng] | 
            
  | 10:14 | 
  <jeremyb> | 
  deployment-bastion killed puppet lock | 
  [releng] | 
            
  | 08:14 | 
  <Krinkle> | 
  bits.beta.wmflabs.org is down with 503 Service Unavailable (http://bits.beta.wmflabs.org/en.wikipedia.beta.wmflabs.org/load.php) | 
  [releng] | 
            
  | 03:04 | 
  <bd808> | 
  Ori made puppet changes that moved the MediaWiki install dir to /srv/mediawiki (https://gerrit.wikimedia.org/r/#/c/159431/). I didn't see that in SAL so I'm adding it here. | 
  [releng] |