| 2019-02-16
      
      § | 
    
  | 05:00 | <zhuyifei1999_> | fixed by restarting flannel. another puppet run simply started kubelet | [tools] | 
            
  | 04:58 | <zhuyifei1999_> | puppet logs: https://phabricator.wikimedia.org/P8097. Docker is failing with 'Failed to load environment files: No such file or directory' | [tools] | 
            
  | 04:52 | <zhuyifei1999_> | copied the resolv.conf from tools-k8s-master-01, removing secondary DNS to make sure puppet fixes that, and starting puppet | [tools] | 
            
  | 04:48 | <zhuyifei1999_> | that host's resolv.conf is badly broken https://phabricator.wikimedia.org/P8096. The last Puppet run was at Thu Feb 14 15:21:09 UTC 2019 (2247 minutes ago) | [tools] | 
            
  | 04:44 | <zhuyifei1999_> | puppet is also failing bad here 'Error: Could not request certificate: getaddrinfo: Name or service not known' | [tools] | 
            
  | 04:43 | <zhuyifei1999_> | this one has logs full of 'Can't contact LDAP server' | [tools] | 
            
  | 04:41 | <zhuyifei1999_> | nslcd also broken on tools-worker-1005 | [tools] | 
            
  | 04:34 | <zhuyifei1999_> | uncordon tools-worker-1014.tools.eqiad.wmflabs | [tools] | 
            
  | 04:33 | <zhuyifei1999_> | the issue was, /var/run/nslcd/socket was somehow a directory, AFAICT | [tools] | 
            
  | 04:31 | <zhuyifei1999_> | then started nslcd vis systemctl and `id zhuyifei1999` returns correct stuffs | [tools] | 
            
  | 04:30 | <zhuyifei1999_> | `nslcd -nd` complains about 'nslcd: bind() to /var/run/nslcd/socket failed: Address already in use'. SIGTERMed a background nslcd, `rmdir /var/run/nslcd/socket`, and `nslcd -nd` seemingly starts to work | [tools] | 
            
  | 04:23 | <zhuyifei1999_> | drained tools-worker-1014.tools.eqiad.wmflabs | [tools] | 
            
  | 04:16 | <zhuyifei1999_> | logs: https://phabricator.wikimedia.org/P8095 | [tools] | 
            
  | 04:14 | <zhuyifei1999_> | restarting nslcd on tools-worker-1014 in an attempt to fix that, service failed to start, looking into logs | [tools] | 
            
  | 04:12 | <zhuyifei1999_> | restarting nscd on tools-worker-1014 in an attempt to fix seemingly-not-attached-to-LDAP | [tools] | 
            
  
    | 2019-02-14
      
      § | 
    
  | 21:57 | <bd808> | Deleted old tools-proxy-02 instance | [tools] | 
            
  | 21:57 | <bd808> | Deleted old tools-proxy-01 instance | [tools] | 
            
  | 21:56 | <bd808> | Deleted old tools-package-builder-01 instance | [tools] | 
            
  | 20:57 | <andrewbogott> | rebooting tools-worker-1005 | [tools] | 
            
  | 20:34 | <andrewbogott> | moving tools-exec-1409, tools-exec-1410, tools-exec-1414, tools-exec-1419 | [tools] | 
            
  | 19:55 | <andrewbogott> | moving tools-webgrid-generic-1401 and tools-webgrid-lighttpd-1419 | [tools] | 
            
  | 19:33 | <andrewbogott> | moving tools-checker-01 to labvirt1003 | [tools] | 
            
  | 19:25 | <andrewbogott> | moving tools-elastic-02 to labvirt1003 | [tools] | 
            
  | 19:11 | <andrewbogott> | moving tools-k8s-etcd-01 to labvirt1002 | [tools] | 
            
  | 18:37 | <andrewbogott> | moving tools-exec-1418, tools-exec-1424 to labvirt1003 | [tools] | 
            
  | 18:34 | <andrewbogott> | moving tools-webgrid-lighttpd-1404, tools-webgrid-lighttpd-1406, tools-webgrid-lighttpd-1410 to labvirt1002 | [tools] | 
            
  | 17:35 | <arturo> | T215154 tools-sgebastion-07 now running systemd 239 and starts enforcing user limits | [tools] | 
            
  | 15:33 | <andrewbogott> | moving tools-worker-1002, 1003, 1005, 1006, 1007, 1010, 1013, 1014 to different labvirts in order to move labvirt1012 to eqiad1-r | [tools] | 
            
  
    | 2019-02-11
      
      § | 
    
  | 22:57 | <bd808> | Shutoff tools-webgrid-lighttpd-14{01,13,24,26,27,28} via Horizon UI | [tools] | 
            
  | 22:34 | <bd808> | Decommissioned tools-webgrid-lighttpd-14{01,13,24,26,27,28} | [tools] | 
            
  | 22:23 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1401.tools.eqiad.wmflabs | [tools] | 
            
  | 22:21 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1413.tools.eqiad.wmflabs | [tools] | 
            
  | 22:18 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1428.tools.eqiad.wmflabs | [tools] | 
            
  | 22:07 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1427.tools.eqiad.wmflabs | [tools] | 
            
  | 22:06 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1424.tools.eqiad.wmflabs | [tools] | 
            
  | 22:05 | <bd808> | sudo exec-manage depool tools-webgrid-lighttpd-1426.tools.eqiad.wmflabs | [tools] | 
            
  | 20:06 | <bstorm_> | Ran apt-get clean on tools-sgebastion-07 since it was running out of disk (and lots of it was the apt cache) | [tools] | 
            
  | 19:08 | <bd808> | Upgraded tools-manifest on tools-cron-01 to v0.19 (T107878) | [tools] | 
            
  | 18:57 | <bd808> | Upgraded tools-manifest on tools-sgecron-01 to v0.19 (T107878) | [tools] | 
            
  | 18:57 | <bd808> | Built tools-manifest_0.19_all.deb and published to aptly repos (T107878) | [tools] | 
            
  | 18:26 | <bd808> | Upgraded tools-manifest on tools-sgecron-01 to v0.18 (T107878) | [tools] | 
            
  | 18:25 | <bd808> | Built tools-manifest_0.18_all.deb and published to aptly repos (T107878) | [tools] | 
            
  | 18:12 | <bd808> | Upgraded tools-manifest on tools-sgecron-01 to v0.17 (T107878) | [tools] |