| 
      
        2019-02-20
      
      §
     | 
  
    
  | 23:30 | 
  <zhuyifei1999_> | 
  begin rebuilding all docker images T178601 T193646 T215683 | 
  [tools] | 
            
  | 23:25 | 
  <zhuyifei1999_> | 
  upgraded toollabs-webservice on tools-bastion-02 to 0.44 (newly-built version) | 
  [tools] | 
            
  | 23:19 | 
  <zhuyifei1999_> | 
  this was built for stretch. hopefully it works for all distros | 
  [tools] | 
            
  | 23:17 | 
  <zhuyifei1999_> | 
  begin build new tools-webservice package T178601 T193646 T215683 | 
  [tools] | 
            
  | 21:57 | 
  <andrewbogott> | 
  moving tools-static-13  to a new virt host | 
  [tools] | 
            
  | 21:34 | 
  <andrewbogott> | 
  moving the tools-static IP from tools-static-13 to tools-static-12 | 
  [tools] | 
            
  | 19:17 | 
  <andrewbogott> | 
  moving tools-bastion-02 to labvirt1004 | 
  [tools] | 
            
  | 16:56 | 
  <andrewbogott> | 
  moving tools-paws-worker-1003 | 
  [tools] | 
            
  | 15:53 | 
  <andrewbogott> | 
  moving tools-worker-1017, tools-worker-1027, tools-worker-1028 | 
  [tools] | 
            
  | 15:03 | 
  <andrewbogott> | 
  moving tools-exec-1413 and tools-exec-1442 | 
  [tools] | 
            
  
    | 
      
        2019-02-16
      
      §
     | 
  
    
  | 05:00 | 
  <zhuyifei1999_> | 
  fixed by restarting flannel. another puppet run simply started kubelet | 
  [tools] | 
            
  | 04:58 | 
  <zhuyifei1999_> | 
  puppet logs: https://phabricator.wikimedia.org/P8097. Docker is failing with 'Failed to load environment files: No such file or directory' | 
  [tools] | 
            
  | 04:52 | 
  <zhuyifei1999_> | 
  copied the resolv.conf from tools-k8s-master-01, removing secondary DNS to make sure puppet fixes that, and starting puppet | 
  [tools] | 
            
  | 04:48 | 
  <zhuyifei1999_> | 
  that host's resolv.conf is badly broken https://phabricator.wikimedia.org/P8096. The last Puppet run was at Thu Feb 14 15:21:09 UTC 2019 (2247 minutes ago) | 
  [tools] | 
            
  | 04:44 | 
  <zhuyifei1999_> | 
  puppet is also failing bad here 'Error: Could not request certificate: getaddrinfo: Name or service not known' | 
  [tools] | 
            
  | 04:43 | 
  <zhuyifei1999_> | 
  this one has logs full of 'Can't contact LDAP server' | 
  [tools] | 
            
  | 04:41 | 
  <zhuyifei1999_> | 
  nslcd also broken on tools-worker-1005 | 
  [tools] | 
            
  | 04:34 | 
  <zhuyifei1999_> | 
  uncordon tools-worker-1014.tools.eqiad.wmflabs | 
  [tools] | 
            
  | 04:33 | 
  <zhuyifei1999_> | 
  the issue was, /var/run/nslcd/socket was somehow a directory, AFAICT | 
  [tools] | 
            
  | 04:31 | 
  <zhuyifei1999_> | 
  then started nslcd vis systemctl and `id zhuyifei1999` returns correct stuffs | 
  [tools] | 
            
  | 04:30 | 
  <zhuyifei1999_> | 
  `nslcd -nd` complains about 'nslcd: bind() to /var/run/nslcd/socket failed: Address already in use'. SIGTERMed a background nslcd, `rmdir /var/run/nslcd/socket`, and `nslcd -nd` seemingly starts to work | 
  [tools] | 
            
  | 04:23 | 
  <zhuyifei1999_> | 
  drained tools-worker-1014.tools.eqiad.wmflabs | 
  [tools] | 
            
  | 04:16 | 
  <zhuyifei1999_> | 
  logs: https://phabricator.wikimedia.org/P8095 | 
  [tools] | 
            
  | 04:14 | 
  <zhuyifei1999_> | 
  restarting nslcd on tools-worker-1014 in an attempt to fix that, service failed to start, looking into logs | 
  [tools] | 
            
  | 04:12 | 
  <zhuyifei1999_> | 
  restarting nscd on tools-worker-1014 in an attempt to fix seemingly-not-attached-to-LDAP | 
  [tools] | 
            
  
    | 
      
        2019-02-14
      
      §
     | 
  
    
  | 21:57 | 
  <bd808> | 
  Deleted old tools-proxy-02 instance | 
  [tools] | 
            
  | 21:57 | 
  <bd808> | 
  Deleted old tools-proxy-01 instance | 
  [tools] | 
            
  | 21:56 | 
  <bd808> | 
  Deleted old tools-package-builder-01 instance | 
  [tools] | 
            
  | 20:57 | 
  <andrewbogott> | 
  rebooting tools-worker-1005 | 
  [tools] | 
            
  | 20:34 | 
  <andrewbogott> | 
  moving tools-exec-1409, tools-exec-1410, tools-exec-1414, tools-exec-1419 | 
  [tools] | 
            
  | 19:55 | 
  <andrewbogott> | 
  moving tools-webgrid-generic-1401 and tools-webgrid-lighttpd-1419 | 
  [tools] | 
            
  | 19:33 | 
  <andrewbogott> | 
  moving tools-checker-01 to labvirt1003 | 
  [tools] | 
            
  | 19:25 | 
  <andrewbogott> | 
  moving tools-elastic-02 to labvirt1003 | 
  [tools] | 
            
  | 19:11 | 
  <andrewbogott> | 
  moving tools-k8s-etcd-01 to labvirt1002 | 
  [tools] | 
            
  | 18:37 | 
  <andrewbogott> | 
  moving tools-exec-1418, tools-exec-1424 to labvirt1003 | 
  [tools] | 
            
  | 18:34 | 
  <andrewbogott> | 
  moving tools-webgrid-lighttpd-1404, tools-webgrid-lighttpd-1406, tools-webgrid-lighttpd-1410 to labvirt1002 | 
  [tools] | 
            
  | 17:35 | 
  <arturo> | 
  T215154 tools-sgebastion-07 now running systemd 239 and starts enforcing user limits | 
  [tools] |