| 
      
        2021-03-18
      
      §
     | 
  
    
  | 19:24 | 
  <bstorm> | 
  set profile::toolforge::infrastructure across the entire project with login_server set on the bastion and exec node-related prefixes | 
  [tools] | 
            
  | 16:21 | 
  <andrewbogott> | 
  enabling puppet tools-wide | 
  [tools] | 
            
  | 16:20 | 
  <andrewbogott> | 
  disabling puppet tools-wide to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/672456 | 
  [tools] | 
            
  | 16:19 | 
  <bstorm> | 
  added profile::toolforge::infrastructure class to puppetmaster T277756 | 
  [tools] | 
            
  | 04:12 | 
  <bstorm> | 
  rebooted tools-sgeexec-0935.tools.eqiad.wmflabs because it forgot how to LDAP...likely root cause of the issues tonight | 
  [tools] | 
            
  | 03:59 | 
  <bstorm> | 
  rebooting grid master. sorry for the cron spam | 
  [tools] | 
            
  | 03:49 | 
  <bstorm> | 
  restarting sssd on tools-sgegrid-master | 
  [tools] | 
            
  | 03:37 | 
  <bstorm> | 
  deleted a massive number of stuck jobs that misfired from the cron server | 
  [tools] | 
            
  | 03:35 | 
  <bstorm> | 
  rebooting tools-sgecron-01 to try to clear up the ldap-related errors coming out of it | 
  [tools] | 
            
  | 01:46 | 
  <bstorm> | 
  killed the toolschecker cron job, which had an LDAP error, and ran it again by hand | 
  [tools] |