651-700 of 1010 results (15ms)
2016-07-13 §
11:19 <yuvipanda> drained tools-worker-1004 - high ksoftirqd usage even with no load [tools]
11:13 <yuvipanda> depool tools-worker-1014 - unusable, totally in iowait [tools]
11:13 <yuvipanda> reboot tools-worker-1004, was unresponsive [tools]
2016-07-12 §
18:07 <yuvipanda> reboot tools-worker-1012, it seems to have failed LDAP connectivity :| [tools]
2016-07-08 §
12:38 <yuvipanda> starting up tools-web-static-02 again [tools]
2016-07-07 §
12:45 <yuvipanda> start deployment of k8s 1.3.0wmf4 for T139259 [tools]
2016-07-06 §
13:09 <yuvipanda> associated a floating IP with tools-k8s-master-01 for T139461 [tools]
11:47 <yuvipanda> moved tools-checker-0[12] to use tools-puppetmaster-01 as puppetmaster so they get appropriate CA for use when talking to kubernetes API [tools]
2016-07-04 §
11:13 <yuvipanda> delete tools-prometheus-01 to free up resources on labvirt1010 [tools]
11:11 <yuvipanda> actually deleted instance tools-cron-02 to free up resources on labvirt1010 - was large and not currently used, and failover process takes a while anyway, so we can recreate if needed [tools]
11:10 <yuvipanda> stopped instance tools-cron-02 to free up some resources on labvirt1010 [tools]
2016-07-03 §
17:09 <yuvipanda> run qstat -u '*' | grep 'dr ' | awk '{ print $1;}' | xargs -L1 qdel -f to clean out jobs stuck in dr state [tools]
16:58 <yuvipanda> migrate tools-web-static-02 to labvirt1011 to provide more breathing room [tools]
16:56 <yuvipanda> delete temp-test-trusty-package to provide more breathing room on labvirt1010 [tools]
13:49 <yuvipanda> reboot tools-exec-1219 [tools]
13:37 <yuvipanda> migrating tools-exec-1216 to labvirt1011 [tools]
13:07 <yuvipanda> delete tools-bastion-01 which was shut down anyway [tools]
13:04 <yuvipanda> attempt to reboot tools-exec-1212 [tools]
2016-06-28 §
15:25 <bd808> Signed client cert for tools-worker-1019.tools.eqiad.wmflabs on tools-puppetmaster-01.tools.eqiad.wmflabs [tools]
2016-06-21 §
16:49 <bd808> Updated jobutils to v1.14 for T138178 [tools]
2016-06-17 §
06:17 <yuvipanda> forced deletion of 7033590 for dykbot for shubinator [tools]
2016-06-09 §
00:19 <mutante> is a valid project [tools]
2016-06-08 §
20:31 <yuvipanda> start tools-bastion-03 was stuck in 'stopped' state [tools]
20:31 <yuvipanda> reboot tools-bastion-03 [tools]
2016-05-31 §
17:35 <valhallasw`cloud> re-enabled queues on tools-exec-1407, tools-exec-1216, tools-exec-1219 [tools]
13:13 <chasemp> reboot of tools-exec-1203 see T136495 all jobs seem gone now [tools]
2016-05-30 §
13:06 <valhallasw`cloud> rebooting tools-exec-1221 [tools]
11:53 <godog> cherry-pick https://gerrit.wikimedia.org/r/#/c/280652 https://gerrit.wikimedia.org/r/#/c/290479 https://gerrit.wikimedia.org/r/#/c/291710/ on tools-puppetmaster-01 [tools]
2016-05-29 §
18:58 <YuviPanda> deleted tools-k8s-bastion-01 for T136496 [tools]
14:29 <valhallasw`cloud> chowned /data/project/xtools-mab-dev to root and back to stop rogue process that was writing to the directory. I'm still not sure where that process was running, but at least this seems to have solved the issue [tools]
2016-05-28 §
21:51 <valhallasw`cloud> rebooted tools-webgrid-lighttpd-1408, tools-pastion-01, tools-exec-1205 [tools]
21:21 <valhallasw`cloud> rebooting tools-exec-1204 (T136495) [tools]
2016-05-27 §
14:45 <YuviPanda> start moving tools-bastion-03 to use tools-puppetmaster-01 as puppetmaster [tools]
2016-05-25 §
20:15 <YuviPanda> deleted tools-bastion-mtemp per chasemp [tools]
19:43 <YuviPanda> delete devpi instance, not currently in use [tools]
19:39 <YuviPanda> run sudo dpkg --configure -a on tools-worker-1007 to get it unstuck [tools]
19:19 <YuviPanda> deleted tools-docker-builder-01 and -02, hosed hosts that are unused [tools]
17:18 <YuviPanda> fixed hhvm upgrade on tools-cron-01 [tools]
07:19 <YuviPanda> hard reboot tools-services-01, was completely stuck on /public/dumps [tools]
06:06 <bd808> Restarting all webservice jobs [tools]
05:33 <andrewbogott> rebooting tools-proxy-02 [tools]
2016-05-23 §
19:36 <YuviPanda> switched tools-checker to tools-checker-03 [tools]
13:28 <chasemp> 'apt-get install hhvm -y --force-yes' across trusty hosts to handle hhvm downgrade [tools]
2016-05-20 §
23:39 <bd808> Forced puppet run on bastion-02 & bastion-05 to apply fix for T135861 [tools]
19:44 <chasemp> tools-exec-1406 having issues rebooting [tools]
2016-05-19 §
21:07 <bd808> deployed jobutils 1.13 on bastions; now with '-l release=...' validation! [tools]
15:43 <YuviPanda> rebooting all tools worker instances [tools]
13:12 <chasemp> reboot tools-exec-1220 stuck in state of unresponsivenss [tools]
2016-05-13 §
00:40 <YuviPanda> cleared all queues that were in error state [tools]
2016-05-12 §
22:59 <YuviPanda> restart tools-worker-1004 to attempt bringing it back up [tools]