6001-6050 of 6479 results (34ms)
2020-09-09 §
18:01 <andrewbogott> restarting ceph-mgr@cloudcephmon1003 in hopes that the slow ops reported are phantoms (https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EOWNO3MDYRUZKAK6RMQBQ5WBPQNLHOPV/) [admin]
17:40 <andrewbogott> giving ceph pg autoscale another chance: ceph osd pool set eqiad1-compute pg_autoscale_mode on [admin]
00:05 <bd808> Running wmcs-novastats-dnsleaks (T262359) [admin]
2020-09-08 §
21:48 <bd808> Renamed FQDN prefixes to wikimedia.cloud scheme in cloudinfra-db01's labspuppet db (T260614) [admin]
14:29 <andrewbogott> restarting nova-compute on all cloudvirts (everyone is upset from the reset switch failure) [admin]
14:18 <arturo> restarting nova-fullstack service in cloudcontrol1003 [admin]
14:17 <andrewbogott> stopping apache2 on labweb1001 to make sure the Horizon outage is total [admin]
2020-09-03 §
09:31 <arturo> icinga downtime cloud* servers for 30 mins (T261866) [admin]
2020-09-02 §
08:46 <arturo> [codfw1dev] reimaging spare server labtestvirt2003 as debian buster (T261724) [admin]
2020-09-01 §
18:18 <andrewbogott> adding drives on cloudcephosd100[3-5] to ceph osd pool [admin]
13:40 <andrewbogott> adding drives on cloudcephosd101[0-2] to ceph osd pool [admin]
13:34 <andrewbogott> adding drives on cloudcephosd100[1-3] to ceph osd pool [admin]
11:27 <arturo> [codfw1dev] rebooting again cloudnet2002-dev after some network tests, to reset initial state (T261724) [admin]
11:09 <arturo> [codfw1dev] rebooting cloudnet2002-dev after some network tests, to reset initial state (T261724) [admin]
10:49 <arturo> disable puppet in cloudnet servers to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/623569/ [admin]
2020-08-31 §
23:26 <bd808> Removed stale lockfile at cloud-puppetmaster-03.cloudinfra.eqiad.wmflabs:/var/lib/puppet/volatile/GeoIP/.geoipupdate.lock [admin]
11:20 <arturo> [codfw1dev] livehacking https://gerrit.wikimedia.org/r/c/operations/puppet/+/615161 in the puppetmasters for tests before merging [admin]
2020-08-28 §
20:12 <bd808> Running `wmcs-novastats-dnsleaks --delete` from cloudcontrol1003 [admin]
2020-08-26 §
17:12 <bstorm> Running 'ionice -c 3 nice -19 find /srv/tools -type f -size +100M -printf "%k KB %p\n" > tools_large_files_20200826.txt' on labstore1004 T261336 [admin]
2020-08-21 §
21:34 <andrewbogott> restarting nova-compute on cloudvirt1033; it seems stuck [admin]
2020-08-19 §
14:21 <andrewbogott> rebooting cloudweb2001-dev, labweb1001, labweb1002 to address mediawiki-induced memleak [admin]
2020-08-06 §
21:02 <andrewbogott> removing cloudvirt1004/1006 from nova's list of hypervisors; rebuilding them to use as backup test hosts [admin]
20:06 <bstorm> manually stopped the RAID check on cloudcontrol1003 T259760 [admin]
2020-08-04 §
18:54 <bstorm> restarting mariadb on cloudcontrol1004 to setup parallel replication [admin]
2020-08-03 §
17:02 <bstorm> increased db connection limit to 800 across galera cluster because we were clearly hovering at limit [admin]
2020-07-31 §
19:28 <bd808> wmcs-novastats-dnsleaks --delete (lots of leaked fullstack-monitoring records to clean up) [admin]
2020-07-27 §
22:17 <andrewbogott> ceph osd pool set compute pg_num 2048 [admin]
22:14 <andrewbogott> ceph osd pool set compute pg_autoscale_mode off [admin]
2020-07-24 §
19:15 <andrewbogott> ceph mgr module enable pg_autoscaler [admin]
19:15 <andrewbogott> ceph osd pool set compute pg_autoscale_mode on [admin]
2020-07-22 §
08:55 <jbond42> [codfw1dev] upgrading hiera to version5 [admin]
08:48 <arturo> [codfw1dev] add jbond as user in the bastion-codfw1dev and cloudinfra-codfw1dev projects [admin]
08:45 <arturo> [codfw1dev] enabled account creation in labtestwiki briefly for jbond42 to create an account [admin]
2020-07-16 §
10:48 <arturo> merging change to neutron dmz_cidr https://gerrit.wikimedia.org/r/c/operations/puppet/+/613123 (T257534) [admin]
2020-07-15 §
23:15 <bd808> Removed Merlijn van Deen from toollabs-trusted Gerrit group (T255697) [admin]
11:48 <arturo> [codfw1dev] created DNS records (A and PTR) for bastion.bastioninfra-codfw1dev.codfw1dev.wmcloud.org <-> 185.15.57.2 [admin]
11:41 <arturo> [codfw1dev] add myself as projectadmin to the `bastioninfra-codfw1dev` project [admin]
11:39 <arturo> [codfw1dev] created DNS zone `bastioninfra-codfw1dev.codfw1dev.wmcloud.org.` in the cloudinfra-codfw1dev project and then transfer ownership to the bastioninfra-codfw1dev project [admin]
2020-07-14 §
15:19 <arturo> briefly set root@cloudnet1003:~ # sysctl net.ipv4.conf.all.accept_local=1 (in neutron qrouter netns) (T257534) [admin]
10:43 <arturo> icinga downtime cloudnet* hosts for 30 mins to introduce new check https://gerrit.wikimedia.org/r/c/operations/puppet/+/612390 (T257552) [admin]
04:01 <andrewbogott> added a wildcard *.wmflabs.org domain pointing at the domain proxy in project-proxy [admin]
04:00 <andrewbogott> shortened the ttl on .wmflabs.org. to 300 [admin]
2020-07-13 §
16:17 <arturo> icinga downtime cloudcontrol[1003-1005].wikimedia.org for 1h for galera database movements [admin]
2020-07-12 §
17:39 <andrewbogott> switched eqiad1 keystone from m5 to cloudcontrol galera [admin]
2020-07-10 §
20:26 <andrewbogott> disabling nova api to move database to galera [admin]
2020-07-09 §
11:23 <arturo> [codfw1dev] rebooting cloudnet2003-dev again for testing sysct/puppet behavior (T257552) [admin]
11:11 <arturo> [codfw1dev] rebooting cloudnet2003-dev for testing sysct/puppet behavior (T257552) [admin]
09:16 <arturo> manually increasing sysctl value of net.nf_conntrack_max in cloudnet servers (T257552) [admin]
2020-07-06 §
15:16 <arturo> installing 'aptitude' in all cloudvirts [admin]
2020-07-03 §
12:51 <arturo> [codfw1dev] galera cluster should be up and running, openstack happy (T256283) [admin]