751-800 of 1130 results (23ms)
2020-05-12 §
16:45 <andrewbogott> restarting neutron-l3-agent on cloudnet1004 so it knows about all three cloudcontrols. Leaving cloudnet1003 since restarting it there will cause network interruptions [admin]
14:06 <arturo> icinga downtime everything for 2h for Debian Buster migration in some cloud components [admin]
2020-05-09 §
16:53 <andrewbogott> rebuilding cloudcontrol2001-dev and 2003-dev with buster for T252121 [admin]
2020-05-08 §
19:02 <bstorm_> moving tools-k8s-haproxy-2 from cloudvirt1021 to cloudvirt1017 to improve spread [admin]
2020-05-05 §
13:58 <andrewbogott> rebuilding cloudcontrol2004-dev to test new puppet changes [admin]
2020-05-04 §
09:04 <arturo> [codfw1dev] manually modify iptables ruleset to only allow SSH from WMF bastions on cloudservices2003-dev and cloudcontrol2004-dev (T251604) [admin]
2020-04-21 §
22:12 <andrewbogott> moving cloudvirt1004 out of the 'standard' aggregate and into the 'maintenance' aggregate [admin]
16:01 <jeh> restart cloudceph mon and osd services for openssl upgrades [admin]
2020-04-15 §
18:44 <jeh> create indexes and views for grwikimedia T245912 [admin]
2020-04-13 §
15:07 <jeh> restart memcached on labwebs to increase cache size T145703 [admin]
2020-04-09 §
19:57 <andrewbogott> upgrading eqiad1 designate to rocky [admin]
16:52 <andrewbogott> cleaned up a bunch of leaked .eqiad.wmflabs dns records [admin]
2020-04-08 §
19:20 <andrewbogott> rotated password and api token for pdns servers on cloudservices1003 and cloudservices1004 [admin]
14:54 <arturo> `root@cloudcontrol1003:~# cp /etc/inputrc .inputrc` to solve some bash shortcut weirdness [admin]
2020-04-07 §
20:57 <andrewbogott> service sssd stop; rm -rf /var/lib/sss/db*; service sssd start on tools-sgebastion-08 [admin]
2020-04-06 §
22:39 <andrewbogott> deleting bogus groups cn=b'project-bastion',ou=groups,dc=wikimedia,dc=org and cn=b'project-tools',ou=groups,dc=wikimedia,dc=org from ldap [admin]
17:42 <arturo> [codfw1dev] transferred DNS zone 57.15.185.in-addr.arpa. to the cloudinfra-codfw1dev project (T247972) [admin]
17:39 <arturo> [codfw1dev] `openstack zone create --email root@wmflabs.org --type PRIMARY --ttl 3600 --description "floating IPs subnet" 57.15.185.in-addr.arpa.` (T247972) [admin]
16:23 <arturo> restarting apache2 in cloudcontrol1003/1004 to pick up latest wmfkeystonehooks changes T249494 [admin]
2020-04-02 §
20:59 <jeh> codfw1dev clear VM error states and start bastions, puppet master and database [admin]
20:53 <jeh> codfw1dev clear VM error states and start bastions, puppet master and database [admin]
2020-04-01 §
16:27 <arturo> [codfw1dev] enable puppet across the fleet clean vxlan changes (T248881) [admin]
2020-03-31 §
12:35 <arturo> [codfw1dev] restarting VMs: designaterockytest14, bastion-codfw1dev-0[1,2] (T248881) [admin]
12:34 <arturo> [codfw1dev] installing neutron-openvswitch-agent on cloudvirt2001-dev (T248881) [admin]
12:25 <arturo> [codfw1dev] installing neutron-openvswitch-agent on cloudnet200[2,3]-dev (T248881) [admin]
11:45 <arturo> [codfw1dev] rebooting cloudvirt2003-dev to pick up latest kernel update. Otherwise modprobe is confused trying to load modules and openvswitch won't start (T248881) [admin]
10:40 <arturo> [codfw1dev] installing neutron-openvswitch-agent on cloudvirt2003-dev (T248881) [admin]
10:09 <arturo> [codfw1dev] reboot cloudnet2003-dev into linux 4.9 (was using 4.14 from a testing operation in 2020-03-10) [admin]
2020-03-30 §
23:42 <bstorm_> deleted "Kubernetes Cluster" and "Kubernetes Performance" dashboards T246689 [admin]
16:44 <arturo> [codfw1dev] installing package neutron-openvswitch-agent in cloudvirt2002-dev (T248881) [admin]
16:42 <andrewbogott> restarting l3 agents on cloudnets in codfw1dev after applying https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/584188/ [admin]
2020-03-27 §
21:28 <bd808> Created huggle.wmcloud.org Designate zone and allocated it to the huggle project [admin]
19:51 <jeh> start haproxy on cloudcontrol2003-dev.wikimedia.org [admin]
2020-03-26 §
15:01 <arturo> icinga downtime cloudvirt* cloudcontrol* cloudnet* lab* cloudstore* [admin]
15:01 <andrewbogott> beginning openstack upgrade window for T242766 [admin]
12:32 <arturo> [codfw1dev] downgraded systemd, libsystemd0, udev and friends to the non-backports versions (T247013) [admin]
2020-03-25 §
19:29 <andrewbogott> dumping a bunch of VMs on cloudvirt1015 to see if it still crashes [admin]
17:56 <jeh> add labweb1002 back into the pool - completed horizon testing T240852 [admin]
17:09 <jeh> depool labweb1002 for horizon testing T240852 [admin]
2020-03-24 §
19:41 <jeh> switch cloudvirt1016 from maintenance to standard host aggregate T243327 [admin]
15:31 <andrewbogott> restarting nova-conductor and nova-api on cloudcontrol1003 and cloudcontrol1004 [admin]
2020-03-23 §
21:41 <jeh> restart neutron-l3-agent on cloudnet100[3,4] to pickup policy.yaml changes [admin]
13:28 <jeh> disable puppet on labweb100[1,2] to enable horizon event traces T240852 [admin]
10:26 <arturo> restarting apache in both labweb1001/labweb1002 upon reports of returning 500s [admin]
2020-03-21 §
14:23 <andrewbogott> restarting apache2 on labweb1001 and 1002 [admin]
2020-03-18 §
19:17 <andrewbogott> deleted a bunch of records from the pdns database on cloudservices1003/1004 which had a record name but the content (where an IP address should be) was NULL, e.g. m.wikidata.beta.wmflabs.org. [admin]
10:55 <arturo> [codfw1dev] deleting BGP agent, undoing changes we did for T245606 [admin]
2020-03-14 §
17:40 <jeh> restart maintain-dbusers on labstore1004 T247654 [admin]
2020-03-13 §
12:39 <arturo> [codfw1dev] reintroduce address scopes for another round of testing T244851 [admin]
12:17 <arturo> [codfw1dev] enabling puppet in cloudnet200x-dev servers after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/579259 (T247505) [admin]