351-400 of 1335 results (23ms)
2021-05-03 §
16:29 <wm-bot> Safe rebooting 'cloudvirt1023.eqiad.wmnet'. (T280641) - cookbook ran by dcaro@vulcanus [admin]
15:41 <wm-bot> Safe rebooting 'cloudvirt1023.eqiad.wmnet'. (T280641) - cookbook ran by dcaro@vulcanus [admin]
15:41 <wm-bot> Safe reboot of 'cloudvirt1022.eqiad.wmnet' finished successfully. (T280641) - cookbook ran by dcaro@vulcanus [admin]
15:13 <wm-bot> Safe rebooting 'cloudvirt1022.eqiad.wmnet'. (T280641) - cookbook ran by dcaro@vulcanus [admin]
10:31 <wm-bot> Safe rebooting 'cloudvirt1021.eqiad.wmnet'. (T280641 - cookbook ran by dcaro@vulcanus) [admin]
10:23 <wm-bot> (from a cookbook) [admin]
09:12 <dcaro> draining and rebooting coludvirt1021 (T280641) [admin]
08:26 <dcaro> draining and rebooting coludvirt1018 (T280641) [admin]
2021-04-30 §
11:16 <dcaro> draining and rebooting coludvirt1017, last one today (T280641) [admin]
10:37 <dcaro> draining coludvirt1016 for reboot (T280641) [admin]
09:47 <dcaro> draining coludvirt1013 for reboot (T280641) [admin]
2021-04-29 §
15:11 <dcaro> hard rebooting cloudmetrics1002, got hung again (T275605) [admin]
07:53 <dcaro> Upgrading ceph libraries on cloudcontrol1005 to octopus (T274566) [admin]
07:51 <dcaro> Upgrading ceph libraries on cloudcontrol1003 to octopus (T274566) [admin]
07:50 <dcaro> Upgrading ceph libraries on cloudcontrol1004 to octopus (T274566) [admin]
2021-04-28 §
21:11 <andrewbogott> cleaning up more references to deleted hypervisors with delete from services where topic='compute' and version != 53; [admin]
20:48 <andrewbogott> cleaning up references to deleted hypervisors with mysql:root@localhost [nova_eqiad1]> delete from compute_nodes where hypervisor_version != '5002000'; [admin]
19:40 <andrewbogott> putting cloudvirt1040 into the maintenance aggregate pending more info about T281399 [admin]
18:11 <andrewbogott> adding cloudvirt1040, 1041 and 1042 to the 'ceph' host aggregate -- T275081 [admin]
11:06 <dcaro> All ceph server side upgraded to Octopus! \o/ (T280641) [admin]
10:57 <dcaro> Got a PG getting stuck on 'remapping' after the OSD came up, had to unset the norebalance and then set it again to get it unstuck (T280641) [admin]
10:34 <dcaro> Slow/blocked opns from cloudcephmon03, "osd_failure(failed timeout osd.32..." (cloudcephosd1005), unset the cluster noout/norebalance and went away in a few secs, setting it again and continuing... (T280641) [admin]
09:03 <dcaro> Waiting for slow heartbeats from osd.58(cloudcephosd1002) to recover... (T280641) [admin]
08:59 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58, currently on cloudcephosd1002 (T280641) [admin]
08:58 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) all from osd.58 (T280641) [admin]
08:58 <dcaro> During the upgrade, started getting warning 'slow osd heartbacks in the back', meaning that pings between osds are really slow (up to 190s) (T280641) [admin]
08:21 <dcaro> Upgrading all the ceph osds on eqiad (T280641) [admin]
08:21 <dcaro> The clock skew seems intermittent, there's another task to follw it T275860 (T280641) [admin]
08:18 <dcaro> All equiad ceph mons and mgrs upgraded (T280641) [admin]
08:18 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, cloudcephmon1001, they are back (T280641) [admin]
08:15 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, it went away, I'm guessing systemd-timesyncd fixed it (T280641) [admin]
08:14 <dcaro> During the upgrade, ceph detected a clock skew on cloudcephmon1002, looking (T280641) [admin]
07:58 <dcaro> Upgrading ceph services on eqiad, starting with mons/managers (T280641) [admin]
2021-04-27 §
14:10 <dcaro> codfw.openstack upgraded ceph libraries to 15.2.11 (T280641) [admin]
13:07 <dcaro> codfw.openstack cloudvirt2002-dev done, taking cloudvirt2003-dev out to upgrade ceph libraries (T280641) [admin]
13:00 <dcaro> codfw.openstack cloudvirt2001-dev back online, taking cloudvirt2002-dev out to upgrade ceph libraries (T280641) [admin]
10:51 <dcaro> ceph.eqiad: cinder pool got it's pg_num increased to 1024, re-shuffle started (T273783) [admin]
10:48 <dcaro> ceph.eqiad: Tweaked the target_size_ratio of all the pools, enabling autoscaler (it will increase cinder pool only) (T273783) [admin]
09:14 <dcaro> manually force stopping the server puppetmaster-01 to unblock migration (in codfw1) [admin]
09:14 <dcaro> manually force stopping the server puppetmaster-01 to unblock migration [admin]
08:59 <dcaro> manually force stopping the server exploding-head on codfw, to try cold migration [admin]
08:47 <dcaro> restarting nova-compute on cloudvirt2001-dev after upgrading ceph libraries to 15.2.11 [admin]
2021-04-26 §
20:56 <andrewbogott> deleting spurious 'codfw1dev' and 'codw1dev-4' regions in the dallas deployment; regions without endpoints break a bunch of things [admin]
09:45 <dcaro> draining cloudvirt2001-dev with the new cookbooks (T280641) [admin]
2021-04-23 §
13:49 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641) [admin]
11:12 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster (T280641) [admin]
09:32 <dcaro> finished upgrade of ceph cluster on codfw1 using exclusively cookbooks (T280641) [admin]
09:17 <dcaro> testing the upgrade_osds cookbook on codfw1 ceph cluster (T280641) [admin]
08:17 <dcaro> testing the upgrade_mons cookbook on codfw1 ceph cluster (T280641) [admin]
2021-04-21 §
17:59 <dcaro> all monitors upgraded on codfw1 with one cookbook `cookbook --verbose -c ~/.config/spicerack/cookbook.yaml wmcs.ceph.upgrade_mons --monitor-node-fqdn cloudcephmon2002-dev.codfw.wmnet` (T280641) [admin]