1-50 of 10000 results (24ms)
2025-01-05 §
18:58 <lucaswerkmeister> remove /tmp/framer.txt on tools-bastion-13 (I notified the owner privately), and replace it with a root-owned file to prevent iTerm from leaking logs into it (https://iterm2.com/downloads/stable/iTerm2-3_5_11.changelog) on tools-sgebastion-10, tools-bastion-12 and tools-bastion-13 [tools]
2025-01-04 §
21:34 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
21:31 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
20:54 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
20:50 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
20:02 <dzahn@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doc1003.eqiad.wmnet with reason: dz [production]
20:01 <dzahn@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on doc1003.eqiad.wmnet with reason: dz [production]
19:33 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
19:29 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
18:30 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
18:27 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
18:22 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
18:18 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
18:02 <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.vps.refresh_puppet_certs (exit_code=0) on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
17:59 <andrew@cloudcumin1001> START - Cookbook wmcs.vps.refresh_puppet_certs on traffic-puppetserver-bookworm.traffic.eqiad1.wikimedia.cloud [traffic]
15:45 <wmbot~sakretsu@tools-bastion-13> implemented health checks for draftbot [tools.itwiki]
2025-01-03 §
21:46 <bd808@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-69 [tools]
21:41 <bd808@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-69 [tools]
21:40 <bd808@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.k8s.worker.drain (exit_code=99) for node tools-k8s-worker-nfs-69 [tools]
21:35 <bd808@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.worker.drain for node tools-k8s-worker-nfs-69 [tools]
21:01 <bd808> `sudo service maintain-dbusers restart` on cloudcontrol1005. Report of missing replica.my.cnf and journalctl output empty due to log rotation. (T382962) [admin]
19:09 <aokoth@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doc2002.codfw.wmnet with reason: Disk Change [production]
19:09 <aokoth@cumin1002> START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doc2002.codfw.wmnet with reason: Disk Change [production]
18:47 <aokoth@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doc2002.codfw.wmnet with reason: Disk Change [production]
18:46 <aokoth@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on doc2002.codfw.wmnet with reason: Disk Change [production]
17:15 <thcipriani@deploy2002> Finished deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit1003.wikimedia.org only) (duration: 00m 10s) [production]
17:15 <thcipriani@deploy2002> Started deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit1003.wikimedia.org only) [production]
17:06 <thcipriani@deploy2002> Finished deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit2002.wikimedia.org only) (duration: 00m 08s) [production]
17:06 <thcipriani@deploy2002> Started deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit2002.wikimedia.org only) [production]
17:00 <isaranto@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
16:55 <thcipriani@deploy2002> Finished deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit2003.wikimedia.org only) (duration: 00m 08s) [production]
16:55 <thcipriani@deploy2002> Started deploy [gerrit/gerrit@44854d4]: remove developer satisfaction survey banner (gerrit2003.wikimedia.org only) [production]
12:10 <moritzm> renewed internal Ganeti certs in eqsin (would have expired in two days) T382873 [production]
12:01 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool es2022 T381848', diff saved to https://phabricator.wikimedia.org/P71781 and previous config saved to /var/cache/conftool/dbconfig/20250103-120132-marostegui.json [production]
11:12 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool es2020 T382945', diff saved to https://phabricator.wikimedia.org/P71780 and previous config saved to /var/cache/conftool/dbconfig/20250103-111255-marostegui.json [production]
11:05 <marostegui> Switchover es4 codfw master to es2043 dbmaint T381848 [production]
11:04 <marostegui@cumin1002> dbctl commit (dc=all): 'Switchover es4 codfw master', diff saved to https://phabricator.wikimedia.org/P71779 and previous config saved to /var/cache/conftool/dbconfig/20250103-110440-marostegui.json [production]
10:35 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool es2021 T381848', diff saved to https://phabricator.wikimedia.org/P71778 and previous config saved to /var/cache/conftool/dbconfig/20250103-103513-marostegui.json [production]
10:06 <marostegui@cumin1002> dbctl commit (dc=all): 'db2236 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71777 and previous config saved to /var/cache/conftool/dbconfig/20250103-100603-root.json [production]
09:50 <marostegui@cumin1002> dbctl commit (dc=all): 'db2236 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71776 and previous config saved to /var/cache/conftool/dbconfig/20250103-095057-root.json [production]
09:35 <marostegui@cumin1002> dbctl commit (dc=all): 'db2236 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71774 and previous config saved to /var/cache/conftool/dbconfig/20250103-093552-root.json [production]
09:20 <marostegui@cumin1002> dbctl commit (dc=all): 'db2236 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71773 and previous config saved to /var/cache/conftool/dbconfig/20250103-092046-root.json [production]
09:05 <marostegui@cumin1002> dbctl commit (dc=all): 'db2236 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P71772 and previous config saved to /var/cache/conftool/dbconfig/20250103-090541-root.json [production]
09:03 <marostegui> Upgrade db2236 to 10.11.10 s4 codfw dbmaint T378940 [production]
09:02 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2236.codfw.wmnet with reason: upgrade [production]
09:02 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on db2236.codfw.wmnet with reason: upgrade [production]
09:02 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2236 to upgrade to 10.11.10 T378940', diff saved to https://phabricator.wikimedia.org/P71771 and previous config saved to /var/cache/conftool/dbconfig/20250103-090215-marostegui.json [production]
08:35 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2115.codfw.wmnet [production]
08:35 <marostegui@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:35 <marostegui@cumin1002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2115.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1002" [production]