4301-4350 of 10000 results (45ms)
2025-01-06 ยง
19:32 <kamila@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1257.eqiad.wmnet with reason: host reimage [production]
19:32 <bd808> Added `postgresql::postgis::postgresql_postgis_package: postgresql-15-postgis-3` to deployment-maps Prefix Puppet to work around default parameter problem (T361381) [releng]
19:31 <bd808> Issued new Puppet cert for deployment-maps-master02.deployment-prep.eqiad1.wikimedia.cloud (T361381) [releng]
19:27 <bd808> Added `postgresql::postgis::postgresql_postgis_package: ignored` to deployment-maps Prefix Puppet to work around default parameter problem (T361381) [releng]
19:15 <brennen> Updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/71 (T382709) [releng]
19:11 <kamila@cumin1002> START - Cookbook sre.hosts.reimage for host wikikube-worker1257.eqiad.wmnet with OS bookworm [production]
19:11 <bd808> Added placeholders for `graphite_host` and `statsd` to deployment-webperf Prefix Puppet [releng]
19:09 <kamila@cumin1002> START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P{wikikube-worker[1257-1263].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) [production]
18:53 <bd808> Fixed missing profile::swift::global_account_keys::{codfw, eqiad} placeholders breaking deployment-ms-* puppet runs [releng]
18:47 <jayme@cumin1002> END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1244.eqiad.wmnet [production]
18:47 <jayme@cumin1002> START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1244.eqiad.wmnet [production]
18:38 <bd808> Fixed incorrect deployment-restbase prefix puppet setting that was causing puppet run failures [releng]
18:32 <ChrisDobbins901_> cdobbins@dns1004 running authdns-update for CR 1097521 [production]
18:31 <ChrisDobbins901_> cdobbins@cumin1002 running authdns-update for CR 1097521 [production]
18:19 <bd808> Issued a new Puppet client cert for traindev01.deployment-prep.eqiad1.wikimedia.cloud [releng]
17:59 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1244.eqiad.wmnet with OS bookworm [production]
17:40 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71818 and previous config saved to /var/cache/conftool/dbconfig/20250106-174024-ladsgroup.json [production]
17:39 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1244.eqiad.wmnet with reason: host reimage [production]
17:37 <jayme@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1244.eqiad.wmnet with reason: host reimage [production]
17:25 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71817 and previous config saved to /var/cache/conftool/dbconfig/20250106-172517-ladsgroup.json [production]
17:16 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host wikikube-worker1244.eqiad.wmnet with OS bookworm [production]
17:15 <jayme@cumin1002> END (PASS) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=0) rolling reimage on P{wikikube-worker[1250-1252].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) [production]
17:15 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1252.eqiad.wmnet with OS bookworm [production]
17:15 <jayme@cumin1002> END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P{wikikube-worker[1240-1244].eqiad.wmnet} and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) [production]
17:15 <jayme@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-worker1244.eqiad.wmnet with OS bookworm [production]
17:10 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P71816 and previous config saved to /var/cache/conftool/dbconfig/20250106-171010-ladsgroup.json [production]
17:00 <jdrewniak@deploy2002> Synchronized portals: Wikimedia Portals Update: [[gerrit:1108451| Bumping portals to master (T128546)]] (duration: 02m 43s) [production]
16:58 <jdrewniak@deploy2002> Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:1108451| Bumping portals to master (T128546)]] (duration: 12m 29s) [production]
16:56 <jayme@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1252.eqiad.wmnet with reason: host reimage [production]
16:55 <dani@deploy2002> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
16:55 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71815 and previous config saved to /var/cache/conftool/dbconfig/20250106-165503-ladsgroup.json [production]
16:54 <dani@deploy2002> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
16:54 <dani@deploy2002> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
16:54 <dani@deploy2002> helmfile [eqiad] START helmfile.d/services/miscweb: apply [production]
16:54 <dani@deploy2002> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
16:54 <raymond-ndibe@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [tools]
16:53 <dani@deploy2002> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
16:52 <jayme@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1252.eqiad.wmnet with reason: host reimage [production]
16:46 <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [tools]
16:42 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db1193 (T370903)', diff saved to https://phabricator.wikimedia.org/P71813 and previous config saved to /var/cache/conftool/dbconfig/20250106-164215-ladsgroup.json [production]
16:42 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
16:42 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
16:39 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
16:39 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
16:38 <btullis@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
16:37 <btullis@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
16:37 <raymond-ndibe@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [toolsbeta]
16:35 <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [toolsbeta]
16:28 <raymond-ndibe@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor [toolsbeta]
16:23 <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor [toolsbeta]