251-300 of 10000 results (26ms)
2025-05-05 §
10:35 <elukey@puppetserver1001> conftool action : set/pooled=true; selector: dnsdisc=inference,name=codfw [production]
10:32 <jelto@cumin1002> START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet [production]
10:32 <jelto@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet [production]
10:24 <tappof> rebooting prometheus1007 into linux-image-6.1.0-33-amd64 [production]
10:19 <andrew@cloudcumin1001> START - Cookbook wmcs.ceph.osd.depool_and_destroy [admin]
10:17 <jelto@cumin1002> START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet [production]
09:58 <elukey@deploy1003> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' . [production]
09:45 <hashar> Cleared /srv/docker/overlay2 on contint2002 [releng]
09:41 <hashar> Cleared /srv/docker/overlay2 on contint1002 (it had bunch of old layers from April/May 2024) [releng]
09:39 <elukey@deploy1003> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
09:39 <elukey@deploy1003> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
09:38 <elukey> depool inference/codfw from DNS discovery to safely apply new pod/container security settings - T369493 [production]
09:30 <dreamyjazz@deploy1003> Finished scap sync-world: Backport for [[gerrit:1141844|[plwiki] Add 'abusefilter-view-private' to sysop (T393353)]] (duration: 13m 04s) [production]
09:23 <dreamyjazz@deploy1003> dreamyjazz, msz2001: Continuing with sync [production]
09:21 <dreamyjazz@deploy1003> dreamyjazz, msz2001: Backport for [[gerrit:1141844|[plwiki] Add 'abusefilter-view-private' to sysop (T393353)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
09:17 <dreamyjazz@deploy1003> Started scap sync-world: Backport for [[gerrit:1141844|[plwiki] Add 'abusefilter-view-private' to sysop (T393353)]] [production]
09:03 <godog> powercycle vrts1003 + vrts2002 - soft lockup T393357 [production]
08:56 <godog> powercycle centrallog2002 - can not login on ssh or console [production]
08:40 <ryankemper@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2015.codfw.wmnet with OS bullseye [production]
08:32 <tappof> rebooting prometheus2007 - no ssh, com2 via racadm hangs [production]
08:32 <godog> powercycle centrallog1002 - can not login on ssh or console [production]
08:21 <ryankemper@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2015.codfw.wmnet with reason: host reimage [production]
08:19 <andrew@cloudcumin1001> END (FAIL) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=99) [admin]
08:17 <ryankemper@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2015.codfw.wmnet with reason: host reimage [production]
08:17 <tappof> powercycle prometheus2008 - no ssh, mgmt console showing systemd units being deactivated, no root login [production]
08:15 <elukey> powercycle prometheus2005 - no ssh, mgmt console showing systemd units being deactivated, no root login [production]
08:11 <elukey> powercycle prometheus1008 - no ssh, mgmt console showing cpu soft lockup continously [production]
08:05 <jgiannelos@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply [production]
08:05 <jgiannelos@deploy1003> helmfile [eqiad] START helmfile.d/services/mobileapps: apply [production]
08:02 <tappof> rebooting prometheus1005 prometheus1006 and prometheus2006 [production]
08:00 <ryankemper@cumin2002> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2015 [production]
08:00 <ryankemper@cumin2002> END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2015 [production]
08:00 <ryankemper@cumin2002> START - Cookbook sre.network.configure-switch-interfaces for host wdqs2015 [production]
08:00 <ryankemper@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2015.codfw.wmnet 209.48.192.10.in-addr.arpa 9.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors [production]
08:00 <ryankemper@cumin2002> START - Cookbook sre.dns.wipe-cache wdqs2015.codfw.wmnet 209.48.192.10.in-addr.arpa 9.0.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors [production]
08:00 <ryankemper@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:00 <ryankemper@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2015 - ryankemper@cumin2002" [production]
08:00 <ryankemper@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2015 - ryankemper@cumin2002" [production]
07:59 <jgiannelos@deploy1003> helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply [production]
07:59 <jgiannelos@deploy1003> helmfile [eqiad] START helmfile.d/services/mobileapps: apply [production]
07:59 <jgiannelos@deploy1003> helmfile [codfw] DONE helmfile.d/services/mobileapps: apply [production]
07:58 <jgiannelos@deploy1003> helmfile [codfw] START helmfile.d/services/mobileapps: apply [production]
07:54 <Dreamy_Jazz> UTC morning backport window finished [production]
07:54 <dreamyjazz@deploy1003> Finished scap sync-world: Backport for [[gerrit:1141573|nnwiki: enable wgCiteResponsiveReferences (T393299)]], [[gerrit:1141582|ruwikibooks: enable VisualEditorAvailableNamespaces for Рецепт (recipe) namespace (T392803)]], [[gerrit:1141089|Add checkuserwiki favicon (T393246)]], [[gerrit:1141574|nupwiki: add timezone (T390711)]] (duration: 14m 11s) [production]
07:47 <dreamyjazz@deploy1003> dreamyjazz, bunnypranav, anzx: Continuing with sync [production]
07:44 <dreamyjazz@deploy1003> dreamyjazz, bunnypranav, anzx: Backport for [[gerrit:1141573|nnwiki: enable wgCiteResponsiveReferences (T393299)]], [[gerrit:1141582|ruwikibooks: enable VisualEditorAvailableNamespaces for Рецепт (recipe) namespace (T392803)]], [[gerrit:1141089|Add checkuserwiki favicon (T393246)]], [[gerrit:1141574|nupwiki: add timezone (T390711)]] synced to the testservers (https://wikitech.wikimedia.org [production]
07:40 <dreamyjazz@deploy1003> Started scap sync-world: Backport for [[gerrit:1141573|nnwiki: enable wgCiteResponsiveReferences (T393299)]], [[gerrit:1141582|ruwikibooks: enable VisualEditorAvailableNamespaces for Рецепт (recipe) namespace (T392803)]], [[gerrit:1141089|Add checkuserwiki favicon (T393246)]], [[gerrit:1141574|nupwiki: add timezone (T390711)]] [production]
07:31 <kartik@deploy1003> Finished scap sync-world: Backport for [[gerrit:1140703|Mobile frequent languages entrypoint: Add dependency to sitemapper (T393144 T386223)]] (duration: 17m 27s) [production]
07:25 <kartik@deploy1003> abi, kartik: Continuing with sync [production]
07:21 <ryankemper@cumin2002> START - Cookbook sre.dns.netbox [production]