2201-2250 of 10000 results (133ms)
2025-04-09 ยง
19:35 <dancy@deploy1003> Installing scap version "4.153.0" for 2 host(s) [production]
19:24 <fab@deploy1003> Finished deploy [airflow-dags/research@ea5f3de]: (no justification provided) (duration: 00m 41s) [production]
19:24 <fab@deploy1003> Started deploy [airflow-dags/research@ea5f3de]: (no justification provided) [production]
19:14 <dzahn@cumin1002> START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: security release [production]
18:48 <dzahn@cumin1002> END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: security release [production]
18:40 <dzahn@cumin1002> START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release [production]
18:20 <brennen@deploy1003> rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.24 refs T386219 [production]
18:06 <brennen> 1.44.0-wmf.24 train status (T386219): logs quiet, no current blockers, moving to group1 [production]
18:04 <bking@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 9 hosts with reason: adding net-new role [production]
17:50 <swfrench@deploy1003> Finished scap sync-world: Test scap run after switching to PHP 8.1 container image for maintenance scripts - T390225 (duration: 03m 10s) [production]
17:47 <swfrench@deploy1003> Started scap sync-world: Test scap run after switching to PHP 8.1 container image for maintenance scripts - T390225 [production]
17:46 <swfrench@deploy1003> Stopping before sync operations [production]
17:45 <dzahn@cumin1002> END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release [production]
17:45 <swfrench@deploy1003> Started scap sync-world: Test stop-before-sync scap run after switching to PHP 8.1 container image for maintenance scripts - T390225 [production]
17:38 <dzahn@cumin1002> START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release [production]
17:34 <bking@cumin2002> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row A - bking@cumin2002 - T388610 [production]
17:33 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row A - bking@cumin2002 - T388610 [production]
17:21 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1135454|Increase max db connection count before circuit breaking (T390510)]] (duration: 16m 47s) [production]
17:19 <mutante> apt1002 - updating thirdparty/gitlab-bullseye gitlab-ce package version [production]
17:12 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
17:11 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1135454|Increase max db connection count before circuit breaking (T390510)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
17:04 <bking@cumin2002> END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row A - bking@cumin2002 - T388610 [production]
17:04 <sukhe> forcing rechecks for pc1011 and db1151 [production]
17:04 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1135454|Increase max db connection count before circuit breaking (T390510)]] [production]
17:04 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row A - bking@cumin2002 - T388610 [production]
17:01 <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row A - bking@cumin2002 - T388610 [production]
17:01 <bking@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch2109.codfw.wmnet on all recursors [production]
17:01 <bking@cumin2002> START - Cookbook sre.dns.wipe-cache cirrussearch2109.codfw.wmnet on all recursors [production]
17:01 <bking@cumin2002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch2085.codfw.wmnet on all recursors [production]
17:01 <bking@cumin2002> START - Cookbook sre.dns.wipe-cache cirrussearch2085.codfw.wmnet on all recursors [production]
16:59 <sukhe> [END] sudo cumin -b11 "O:mariadb::core" "run-puppet-agent" [production]
16:46 <sukhe> sudo cumin -b11 "O:mariadb::core" "run-puppet-agent" [production]
16:44 <sukhe> forcing puppet run on db2229 [production]
16:38 <sukhe> merging above change: CR 1135471 [production]
16:17 <bking@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2088.codfw.wmnet with reason: host reimage [production]
16:10 <elukey@cumin1002> START - Cookbook sre.hosts.provision for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
16:01 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2068.codfw.wmnet with reason: host reimage [production]
15:57 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:55 <bking@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2068.codfw.wmnet with reason: host reimage [production]
15:50 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch2088 [production]
15:50 <bking@cumin2002> START - Cookbook sre.hosts.move-vlan for host cirrussearch2088 [production]
15:50 <bking@cumin2002> START - Cookbook sre.hosts.reimage for host cirrussearch2088.codfw.wmnet with OS bullseye [production]
15:48 <bking@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cirrussearch2088.codfw.wmnet'] [production]
15:46 <elukey@cumin1002> START - Cookbook sre.hosts.provision for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:43 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:38 <sukhe> reprepro -C component/nginx-ech include bookworm-wikimedia openssl_3.4.1-1+ech1_amd64.changes: T205378 [production]
15:38 <bking@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cirrussearch2088.codfw.wmnet'] [production]
15:35 <jforrester@deploy1003> Started scap sync-world: Backport for [[gerrit:1135084|Move to new async Parsoid fragment provision (T373253 T388546)]], [[gerrit:1135410|Switch out various old PHP aliases to the current class names]], [[gerrit:1126659|Add wikifunctionsclient dblist for production wikis that allow embedding Wikifunctions calls]] [production]
15:32 <elukey@cumin1002> START - Cookbook sre.hosts.provision for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART [production]
15:26 <bking@cumin2002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cirrussearch2088.codfw.wmnet'] [production]