production SAL

201-250 of 10000 results (68ms)

2025-04-03 §
13:25	<cgoubert@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-cron: apply	[production]
13:20	<taavi@deploy1003>	ihurbain, taavi: Continuing with sync	[production]
13:17	<taavi@deploy1003>	ihurbain, taavi: Backport for [[gerrit:1133113\|Enable Parsoid Read Views on 13 wiktionaries (T390680)]], [[gerrit:1133141\|Enable Parsoid Read Views to incubator and dagwiki mobile frontend (T380768 T381002)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
13:07	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply	[production]
13:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad	[production]
13:07	<taavi@deploy1003>	Started scap sync-world: Backport for [[gerrit:1133113\|Enable Parsoid Read Views on 13 wiktionaries (T390680)]], [[gerrit:1133141\|Enable Parsoid Read Views to incubator and dagwiki mobile frontend (T380768 T381002)]]	[production]
13:07	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply	[production]
13:06	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply	[production]
13:06	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply	[production]
13:05	<jmm@cumin2002>	START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-eqiad	[production]
13:04	<jmm@cumin2002>	END (FAIL) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=1) rolling restart_daemons on A:thanos-fe	[production]
13:02	<jmm@cumin2002>	START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe	[production]
12:56	<moritzm>	prune now obsolete nginx packages from testreduce1002 T329529	[production]
12:55	<godog>	move k8s instances from prometheus1006 to prometheus1008 - T383232	[production]
12:55	<jmm@cumin2002>	END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public	[production]
12:54	<klausman@deploy1003>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
12:53	<jmm@cumin2002>	START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public	[production]
12:53	<klausman@deploy1003>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
12:48	<klausman@deploy1003>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.	[production]
12:47	<klausman@deploy1003>	helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.	[production]
12:42	<jmm@cumin2002>	END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-all	[production]
12:28	<jmm@cumin2002>	START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-all	[production]
12:25	<klausman@deploy1003>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
12:24	<klausman@deploy1003>	helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.	[production]
12:22	<jmm@cumin2002>	END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-test	[production]
12:21	<jmm@cumin2002>	START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-test	[production]
12:16	<moritzm>	installing libxslt security updates	[production]
11:58	<moritzm>	installing Intel microcode security updates	[production]
11:56	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
11:50	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet	[production]
11:46	<moritzm>	installing Django security updates on Bullseye	[production]
11:37	<moritzm>	installing Python 3.9 security updates	[production]
11:33	<topranks>	reboot cr2-eqord to complete JunOS upgrade T364092	[production]
11:31	<topranks>	disable EBGP sessions to internet peers on cr2-eqord to prep for JunOS upgrade T364092	[production]
11:30	<cmooney@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr2-codfw,cr2-eqiad,cr2-eqord,cr2-eqord IPv6,cr3-ulsfo with reason: Upgrade cr2-eqord JunOS	[production]
11:07	<moritzm>	installing nodejs security updates	[production]
11:06	<topranks>	pre-pend as paths announced to codfw/eqiad from eqord to prep for JunOS upgrade T364092	[production]
11:02	<ladsgroup@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1133862\|Bump thumbnail steps to 65% (T360589)]] (duration: 16m 34s)	[production]
10:55	<ladsgroup@deploy1003>	ladsgroup: Continuing with sync	[production]
10:54	<mvernon@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-fe2003.codfw.wmnet with OS bookworm	[production]
10:54	<mvernon@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"	[production]
10:53	<ladsgroup@deploy1003>	ladsgroup: Backport for [[gerrit:1133862\|Bump thumbnail steps to 65% (T360589)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
10:51	<mvernon@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"	[production]
10:50	<topranks>	drain transport circuits to eqord (Chicago network pop) to prep for Junos upgrade cr2-eqord T364092	[production]
10:48	<moritzm>	remove nodejs from aqs* hosts, no longer used/needed and spares us needless security rollouts T350143	[production]
10:46	<ladsgroup@deploy1003>	Started scap sync-world: Backport for [[gerrit:1133862\|Bump thumbnail steps to 65% (T360589)]]	[production]
10:32	<mvernon@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-fe2003.codfw.wmnet with reason: host reimage	[production]
10:27	<mvernon@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on apus-fe2003.codfw.wmnet with reason: host reimage	[production]
10:22	<akosiaris@deploy1003>	helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.	[production]
10:22	<akosiaris@deploy1003>	helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.	[production]