production SAL

1751-1800 of 10000 results (95ms)

2024-11-05 §
10:46	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance es1032', diff saved to https://phabricator.wikimedia.org/P70916 and previous config saved to /var/cache/conftool/dbconfig/20241105-104608-ladsgroup.json	[production]
10:44	<jnuche@deploy2002>	install-world aborted: (no justification provided) (duration: 03m 09s)	[production]
10:41	<jnuche@deploy2002>	Installing scap version "4.121.0" for 209 hosts	[production]
10:41	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:40	<brouberol@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
10:31	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance es1032', diff saved to https://phabricator.wikimedia.org/P70915 and previous config saved to /var/cache/conftool/dbconfig/20241105-103101-ladsgroup.json	[production]
10:15	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance es1032 (T376905)', diff saved to https://phabricator.wikimedia.org/P70914 and previous config saved to /var/cache/conftool/dbconfig/20241105-101553-ladsgroup.json	[production]
10:11	<elukey>	set proxy timeouts of docker registry's nginx instances from 300s to 180s - T378618	[production]
10:09	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depooling es1032 (T376905)', diff saved to https://phabricator.wikimedia.org/P70913 and previous config saved to /var/cache/conftool/dbconfig/20241105-100953-ladsgroup.json	[production]
10:09	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance	[production]
10:09	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1032.eqiad.wmnet with reason: Maintenance	[production]
10:07	<vgutierrez@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS bookworm	[production]
10:00	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
10:00	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance	[production]
09:49	<vgutierrez@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1013.eqiad.wmnet with reason: host reimage	[production]
09:45	<vgutierrez@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1013.eqiad.wmnet with reason: host reimage	[production]
09:33	<vgutierrez@cumin1002>	START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS bookworm	[production]
09:31	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled	[production]
09:31	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on pc1013.eqiad.wmnet with reason: T373037, host is not pooled	[production]
09:22	<jnuche@deploy2002>	Started scap sync-world: testwikis to 1.44.0-wmf.2 refs T375661	[production]
09:21	<_joe_>	restarted rsyslog on deploy2002 T379044	[production]
08:57	<tchanders@deploy2002>	Started scap sync-world: Backport for [[gerrit:1087373\|Revert "temp accounts: Enable temp account creation on second-round pilots"]]	[production]
08:24	<vgutierrez>	uploaded ipip-multiqueue-optimizer 0.3+deb12u1 to apt.wm.o (bookworm)	[production]
08:10	<tchanders@deploy2002>	Started scap sync-world: Backport for [[gerrit:1087195\|temp accounts: Enable temp account creation on second-round pilots (T378336)]]	[production]
08:06	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2828	[production]
08:03	<ayounsi@cumin1002>	START - Cookbook sre.network.peering with action 'email' for AS: 2828	[production]
08:03	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 14593	[production]
07:55	<ayounsi@cumin1002>	START - Cookbook sre.network.peering with action 'configure' for AS: 14593	[production]
07:39	<ayounsi@cumin1002>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11414	[production]
07:39	<ayounsi@cumin1002>	START - Cookbook sre.network.peering with action 'email' for AS: 11414	[production]
05:10	<mwpresync@deploy2002>	Pruned MediaWiki: 1.43.0-wmf.27 (duration: 10m 37s)	[production]
04:03	<mwpresync@deploy2002>	Started scap sync-world: testwikis to 1.44.0-wmf.2 refs T375661	[production]
00:10	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-gp2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED	[production]
00:10	<rzl@deploy2002>	Finished scap sync-world: 1085506 (duration: 02m 50s)	[production]
00:08	<rzl@deploy2002>	Started scap sync-world: 1085506	[production]
00:04	<jhancock@cumin2002>	START - Cookbook sre.hosts.provision for host mc-gp2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED	[production]
2024-11-04 §
23:56	<jhancock@cumin2002>	END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp2006	[production]
23:56	<jhancock@cumin2002>	START - Cookbook sre.network.configure-switch-interfaces for host mc-gp2006	[production]
23:56	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc-gp2006.codfw.wmnet with OS bookworm	[production]
23:18	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp2005.codfw.wmnet with OS bookworm	[production]
23:18	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"	[production]
23:18	<jhancock@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"	[production]
23:17	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp2004.codfw.wmnet with OS bookworm	[production]
23:17	<jhancock@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"	[production]
23:15	<jhancock@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"	[production]
22:59	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp2005.codfw.wmnet with reason: host reimage	[production]
22:56	<jhancock@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp2004.codfw.wmnet with reason: host reimage	[production]
22:53	<jhancock@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp2005.codfw.wmnet with reason: host reimage	[production]
22:53	<jhancock@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp2004.codfw.wmnet with reason: host reimage	[production]
22:35	<jhancock@cumin2002>	START - Cookbook sre.hosts.reimage for host mc-gp2006.codfw.wmnet with OS bookworm	[production]