101-150 of 10000 results (12ms)
2025-09-15 §
09:52 <elukey@deploy1003> helmfile [codfw] START helmfile.d/services/linkrecommendation: sync [production]
09:47 <btullis@cumin1003> END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad [production]
09:41 <elukey@deploy1003> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync [production]
09:40 <elukey@deploy1003> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync [production]
09:37 <elukey@deploy1003> helmfile [codfw] DONE helmfile.d/services/changeprop: sync [production]
09:37 <elukey@deploy1003> helmfile [codfw] START helmfile.d/services/changeprop: sync [production]
09:34 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/wikifeeds: sync [production]
09:34 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/wikifeeds: sync [production]
09:34 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: sync [production]
09:33 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: sync [production]
09:33 <ladsgroup@deploy1003> Started scap sync-world: Backport: [[gerrit:1188285|Reduce db lock timeout in LinksUpdate and CategoryMembershipChangeJob]] (T366938) [production]
09:33 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/changeprop: sync [production]
09:33 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/changeprop: sync [production]
09:32 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/mobileapps: sync [production]
09:31 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/mobileapps: sync [production]
09:30 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/termbox: sync [production]
09:30 <stevemunene@cumin1003> END (PASS) - Cookbook sre.k8s.wipe-cluster (exit_code=0) Wipe the K8s cluster dse-codfw: Cleanup the dse-k8s-codfw cluster [production]
09:30 <effie> stopping puppet on A:lvs-low-traffic-eqiad and A:lvs-low-traffic-codfw [production]
09:30 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/termbox: sync [production]
09:28 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'. [production]
09:25 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'. [production]
09:24 <hashar> integration: added back to Jenkins integration-agent-docker-1043 [releng]
09:20 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync [production]
09:20 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/linkrecommendation: sync [production]
09:20 <hashar> integration: stopped and started "integration-agent-docker-1043 which has been disconnected from Jenkins since April 2nd [releng]
09:20 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
09:19 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. [production]
09:13 <stevemunene@cumin1003> START - Cookbook sre.k8s.wipe-cluster Wipe the K8s cluster dse-codfw: Cleanup the dse-k8s-codfw cluster [production]
08:54 <btullis@cumin1003> START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad [production]
08:15 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
08:14 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. [production]
08:07 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
08:04 <stevemunene@deploy1003> helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. [production]
07:44 <slyngshede@cumin1003> DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jly out of all services on: 2418 hosts [production]
07:39 <jmm@cumin2002> DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jly out of all services on: 2418 hosts [production]
07:20 <volans> restarted the gerrit job, was not reporting updates since last friday [tools.wikibugs]
07:18 <jmm@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1011.eqiad.wmnet with reason: in setup [production]
07:17 <jmm@cumin2002> DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging jwheeler out of all services on: 2418 hosts [production]
06:56 <moritzm> reindex gis database on maps1011 following initial OSM import T381565 [production]
06:55 <jynus> restarted atftpd on install1004 [production]
2025-09-13 §
02:25 <dani@deploy1003> helmfile [codfw] DONE helmfile.d/services/miscweb: apply [production]
02:25 <dani@deploy1003> helmfile [codfw] START helmfile.d/services/miscweb: apply [production]
02:25 <dani@deploy1003> helmfile [eqiad] DONE helmfile.d/services/miscweb: apply [production]
02:25 <dani@deploy1003> helmfile [eqiad] START helmfile.d/services/miscweb: apply [production]
02:25 <dani@deploy1003> helmfile [staging] DONE helmfile.d/services/miscweb: apply [production]
02:24 <dani@deploy1003> helmfile [staging] START helmfile.d/services/miscweb: apply [production]
2025-09-12 §
21:25 <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: in setup [production]
21:25 <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on zuul2002.codfw.wmnet with reason: in setup [production]
20:20 <ladsgroup@cumin1003> END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) db1160* gradually with 4 steps - Work done [production]
20:05 <ladsgroup@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance [production]