1-50 of 4488 results (20ms)
2022-05-12 §
14:49 <razzi> undo the 2 previous confctl changes to repool dbproxy1019 to wikireplicas-b only [analytics]
14:35 <razzi> razzi@cumin1001:~$ sudo confctl select service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet set/pooled=yes # for T298940 [analytics]
2022-05-11 §
18:20 <razzi> disregard the above log; wrote out the command but then saw there was a warning for cr2-eqiad [analytics]
18:15 <razzi> razzi@lvs1019:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915 [analytics]
18:06 <razzi> razzi@lvs1020:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915 [analytics]
13:29 <mforns> restarted oozie jobs after deployment: mediarequest_top_files, pageview_top_articles, unique_devices_per_domain_monthly, unique_devices_per_project_family_monthly [analytics]
2022-05-10 §
20:32 <mforns> finished refinery deploy (regular weekly train) [analytics]
19:34 <mforns> starting refinery deploy (regular weekly train) [analytics]
2022-05-09 §
15:06 <SandraEbele> killed ‘apis-coord' oozie job and started corresponding airflow job ‘apis_metrics_to_graphite’ [analytics]
2022-05-06 §
09:11 <joal> kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 again [analytics]
08:44 <joal> Rerun cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 with SRE watching network [analytics]
08:29 <joal> kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 as it was probably saturating network [analytics]
2022-05-05 §
18:53 <btullis> restarting airflow-scheduler@platform_eng.service on an-airflow1003 [analytics]
18:53 <btullis> restarted airflow-scheduler@research.service on an-airflow1002 [analytics]
18:49 <btullis> restarting airflow-scheduler@analytics service on an-launcher1002 [analytics]
12:26 <aqu> Regular analytics weekly train [analytics/refinery@cc4b2bd] [analytics]
09:53 <btullis> roll-restarting hadoop masters to pick up new heap size [analytics]
09:16 <btullis> re-enabling gobblin jobs now [analytics]
09:15 <btullis> restarting failed eventlogging_to_druid_ services on an-launcher1002 [analytics]
09:00 <btullis> restarting an-coord1001 [analytics]
08:53 <btullis> stopping oozie on an-coord1001 [analytics]
2022-05-04 §
08:47 <btullis> rebooting an-coord1002 to pick up new kernel [analytics]
2022-05-03 §
18:24 <razzi> remove /etc/apache2/sites-available/50-superset-wikimedia-org.conf from an-tool1005 (superset staging) since it was removed from puppet but has no ensure: absent [analytics]
2022-04-27 §
19:37 <ottomata> restarting airflow services on all airflow instances after installing updated airflow debian package [analytics]
2022-04-26 §
19:02 <aqu> About to deploy analytics/refinery: Weekly deployment train + Artifacts to 0.1.27 [analytics]
12:02 <joal> Rerun cassandra-daily-wf-local_group_default_T_mediarequest_per_file-2022-4-23 [analytics]
2022-04-25 §
20:09 <ottomata> dropping event.ios_notification_interaction hive table and data for backwards incompatible schema change in T290920 [analytics]
11:51 <btullis> failing back hdfs active role to an-master1001 [analytics]
11:49 <btullis> restarted hadoop-yarn-resourcemanager on an-master1002 to force the active role back to an-master1001 [analytics]
11:01 <btullis> rebooting an-master1001 [analytics]
10:25 <btullis> restarting the `check_webrequest_partitions` service on an-launcher1002 [analytics]
09:39 <btullis> failover to an-master1002 successful at 3rd attempt [analytics]
09:30 <btullis> 2nd attempt to switch HDFS services to an-master1002 [analytics]
09:13 <btullis> switching HDFS services to an-master1002 [analytics]
08:53 <btullis> rebooting an-master1002 - T304938 [analytics]
2022-04-23 §
09:38 <elukey> `apt-get clean` on an-airflow1001 to free some space [analytics]
2022-04-21 §
22:26 <mforns> killed browser_general oozie job and started corresponding airflow job [analytics]
2022-04-13 §
16:40 <razzi> reboot an-launcher1002 for security updates [analytics]
2022-04-12 §
22:12 <milimetric> deployed and synced refinery-source 0.1.26 to hdfs [analytics]
2022-04-11 §
12:35 <aqu> About to deploy analytics/refinery "Migrate mediarequest hourly from Oozie to Airflow" (replace previous msg) [analytics]
12:35 <aqu> About to deploy refinery/source "Migrate mediarequest hourly from Oozie to Airflow" [analytics]
2022-04-06 §
20:53 <razzi> roll restart aqs to deploy new mediawiki history snapshot [analytics]
15:51 <mforns> deployed airflow to analytics (big refactor) [analytics]
15:23 <mforns> deployed Airflow to analytics_test (big refactor) [analytics]
09:18 <btullis> restarted eventlogging_to_druid_netflow_hourly on an-launcher1002 [analytics]
2022-04-05 §
20:41 <razzi> deploying refinery for https://gerrit.wikimedia.org/r/c/analytics/refinery/+/776269/ [analytics]
15:54 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1005 [analytics]
15:10 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1003 [analytics]
15:02 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade T299481 [analytics]
15:01 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade [analytics]