351-400 of 5021 results (24ms)
2022-09-13 §
15:46 <btullis> restarting eventlogging_to_druid_editattemptstep_hourly.service on an-launcher1002 [analytics]
15:44 <btullis> cancel that last message. Upgrading hadoop packages on an-launcher instead. They were inadvertently omitted last time. [analytics]
15:39 <btullis> Going to downgrade hadoop on ann hadoop-worker nodes to 2.10.1 [analytics]
15:21 <btullis> failed over hive to an-coord1002 via DNS https://gerrit.wikimedia.org/r/c/operations/dns/+/831906 [analytics]
15:20 <btullis> restarted yarn service on an-master1002 to make the active host an-master1001 again. [analytics]
15:11 <btullis> restart hive-server2 and hive-metastore service on an-coord1002 to pick up new version of hadoop [analytics]
14:55 <btullis> rolling out updated hadoop packages to analytics-airflow (cumin alias) hosts [analytics]
14:42 <btullis> sudo systemctl restart analytics-reportupdater-logs-rsync.service on an-launcher1002 [analytics]
13:21 <joal> Manual launch of refinery-drop-mediawiki-snapshots with new tables in patch https://gerrit.wikimedia.org/r/831866 [analytics]
10:51 <btullis> attempting failback operation on hadoop namenodes [analytics]
09:42 <btullis> roll-restarting the hadoop masters via the cookbook [analytics]
2022-09-12 §
08:37 <btullis> cold-reset BMC device on analytics1073 [analytics]
2022-09-08 §
17:32 <joal> make ops reboot stat1008 [analytics]
2022-09-07 §
13:36 <joal> rerun failed airflow tasks [analytics]
2022-09-06 §
22:18 <milimetric> restarted webrequest druid daily and hourly jobs [analytics]
22:18 <milimetric> restarted referrer daily coordinator [analytics]
22:18 <milimetric> restarted webrequest load bundle [analytics]
21:57 <milimetric> finished cleaning up bad state and re-deploying refinery [analytics]
21:45 <milimetric> cleared logs earlier than September 1st from an-launcher1002:/srv/airflow-analytics/logs/scheduler [analytics]
18:49 <milimetric> finished refinery-source 0.2.6 deploy, waiting 5 minutes and starting refinery deploy [analytics]
18:28 <milimetric> weekly deployment train starting [analytics]
09:55 <btullis> merged and deployed https://gerrit.wikimedia.org/r/c/operations/puppet/+/821695 [analytics]
2022-09-04 §
12:49 <elukey> pkill remaining processes of user effeietsanders on stat1008 to unblock puppet [analytics]
2022-09-02 §
08:25 <joal> Restart mediawiki_history_denormalize job manually [analytics]
2022-08-30 §
17:49 <joal> Deploying refinery onto HDFS [analytics]
17:11 <joal> deploy refinery using scap [analytics]
17:11 <joal> release refinery-source v0.2.5 to archiva [analytics]
2022-08-29 §
16:44 <mforns> killed mediawiki-history-dumps oozie after migration to airflow [analytics]
08:04 <joal> Rerun refine_eventlogging_legacy failed hours [analytics]
07:54 <joal> rerun pageview-hourly-wf-2022-8-28-15 oozie workflow [analytics]
2022-08-22 §
16:25 <btullis> btullis@an-airflow1004:~$ sudo systemctl reset-failed ifup@ens13.service [analytics]
2022-08-19 §
08:45 <btullis> restarted archiva to pick up new JRE [analytics]
2022-08-18 §
19:57 <ottomata> apply yarn production queue changes to allow analytics-research and analytics-platform-eng users to submit jobs to production queue - T312858 [analytics]
14:04 <btullis> re-running refine_eventlogging_legacy for helppanel [analytics]
09:51 <btullis> restarted monitor-refine-event on an-launcher1002 [analytics]
2022-08-17 §
13:19 <mforns> deployed airflow for https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/117 [analytics]
2022-08-16 §
18:49 <ottomata> complete refinery deploy that was unfinished from last week. an-launcher1002 and hdfs already have this version (6e47e0e712528c8816b7fd7456b8745e4dbc5c72) deployed. [analytics]
16:02 <btullis> deploying airflow-dags [analytics]
2022-08-15 §
19:26 <ottomata> test [analytics]
2022-08-10 §
18:04 <ottomata> Deployed refinery using scap, then deployed onto hdfs [analytics]
17:03 <ottomata> stopping puppet and drop data timers on an-launcher1002 and an-test-coord1001 to deploy drop script changes - T270433 [analytics]
13:42 <btullis> failed hive back to an-coord1001 via DNS change. [analytics]
11:47 <btullis> btullis@an-coord1001:~$ sudo systemctl restart hive-server2.service hive-metastore.service [analytics]
2022-08-08 §
11:43 <btullis> rebooting an-worker1102 due to kernel soft lockups [analytics]
2022-08-05 §
16:05 <milimetric> force scap deploying refinery [analytics]
16:01 <ottomata> removing airflow logs older than 7 days on an-launcher1002 [analytics]
2022-08-04 §
18:31 <ottomata> dropping medawiki_web_ui_interactions hive tables and data - T314151 [analytics]
18:19 <milimetric> scap deploying refinery host by host after Ben cleaned up the repos with "git checkout master" [analytics]
18:11 <btullis> btullis@deploy1002:/srv/deployment/analytics/refinery$ scap deploy -l stat1008.eqiad.wmnet "Regular analytics weekly train [analytics/refinery@$(git rev-parse --short HEAD)]" [analytics]
18:05 <btullis> we are re-deploying refinery to an-launcher1002 with the command above [analytics]