3701-3750 of 6047 results (45ms)
2019-12-16 §
10:04 <joal> Killing user-app eating all cluster (application_1573208467349_190044) [analytics]
09:05 <joal> Rerun webrequest-load-wf-text-2019-12-14-18 with updated error-checking parameters (all false positive) [analytics]
08:49 <elukey> re-run webrequest-load 2019-12-14-13 and 2019-12-15-12 with higher mapreduce limits (modified version of refinery on hdfs /user/elukey with https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/557794/) [analytics]
07:22 <elukey> stop camus timers as prep step for maintenance (if we'll do it) [analytics]
2019-12-13 §
07:42 <elukey> execute reset-failed for monitor_refine_mediawiki_job_events on an-coord1001 [analytics]
2019-12-12 §
18:46 <elukey> rsync timers deployed on labstore100[6,7] [analytics]
15:23 <elukey> execute systemctl reset-failed monitor_refine_mediawiki_job_events after Andrew's comment on alerts@ [analytics]
12:59 <elukey> roll restart hadoop workers to pick up the new settings (removed prefer ipv4 false after T240255) [analytics]
12:40 <elukey> enable timers on an-coord1001 after maintenance [analytics]
12:39 <elukey> restart hive and oozie on an-coord1001 to pick up ipv6 settings [analytics]
11:14 <elukey> stop timers on an-coord1001 as prep step for hive/oozie restart [analytics]
2019-12-11 §
07:07 <elukey> kill/re-run pageview 2019-12-10-17, stuck in whitelist check for hours (https://hue.wikimedia.org/jobbrowser/jobs/job_1573208467349_171800 for more info) [analytics]
2019-12-10 §
14:34 <elukey> shutdown of stat1004 to check if it can hold a GPU [analytics]
14:08 <jbond42> rolling restart of varnishkafaka-webrequest and varnishkafaka-eventloggin [analytics]
2019-12-05 §
09:35 <elukey> enable timers on an-coord1001 after maintenance [analytics]
09:34 <elukey> stop oozie/hive-*; restart mariadb; restart oozie/hive-* on an-coord1001 to pick up explicit_defaults_for_timestamp - T236180 [analytics]
09:06 <elukey> temporarily stop timers on an-coord1001 to ease the restart of mariadb on an-coord1001 [analytics]
2019-12-04 §
20:57 <milimetric> finished refinery-deploy-to-hdfs from stat1004 but something's broken on stat1007 in the /srv/deployment/analytics/refinery repo [analytics]
20:08 <milimetric> deployed refinery source [analytics]
11:36 <elukey> restart mariadb on analytics1030 (hadoop test coordinator) to test explicit_defaults_for_timestamp - T236180 [analytics]
2019-12-03 §
09:48 <joal> Kill restart mediawiki-history-load-coord after sqoop re-import of missing tables [analytics]
2019-12-02 §
20:27 <joal> restart cassandra bundle [analytics]
20:17 <joal> Deploying refinery to hdfs - Last for today! [analytics]
20:00 <joal> Deploy refinery using scap to fix today deploy (last) [analytics]
19:20 <joal> Manually kill cassandra-coord-mediarequest-per-referer-hourly from bundle as it shouldn't exist [analytics]
19:07 <joal> restart cassandra bundle after redeployed patch [analytics]
18:40 <joal> Deploy refinery onto hdfs [analytics]
18:39 <joal> Deploy refinery using scap for fixes [analytics]
16:43 <joal> Restarting cassandra bundle after deploy [analytics]
11:40 <joal> Restart mediawiki-geoeditors-monthly-coord [analytics]
11:39 <joal> Drop wmf.geoeditors_daily table and create wmf/editors_daily, moving underlying data and recreating partitions [analytics]
11:35 <joal> Kill mediawiki-geoeditors-monthly-coord before updating the jobn [analytics]
10:30 <joal> Manually sqoop tables not yet done because of late deploy (content_models, content, slots, slot_roles, wbt_entity_usage) [analytics]
10:21 <joal> Create new tables for newly sqooped data in hive wmf_raw database [analytics]
09:43 <joal> Deploying refinery onto HDFS [analytics]
09:22 <joal> Deploy refinery using scap [analytics]
2019-11-27 §
09:16 <elukey> apply systemd user limits to stat1005 [analytics]
07:10 <elukey> apply systemd user limits to stat1006,stat1007 and notebook100* [analytics]
2019-11-26 §
17:19 <elukey> add systemd user limits to stat1004 [analytics]
2019-11-25 §
13:27 <elukey> set global read_only=1 on db1108's log database [analytics]
2019-11-21 §
20:07 <mforns> deploying refinery to add pageview whitelist changes and stop alerts [analytics]
15:50 <mforns> deployed refinery (with v0.0.107) [analytics]
15:10 <mforns> deployed refinery-source v0.0.107 [analytics]
06:59 <elukey> restart hdfs-cleaner on an-coord1001 [analytics]
2019-11-19 §
19:00 <elukey> regenerate TLS cert for yarn.wikimedia.org (containing SANs for all analytics UIs) to add datasets.w.o SAN (site was failing due to ATS not being able to contact thorium) [analytics]
13:54 <joal> Deleting 600 more log-folders from analytics user (cassandra backfilling logs) -- T238648 [analytics]
13:46 <joal> Deleting old parquet wikitext data (new data is stored in Avro) -- T238648 [analytics]
13:46 <joal> Deleting 100 heavier log-folders from analytics user (cassandra backfilling logs) -- T238648 [analytics]
07:51 <elukey> restart hdfs-cleaner on an-coord1001 [analytics]
2019-11-18 §
20:03 <joal> Rerun failed mediawiki_wikitext_history oozie job (2019-10) [analytics]