analytics SAL

1801-1850 of 4183 results (26ms)

2020-01-07 §
15:16	<elukey>	re-enable timers on an-coord1001 after hive restart	[analytics]
15:03	<elukey>	restart hive (server+metastore) on an-coord1001 to apply delegation token settings	[analytics]
14:36	<elukey>	stop timers on an-coord1001 as prep step to restart hive	[analytics]
14:05	<elukey>	apply max cpu cores usage (via systemd cgroups) on stat/notebook	[analytics]
07:59	<elukey>	restart hue (again) with correct principal settings)	[analytics]
07:42	<elukey>	restart Hue after applying a new kerberos setting (hue_principal, was not specified before)	[analytics]
2020-01-06 §
16:45	<joal>	Manually sqoop missing tables (content,content_models,slot_roles,slots,wbc_entity_usage0	[analytics]
2020-01-02 §
18:32	<elukey>	restart hue with new hive query limits	[analytics]
2019-12-31 §
16:25	<elukey>	re-run webrequest_load 2019/12/30-15 due to (hopefully) temp hive/kerb issues	[analytics]
2019-12-30 §
13:48	<joal>	rerun webrequest-load-wf-text-2019-12-30-8 with updated error-thresholds	[analytics]
2019-12-23 §
15:01	<fdans>	deploying refinery	[analytics]
08:55	<joal>	Manually killing application_1576512674871_6292 as it's failing	[analytics]
07:54	<elukey>	re-run 2019-12-22-17 again with customized max heap settings (/user/elukey/refinery dir on hdfs)	[analytics]
07:53	<elukey>	re-run 2019-12-22-14 again with customized max heap settings (/user/elukey/refinery dir on hdfs)	[analytics]
00:44	<elukey>	re-run 2019-12-22-17 again with customized max heap settings (/user/elukey/refinery dir on hdfs)	[analytics]
2019-12-22 §
18:31	<elukey>	re-run 2019-12-22-14 again with customized max heap settings (/user/elukey/refinery dir on hdfs)	[analytics]
17:15	<elukey>	re-run webrequest_load 2019-12-22-14	[analytics]
2019-12-21 §
15:54	<elukey>	re-run webrequest_load 2019-12-21T14 (failed due to mappers ooms)	[analytics]
2019-12-19 §
18:42	<mforns>	deployed refinery (corresponding to source v0.0.109)	[analytics]
17:53	<mforns>	deployed refinery-source v0.0.109	[analytics]
2019-12-17 §
08:22	<elukey>	re-launch netflow realtime supervisor in Druid Analytics	[analytics]
08:10	<joal>	Kill-restart cassandra-daily-coord-local_group_default_T_mediarequest_per_file to fix the refinery-hive-jar path issue	[analytics]
2019-12-16 §
20:21	<joal>	Rerun webrequest-load-wf-text-2019-12-16-13 without dataloss-error threshold after having checked for dataloss (real dataloss, 10^-4percent)	[analytics]
15:41	<mforns>	finished deploying analytics refinery for kerberos migration	[analytics]
15:20	<mforns>	deploying analytics refinery for kerberos migration	[analytics]
12:56	<joal>	Kill all oozie jobs after having dumped their statuses	[analytics]
12:26	<joal>	Reference for killed backfilling mediarequest-per-file job: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0003296-191212123816836-oozie-oozi-C/	[analytics]
12:26	<joal>	Reference for killed backfillin jo	[analytics]
12:23	<joal>	Kill backfilling job for mediarequest-per-file with 2017-07-0[2345] days not done	[analytics]
12:22	<joal>	Rerun cassandra-daily-wf-local_group_default_T_pageviews_per_article_flat-2019-12-15	[analytics]
12:17	<elukey>	kill netflow realtime druid supervisor as prep step for kerberos	[analytics]
11:14	<joal>	Clean spark-shell drivers on cluster before kerberos	[analytics]
10:46	<elukey>	stop airflow-* on an-airflow1001	[analytics]
10:41	<elukey>	stop jupyterhub on notebook100[3,4] as prep step for kerberos	[analytics]
10:38	<elukey>	kill Nuria's spark shell application masters in Yarn	[analytics]
10:17	<elukey>	stop hadoop-related timers on stat1007	[analytics]
10:04	<joal>	Killing user-app eating all cluster (application_1573208467349_190044)	[analytics]
09:05	<joal>	Rerun webrequest-load-wf-text-2019-12-14-18 with updated error-checking parameters (all false positive)	[analytics]
08:49	<elukey>	re-run webrequest-load 2019-12-14-13 and 2019-12-15-12 with higher mapreduce limits (modified version of refinery on hdfs /user/elukey with https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/557794/)	[analytics]
07:22	<elukey>	stop camus timers as prep step for maintenance (if we'll do it)	[analytics]
2019-12-13 §
07:42	<elukey>	execute reset-failed for monitor_refine_mediawiki_job_events on an-coord1001	[analytics]
2019-12-12 §
18:46	<elukey>	rsync timers deployed on labstore100[6,7]	[analytics]
15:23	<elukey>	execute systemctl reset-failed monitor_refine_mediawiki_job_events after Andrew's comment on alerts@	[analytics]
12:59	<elukey>	roll restart hadoop workers to pick up the new settings (removed prefer ipv4 false after T240255)	[analytics]
12:40	<elukey>	enable timers on an-coord1001 after maintenance	[analytics]
12:39	<elukey>	restart hive and oozie on an-coord1001 to pick up ipv6 settings	[analytics]
11:14	<elukey>	stop timers on an-coord1001 as prep step for hive/oozie restart	[analytics]
2019-12-11 §
07:07	<elukey>	kill/re-run pageview 2019-12-10-17, stuck in whitelist check for hours (https://hue.wikimedia.org/jobbrowser/jobs/job_1573208467349_171800 for more info)	[analytics]
2019-12-10 §
14:34	<elukey>	shutdown of stat1004 to check if it can hold a GPU	[analytics]
14:08	<jbond42>	rolling restart of varnishkafaka-webrequest and varnishkafaka-eventloggin	[analytics]