analytics SAL

1-50 of 5640 results (32ms)

2023-11-17 §
14:58	<mforns>	marked several failed tasks of datahub_ingestion DAG in Airflow, because the issues were fixed, added notes to the DAG itself	[analytics]
12:55	<joal>	Rerun Airflow metadata_ingest_daily datahub job	[analytics]
2023-11-16 §
14:45	<btullis>	rolling out 974993: Add spark.sql.warehouse.dir to spark3 defaults \| https://gerrit.wikimedia.org/r/c/operations/puppet/+/974993 for T349523	[analytics]
13:22	<sergi0>	stat1008: Add `sowiki`, `stwiki`, `tgwiki` and `ugwiki` to `/srv/published/datasets/one-off/research-mwaddlink/wikis.txt` (T340944)	[analytics]
2023-11-15 §
20:44	<xcollazo>	Ran 'sudo -u analytics hdfs dfs -rm -r -skipTrash /user/hive/warehouse/wmf_dumps.db/wikitext_raw_rc1' to delete HDFS data of old release candidate table	[analytics]
20:43	<xcollazo>	Ran 'sudo -u analytics hdfs dfs -rm -r -skipTrash /wmf/data/wmf_dumps/wikitext_raw_rc0' to delete HDFS data of old release candidate table	[analytics]
20:42	<xcollazo>	Ran 'DROP TABLE wmf_dumps.wikitext_raw_rc0' and 'DROP TABLE wmf_dumps.wikitext_raw_rc1' to delete older release candidate tables.	[analytics]
14:51	<ottomata>	deployed refine using refinery-job 0.2.26 JsonSchemaConverter from wikimedia-event-utilities - https://phabricator.wikimedia.org/T321854	[analytics]
14:33	<joal>	Deploy refinery onto HDFS (unique-devices hotfix)	[analytics]
13:44	<joal>	Deploying refinery for unique-devices hotfix	[analytics]
11:22	<btullis>	exiting safe mode	[analytics]
11:06	<btullis>	merged all config files changes replacing an-coord1001 with an-mariadb1001	[analytics]
11:04	<btullis>	position confirmed, resetting all slaves on an-mariadb1001 for T284150	[analytics]
11:02	<btullis>	set an-coord1001 mysql to read_only	[analytics]
11:01	<btullis>	entering HDFS safe mode	[analytics]
11:01	<btullis>	proceeding with the implementation plan here: https://phabricator.wikimedia.org/T284150#9330525	[analytics]
10:43	<btullis>	temporarily disabled production jobs that write to HDFS	[analytics]
2023-11-14 §
20:35	<sfaci>	recreated unique_devices iceberg tables	[analytics]
20:35	<sfaci>	restarted Druid supervisors	[analytics]
19:55	<sfaci>	Deployed refinery using scap, then deployed onto hdfs	[analytics]
19:24	<sfaci>	Deploying refinery using scap	[analytics]
14:50	<btullis>	roll-restarting the presto cluster to pick up new puppet 7 CA settings	[analytics]
14:28	<btullis>	performing a rolling restart of the mariadb services on dbstore100[3,5,7] post this patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/968668	[analytics]
11:03	<stevemunene>	depool druid100[4-6] set pooled=inactive	[analytics]
2023-11-13 §
23:34	<btullis>	rebooting clouddb1021 to pick up new kernel and puppet 7 CA.	[analytics]
21:28	<btullis>	deploying updated datahub containers for T348647	[analytics]
21:27	<btullis>	reloading haproxy on dbproxy1018 post maintenance	[analytics]
17:07	<ottomata>	deploying refinery with refinery source 0.2.25 jars and using 0.2.25 for refine job - T321854	[analytics]
13:57	<btullis>	reloaded haproxy on dbproxy1018 to depool the analytics wikireplicas cluster	[analytics]
12:31	<btullis>	repooled clouddb10[13-16] post maintenance.	[analytics]
11:08	<btullis>	rebooting clouddb1013 to pick up new kernel and SSL CA settings	[analytics]
10:49	<btullis>	systemctl reload haproxy on dbproxy1019 to depool the web wikireplica cluster	[analytics]
2023-11-09 §
14:43	<btullis>	pooled druid10[09-11] in the druid-public cluster.	[analytics]
12:29	<btullis>	Proceeding to roll-restart yarn nodemanagers with `sudo cumin A:hadoop-worker -b 1 -s 30 'systemctl restart hadoop-yarn-nodemanager.service'` for T344910	[analytics]
11:47	<btullis>	restarting yarn-nodemanager service on an-worker1100.eqiad.wmnet as a canary for T344910	[analytics]
11:14	<btullis>	deploying multiple spark shufflers to production for T344910	[analytics]
09:53	<btullis>	executed `helmfile -e eqiad --state-values-set roll_restart=1 sync` to roll-restart datahub in eqiad	[analytics]
09:43	<btullis>	executed `helmfile -e codfw --state-values-set roll_restart=1 sync` to roll-restart datahub in codfw	[analytics]
2023-11-08 §
15:52	<stevemunene>	Add analytics-wmde service user to the Yarn production queue T340648	[analytics]
13:55	<btullis>	beginning rolling restart of all hadoop workers in production, to pick up new puppet 7 CA settings.	[analytics]
10:33	<btullis>	restarting hadoop-hdfs-datanode.service and hadoop-yarn-nodemanager.service on an-worker1111 to pick up puppet7 changes.	[analytics]
10:27	<brouberol>	running scap deploy for airflow-dags/analytics	[analytics]
2023-11-07 §
20:48	<xcollazo>	Ran 'kerberos-run-command hdfs hdfs dfs -chmod -R g+w /wmf/data/wmf_dumps/wikitext_raw_rc2' to ease experimentation on this release candidate table.	[analytics]
15:52	<btullis>	restart airflow-sheduler and airflow-webserver services on an-test-client1002	[analytics]
15:50	<btullis>	restart mariadb service on an-test-coord1001	[analytics]
15:50	<btullis>	restart mariadb service on an-test-coord100	[analytics]
15:49	<btullis>	restart presto-server service on an-test-coord1001 and an-test-presto1001 to pick up new puppet 7 CA settings	[analytics]
15:48	<btullis>	restart hive-server2 and hive-metastore services on an-test-coord1001 to pick up new puppet 7 CA settings.	[analytics]
15:35	<btullis>	roll-restarting hadoop workers in test, to test new puppet 7 CA settings.	[analytics]
14:52	<btullis>	roll-restarting hadoop masters on the test cluster, after upgrading to puppet 7	[analytics]