2016-10-20
§
|
17:06 |
<elukey> |
created 0000294-161020124223818-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-13 (oozie errors) |
[analytics] |
16:17 |
<ottomata> |
restarting eventlogging after rebooting kafka brokers |
[analytics] |
15:43 |
<mforns> |
restarted EventLogging after throughput drop |
[analytics] |
14:20 |
<ottomata> |
starting rolling restart of analytics-eqiad kafka brokers to apply kernel update |
[analytics] |
13:13 |
<elukey> |
re-enabling oozie and camus after cluster reboot |
[analytics] |
10:04 |
<elukey> |
stopped camus on an1027 and all the oozie bundles as prep step for the reboots of analytics* |
[analytics] |
07:49 |
<elukey> |
created 0038058-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-5 (oozie errors) |
[analytics] |
07:48 |
<elukey> |
created 0038054-160922102909979-oozie-oozi-C to re-run webrequest-load-check_sequence_statistics-wf-upload-2016-10-20-3 (oozie errors) |
[analytics] |
02:14 |
<andrewbogott> |
restarted druid101 because it was in some kind of kernel lockup |
[analytics] |
2016-10-17
§
|
16:57 |
<elukey> |
started the oozie coordinator 0034720-160922102909979-oozie-oozi-C to re-execute webrequest-load-wf-upload-2016-10-17-14 |
[analytics] |
16:42 |
<ottomata> |
restarting hadoop nodemanagers 1 at a time |
[analytics] |
15:32 |
<ottomata> |
rebootting analytics1030 |
[analytics] |
08:24 |
<elukey> |
upgraded nodejs on aqs100[56] (already done on aqs1004) |
[analytics] |
07:35 |
<elukey> |
created oozie coordinator 0034240-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-6 |
[analytics] |
06:18 |
<elukey> |
created oozie coordinator 0034161-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-4 |
[analytics] |
06:17 |
<elukey> |
created oozie coordinator 0034153-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-3 |
[analytics] |
06:16 |
<elukey> |
created oozie coordinator 0034149-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-2 |
[analytics] |
06:15 |
<elukey> |
created oozie coordinator 0034143-160922102909979-oozie-oozi-C to restart webrequest-load-check_sequence_statistics-wf-upload-2016-10-17-1 |
[analytics] |
2016-09-22
§
|
15:30 |
<elukey> |
analytics1001 is back Yarn/HDFS master |
[analytics] |
13:16 |
<elukey> |
previous comment was meant to be read as "set a permanent read only = false" |
[analytics] |
13:16 |
<elukey> |
set read_only = false (on startup) for the analytics1003's mariadb instance |
[analytics] |
13:12 |
<elukey> |
restarted oozie jobs for 2016-9-22-6 |
[analytics] |
12:50 |
<elukey> |
varnishkafka 1.0.12 installed in cache:upload ulsfo and eqiad |
[analytics] |
11:04 |
<elukey> |
re-enabling oozie and camus after cluster reboots |
[analytics] |
10:57 |
<elukey> |
rebooted analytics1001 |
[analytics] |
10:55 |
<elukey> |
Failover from analytics1001 to analytics1002 as prep step for 1001's reboot |
[analytics] |
10:28 |
<elukey> |
setting global read_only = 0 to analytics1003 mariadb instance |
[analytics] |
10:04 |
<elukey> |
rebooted analytics1003 (oozie, hive-metastore and hive-server2 daemons affected) |
[analytics] |
09:51 |
<elukey> |
executed aptitude remove apache2 on analytic1027 (we use nginx in front of hue, apache steals port 8888 to hue and it does not start) |
[analytics] |
09:49 |
<elukey> |
suspended all oozie bundles as prep step to reboot analytics1003 |
[analytics] |
09:39 |
<elukey> |
rebooted analytics1027 |
[analytics] |