| 
      
        2021-03-10
      
      §
     | 
  
    
  | 17:09 | 
  <Majavah> | 
  set beta cluster mediawiki as read write on mw config (T276968) | 
  [releng] | 
            
  | 17:03 | 
  <Majavah> | 
  make deployment-db06 read-write T276968 | 
  [releng] | 
            
  | 16:50 | 
  <Majavah> | 
  `reset slave;` on new master deployment-db06 T276968 | 
  [releng] | 
            
  | 16:49 | 
  <Majavah> | 
  add deployment-db07 as a replica of db06 for T276968 | 
  [releng] | 
            
  | 16:45 | 
  <Urbanecm> | 
  root@deployment-db07:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1 # T276968 | 
  [releng] | 
            
  | 16:12 | 
  <Majavah> | 
  deployment-db08 CHANGE MASTER to MASTER_USER='repl', MASTER_PASSWORD='redacted', MASTER_PORT=3306, MASTER_HOST='deployment-db06.deployment-prep.eqiad1.wikimedia.cloud', MASTER_LOG_FILE='deployment-db06-bin.000059', MASTER_LOG_POS=522469730; (T276968) | 
  [releng] | 
            
  | 16:06 | 
  <Urbanecm> | 
  start root@deployment-db07:/srv/sqldata.db06# rsync --progress -r deployment-db06:/srv/sqldata/ . (T276968) | 
  [releng] | 
            
  | 15:57 | 
  <Majavah> | 
  set deployment-db06 as readonly from mysql side T276968 | 
  [releng] | 
            
  | 15:54 | 
  <Urbanecm> | 
  Start `root@deployment-db08:/opt/wmf-mariadb104/bin# ./mysql_upgrade -h 127.0.0.1` (T276968) | 
  [releng] | 
            
  | 15:54 | 
  <Urbanecm> | 
  Start mariadb on db08 (T276968) | 
  [releng] | 
            
  | 15:22 | 
  <Urbanecm> | 
  rsync deployment-db06:/srv/sqldata to deployment-db08:/srv/sqldata in a tmux session on deploymdeployment-db08 (T276968) | 
  [releng] | 
            
  | 14:52 | 
  <Majavah> | 
  delete deployment-db08 /srv/sqldata to attempt procedure in https://phabricator.wikimedia.org/T276968#6900199 | 
  [releng] | 
            
  | 10:16 | 
  <arturo> | 
  briefly stopping deployment-puppetdb03 to disable VMX CPU flag | 
  [releng] | 
            
  | 00:28 | 
  <marxarelli> | 
  mariadb successfully started on db07 following transfer/extraction using mariabackup and following mysql_upgrade (T276968) | 
  [releng] | 
            
  | 00:10 | 
  <marxarelli> | 
  restore of db06 failed yet again. trying mariabackup db06 -> db07 instead of mysqldump (after fixing docs/usage of the former) (T276968) | 
  [releng] | 
            
  
    | 
      
        2021-03-09
      
      §
     | 
  
    
  | 21:54 | 
  <marxarelli> | 
  restoring from db06 dump on db07 and db08 following `DROP VIEW IF EXISTS user` workaround (T276968) | 
  [releng] | 
            
  | 20:53 | 
  <marxarelli> | 
  restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 (T276968) | 
  [releng] | 
            
  | 20:53 | 
  <marxarelli> | 
  restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 | 
  [releng] | 
            
  | 20:39 | 
  <marxarelli> | 
  doing `--skip-grant-tables` on deployment-db08 and creating a new root@127.0.0.1 user (T276968) | 
  [releng] | 
            
  | 20:33 | 
  <Majavah> | 
  install mariadb on deployment-db08 T276968 | 
  [releng] | 
            
  | 19:59 | 
  <marxarelli> | 
  creating new instance deployment-db08 to use as new beta replica db (T276968) | 
  [releng] | 
            
  | 19:56 | 
  <marxarelli> | 
  deleting deployment-db05 to free up quota for new replica (T276968) | 
  [releng] | 
            
  | 19:50 | 
  <marxarelli> | 
  restoring database dump on deployment-db07 (T276968) | 
  [releng] | 
            
  | 18:49 | 
  <marxarelli> | 
  restarting db dump on db06 `mysqldump -h 127.0.0.1 --events --routines --triggers --all-databases -f --single-transaction` (T276968) | 
  [releng] | 
            
  | 18:38 | 
  <Majavah> | 
  installing mariadb 10.4 via role::mariadb::beta to db07 T276968 | 
  [releng] | 
            
  | 18:25 | 
  <marxarelli> | 
  "View 'labswiki.tag_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them" when using LOCK TABLES" during mysqldump on db06 (T276968) | 
  [releng] | 
            
  | 18:21 | 
  <Majavah> | 
  create deployment-db07 as g2.cores8.ram16.disk160 Buster T276968 | 
  [releng] | 
            
  | 18:20 | 
  <marxarelli> | 
  disabled puppet on deployment-db06 and started mysqldump (T276968) | 
  [releng] | 
            
  | 18:09 | 
  <Majavah> | 
  set deployment-db05 to read-only to avoid issues with T276968 | 
  [releng] | 
            
  | 18:04 | 
  <marxarelli> | 
  deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) | 
  [releng] | 
            
  | 17:25 | 
  <marxarelli> | 
  seeing "[ 2886.337845] EXT4-fs error (device vda3): ext4_validate_block_bitmap:" for deployment-db05 | 
  [releng] | 
            
  | 17:22 | 
  <marxarelli> | 
  restarting deployment-db05 via horizon | 
  [releng] | 
            
  | 17:22 | 
  <marxarelli> | 
  deployment-db05 seems to be acting up (intermittent connection failures) which is causing issues with beta-update-databases-eqiad, which is (possibly) causing post-merge jobs to pile up | 
  [releng] | 
            
  | 16:47 | 
  <marxarelli> | 
  still seeing "JobOffer[deployment-deploy01 #3] rejected beta-scap-eqiad: Waiting for next available executor on ‘deployment-deploy01’" despite available executors | 
  [releng] | 
            
  | 16:26 | 
  <marxarelli> | 
  builds once again being scheduled on deployment-deploy01 | 
  [releng] | 
            
  | 16:24 | 
  <marxarelli> | 
  cycling gearman plugin on integration.wikimedia.org | 
  [releng] | 
            
  | 16:16 | 
  <marxarelli> | 
  taking deployment-deploy01 agent offline to mitigate stuck post-merge jobs | 
  [releng] | 
            
  | 13:32 | 
  <arturo> | 
  hard-reboot deployment-db05 because issues related to T276922 | 
  [releng] | 
            
  | 12:34 | 
  <arturo> | 
  briefly rebooting VM deployment-db05, we need to reboot its hypervisor cloudvirt1038 and failed to migrate to other | 
  [releng] | 
            
  
    | 
      
        2021-03-07
      
      §
     | 
  
    
  | 17:46 | 
  <James_F> | 
  Deleting deployment-snapshot01, shut off since 2020-10-03. | 
  [releng] | 
            
  | 17:43 | 
  <James_F> | 
  Deleting deployment-cumin02, shut off since 2020-10-16. | 
  [releng] | 
            
  | 17:18 | 
  <Majavah> | 
  shutdown deployment-memc[04-05] T276707 | 
  [releng] | 
            
  | 16:51 | 
  <Majavah> | 
  cherry pick 669436 and 669436 to deployment-puppetmaster04 T276707 | 
  [releng] | 
            
  | 15:52 | 
  <Majavah> | 
  redis::shards change shard01 from deployment-memc04 to deployment-memc08, shard02 from deployment-memc05 to deployment-memc10 T276707 | 
  [releng] | 
            
  | 15:44 | 
  <Majavah> | 
  create deployment-memc10 on Buster T276707, beta cluster is almost on full quota but will get better when old shutdown Jessie instances will be deleted | 
  [releng] | 
            
  | 15:28 | 
  <Majavah> | 
  remove and shard04 (deployment-memc07) from redis::shards, switch shard03 from deployment-memc06 to deployment-memc09, [06-07] are both already shut down and 09 is a new in setup Buster machine to replace it, T276707 T250585 | 
  [releng] | 
            
  | 13:14 | 
  <Majavah> | 
  create deployment-memc09 on Buster T276707 | 
  [releng] |