Skip to content

gh-ost swap table is hanging cluster #974

@shiwangini93

Description

@shiwangini93

gh-ost swap table caused writeset replication issues.

We have bi-directional (master -master) replication for mysql. I ran below statement to alter table:

     /usr/bin/gh-ost  --alter="add column status int null;"  --database=test  --table=noise  --user=test_user -- 
    password=xxxxxxxx  --chunk-size=2000 --max-load=Threads_connected=200 -exact-rowcount --verbose --execute 

It copied the data successfully to the switcher table. However, by the time of switchover it got stuck at this point:

   2021-05-24 07:15:34 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in 
   place (though I don't strictly have to)
   2021-05-24 07:15:34 INFO Checking session lock: gh-ost.1680148.lock
   2021-05-24 07:15:34 INFO Connection holding lock on original table still exists
   2021-05-24 07:15:34 INFO Will now proceed to drop magic table and unlock tables
   2021-05-24 07:15:34 INFO Dropping magic cut-over table

When, I checked for locking at database level. I didn't find any. However, when I ran show processlist; - I saw replication was getting stuck:

    | 1680148 | shiwangini  | 127.0.0.1:58328   | shiwangini_test    | Query       |   10613 | checking permissions                                          
    | drop /* gh-ost */ table if exists `shiwangini_test`.`_agents_del`                                    |         0 |             0 |
    | 1680150 | shiwangini  | 127.0.0.1:58332   | shiwangini_test    | Query       |   10614 | wsrep: initiating pre-commit for 
    write set (80151983)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680151 | shiwangini  | 127.0.0.1:58334   | shiwangini_test    | Query       |   10614 | Waiting for table metadata lock                               
    | rename /* gh-ost */ table `shiwangini_test`.`agents` to `shiwangini_test`.`_agents_del`, `shiwangini |         0 |             
    0 |
    | 1680152 | shiwangini  | 127.0.0.1:58336   | shiwangini_test    | Query       |   10613 | wsrep: initiating pre-commit for 
    write set (80151991)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680153 | shiwangini  | 127.0.0.1:58338   | shiwangini_test    | Query       |   10085 | wsrep: waiting to replay write set 
    (-1)                       | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680156 | shiwangini  | 127.0.0.1:58340   | shiwangini_test    | Query       |   10610 | wsrep: initiating pre-commit for 
    write set (80152004)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680158 | shiwangini  | 127.0.0.1:58342   | shiwangini_test    | Query       |   10605 | wsrep: initiating pre-commit for 
    write set (80152009)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680159 | shiwangini  | 127.0.0.1:58344   | shiwangini_test    | Query       |   10600 | wsrep: initiating pre-commit for 
    write set (80152010)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680161 | shiwangini  | 127.0.0.1:58348   | shiwangini_test    | Query       |   10595 | wsrep: initiating pre-commit for 
    write set (80152011)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
   | 1680162 | shiwangini  | 127.0.0.1:58350   | shiwangini_test    | Query       |   10590 | wsrep: initiating pre-commit for 
   write set (80152012)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
   | 1680165 | shiwangini  | 127.0.0.1:58352   | shiwangini_test    | Query       |   10585 | wsrep: initiating pre-commit for write 
   set (80152013)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
  | 1680166 | shiwangini  | 127.0.0.1:58354   | shiwangini_test    | Query       |   10580 | wsrep: initiating pre-commit for write 
  set (80152014)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
   | 1680167 | shiwangini  | 127.0.0.1:58358   | shiwangini_test    | Query       |   10575 | wsrep: initiating pre-commit for write 
   set (80152015)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680168 | shiwangini  | 127.0.0.1:58360   | shiwangini_test    | Query       |   10570 | wsrep: initiating pre-commit for 
    write set (80152016)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
    | 1680170 | shiwangini  | 127.0.0.1:58362   | shiwangini_test    | Query       |   10565 | wsrep: initiating pre-commit for 
    write set (80152017)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
     | 1680171 | shiwangini  | 127.0.0.1:58364   | shiwangini_test    | Query       |   10560 | wsrep: initiating pre-commit for 
     write set (80152018)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |
      | 1680174 | shiwangini  | 127.0.0.1:58368   | shiwangini_test    | Query       |   10535 | wsrep: initiating pre-commit for 
      write set (80152019)         | insert /* gh-ost */ into `shiwangini_test`.`_agents_ghc`
			(id, hint, value)
		values
			(NULLIF |         0 |             0 |

I cancelled the gh-ost execution and even after cancelling it - the cluster was keep hanging. Finally, we had to restart the whole node.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions