Changing the script so reconnections to the master also works following a failover #2

laudares · 2021-05-24T22:46:40Z

Changing the script so a connection directed to a master reconnects upon failure (due to failover/switchover). The problem, as I left wrote as a comment in the code, 'a recently promoted/demoted server may still be transitioning and initially reply with the previous role' - that's valid both ways. For example, note how it switches to "read-mode" just after failing over:

$ ./HAtester.py 5000
 Working with:   MASTER - 192.168.1.13
     Inserted: 2021-05-24 22:09:43.613525

 Working with:   MASTER - 192.168.1.13
     Inserted: 2021-05-24 22:09:44.624923
(...)
Trying to connect
Unable to connect to database :
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
(...)
Trying to connect
 Working with:    REPLICA - 192.168.1.13
     Retrieved: 2021-05-24 22:10:01.804913

 Working with:    REPLICA - 192.168.1.13
     Retrieved: 2021-05-24 22:10:01.804913

 Working with:    REPLICA - 192.168.1.13
     Retrieved: 2021-05-24 22:10:01.804913

Trying to connect
 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:10:10.888396

It is more critical when doing reads and then the replica is promoted as master and does a write (!):

$ ./HAtester.py 5001
 Working with:    REPLICA - 192.168.1.11
     Retrieved: 2021-05-24 22:20:50.672920

 Working with:    REPLICA - 192.168.1.11
     Retrieved: 2021-05-24 22:20:50.672920
(...)
Working with:    REPLICA - 192.168.1.11
     Retrieved: 2021-05-24 22:20:50.672920

 Working with:   MASTER - 192.168.1.11
Trying to connect
 Working with:    REPLICA - 192.168.1.12
     Retrieved: 2021-05-24 22:20:50.672920

 Working with:    REPLICA - 192.168.1.12
     Retrieved: 2021-05-24 22:20:50.672920

Trying to connect
 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:53.403129

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:54.414794

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:55.426355

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:56.437823

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:57.447859

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:58.458605

 Working with:   MASTER - 192.168.1.11
     Inserted: 2021-05-24 22:21:59.468775

Trying to connect
 Working with:    REPLICA - 192.168.1.12
     Retrieved: 2021-05-24 22:21:59.468775
(...)

I'm sending the pull request but I'm not 100% confident this is the way.
Alternatively, we can put a "sleep" of a few seconds to avoid this - unless there is a different/transparent way we can prevent writes to a read-only connection (to port 5001).

…pon failure (due to failover/switchover). The problem, as I left wrote as a comment in the code, 'a recently promoted/demoted server may still be transitioning and initially reply with the previous role' - that's valid both ways

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changing the script so reconnections to the master also works following a failover #2

Changing the script so reconnections to the master also works following a failover #2

Uh oh!

laudares commented May 24, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Changing the script so reconnections to the master also works following a failover #2

Are you sure you want to change the base?

Changing the script so reconnections to the master also works following a failover #2

Uh oh!

Conversation

laudares commented May 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

laudares commented May 24, 2021 •

edited

Loading