Blog

What on Earth is a Split-Brain Scenario in a MySQL Database Cluster?

Overview

The Skinny

In this blog post we will define what a split-brain scenario means in a MySQL database cluster, and then explore how a Tungsten MySQL database cluster reacts to a split-brain situation.

Agenda

What's Here?

  • Define the term "split-brain"
  • Briefly explore how the Tungsten Manager works to monitor the cluster health and prevent data corruption in the event of a network partition
  • Also explore how the Tungsten Connector works to route writes
  • Describe how a Tungsten MySQL database cluster reacts to a split-brain situation
  • Illustrate various testing and recovery procedures

Split-Brain: Definition and Impact

Sounds scary, and it is!

A split-brain occurs when a MySQL database cluster which normally has a single write master, has two write-able masters.

This means that some writes which should go to the "real" master are sent to a different node which was promoted to write master by mistake.

Once that happens, some writes exist on one master and not the other, creating two broken masters. Merging the two data sets is impossible, leading to a full restore, which is clearly NOT desirable.

We can say that a split-brain scenario is to be strongly avoided.

A situation like this is most often encountered when there is a network partition of some sort, especially with the nodes spread over multiple availability zones in a single region of a cloud deployment.

This would potentially result in all nodes being isolated, without a clear majority within the voting quorum.

A poorly-designed cluster could elect more than one master under these conditions, leading to the split-brain scenario.

Tungsten Manager: A Primer

A Very Brief Summary

The Tungsten Manager health-checks the cluster nodes and the MySQL databases.

The Manager is responsible for initiating various failure states and helping to automate recovery efforts.

Each Manager communicates with the others via a Java JGroups group chat.

Additionally, the Connectors get status information from a chosen Manager as well...

Tungsten Connector: A Primer

Another Brief Summary

The Tungsten Connector is an intelligent MySQL database proxy located between the clients and the database servers, providing a single connection point, while routing queries to the database servers.

Simply put, the Connector is responsible for routing MySQL queries to the correct node in the cluster.

In the event of a failure, the Tungsten Connector can automatically route queries away from the failed server and towards servers that are still operating.

When the cluster Managers detect a failed master (i.e. because the MySQL server port is no longer reachable), the Connectors are signaled and client traffic is re-routed to the newly-elected Master node.

Each Connector makes a TCP connection to any available Manager, then all command-and-control traffic uses that channel. The Manager never initiates a connection to the Connector.

When there is a state change (i.e. shun, welcome, failover, etc.), the Manager will communicate to the Connector over the existing channel.

The Connector will re-establish a channel to an available Manager if the Manager it is connected to is stopped or lost.

For more detailed information about how the Tungsten Connector works, please read our blog post, "Experience the Power of the Tungsten Connector"

Failsafe-Shun: Safety by Design

Protect the data first and foremost!

Since a network partition would potentially result in all nodes being isolated without a clear majority within the voting quorum, the default action of a Tungsten Cluster is to SHUN all of the nodes.

Shunning ALL of the nodes means that no client traffic is being processed by any node, both reads and writes are blocked.

When this happens, it is up to a human administrator to select the proper master and recover the cluster.

The main thing that avoids split-brain in our clustering is that the Connector is either:

  1. connected to a manager that is a member of a quorum or
  2. it is connected to a Manager that has all resources shunned

In the first case, it's guaranteed to have a single master. In the second case, it can't connect to anything until the Manager its connected to is in a quorum.

Example Failure Testing Procedures

Use this failure test procedure ONLY in a dev/test environment.
Use this procedure AT YOUR OWN RISK!

A failsafe-shun scenario can be forced.

Given a 3-node cluster east, with master db1 and slaves db2/db3, simply stop the manager process on both slaves and wait about 60 seconds:

shell> manager stop
Stopping Tungsten Manager Service...
Stopped Tungsten Manager Service.

The Manager on the master node db1 will restart itself after an appropriate timeout and the entire cluster will then be in FAILSAFE-SHUN status.

Once you have verified that the cluster in in FAILSAFE-SHUN status, start the Managers on both slaves before proceeding with recovery:

shell> manager start
Starting Tungsten Manager Service...
Waiting for Tungsten Manager Service..........
running: PID:21911

Example Recovery Procedures

First, examine the state of the dataservice and choose which datasource is the most up to date or canonical. For example, within the following example, each datasource has the same sequence number, so any datasource could potentially be used as the master:

[LOGICAL] /alpha > ls
 
COORDINATOR[host1:AUTOMATIC:ONLINE]
 
ROUTERS:
+----------------------------------------------------------------------------+
|connector@host1[18450](ONLINE, created=0, active=0)                         |
|connector@host2[8877](ONLINE, created=0, active=0)                          |
|connector@host3[8895](ONLINE, created=0, active=0)                          |
+----------------------------------------------------------------------------+
 
DATASOURCES:
+----------------------------------------------------------------------------+
|host1(master:SHUNNED(FAILSAFE AFTER Shunned by fail-safe procedure),        |
|progress=17, THL latency=0.565)                                             |
|STATUS [OK] [2013/11/04 04:39:28 PM GMT]                                    |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=master, state=ONLINE)                                     |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+
 
+----------------------------------------------------------------------------+
|host2(slave:SHUNNED(FAILSAFE AFTER Shunned by fail-safe procedure),         |
|progress=17, latency=1.003)                                                 |
|STATUS [OK] [2013/11/04 04:39:51 PM GMT]                                    |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=slave, master=host1, state=ONLINE)                        |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+
 
+----------------------------------------------------------------------------+
|host3(slave:SHUNNED(FAILSAFE AFTER Shunned by fail-safe procedure),         |
|progress=17, latency=1.273)                                                 |
|STATUS [OK] [2013/10/26 06:30:26 PM BST]                                    |
+----------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                     |
|  REPLICATOR(role=slave, master=host1, state=ONLINE)                        |
|  DATASERVER(state=ONLINE)                                                  |
|  CONNECTIONS(created=0, active=0)                                          |
+----------------------------------------------------------------------------+

Recover Master Using

Once you have selected the correct host, use cctrl to call the recover master using command specifying the full service name and hostname of the chosen datasource:

[LOGICAL] /alpha > recover master using alpha/host1
 
This command is generally meant to help in the recovery of a data service
that has data sources shunned due to a fail-safe shutdown of the service or
under other circumstances where you wish to force a specific data source to become
the primary. Be forewarned that if you do not exercise care when using this command
you may lose data permanently or otherwise make your data service unusable.
Do you want to continue? (y/n)> y
DATA SERVICE 'alpha' DOES NOT HAVE AN ACTIVE PRIMARY. CAN PROCEED WITH 'RECOVER USING'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'host1'
DATA SERVER 'host1' IS NOW AVAILABLE FOR CONNECTIONS
DataSource 'host1' is now OFFLINE
DATASOURCE 'host1@alpha' IS NOW A MASTER
FOUND PHYSICAL DATASOURCE TO RECOVER: 'host2@alpha'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'host2'
DATA SERVER 'host2' IS NOW AVAILABLE FOR CONNECTIONS
RECOVERING 'host2@alpha' TO A SLAVE USING 'host1@alpha' AS THE MASTER
DataSource 'host2' is now OFFLINE
RECOVERY OF DATA SERVICE 'alpha' SUCCEEDED
FOUND PHYSICAL DATASOURCE TO RECOVER: 'host3@alpha'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'host3'
DATA SERVER 'host3' IS NOW AVAILABLE FOR CONNECTIONS
RECOVERING 'host3@alpha' TO A SLAVE USING 'host1@alpha' AS THE MASTER
DataSource 'host3' is now OFFLINE
RECOVERY OF DATA SERVICE 'alpha' SUCCEEDED
RECOVERED 2 DATA SOURCES IN SERVICE 'alpha'

You will be prompted to ensure that you wish to choose the selected host as the new master. cctrl then proceeds to set the new master, and recover the remaining slaves.

If this operation fails, you can try the more manual process, next.

Welcome and Recover

A simple welcome attempt will fail:

[LOGICAL] /east > datasource db1 welcome
 
WARNING: This is an expert-level command:
Incorrect use may cause data corruption
or make the cluster unavailable.
 
Do you want to continue? (y/n)> y
ONLINE PRIMARY DATA SOURCE WAS NOT FOUND IN DATA SERVICE 'east'

To use the welcome command, the force mode must be enabled first:

[LOGICAL] /east > set force true
FORCE: true
 
[LOGICAL] /east > datasource db1 welcome
 
WARNING: This is an expert-level command:
Incorrect use may cause data corruption
or make the cluster unavailable.
 
Do you want to continue? (y/n)> y
DataSource 'db1' is now OFFLINE

Note the OFFLINE state as the result? In AUTOMATIC mode, the datasource will be set to ONLINE before you have the time to run the ls command and look at the cluster state:

[LOGICAL] /east > ls
 
COORDINATOR[db1:AUTOMATIC:ONLINE]
 
ROUTERS:
+---------------------------------------------------------------------------------+
|connector@db1[9970](ONLINE, created=2, active=0)                                 |
|connector@db2[10076](ONLINE, created=0, active=0)                                |
|connector@db3[11373](ONLINE, created=2, active=0)                                |
+---------------------------------------------------------------------------------+
 
DATASOURCES:
+---------------------------------------------------------------------------------+
|db1(master:ONLINE, progress=4, THL latency=0.667)                                |
|STATUS [OK] [2019/06/12 04:00:21 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=master, state=ONLINE)                                          |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=1, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db2(slave:SHUNNED(FAILSAFE_SHUN), progress=4, latency=0.773)                     |
|STATUS [OK] [2019/06/12 03:47:42 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=slave, master=db1, state=ONLINE)                               |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=1, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db3(slave:SHUNNED(FAILSAFE_SHUN), progress=4, latency=0.738)                     |
|STATUS [OK] [2019/06/12 03:47:32 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=slave, master=db1, state=ONLINE)                               |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=2, active=0)                                               |
+---------------------------------------------------------------------------------+

Finally, recover the remaining slaves:

[LOGICAL] /east > recover
 
SET POLICY: AUTOMATIC => MAINTENANCE
FOUND PHYSICAL DATASOURCE TO RECOVER: 'db3@east'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'db3'
Verified that DB server notification 'db3' is in state 'ONLINE'
DATA SERVER 'db3' IS NOW AVAILABLE FOR CONNECTIONS
The active primary in data service 'east' is 'db1@east'
RECOVERING 'db3@east' TO A SLAVE USING 'db1@east' AS THE MASTER
RECOVERY OF DATA SERVICE 'east' SUCCEEDED
FOUND PHYSICAL DATASOURCE TO RECOVER: 'db2@east'
VERIFYING THAT WE CAN CONNECT TO DATA SERVER 'db2'
Verified that DB server notification 'db2' is in state 'ONLINE'
DATA SERVER 'db2' IS NOW AVAILABLE FOR CONNECTIONS
The active primary in data service 'east' is 'db1@east'
RECOVERING 'db2@east' TO A SLAVE USING 'db1@east' AS THE MASTER
RECOVERY OF DATA SERVICE 'east' SUCCEEDED
REVERT POLICY: MAINTENANCE => AUTOMATIC
RECOVERED 2 DATA SOURCES IN SERVICE 'east'
 
[LOGICAL] /east > ls
 
COORDINATOR[db1:AUTOMATIC:ONLINE]
 
ROUTERS:
+---------------------------------------------------------------------------------+
|connector@db1[9970](ONLINE, created=2, active=0)                                 |
|connector@db2[10076](ONLINE, created=0, active=0)                                |
|connector@db3[11373](ONLINE, created=2, active=0)                                |
+---------------------------------------------------------------------------------+
 
DATASOURCES:
+---------------------------------------------------------------------------------+
|db1(master:ONLINE, progress=4, THL latency=0.667)                                |
|STATUS [OK] [2019/06/12 04:00:21 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=master, state=ONLINE)                                          |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=1, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db2(slave:ONLINE, progress=4, latency=0.000)                                     |
|STATUS [OK] [2019/06/12 04:06:36 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=slave, master=db1, state=ONLINE)                               |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=1, active=0)                                               |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|db3(slave:ONLINE, progress=4, latency=0.000)                                     |
|STATUS [OK] [2019/06/12 04:06:27 PM UTC]                                         |
+---------------------------------------------------------------------------------+
|  MANAGER(state=ONLINE)                                                          |
|  REPLICATOR(role=slave, master=db1, state=ONLINE)                               |
|  DATASERVER(state=ONLINE)                                                       |
|  CONNECTIONS(created=2, active=0)                                               |
+---------------------------------------------------------------------------------+

Summary

The Wrap-Up

In this blog post we defined what a split-brain scenario means, and explored how a Tungsten MySQL database cluster reacts to a split-brain situation.

To learn about Continuent solutions in general, check out https://www.continuent.com/solutions

The Library

Please read the docs!

For more information about Tungsten Cluster recovery procedures, please visit https://docs.continuent.com/tungsten-clustering-6.0/operations-recovery-master.html

Tungsten Clustering is the most flexible, performant global database layer available today - use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

About the Author

Eric M. Stone
COO

Eric is a veteran of fast-paced, large-scale enterprise environments with 35 years of Information Technology experience. With a focus on HA/DR, from building data centers and trading floors to world-wide deployments, Eric has architected, coded, deployed and administered systems for a wide variety of disparate customers, from Fortune 500 financial institutions to SMB’s.

Add new comment