Blog

No-Downtime Cluster Software Upgrades for MySQL / MariaDB / Percona Server

One important way to protect your MySQL / MariaDB / Percona Server data is to keep your Tungsten Clustering software up-to-date.

A standard cluster deployment uses three nodes, which allows for no-downtime upgrades along with the ability to have a fully available cluster during maintenance.

Please note that with only two database cluster nodes, there is a window of vulnerability created by leaving zero failover candidates available when the lone slave is taken down for service.

The Best Practices: Staging

Performing a No-Downtime Upgrade for a Staging Deployment

When upgrading a Staging-style deployment, all nodes are upgraded at once in parallel via the `tools/tpm update` command run from inside the staging directory on the staging host.

No Master switch happens, and all layers are restarted to use the new code. This could introduce an outage for the applications depending on the age and feature-set of the old version. For that reason the `--no-connectors` option is used to prevent the restart of the Connector processes until you are ready to do so.

By default, an update/upgrade process will restart all services, including the connector. Adding this option prevents the connectors from being restarted. For example:

shell> tools/tpm update --no-connectors

If this option is used, the connectors must be manually updated to the new version after being drained from your load balancer pool or during a quieter period of traffic. This can be achieved by running a promote on each Connector node:

shell> tpm promote-connector

This will result in a short period of downtime (couple of seconds) on the single host concerned, while the other connectors in your deployment keep running. During the upgrade, the Connector is restarted using the updated software and/or configuration.

Click here to read more about "Upgrading using the Staging Method" in our online docs...

The Best Practices: INI

Performing a No-Downtime Upgrade for an INI-based Deployment

In many ways, upgrading an INI-based deployment is similar to a Staging upgrade, except that the `tools/pm update` command is executed individually on all cluster and database nodes from the locally-extracted staging directory.

Use of the `--no-connectors` option is the same.

The biggest difference is due to the fact that each node is done separately. This introduces the possibility of upgrading all the slaves first, then doing a switch, then upgrading the final node.

To Switch or Not to Switch, THAT is the Question

We recommend only using the No-Switch method of INI upgrades. Performing a switch in the middle of an upgrade can lead to various possible mismatches on multiple layers.

Click here to read more about "Upgrading when using INI-based configuration".

We have documented both approaches for those customers who feel they must perform a switch in the middle of an upgrade.

No Switch

To use the No-Switch method of upgrading (docs here):

  1. Place the cluster into maintenance mode
  2. Upgrade the slaves in the dataservice. Be sure to shun and welcome each slave.
  3. Upgrade the master node. (Important: Replication traffic to the slaves will be delayed while the replicator restarts. The delays will increase if there are a large number of stored events in the THL. Old THL may be removed to decrease the delay. Do NOT delete THL that has not been received on all slave nodes or events will be lost.)
  4. Upgrade the connectors in the dataservice one-by-one. (Important: Application traffic to the nodes will be disconnected when each connector restarts.)
  5. Place the cluster into automatic mode

Switch (Not Recommended)

To use the Switch method of upgrading (docs here):

  1. Upgrade the slaves in the dataservice. Be sure to shun and welcome each slave.
  2. Switch the current master to one of the upgraded slaves. (Important: Application and replication traffic will be delayed while the switch occurs.)
  3. Upgrade the original master node which is now a slave. Be sure to shun and welcome it.
  4. Upgrade the connectors in the dataservice one-by-one. (Important: Application traffic to the nodes will be disconnected when the connector restarts.)

Tungsten Clustering is the most flexible, performant global database layer available today - use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!

For more information, please visit https://www.continuent.com/solutions

Want to learn more or run a POC? Contact us.

About the Author

Eric M. Stone
COO

Eric is a veteran of fast-paced, large-scale enterprise environments with 35 years of Information Technology experience. With a focus on HA/DR, from building data centers and trading floors to world-wide deployments, Eric has architected, coded, deployed and administered systems for a wide variety of disparate customers, from Fortune 500 financial institutions to SMB’s.

Add new comment