How to Prevent AWS Outages from Bringing Your Database Down

Modern-Day Mission-Critical Apps in the Cloud

Business-critical and mission-critical applications face increased pressure from serving significantly more users (often distributed worldwide) and serving them with near-immediate response times and no downtime.

Enterprises designing these apps may leverage cloud to reduce cost, management, or for other reasons. However, after another holiday season rife with AWS outages, clouds aren’t as nebulous, meaning enterprises must look at which physical cloud data centers are running their resources and services (eg. EC2), and treat them as a single point of failure in their architecture.

For applications that must be always-on, having database servers hosted in one cloud data center is simply not enough.

From Disaster Recovery (DR) to Continuous Operations

Besides Multi-Region or Hybrid-Cloud clustering, Multi-Cloud is another strategy to protect your business from downtime. Some say it's the best way. But, Multi-Cloud MySQL clustering is not possible with AWS Aurora...or any other DBaaS...

The way to withstand site-wide outages is to design redundancy across regions, data centers or cloud providers - a clustering benefit we refer to here at Continuent as “Disaster Recovery” (DR).

However, DR is somewhat of a misnomer and a distinction must be made. As summarized on the MariaDB Enterprise DR page, traditional enterprise DR is described as:

“...strategies for restoring availability and data in the event of an unexpected outage or data loss/corruption – everything from multiple data centers to backup and restore tools to flashback and system versioning.”

Most of the traditional DR strategies are measures of correction after a disruption in your database service, not prevention.

Enter “modern-day DR” - continuous operations! In the event of an AWS data center outage, a different cloud or region is immediately available to serve the application, so there’s no downtime. When the original database servers are up and running again, they may re-enter the cluster and resume operations as if nothing happened. The result is fault tolerance against site-wide outages - continuous operations.

“An ounce of prevention is worth a pound of cure” — Benjamin Franklin

Clustering to prevent site-wide outages for database servers is not simple. Not only does the MySQL database often handle huge fluctuations in users, but the data is also constantly changing, as the database serves up Reads and Writes 24/7. It’s a highly dynamic system and requires a very mature, robust and tested solution.

Tungsten Clustering frees you of the risks of cloud-vendor lock-in and comes with 24/7/365 mission-critical support you can trust; the Support team is highly experienced and genuinely cares. The clustering stack consists of components that work together seamlessly to ensure availability, performance, data integrity, as well as simple management and monitoring in a fault-tolerant, light-weight scalable and cost-effective system. That’s why customers such as Adobe, Garmin, Marketo, Riot Games and VMware rely on Tungsten to safeguard their most critical data.

Feel free to reach out to learn more!

Resources:

About the Author

Sara Captain
Director of Product Marketing

Sara has worn various hats at Continuent since 2014. Listening to Continuent customers over the years, Sara fell in love with the Continuent Tungsten suite of products. She started learning Linux and MySQL administration with the support of Continuent's amazing team, so she can help with keeping Customers happy. Prior to Continuent she worked in consulting with a focus on leveraging data.

Add new comment