In this first in a series of blog posts on real-time data replication from MySQL, MariaDB or Percona Server to high-performance data analytics engines (and how to do so easily and cost-effectively … but more on that further below), we’re focussing on Vertica replication.
Vertica is a popular columnar relational DBMS, which is designed to handle modern analytic workloads, enabling fast query performance; it is currently ranked 19th on db-engines’ database management systems popularity rankings (Relational DBMS).
First up, let’s take a quick look at what real-time analytics is all about and what some common replication challenges are.
Real-Time Analytics & Data Replication
Real-time analytics allow companies to react rapidly to changing business conditions as for example:
- Online ad services that process click-through data to maximize ad impressions
- Retailers that analyze sales patterns to identify micro-trends and move inventory to meet them
The common theme is speed: moving lots of information without delay from operational systems to fast data warehouses that can feed reports back to users as quickly as possible.
There are many challenges in building any system that must scale to high numbers of transactions; and replicating from MySQL to Vertica has its own challenges when not prepared for with care or when using tools that aren’t optimal for the job.
There are typically a couple of options for moving data between MySQL and Vertica:
- Work with an ETL tool to post batches extracted from MySQL to Vertica
- Write custom-made scripts to scrape data out of the binlog, process them with a fast scripting language like Perl, and load the result into Vertica
The reality is that there is no simple way to provide anything like real-time reports to users when large volumes of data are involved as, in our experience, ETL and custom-made solutions tend to struggle with real-time transfer as well as the extra load they impose on already busy servers.
This is where Tungsten Replicator comes in, and in particular, the Tungsten Replicator AMI on the Amazon Marketplace, which is the focus of this post.
Tungsten Replicator AMI for Vertica
Tungsten Replicator is the most advanced heterogeneous replication solution for MySQL, MariaDB & Percona Server, including Amazon RDS MySQL and Amazon Aurora.
It provides high-performance and improved replication functionality over the native MySQL replication solution and into a range of target databases, such as Vertica of course… as well as MySQL (all versions), Oracle, PostgreSQL, AWS RedShift, ClickHouse, Hadoop, MongoDB & Kafka.
It is available as an AMI and can be accessed via AWS.
How Vertica Replication Works
Option 1: Local Install
Extractor reads directly from the logs. This is the default.
Option 2: Remote
Extractor gets log data via MySQL Replication protocols (which requires the DBMS service to be online) This is how we handle Amazon Aurora extraction tasks.
How the Tungsten Replicator Can Be Used
- Always-on, fast and reliable applications that are supported and protected by a multi-site active/active Tungsten Clustering topology for MySQL; and a platform that is powered by real-time data replication from MySQL to Vertica using the Tungsten Replicator.
- When scalability and security are needed with demanding query performance in an environment of OLTP production MySQL, MariaDB or Percona Server databases that are being synchronized with Vertica clusters.
- Let us know about your own replication needs or use case!
How to Get Started With the Tungsten Replicator AMI
Getting started with the 14-Day free trial
Users can try one instance of each type of the AMI for free for 14 days to get them started (AWS infrastructure charges still apply). Please note that free trials will automatically convert to a paid hourly subscription upon expiration.
Our colleague Chris Parker recorded this handy walk-through on how to get started with the Tungsten Replicator AMI if you need some tips & tricks.
Replicate from AWS Aurora, AWS RDS MySQL, MySQL, MariaDB & Percona Server from as little as $2.067/hour
With Tungsten Replicator (AMI) on AWS, users can replicate GB's of data from as little as $40c/hour:
- Go to the AWS Marketplace, and search for Tungsten, or click here.
- Choose and Subscribe to the Tungsten Replicator for MySQL Source Extraction.
- Choose and Subscribe to the target Tungsten Replicator AMI of your choice.
- Pay by the hour.
- When launched, the host will have all the prerequisites in place and a simple "wizard" runs on first launch and asks the user questions about the source and/or target and then configures it all for them.
Note that the following extraction and applier support is also provided
- MySQL (all versions, on-premises and in the cloud)
- MariaDB (all versions, on-premises and in the cloud)
- Percona Server (all versions, on-premises and in the cloud)
- AWS Aurora
- AWS RDS MySQL
- Azure MySQL
- Google Cloud SQL
Available Replication Target Databases
Do give Tungsten Replicator AMI a try and let us know how you get on or if you have any questions by commenting below!