In this blog post we explore various options for performance tuning MySQL server for better Replicator applier performance on the replica nodes.
A Tungsten Cluster relies upon the Tungsten Replicator to move events from the primary node to the replicas. Once the event has been transferred to the replica as THL on disk, the applier on the replica will then attempt to write it to the database. Tungsten Replicator can only apply events as fast as MySQL allows. If the MySQL server is somehow slow or blocking, then the Replicator will be as well.
A properly-tuned database server in addition to infrastructure and SysAdmin best practices will go quite a long way towards high-performance apply operations on replica nodes.
Recently, a customer asked us:
During one of our load tests, we had a peak of 60k writes/min, averaging around 40k w/m. During this period, we saw an applied latency of 100-160 seconds. Is it possible to improve the replication latency on the replica nodes?
Where To Look and What For?
When performance tuning, we normally examine the configuration and logs for the following layers:
- Tungsten Replicator
- MySQL server
- Hardware & OS
What are Performance Tuning Best Practices?
Before we dive into the various layers, let's review the performance tuning basics that I have learned over the years:
- Performance tuning is a slow, iterative process.
- Change one thing at a time and test carefully.
- Document your changes and test results.
- Go for the low-hanging fruit early on. Sometimes the smallest changes may have the largest results, i.e. adding a proper index on a table.
- As soon as you remove one bottleneck, the next one will appear. Iterate!
- Divide and Conquer - pick a spot and ensure everything is working properly on both sides. this often helps in isolating issues.
- Start at the top and work your way down the layers. Test, test, test!
- Start at the bottom and work your way up the layers. Test, test, test!
- When in doubt, ask, then document.
- Read the documentation. Then read it again.
- Ensure consistency amongst all nodes at every layer.
- Most of all, do not assume.
The Short Version
We always start off by checking the replicator applier stages, of which there are three.
On a replica node, if the
appliedLatency of the
remote-to-thl stage is low, but the
appliedLatency of both the
q-to-dbms stages are high, then the issue is almost always with the MySQL server somehow.
For example, on a healthy cluster:
On a Primary Node:
shell> trepctl status -name tasks | egrep 'applied|stage'
appliedLastEventId : mysql-bin.000046:0000000065129638;-1 appliedLastSeqno : 2656221 appliedLatency : 0.279 stage : binlog-to-q appliedLastEventId : mysql-bin.000046:0000000065129638;-1 appliedLastSeqno : 2656221 appliedLatency : 0.279 stage : q-to-thl
On a Replica Node:
shell> trepctl status -name tasks | egrep 'applied|stage'
appliedLastEventId : mysql-bin.000046:0000000065077978;-1 appliedLastSeqno : 2656191 appliedLatency : 0.345 stage : remote-to-thl appliedLastEventId : mysql-bin.000046:0000000065077978;-1 appliedLastSeqno : 2656191 appliedLatency : 0.371 stage : thl-to-q appliedLastEventId : mysql-bin.000046:0000000065077978;-1 appliedLastSeqno : 2656191 appliedLatency : 0.374 stage : q-to-dbms
For more information about stages, please visit the following blog post: Mastering Tungsten Replicator Series: Understanding Pipelines and Stages
Quick Tech "Tool Tip"
Here is a handy command to describe the summary progress of each stage on that node, refreshing every two seconds:
trepctl perf -r 2
What a Lovely Tune
- The first thing to do would be to enable the slow query logs just to validate that there is nothing blocking faster queries behind it. Since the applier is single-threaded, just one repeated slow query can create a huge bottleneck.
SHOW FULL PROCESSLISTon the replicas during the latency would be helpful to identify what may be locking or blocking.
- Use the MySQL EXPLAIN command to better understand why queries are slow.
- Make sure there are no redundant indexes.
- Does every table have a Primary Key?
- What type of replication is enabled in MySQL - STATEMENT, MIXED or ROW? For active/passive clusters, we recommend MIXED, and for active/active clusters, use ROW to ensure data integrity across masters. ROW will increase latency due to the sheer volume of data transmitted as compared to STATEMENT or MIXED.
- Are the tables InnoDB? If so, do you have lots of memory? can you load the tables into RAM and then sync to disk? This could help to avoid the physical SERIAL disk i/o that can create bottlenecks, even if the i/o channel itself is not full
- Check for the use of Unique indexes, which disable the InnoDB change buffering performance boost. InnoDB has change buffering (previously called the insert buffer), which is a feature to delay building secondary indexes that are not unique, and merge writes. It can boost insert performance by quite a lot, and it’s enabled by default.
- Finally, how large are the affected tables and schemas in terms of row size and byte size? Can the tables be split? Can the databases be split?
Key MySQL my.cnf Configuration Parameters to Check:
For this specific customer, we identified four (4) "hot" tables in the slow query logs, which turned out to have 2 million or more rows each. This asks the database to perform some i/o-intensive operations.
Once the issue was clearly localized to the four tables, the scope narrowed somewhat, implying an indexing, query or disk i/o-based bottleneck.
Disk Inside and Out
- Disk I/O is normally the slowest part of any compute workflow because disk is often much slower than physical memory. No matter how fast the underlying SSD is, there are physical limitations to overcome, especially with SAN and Cloud-based storage.
- For example, when using AWS EC2, consider using enhanced EBS I/O with provisioned IOPS if you are not already doing so, which would provide faster performance at an increased cost.
- Not having separated volumes for different purposes will always degrade performance because of disk contention. For example, ensure separate volumes for (at least) root, mysql data, mysql binlogs and THL files.
- Then stripe each filesystems for MySQL over multiple volumes using RAID of some sort (i.e. RAID-0), so that there are multiple I/O channels in use at the same time for the same filesystem.
- This layered disk architecture provides for a large number of parallel disk I/O channels, giving a much higher throughput at a much lower latency.
- What is the filesystem type? (i.e. xfs, ext4, etc...) because journaling filesystems are very slow. Consider using a non-journaling file system, or disabling journaling. We suggest using xfs.
- Mount the filesystem using the
The Network Really IS the Computer
- Ensure sufficient bandwidth, and a very low error rate.
- Confirm the various hops have sufficient cpu, ram and bandwidth
- Is the firewall able to keep up?
SysAdmins to the Rescue, As Usual
- Is there enough CPU?
- Is there enough RAM?
- Check nproc and ofiles limits
Advanced Tungsten Replicator
The Parallel (Apply) Universe
After making sure that all reasonable efforts have been made to properly evaluate the above solutions, there is an advanced feature available to try: Parallel Apply.
Tungsten Replicator by default uses a single-threaded applier, so it can get about 10,000 updates per second maximum, depending on the round trip time, and so on. Since MySQL server is multi-threaded, the primary is able to write faster than the replicas.
To increase that limit, Tungsten Replicator offers Parallel Apply, which employs multiple replication apply threads, one per shard.
By default, we shard by database, with one shard created per database schema. If there is a cross-database query, all other threads block until that one completes, slowing performance down dramatically.
This means Parallel apply is best suited for environments that equally busy writes for every database. Having many databases, but only one or two as hot defeats the design and purpose of Parallel Apply.
Again, any cross-shard query will force the Replicator back into single-threaded mode to ensure data integrity, with the result of having no performance gain, or even degradation as now the Replicator has to keep switching modes.
Tungsten Replicator can also shard by table, with the same caveats, but this time cross-TABLE queries will block other threads, making this somewhat less useful than desired based on most query designs.
Important questions before implementing Parallel Apply:
- Do you have just one database schema or multiple?
- If single, do the bulk of your queries cross tables or use single tables?
- If multiple databases, are they de-coupled or do you do a bunch of cross-database queries?
What WAS the Solution After All?
For this customer, the following changes improved performance to acceptable levels:
- The THL was being written to the same location as the MySQL binary logs, creating contention. Moving the THL directory to a different location (/logs) improved performance.
- Because the nodes are AWS EC2 instances, converting volumes from GP2 to Provisioned IOPS allowed the replica to keep up with replication much better.
- Several MySQL configuration values were modified, which also decreased the applied latency on the replica nodes:
innodb_io_capacity=1000 (Interestingly, lowering this from 10k to 1k led to a significant improvement in latency, despite the EBS volume being provisioned for 10k iops) innodb_io_capacity_max=2000 innodb_adaptive_hash_index=0 innodb_lru_scan_depth=2048 performance_schema=0
Bonus Performance Tuning Tip
Our upcoming version 7.0.0 will allow you to adjust the innodb_flush_log_at_trx_commit value of the replica nodes, potentially resulting in up to 75% gains. By setting the MySQL variable innodb_flush_log_at_trx_commit=2, MySQL does not need to flush to disk upon each write, instead passing that responsibility to the operating system sync operation. There is a small window for data loss in the event of a host crash, but if you deem that risk acceptable on a replica, you can get a huge boost in performance. This feature in version 7.0.0 will do this for you automatically if enabled, and ensure that the value is set to 1 on the primary node.
Please read the docs!
For more information about how to use the Tungsten Replicator, please see Mastering Tungsten Replicator Series: Command Line Superpowers
For more in-depth discussion about MySQL database tuning, here are some excellent blog posts by Morgan Tocker:
For more technical information about Tungsten clusters, please visit https://docs.continuent.com
Clearly, there are many, many things to think about when it comes to MySQL performance tuning - this blog post barely scratches the surface of the subject. Remember, performance tuning is all about iterative effort over time!
Tungsten Clustering is the most flexible, performant global database layer available today – use it underlying your SaaS offering as a strong base upon which to grow your worldwide business!
For more information, please visit https://www.continuent.com/solutions
Want to learn more or run a POC? Contact us.