Advanced Cluster Tuning: Using Parallel Apply on a Single Node

Recently a Customer Asked

“Are there any issues with running a specific node with parallel apply enabled, and the rest of the nodes using a single stream (parallel disabled)? Just curious because that would be easiest for A/B testing of the replication effects.”

In this blog post we explore the implications and best practices for using Parallel Apply on a single node in the cluster.

Firstly, the answer is yes, you may certainly enable Parallel Apply on a single node within a Tungsten Cluster.

The caveat is that a node with Parallel Apply enabled cannot be a provisioning source for a node without Parallel Apply enabled unless the replicator on the source node is offline with just a single row in the trep_commit_seqno table.

Pros and Cons

PROS

  • Enables A/B testing of
    • Parallel Apply-enabled versus Parallel Apply-disabled nodes
    • Different channel counts on Parallel Apply-enabled nodes

CONS

  • Complex configurations always invite problems; a key rule of clustering is that all nodes should be as close to the same as possible
  • A node with Parallel Apply enabled cannot be a provisioning source for a node without Parallel Apply enabled unless the replicator on the source (Parallel-enabled) node is offline with just a single row in the trep_commit_seqno table. 

Enabling Parallel Apply

To enable Parallel Apply on the cluster node of your choice, the key is to take the Replicator offline gracefully, then under [defaults], set the svc-parallelization-type to disk, and the number of channels to 10:

shell$ cctrl
cctrl> set policy maintenance
cctrl> exit
shell$ trepctl offline
~or~
shell$ trepctl offline-deferred -at-seqno {seqno}
shell$ vi /etc/tungsten/tungsten.ini
[defaults]
…
svc-parallelization-type=disk
channels=10
shell$ tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-6.1.15-8
shell$ echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-6.1.15-8
shell$ cd {STAGING_DIRECTORY}
shell$ ./tools/tpm update
shell$ cctrl
cctrl> set policy automatic
cctrl> exit

Next, confirm that the replicator is online and using multiple channels:

shell$ trepctl status | grep channels
channels               : 10

You may also check the active channels:

shell$ trepctl status -name channel-assignments 
Processing status command (channel-assignments)...
NAME      VALUE
----      -----
channel : 1
shard_id: test_db7_demo
NAME      VALUE
----      -----
channel : 0
shard_id: test_db8_demo
NAME      VALUE
----      -----
channel : 2
shard_id: test_db9_demo
Finished status command (channel-assignments)...

Note that while there are 10 allocated channels in the configuration, only three are in use at the moment. This is because I have three load streams, one per node, with one database per node. Since the natural sharding for Parallel Apply is per database, we get three active channels, one per active load stream.

A more advanced version of the trepctl status command uses egrep instead of grep for easy multiple matches:

shell$ trepctl status | egrep 'state|applied|serviceName|role|channels'
appliedLastEventId     : mysql-bin.001414:0000000079461201;-1
appliedLastSeqno       : 179999
appliedLatency         : 0.633
channels               : 10
role                   : slave
serviceName            : north
state                  : ONLINE

You may also query the database when the replicator is ONLINE to see the extra rows in the trep_commit_seqno table:

mysql> select * from tungsten_north.trep_commit_seqno;
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
| task_id | seqno  | fragno | last_frag | source_id               | epoch_number | eventid                              | applied_latency | update_timestamp    | shard_id      | extract_timestamp   |
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
|       0 | 809427 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000072689396;-1 |               0 | 2021-11-29 15:20:35 | test_db9_demo | 2021-11-29 15:20:34 |
|       1 | 809426 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000072687739;-1 |               0 | 2021-11-29 15:20:35 | test_db8_demo | 2021-11-29 15:20:34 |
|       2 | 809428 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000072691053;-1 |               0 | 2021-11-29 15:20:35 | test_db7_demo | 2021-11-29 15:20:34 |
|       3 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       4 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       5 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       6 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       7 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       8 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
|       9 | 799999 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001424:0000000057067200;-1 |               0 | 2021-11-29 15:19:43 | test_db8_demo | 2021-11-29 15:19:43 |
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
10 rows in set (0.01 sec

You may also query the database to see the single row when the replicator is OFFLINE:

shell$ trepctl status | egrep 'state|applied|serviceName|role|channels'
appliedLastEventId     : NONE
appliedLastSeqno       : -1
appliedLatency         : -1.0
channels               : -1
role                   : slave
serviceName            : north
state                  : OFFLINE:NORMAL
shell$ tpm mysql
mysql> select * from tungsten_north.trep_commit_seqno;
mysql> select * from tungsten_north.trep_commit_seqno;
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
| task_id | seqno  | fragno | last_frag | source_id               | epoch_number | eventid                              | applied_latency | update_timestamp    | shard_id      | extract_timestamp   |
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
|       0 | 909363 |      0 | 1         | db9-demo.continuent.com |            0 | mysql-bin.001426:0000000028566800;-1 |               0 | 2021-11-29 15:29:30 | test_db7_demo | 2021-11-29 15:29:29 |
+---------+--------+--------+-----------+-------------------------+--------------+--------------------------------------+-----------------+---------------------+---------------+---------------------+
1 row in set (0.01 sec)

For more information, please visit the documentation: https://docs.continuent.com/tungsten-clustering-6.1/deployment-parallel-updateconfig.html#deployment-parallel-updateconfig-enable

Provisioning Caveat When Using Parallel Apply

If you try to provision a target with Parallel Apply disabled from a source with Parallel Apply enabled, you will get the following error message in trepctl status on the target node after you try to take the service online:

pendingExceptionMessage: Unable to prepare plugin: class name=com.continuent.tungsten.replicator.datasource.DataSourceService message=[Rows in trep_commit_seqno are inconsistent with channel count: channels=1 rows=10]

To solve this issue, take the replicator on the source node offline, then try again.

Disabling Parallel Apply

Should you wish to turn off Parallel Apply, the key is to take the Replicator offline gracefully, then change the svc-parallelization-type to none, and the number of channels to 1:

shell$ cctrl
cctrl> set policy maintenance
cctrl> exit
shell$ trepctl offline
~or~
shell$ trepctl offline-deferred -at-seqno {seqno}
shell$ vi /etc/tungsten/tungsten.ini
[defaults]
…
svc-parallelization-type=none
channels=1
shell$ tpm query staging
tungsten@db1:/opt/continuent/software/tungsten-clustering-6.1.15-8
shell$ echo The staging DIRECTORY is `tpm query staging| cut -d: -f2`
The staging DIRECTORY is /opt/continuent/software/tungsten-clustering-6.1.15-8
shell$ cd {STAGING_DIRECTORY}
shell$ ./tools/tpm update
shell$ cctrl
cctrl> set policy automatic
cctrl> exit

shell$ trepctl status | grep channels
channels               : 1

For more information, please visit the documentation: https://docs.continuent.com/tungsten-clustering-6.1/deployment-parallel-updateconfig.html#deployment-parallel-updateconfig-disable

Future Directions

On our roadmap are improvements to the provisioning tool, `tprovision`, which will check the Parallel Apply status and ensure that the provisioning source and target have matching configurations.

Wrap-Up

In this blog post we explored the implications and best practices for using Parallel Apply on a single node in the cluster.

About the Author

Eric M. Stone
COO and VP of Product Management

Eric is a veteran of fast-paced, large-scale enterprise environments with 35 years of Information Technology experience. With a focus on HA/DR, from building data centers and trading floors to world-wide deployments, Eric has architected, coded, deployed and administered systems for a wide variety of disparate customers, from Fortune 500 financial institutions to SMB’s.

Add new comment