The Challenge
Performing DDL changes in a database is common practice for any DBA or Application Developer, but what often creates a head-scratching moment is how to handle those DDL changes that will take a long time to execute and/or cause table locking. Not everyone has the luxury of being able to stop running production applications for however long the change takes.
This was the challenge that came up recently with one of our clustering customers. They have a huge database upwards of 700Gb in size with tables containing many millions of rows. They needed to rename some columns in these large tables but couldn’t have any downtime, and they are also running a Multi-Site/Active-Active topology, which added additional layers of complexity.
Typically, with additive DDL changes such as adding a new table or even making these types of rename changes on small tables where the operation is instantaneous, you simply make the change on the Primary node and allow that change to flow through replication; yet for large changes that lock tables or may take hours to complete, it’s not so simple.
Since we are in a clustering environment and have the ability to isolate nodes, coupled with the Tungsten Replicator filters, we were able to provide a solution that allowed the customer to apply the changes first on the replica nodes, then promote one of the updated replicas, and finally make the changes on the old primary, all whilst still keeping replication flowing with only a brief moment of pause when the new node was promoted.
In a previous blog post, we walked through the process of removing columns using the dropcolumn filter, in this blog post we explore how to handle column renames.
Let’s step through this process and see how easy such challenging maintenance operations can be when you are using Tungsten Clustering to manage your MySQL databases…
Prerequisites
There is only one small prerequisite for being able to perform this seamless maintenance operation, and that is that your MySQL database binlog_format must be set to ROW.
Prep the Filters
To achieve this column rename task, we need to enable a few filters in the Replicators. Let's just review what these filters do and why we need them.
The colnames Filter
The colnames filter is enabled on the primary node and allows the extractor to include column names within the THL metadata. Typically, this isn’t enabled by default (unless you are running a heterogeneous replication pipeline!) as doing so can increase the size of THL on disk. Therefore, using this filter does come with a side note that you should expect to see an increase in THL size; however, once the column rename process has completed you can revert this!
This is what a typical THL entry will look like without the colnames filter applied:
SEQ# = 220 / FRAG# = 0 (last frag)
- FILE = thl.data.0000000001
- TIME = 2025-11-12 10:52:06.0
...
- ACTION = INSERT
- SCHEMA = hr
- TABLE = employees
- ROW# = 0
- COL(1:) (size = 3) = 207
- COL(2:) (size = 5) = Bart
- COL(3:) (size = 6) = Simpson
- COL(4:) (size = 17) = bart@someemaildomain.com
- COL(5:) (size = 11) = 1354698745
- COL(6:) (size = 10) = 2025-11-12
- COL(7:) (size = 7) = AD_PRES
- COL(8:) (size = 8) = 15000.00
- COL(9:) (size = 0) = NULL
- COL(10:) (size = 0) = NULL
- COL(11:) (size = 2) = 90
After applying the colnames filter, you will start to see additional information in the output, for example:
SEQ# = 221 / FRAG# = 0 (last frag)
- FILE = thl.data.0000000001
- TIME = 2025-11-12 10:53:06.0
...
- ACTION = INSERT
- SCHEMA = hr
- TABLE = employees
- ROW# = 0
- COL(1: employee_id) (size = 3) = 208
- COL(2: first_name) (size = 5) = Lisa
- COL(3: last_name) (size = 6) = Simpson
- COL(4: email) (size = 17) = lisa@someemaildomain.com
- COL(5: phone_number) (size = 11) = 1354698547
- COL(6: hire_date) (size = 10) = 2025-11-12
- COL(7: job_id) (size = 7) = AD_PRES
- COL(8: salary) (size = 8) = 19000.00
- COL(9: commission_pct) (size = 0) = NULL
- COL(10: manager_id) (size = 0) = NULL
- COL(11: department_id) (size = 2) = 90
colnames filter, the changes only apply to THL generated AFTER enabling. We do not update existing THL on disk, therefore if you have a very busy system, you will need to monitor THL on the replica nodes and only start the DDL changes when you see THL appearing on your nodes that contain the column names.
The rename Filter
This filter is key to making the whole solution work. The rename filter allows you to remap schemas, tables and columns between extractor and applier. Once enabled, the filter reads a CSV file that contains the transformation mapping that you wish to implement. The filter is enabled on the appliers and will scan the THL prior to applying it to the target database and make the necessary adjustments in-flight. In our example, we are simply going to remap column names, hence why we have to enable the colnames filter.
The format of the CSV file is as follows:
origSchema,origTable,origColumn,newSchema,newTable,newColumn
There are two wildcards you can use. You can use the * wildcard for the origXxxx entries to denote apply to all, and on the newXxxx entries if there is no change between Original and New, you just use the - symbol. Here are a few examples:
This example will remap the tablename userlog to new_userlog in every schema, with no column changes:
*,userlog,*,-,new_userlog,-
This example will remap every table from the hr schema to the resources schema:
hr,*,*,resources,-,-
This example, which is the one we will use, remaps a column name in a specific schema and table:
hr,employees,email,-,-,emailaddress
Worked Example
Now that we understand the two filters, let’s step through the process of making the changes.
In this example, we are going to rename the email column in the hr.employees table using the following DDL:
ALTER TABLE hr.employees CHANGE email emailaddress VARCHAR(25);
Step 1: Enable the Filters
This needs to be applied to every node across the full topology.
Within the /etc/tungsten/tungsten.ini configuration, you need to add the following within the [defaults] stanza:
svc-extractor-filters=colnames
svc-applier-filters=rename
property=replicator.filter.rename.definitionsFile=/opt/continuent/share/rename.csv
Next, create the rename.csv file as an empty file:
shell> touch /opt/continuent/share/rename.csv
Next, we apply the changes. This involves updating the software on all nodes. No outage will occur, however the Replicator will be restarted and therefore replication will momentarily pause:
shell> tpm update
If you are also running the Multi-Site/Active-Active topology with a separate cross-site Replicator, these will also need updating:
shell> mm_tpm update
Finally, put the cluster into maintenance mode, making it ready for the next step. If you are running a multi-site setup, ensure you do this across all clusters.
shell> cctrl
cctrl> set policy maintenance
Step 2: Make the DDL Changes
Now repeat the process on all replica nodes, one node at a time.
-
Isolate the replica within the cluster and take the Replicator offline:
shell> cctrl cctrl> datasource <hostnamehere> shun cctrl> replicator <hostnamehere> offline -
If running a Multi-Site/Active-Active cluster, also stop the independent cross-site Replicators:
shell> mm_trepctl offline -
Make the DDL change within the database:
mysql> ALTER TABLE hr.employees CHANGE email emailaddress VARCHAR(25); -
Add the mapping to the
rename.csvfile:hr,employees,email,-,-,emailaddress -
Recover the node within the cluster:
shell> cctrl cctrl> datasource <hostnamehere> welcome cctrl> datasource <hostnamehere> online cctrl> replicator <hostnamehere> online -
Restart the cross-site Replicators, if applicable:
shell> mm_trepctl online - Repeat steps on all remaining Replica nodes.
Step 3: Promote a New Primary
As it stands right now, you have made the DDL change on all the Replicas and the rename filter is remapping any occurrence of the email column to the new emailaddress column. When we perform a switch and before making the DDL changes we need to have a reverse version of the rename filter since we will now be taking changes that include the new column name, and we will need to remap to the old column name. So these steps now need to happen on the Primary node(s):
-
Add the reverse mapping to the
rename.csv:hr,employees,emailaddress,-,-email -
Issue a switch
shell> cctrl cctrl> switch
Step 4: Update the Old Primaries
Once the switch has completed, you can repeat the steps within “Step 2” to apply the DDL to the old primary that is now a Replica, with one slight difference. At item 4 instead of adding the mapping to the rename filter, you need to empty the rename.csv file. It is important the file exists, otherwise the Replicator will fail, so just remove all content and save as an empty file.
Finally, place the cluster back into automatic mode:
shell> cctrl
cctrl> set policy automatic
Step 5: Optional Cleanup
Now that the change has been made across all the nodes, you can remove the filters. To do this, you would simply revert the changes made in “Step 1” by removing the entries from the tungsten.ini files and re-running the tpm update commands.
Alternatively, if you know that you will have more changes to make in the future, you can leave the filters in place, however just ensure you empty out the rename.csv file to avoid any future confusion!
Comments
Add new comment