Achieving software upgrades without downtime is a challenge that few software products can handle, or even try to. Tungsten Clustering is one that can!
Since version 4.0, Tungsten Clustering has included close-to zero (few milliseconds) downtime upgrades in its tpm deployment tool, thanks to two Tungsten Connector improvements: “graceful stop” and start/retry.
The connector graceful-stop command allows ongoing connections to finish their session before the connector is fully stopped inside a configurable timeout window.
With the start/retry feature, a new connector instance can be launched while another one is still bound to the server socket. It will wait for the socket to become available by retrying binding every 200ms by default (which is tunable), drastically reducing the window for application connection failures.
Want to try it out? “tpm upgrade” already includes this feature!
On the way to full zero-downtime upgrades
An interesting improvement in the Linux 3.9 kernel offers a way for two programs (or two instances of a single program) to listen to the same TCP port via SO_REUSEPORT
Yes, the holy grail of true zero-downtime upgrades is made achievable by this cool new feature… but is only available in most recent distributions (CentOS/RHEL 7+, Debian 9, etc.) and in Java 9+.
For the many production environments running older versions, Tungsten Connector already offers close-to zero downtime upgrades!