Redis HA with Redis Sentinel and VIP

For an actual project we decided to use Redis for some reasons. As there is availability a critical part, we discovered that Redis Sentinel can monitor Redis and handle an automatic master failover to a available slave.

Setting up the Redis replication was straight forward and even setting up Sentinel. Please keep in mind, if you configure Redis to require an authentication password, you even need to provide that for the replication process (masterauth) and for the Sentinel connection (auth-pass).

The more interesting part is, how to migrate over the clients to the new master in case of a failover process. While Redis Sentinel could also be used as configuration provider, we decided not to use this feature, as the application needs to request the actual master node from Redis Sentinel much often, which will maybe a performance impact.
The first idea was to use some kind of VRRP, implemented into keepalived or something like this. The problem with such a solution is, you need to notify the VRRP process when a redis failover is in progress.
Well, Redis Sentinel has a configuration option called 'sentinel client-reconfig-script':

# When the master changed because of a failover a script can be called in
# order to perform application-specific tasks to notify the clients that the
# configuration has changed and the master is at a different address.
# 
# The following arguments are passed to the script:
#
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
#
# <state> is currently always "failover"
# <role> is either "leader" or "observer"
# 
# The arguments from-ip, from-port, to-ip, to-port are used to communicate
# the old address of the master and the new address of the elected slave
# (now a master).
#
# This script should be resistant to multiple invocations.

This looks pretty good and as there is provided a <role>, I thought it would be a good idea to just call a script which evaluates this value and based on it's return, it adds the VIP to the local network interface, when we get 'leader' and removes it when we get 'observer'. It turned out that this was not working as <role> didn't returned 'leader' when the local redis instance got master and 'observer' when got slave in any case. This was pretty annoying and I was short before giving up.
Fortunately I stumpled upon a (maybe) chinese post about Redis Sentinal, where was done the same like I did. On the second look I recognized that the decision was made on ${6} which is <to-ip>, nothing more then the new IP of the Redis master instance. So I rewrote my tiny shell script and after some other pitfalls this strategy worked out well.

Some notes about convergence. Actually it takes round about 6-7 seconds to have the VIP migrated over to the new node after Redis Sentinel notifies a broken master. This is not the best performance, but as we expect this happen not so often, we need to design the application using our Redis setup to handle this (hopefully) rare scenario.

Show Comments