High Availability Software Based Load Balancer for CockroachDB
Recently, I was building an environment for a series of tests that used CockroachDB as the underlying data store. This happens all the time, but in this instance I had the need to also build a high availability (HA) load balancer (LB) configuration. The need was for the environment to continue to function even if one of the software based load balancers failed. I will explain how to build redundant load balancers using keepalived and haproxy in this post.
To start out, I built the DB cluster on a number of nodes, spread across multiple availability zones and multiple geographic regions. For this, I followed the basic instructions from the CockroachDB documentation (https://www.cockroachlabs.com/docs/v21.1/install-cockroachdb-linux). One of the huge benefits of CRDB is that it provides redundancy and distributed execution of queries across the DB cluster. What it does not include is any kind of load balancing, as that is beyond its purpose. If we were to connect to any node in our cluster at this point, we could view the DBConsole or issue sql queries. But if the node our client is connected to were to experience an infrastructure failure, our workload would stop running.
CockroachDB will generate a haproxy configuration for us if we ask it, using the command: ./cockroach gen haproxy --insecure --host $IP_OF_A_NODE_IN_THE_CLUSTER
This will create an example haproxy.cfg file with a decent set of defaults to use for our CRDB cluster. This file can be transferred to another node that will run haproxy -f haproxy.cfg, and it will act as a basic software load balancer in front of our CRDB cluster. This is the start of our HA software based LB configuration. If we use this configuration file our haproxy node will act as a load balancer and begin distributing connections across all the nodes in the CRDB cluster. This scenario is much better, as if one of the CRDB nodes has an infrastructure failure, our workload will continue to run. But if the HAProxy node were to fail, we're dead in the water again.
If we take this configuration for haproxy and copy it to a second node running haproxy, we now have a warm spare. This will offer us a very minimal amount of redundancy, but getting the 2nd haproxy node in place would take manual effort.
But what if there was a way for the 2nd haproxy node to automatically take over LB duties if the first one were to fail? If we incorporate keepalived into our design, it can make this happen for us. In short, keepalived creates a virtual IP that you would use to access the LB. Keepalived will watch the LB node that is offering up the VIP, and if that node were to stop responding, keepalived will recreate that VIP on the second LB node. This gives us a redundant LB configuration within our overall system. Of course, separate network paths, power, and storage will also play into the overall resiliency of the system.
A basic keepalived.conf file would look like below. From one LB node to the next, the unicast_src_ip and unicast_peer will be flipped. In this example, the VIP that is being presented is 10.13.1.40. The config below is for my second LB node, which has an IP of 10.13.1.39 and the first LB node has an IP of 10.13.1.38. A configuration for the first LB node would be the same with the exception of 10.13.1.39 and 10.13.1.38 being swapped.
global_defs {
notification_email {
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance haproxy-vip {
state BACKUP
priority 100
interface eth0 # Network card
virtual_router_id 60
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.13.1.39 # The IP address of this machine
unicast_peer {
10.13.1.38 # The IP address of peer machines
}
virtual_ipaddress {
10.13.1.40/32 # The VIP address
}
track_script {
chk_haproxy
}
}
Comments
Post a Comment