Press "Enter" to skip to content

A DNS load balanced HA cluster with Bind9 and BalanceNG

Abstract

This example shows how to setup a dual node, load balanced and high available DNS cluster with Bind9 and BalanceNG in a few easy steps. This is exactly the way we operate our own primary DNS server in our DMZ.

Step 1: Preparing the Nodes

We picked two 1U boxes having two Intel based 1GBit interfaces already on board, which is just fine for this setup. We decided not to use a “rotating” hardisk, we installed a 2GByte Transcend Flash Disk module instead which is plugged directly into the IDE interface connector on the motherboard. The benefits of a flash solid state disk are substantial better MTBF measures and also reduced power consumption.

The OS in use for this setup is Ubuntu Server LTS.

Step 2: Physical Network Configuration

Both nodes are connected to the 100MBit core switch in the DMZ, where eth0 interface is configured as usual and the eth1 interface is being used exclusively by BalanceNG in DSR Direct Server Return mode.

The network setup looks like this:

Dual Node Bind9/BalanceNG cluster in DSR mode
Dual Node Bind9/BalanceNG cluster in DSR mode

Step 3: Linux Network Configuration

We decided to call the one node ns0aand the other ns0bsince there’s no dedicated VRPP mastership (The “first” one booted gets master first).

The DNS cluster virtual IP address will be 10.235.210.1, which requires a loopback alias to be established. Node ns0a gets the Linux OS address 10.235.210.43 and node ns0b the address 10.235.210.44 (both used for normal Linux operation e.g. SSH login).

/etc/network/interfaces of ns0a looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0 lo:0 lo:1

iface eth0 inet static
 address 10.235.210.43
 netmask 255.255.255.0
 gateway 10.235.210.254

iface lo:0 inet static
 address 10.235.210.1
 netmask 255.255.255.255

and /etc/network/interfaces of ns0b looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0 lo:0 lo:1

iface eth0 inet static
 address 10.235.210.44
 netmask 255.255.255.0
 gateway 10.235.210.254

iface lo:0 inet static
 address 10.235.210.1
 netmask 255.255.255.255

The following lines need to be appended to /etc/sysctl.conf on both nodes in order to prefent “ARP flux” problems as descibed in our FAQ:

net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2

Step 4: Installation of required Packages

The following Ununtu/Debian Packages need to be installed on both nodes as the next step:

  • bind9
  • mon
  • libnet-dns-perl
  • BalanceNG (.deb package as from the download page)

Step 5: Licensing

The nodeid of the BalanceNG host can be retrieved that way:

# bng -N 
11:22:33:44:55:66

The license is activated by the “license” configuration command which we insert into the file /etc/bng.global. This makes the license active for all instances of BalanceNG on the particular node (or Linux box).

after key generation we insert into /etc/bng.global on ns0a this line:

license NS0ATEST 17d17854ad3d234d1e8603629f0ae5ae

and on ns0b this line:

license NS0BTEST 124f4ebf63e3f5dab69fb813ef3d0216

Licensing can then be verified as follows:

# bng control
BalanceNG: connected to PID 14598
bng# show license
 status: valid full license
 serial: NS0ATEST 
 nodeid: 11:22:33:44:55:66
 type "show version" for version and Copyright information
bng#

Step 6: Bind9 Preparation

This step is done as usual, we just have to make sure that Bind9 is listening on the following addresses:

  • The loopback address 127.0.0.1,
  • the virtual loopback alias address 10.235.210.1,
  • and on the eth0 native address (10.235.210.43 on ns0a and 10.235.210.44 on ns0b).

We just used the following line in the options-Section of /etc/bind/named.conf to establish this:

listen-on {any;};

After configuration we established a SSH-key trust relationship between the nodes and wrote a small script which allows to keep the zone files in /etc/bind on both nodes in sync.

The remaining part of the Bind9 configuration is “as usual” and not in the scope of this example. The “BIND 9.5 Administrator Reference Manual” is a very helpful resource for this task.

Step 7: BalanceNG agent configuration and script

We decided for this setup to make use of the BalanceNG agent “bngagent” health check script capabilities. The file /etc/rc.local on noth nodes looks like this for this purpose:

#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

bngagent -c"/etc/bind/agentcheck.sh" 439
exit 0

And the file /etc/bind/agentcheck.sh skript looks like this (using the dns.monitor of the “mon” monitoring daemon):

#!/bin/sh
/usr/lib/mon/mon.d/dns.monitor -zone balanceng.net -master 10.235.210.1 10.235.210.1
if [ "$?" = "0" ]
then
 echo 1
else
 echo 0
fi

In this case the dns.monitor checks the presence of Bind9 by contacting the DNS service on the loopback alias address (which is identical to the BalanceNG virtual server address of “server 1”).

Step 8: The BalanceNG Configuration Files

BalanceNG configuration for NS0A

 // configuration taken Thu May 1 20:47:29 2008
 // BalanceNG 2.084 (created 2008/05/01)
 hostname NS0A
 set localdsr 1
 interface eth1
 vrrp {
   vrid 7
   priority 200
   network 1
 }
 network 1 {
   name "local network"
   addr 10.235.210.0
   mask 255.255.255.0
   real 10.235.210.45
   interface eth1
 }
 register network 1
 enable network 1
 server 1 {
   ipaddr 10.235.210.1
   port 53
   method session
   targets 1,2
 }
 register server 1
 enable server 1
 target 1 {
   ipaddr 10.235.210.43
   port 53
   ping 5,12
   agent 439,5,13
   tcpopen 53,5,12
   dsr enable
 }
 target 2 {
   ipaddr 10.235.210.44
   port 53
   ping 5,12
   agent 439,5,13
   tcpopen 53,5,12
   dsr enable
 }
 register targets 1,2
 enable targets 1,2
 // end of configuration

BalanceNG configuration for NS0B

 // configuration taken Thu May 1 20:49:00 2008
 // BalanceNG 2.084 (created 2008/05/01)
 hostname NS0B
 set localdsr 1
 interface eth1
 vrrp {
   vrid 7
   priority 200
   network 1
 }
 network 1 {
   name "local network"
   addr 10.235.210.0
   mask 255.255.255.0
   real 10.235.210.46
   interface eth1
 }
 register network 1
 enable network 1
 server 1 {
   ipaddr 10.235.210.1
   port 53
   method session
   targets 1,2
 }
 register server 1
 enable server 1
 target 1 {
   ipaddr 10.235.210.43
   port 53
   ping 5,12
   agent 439,5,13
   tcpopen 53,5,12
   dsr enable
 }
 target 2 {
   ipaddr 10.235.210.44
   port 53
   ping 5,12
   agent 439,5,13
   tcpopen 53,5,12
   dsr enable
 }
 register targets 1,2
 enable targets 1,2
 // end of configuration

Step 9: Testing

At the very end you should be able to see BalanceNG sessions being created and name resolution should work like a charm. A typical CLI dialog could look like this (on the current VRRP master NS0A):

root@ns0a:~# bng control
BalanceNG: connected to PID 10668
NS0A# show vrrp
 state MASTER
 vrid 7
 priority 200
 ipaddr0 82.135.110.1
NS0A# show sessions
 71 sessions
 hash ip-address port srv tgt age stout
 -------- --------------- ----- --- --- ---- -------
 5604999 10.85.134.135 any 1 1 18 600
 5604997 10.85.134.133 any 1 2 18 600
 80901 10.1.60.5 any 1 1 20 600
 15546371 10.237.56.3 any 1 2 21 600
 13763506 10.210.3.178 any 1 2 44 600
 1679371 10.25.160.11 any 1 2 46 600
 15369391 10.234.132.175 any 1 2 58 600
 5915664 10.90.68.16 any 1 1 102 600
 6888621 10.105.28.173 any 1 1 107 600
 15571300 10.237.153.100 any 1 2 109 600
 2328074 10.35.134.10 any 1 1 113 600
 ... remaining sessions not shown
NS0A#

On the current backup (NS0B in this example) you can verify that the sessions are being “learned” correctly:

BalanceNG: connected to PID 10359
NS0B# show vrrp
 state BACKUP
 vrid 7
 priority 200
 ipaddr0 82.135.110.1
NS0B# shown sessions
 ERROR: invalid command (try help)
NS0B# show sessions
 71 sessions
 hash ip-address port srv tgt age stout
 -------- --------------- ----- --- --- ---- -------
 5604999 10.85.134.135 any 1 1 18 600
 5604997 10.85.134.133 any 1 2 18 600
 80901 10.1.60.5 any 1 1 20 600
 15546371 10.237.56.3 any 1 2 21 600
 13763506 10.210.3.178 any 1 2 44 600
 1679371 10.25.160.11 any 1 2 46 600
 15369391 10.234.132.175 any 1 2 58 600
 5915664 10.90.68.16 any 1 1 102 600
 6888621 10.105.28.173 any 1 1 107 600
 15571300 10.237.153.100 any 1 2 109 600
 2328074 10.35.134.10 any 1 1 113 600
 ... remaining sessions not shown
NS0B#