Aug 27 2014

Cheap Microsoft Exchange Cluster with LVS Load Balancer

Published by at 3:56 pm under Linux




Microsoft Exchange traffic can be load-balanced with Microsoft NLB but it requires to set up 4 Exchange servers for the sake that NLB isn’t compatible with (Exchange) cluster. Most Microsoft guys do not recommend it.
Among NLB downsides, Exchange services are monitored with a basic ping, meaning the service is considered to be active even if Exchange services have been turned off.
 
Hardware-based load-balancers are advised in most setups but are pretty expensive. Since I am going to virtualize Exchange servers on Hyper-v, software-based load-balancers make sense naturally.
This is what brings me down to Linux Virtual Server (LVS) for no extra cost with little resources.
 
Linux and Microsoft hardest defenders may not be happy with it but, Linux Redhat is fully supported on Hyper-V and has been optimized with a few enhancements.
 

Network

Each Exchange virtual machine will be hosted on a different physical server along with a Linux VM for redondancy. The active load-balancer (Master LB1) will distribute packets to the “real” servers through a private network. The private network needs to be connected to a physical port so VMs on the 2 machines can see each others.

 
Create the 2 networks and 4 VMs in Hyper-v giving 1 and 2 NICs to Exchange and LVS servers respectively. Assign IPs accordingly to your network topology.
 

LVS Setup

First edit /etc/sysctl.conf to allow ip forwarding:

# Controls IP packet forwarding
net.ipv4.ip_forward = 1

Then reboot or run
sysctl -p /etc/sysctl.conf
 
As a main direction, you can follow Redhat LVS manual.
Set a password running piranha-passed on the Linux server that is going to be the master and start pulse and piranha-gui services.

/etc/init.d/pulse start
/etc/init.d/piranha-gui start

You should be able to reach the conf page on port 3636 with a web browser ie http://IP:3636. Log in with piranha as username and the password you set above.
 
Once logged on Piranha configuration webpage, fill in “Global Settings” and “Redundancy” pages. It is pretty straightforward: the virtual gateway for real servers will go on sub- interface eth1:1. I didn’t tick “Monitor link” nor “Use sync daemon” since it wouldn’t work on my setup.

 

 
This is where we get to the point. We could create a virtual server for each port to monitor. I decided not to but mark all packets, so we need to monitor one service only. No need to write different monitoring scripts. You can indeed do so if you’d like to.

 
I monitor port 80 since Exchange listens on HTTP.
Set a number in the firewall mark field to group all ports to be redirected: I’ve set 80 but it could be anything else really, it’s only a tag. Just remember to set the same number in iptables below.
I chose to set the IP address on eth0:3: I’ll let you know why later on.
Add the 2 real servers IP and leave the default monitoring script in the last tab.
 

Iptables

We’ll now set up iptables to mark packets with number 80. Iptables will filter (1st part below) and mark (2nd part) packets.
Here’s what I have in /etc/sysconfig/iptables:

# Completed on Fri Mar 21 15:42:56 2014
# Generated by iptables-save v1.4.7 on Fri Mar 21 15:42:56 2014
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [353:70931]
:OUTPUT ACCEPT [908:61941]
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 25 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 3636 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 2525 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 587 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 465 -j ACCEPT
-A INPUT -p tcp -m state --state NEW -m tcp --dport 717 -j ACCEPT
-A INPUT -d 192.168.1.36/32 -j ACCEPT
-A INPUT -d 192.168.1.37/32 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -i eth0 -o eth1:1 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i eth1:1 -o eth0 -j ACCEPT
COMMIT
# Completed on Fri Mar 21 15:42:56 2014
# Generated by iptables-save v1.4.7 on Fri Mar 21 15:42:56 2014
*mangle
:PREROUTING ACCEPT [87155:9160406]
:INPUT ACCEPT [82925:8666308]
:FORWARD ACCEPT [3859:469324]
:OUTPUT ACCEPT [64873:3999893]
:POSTROUTING ACCEPT [68732:4469217]
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 443 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 25 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 2525 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 587 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 465 -j MARK --set-xmark 0x50/0xffffffff
-A PREROUTING -d 192.168.1.35/32 -p tcp -m tcp --dport 717 -j MARK --set-xmark 0x50/0xffffffff
COMMIT
# Completed on Fri Mar 21 15:42:56 2014

Note iptables converts mark numbers in hexa, ie 80 becomes 0x50.
The main configuration is over. Now remain the 2nd load-balancer, real servers remote access and NAT. Let’s check this out.
 

Real Servers Management

Real servers Exchange1&2 can be managed from hyper-v but this is far from being handy.
In order to manage the 2 servers, I created 2 extra virtual servers on LVS that always redirect to the same IP, though only one real server.

I used to mark packets with number 1 & 2, respectively for Exchange1 and Exchange2, and added the following lines to iptables.

-A PREROUTING -d 192.168.15.36/32 -j MARK --set-xmark 0x1/0xffffffff
-A PREROUTING -d 192.168.15.37/32 -j MARK --set-xmark 0x2/0xffffffff

 
Restart pulse and iptables services and you should get the following interfaces:

[root@lb1 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0E
          inet addr:192.168.1.33  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::215:5dff:fe0f:290e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:481004253 errors:0 dropped:73168 overruns:0 frame:0
          TX packets:305264787 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:380465051896 (354.3 GiB)  TX bytes:204324705710 (190.2 GiB)
          Interrupt:9 Base address:0x6000

eth0:1    Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0E
          inet addr:192.168.1.36  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:9 Base address:0x6000

eth0:2    Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0E
          inet addr:192.168.1.37  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:9 Base address:0x6000

eth0:3    Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0E
          inet addr:192.168.1.35  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:9 Base address:0x6000

eth1      Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0F
          inet addr:10.0.1.33  Bcast:10.0.1.255  Mask:255.255.255.0
          inet6 addr: fe80::215:5dff:fe0f:290f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:320686576 errors:0 dropped:0 overruns:0 frame:0
          TX packets:425280520 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:209496440714 (195.1 GiB)  TX bytes:375879165808 (350.0 GiB)

eth1:1    Link encap:Ethernet  HWaddr 00:15:5D:0F:29:0F
          inet addr:10.0.1.254  Bcast:10.0.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

 

You can now reach Exchange1 & 2 on IPs 192.168.1.36 & .37.
 

Let Real Servers Connect to the Outside

A major problem is the real servers cannot connect to the outside world since they’re on a private network. This can be achieved with a NAT rule on the LVS server. Add this up to iptables:

# Generated by iptables-save v1.4.7 on Fri Mar 21 15:42:56 2014
*nat
:PREROUTING ACCEPT [433:57174]
:POSTROUTING ACCEPT [46:2760]
:OUTPUT ACCEPT [46:2760]
-A POSTROUTING -o eth0 -j MASQUERADE
COMMIT

 
You can find out more about IP Masquerading on this HOWTO
Restart the pulse service once you’re happy with the LVS conf, and copy it over to the backup server along with the iptables conf file.
You can use scp on this purpose:
scp /etc/sysconfig/ha/lvs.cf lb2:/etc/sysconfig/ha/lvs.cf
scp /etc/sysconfig/iptables lb2:/etc/sysconfig/iptables
Providing lb2 is registered in your DNS.
DO this everytime you make a change either on LVS or iptables.
 

RDP Monitoring

Instead of monitoring port 80 on Exchange1 and Exchange2, it is best to check RDP.
Download check_x224 (written for Nagios) to /usr/local/bin for instance

# cd /usr/local/bin/
# chmod +x check_x224

 
Edit the script and remove the following block:
if elapsed > critical_sec:
print(‘x224 CRITICAL: RDP connection setup time (%f) was longer than (%d) seconds’ % (elapsed,critical_sec))
sys.exit(2)
if elapsed > warning_sec:
print(‘x224 WARNING: RDP connection setup time (%f) was longer than (%d) seconds’ % (elapsed,warning_sec))
sys.exit(1)

 
Edit and replace all
print(‘x224 OK. Connection setup time: %f sec.|time=%fs;%d;%d;0’ % (elapsed,elapsed,warning_sec,critical_sec))
instances with
print(‘OK’)
 
and type in “Sending Program” on Piranha virtual server’s monitoring script:
/usr/local/bin/check_x224 -H 10.0.1.1
for Exchange1; Replace IP for Exchange2.

This will return OK if RDP is up on the Exchange server.
 

Persistence and Failover

You have to set a value in the virtual server persistence setting (600 here) so LVS keeps redirecting packets to the same server for the time being. If you don’t, packets from a unique client would be load-balanced to multiple servers and connections wouldn’t be established.
If a real server goes down, persistence keep sessions bound to the crashed server making the service unavailable.
To avoid this behaviour, add the following lines to /etc/sysctl.conf and run sysctl -p to make changes persistent upon reboot:

net.ipv4.vs.expire_quiescent_template = 1
net.ipv4.vs.expire_nodest_conn = 1

The first line drops sessions to a real server that’s become unavailable.
The Second does the same when a real server is removed from the conf.
All useful information is available on http://www.austintek.com.
 
If sysctl -p returns an error upon reboot on your Redhat distribution

error: "net.ipv4.vs.expire_quiescent_template" is an unknown key
error: "net.ipv4.vs.expire_nodest_conn" is an unknown key

You need to load ip_vs module at boot time. Create the following executable file:

echo "modprobe ip_vs" >/etc/sysconfig/modules/lvs.modules
chmod +x /etc/sysconfig/modules/lvs.modules

 
It is interesting to check ipvsadm’s manual as well. It gives the ability to check live sessions for instance:
# ipvsadm -Lcn
 

DNS registration in AD

One last thing: Exchange servers will register in AD with their private IPs.
Edit DNS records on your Domain controller with external IPs so name resolution returns the correct values.
Make sure to disable the DNS auto registration in the IPv4 settings of your network cards if you don’t want your DNS settings to be overwritten.
It is also best to add your Exchange servers internal IPs in their host file so they communicate straight to each other.
 

Update

If you upgrade an Exchange server, you’re better off disabling it so noone makes a new connection while services go up and down.
Run

# ipvsadm -l

to check the current status and

ipvsadm -e -f 80 -r exchange_server_ip:80 -w 0 -m

to set the weight to 0. Replace with 1 to set it back to 1.
Current connections won’t drop to the other server until services go down but noone will connect until the weight gets back to 1.


2 responses so far

2 Responses to “Cheap Microsoft Exchange Cluster with LVS Load Balancer”

  1. Mallikarjunaon 17 Sep 2014 at 12:51 pm

    I uses check_x224 on loadbalancer server but i got this error : localhost nanny[8500]: Ran the external sending program to (192.168.xxx.xxx:8080) but didn’t get anything back

    Due to this error when application server(192.168.xxx.xxx:8080) is down loadbalancer is not recognising.
    Please help me ……….

  2. daveon 04 Oct 2014 at 9:08 am

    check_x224 runs the RDP protocol on port 3389.
    You need the proper check script for port 8080 (HTTP I suppose)

Trackback URI | Comments RSS

Leave a Reply