Failover cluster with 16 ip's - subnet question



  • Hi, I've built a PfSense 2.0.2 failover cluster in a test setting and it works great. I've used this howto and I have two identical machines, each with a quad port Intel card (PfSense treats it as four separate interfaces, em0 through em3). Again, it works great.

    Now my production router (also PfSense 2.0.2) gets a 16 ip address block from the isp's glass router. The isp's router has a lan port which connects to my PfSense router via a regular ethernet cable. Nothing fancy there.

    I have a couple of servers - web servers, mail server, etc. - which are available from the internet using their own ip addresses, namely the addresses from the 16 ip address block. This also works great. It looks like this:


    As you can see, the isp needs a 100Mb full duplex connexion. I made sure the switch between the routers' wan ports and the isp's router is a 100Mb full duplex switch.


    (The VHID Group numbers are for easy identification. I'm not sure if they should really be in the same group but it works like this so I never changed it.)

    Now comes the problem.

    I needed to move the new cluster to the production environment. So instead of the temporary private ip range on the test routers' wan interface, I used two of our public ip addresses from our 16 address block that were not yet in use.

    I set x.x.x.50/32 as the virtual wan ip and x.x.x.63/28 and x.x.x.62/28 as the wan addresses. Obviously, this is wrong. No internet connection was available.
    So I tried setting x.x.x.50/28 as a virtual wan ip and x.x.x.63/32 and x.x.x.62/32 as real wan addresses but PfSense wouldn't allow that because x.x.x.50/28 is not in the x.x.x.63/32 range.

    My questions are:

    • Must I use public ip addresses as the real wan interface ip address? Can't I use a private range address like 10.2.0.1/24?

    • I guess my isp's router (whose ip address is x.x.x.49, which is my PfSense router's gateway) wants x.x.x.50/28 as a lan connection. Do you suppose this is correct?

    • How would you suggest I set both routers' wan ip address and their virtual addresses?

    I'm confused by the line This must be the network's subnet mask. It does not specify a CIDR in Firewall: Virtual IP Address: Edit.

    Thanks for reading and any answers and pointers are greatly appreciated :)







  • The WAN real ip must be live IPs, for now. For the CARP, you must match the CIDR of your range. So you VIPs are x.x.x.50/28 and not /32. I would use the real IPs x.61/28 and x.62/28 for the real WAN IPS and x.50/28 for the VIP. For the WAN side. Then you also have to setup the same on the LAN side. If you use DHCP, it hands out the LAN VIP for the gateway.



  • Great, thanks for you reply, Podilarius! Should all virtual ips (that is, the servers behind the router) have /28 instead of /32?

    Lan is working as it should. The routers are 10.0.0.5 and 10.0.0.6 and their shared virtual lan ip is 10.0.0.4, which dhcp sends as the default gateway.

    /Edit
    Ok I tried the settings you suggested. It's not working. I set all vip subnet masks to /28. Lan is working just fine and the CARP settings are not complaining but I get no internet connection when I switch cables. Sending a ping from the routers to an external ip results in 100% packet loss. Switching the cables back: 0% loss.

    I tested my cables and tried others. Suggestions are welcome.

    /Edit 2
    Tried again and this time it does work \o/ Only it is very unstable. I rechecked all my wiring, replaced all relevant switches but no luck. I think I'll just clone my production router and forget about CARP.



  • I would check to make sure that the settings sync properly. How did you setup the pfsync interface? This is usually where most run into configuration problems as they will use the LAN instead of a dedicated interface. If you are going to use LAN, it is better to setup a VLAN specifically for pfsync. Or at least change it from multicast by putting in an IP address.



  • I'm using a dedicated network interface for syncing and a dedicated subnet which is not the LAN subnet. Sync is working perfectly: entering a bunch of firewall rules for example are shown on the other machine as soon as I hit F5. The machines are connected via one cable; no switches or anything in between. The machines are identical and physically next to each other. The log shows no syncing errors.



  • If you are setup correctly, then perhaps it is on the otherside. Meaning that when you switch over, the router in front doesn't "recognize" it. As a test, when you fail over, unplug the WAN cable from the backup and then plug it back up. This should reset the arp table on the upstream device. This might not be the problem.
    What version are you running? Are there any errors in the logs? How are you failing over? What is the CARP status on each machine when you do fail them over?



  • I reset the router in front, cloned the mac address of the old PfSense router and used the same ports on all switches to rule out arp table problems.

    What exactly do you mean by 'when you fail over'? I tested with just new router A, just new router B and then both routers online. No CARP errors whatsoever are logged.

    It almost seems like a mechanical or wiring problem but I used several new cables (CAT 6 SFTP) and tested everything very thoroughly using cable testers and also computer to computer 'live' testing (very seldomly a cable tester tests ok but the cable is just on the edge of failing and you notice it when copying files). I also used various new and tested switches.

    Anyway I've given up. I replaced the old router with one of the new routers and decided not to create a cluster. Thanks for your help and suggestions, Podilarius. I will try it again some day when I have more time.



  • What I mean is that when the master is online, CARP status will be MASTER, while on the second system it will be BACKUP. When the primary is offline, then the secondary should change to MASTER. Are you using realtek NICs? If so, you might have a problem there.
    Time constraints are what they are; perhaps next time you will have more time. I would setup the current active one with CARP VIPs so that later on you can just roll it into a cluster.



  • Yes, MASTER/BACKUP status changes are logged (and mailed to me \o/). I am using these network cards: Intel Pro/1000 PT Quad Port LP Server Adapter. PfSense finds them as em0, em1, em 2 and em3.

    Thanks again :)


Locked