Fixed: Failover lbpool which failed after hardware upgrade
yoberi last edited by
I'm running pfsense 1.2.3-release with a dual-WAN configuration configured to use a failover gateway load_balancer pool. Prior to a hardware upgrade on the server (added a dual ethernet NIC) my loadbalancing configuration worked fine: it would favor ISP1 on WAN and only when the monitored ip address (a dns server on ISP1 with static route to use WAN) was unreachable, the system would use ISP2 on WAN2 (aka opt-wan which is opt2 on my system).
After I added the dual ethernet NIC to the configuration, I was able to remap the nic address (mixture of bge and em devices) to the standard labels used by pfsense (lan, wan, opt1 through opt4), and re-import the xml configuration.
All services worked well except the failover to ISP2 on the opt-wan when ISP1 on WAN was down. After much testing and playing with settings after hours, I've gotten the failover gateway to failover properly. It required backing up the pfsense xml configuration, hacking it slightly, and reloading the modified version:
<load_balancer><lbpool><type>gateway</type> <behaviour>failover</behaviour> <monitorip>220.127.116.11</monitorip> <name>WanFailToWan2</name> <desc>Prioritize WAN but failover to WAN2</desc> <port><servers>wan|18.104.22.168</servers> <servers>opt2|22.214.171.124</servers></port></lbpool> …</load_balancer>
and this works:
<load_balancer><lbpool><type>gateway</type> <behaviour>failover</behaviour> <monitorip><name>WanFailToWan2</name> <desc>Prioritize WAN but failover to WAN2</desc> <port><servers>wan|126.96.36.199</servers> <servers>opt2|188.8.131.52</servers></port></monitorip></lbpool> ...</load_balancer>
Notice the difference in <monitorip>. I discovered that earlier pre-hw upgrade xml files contained "<monitorip>", whereas later xml versions with the failing configuration contained "<monitorip>184.108.40.206</monitorip>". I could not find a setting via the GUI to allow removing/setting "<monitorip>" and there was no indication that it was set in the xml. I'm not exactly sure what sequence of changes (hardware or software) led to this apparently invalid state other than what I've vaguely described as my hardware upgrade.
I hope this can be useful to others that may find themselves in the same situation.