SIP trunk failover/back on multi wan issues
I currently have multi wan configured and it's working well. I've configured 3 groups for the gateways, one balances between them using weighting, another uses link1 unless it's down and fails to link2, and the last uses link2 unless it's down and pushes traffic to link1.
I've tested failover with an openVPN tunnel on UDP and it's flawless. When one of the links drops, openVPN will move to the slower link, when the connection comes back up I have to manually kill the state table entry for the ovpn client router (currently behind pfsense) and the udp traffic seamlessly starts going over the faster connection without any disruption to the tunnel.
SIP also causes a problem in this configuration, we've set our PBX up with a similar gateway group. When the preferred gateway goes down, traffic will transition to the other while it's down and things work as expected with out provider. However, when the connection comes back up things get broken. The registration channel to the provider stays active in the state table on the secondary connection, but new RTP streams leave using the DSL connection (which is exactly what the firewall has been told to do). This causes SIP to break and incoming calls no longer work (but outgoing are fine) until the states relating to the PBX are manually killed.
Is there any way I can get pfSense to kill the states relating to these 2 hosts when it sees the preferred gateway changes to "UP"? I have also noticed that the email notification system only sends notifications when the system removes a link from a routing group, but doesn't send another when it comes back online, is there a way to enable that also?
Fixed this by using some hackery, but it gets the job done. I had to bring the interface down manually because I couldn't test if the apinger failure triggering a gateway modification triggers the shellcmd to run. I'll post back with an update when the connection fails organically again and share results.
Used this article as a reference http://forum.pfsense.org/index.php/topic,7808.msg46725.html#msg46725
Made the following changes on the firewall and rebooted it (not sure if the reboot is necessary.
Add to /conf/config.xml under <system>:
Contents of /usr/local/bin/kill_pbx_states.sh (chmod 755):
#!/bin/sh sleep 5 /sbin/pfctl -k 192.168.0.220 ```</system>
I have tested this script and wanted to post my results.
In short, it works and seems to work well for failing back to a primary connection.
For me, I have sites that use a broadband circuit as primary internet and a T1 as secondary. If the broadband goes down, the traffic is shifted to the T1, but when it comes back online, traffic is required to go back over the broadband.
This creates an issue with any connection established over the WAN2/T1 interface on PFSense, like SIP, which has already been pointed out.
So in my test build, I connected my firewall to a switch port and setup IPs on a VLAN interface. I used a different monitor IP other than my gateway and shutdown routing off the switch to simulate a network failure that didn't include the gateway going down. It worked to clear states from WAN2 every time I brought the primary WAN1 interface back up.
The ability to clear states when a gateway returns to up status is important for those who need to keep traffic on a primary circuit as long as its up, so hopefully this is something that can get built into future updates. Right now I think states are only cleared when the gateway goes down. But in the meantime this script is important for anyone that requires that functionality.
I am going to test this script in combination with using OpenVPN and tunnel failover. I've seen multiple posts on OpenVPN tunnel failover but that had the same issue with not returning traffic back to the primary link. One work around is to create a different tunnel connection for each interface and connected site, and use OSPF to select the best route. With this script clearing states when the gateways come back up, I think it should work for basic tunnel failover to return to the primary interface and ultimately allow me to reduce the number of tunnels I need to build to get a full mesh VPN network with failover.
This idea is great. I got the same problem here with a pbx server and a connection between two pbx's. I had to reset the states manually and that was trash. I did the same that you told and I will see if it works. Tks!
Hi, I solved it like this :
#Kill Udp Sip States after new wan IP
echo "Killing States from ASTERISK pbx to SIPPROVIDER" |logger;
#kill freepbx connection
/sbin/pfctl -k ASTERISKIP
/sbin/pfctl -k ASTERISKIP -k SIPPROVIDER
/sbin/pfctl -k WAN1IP -k SIPPROVIDER
/sbin/pfctl -k WAN2IP -k SIPPROVIDER
chmod 755 /usr/local/bin/reset_voip_states.sh
Edit config file /conf/config.xml
works like a charm