Solved - Dual WAN failover gateway group with ipsec connection to azure
I am running pfsense 2.2.3 and have 2 WAN interfaces and 1 LAN connection. I have set up a gateway group and tested the fail over of the two WAN connections. I have the ipsec vpn to azure set up and working when WAN 1 is selected as the interface. I tried changing the interface to the gateway group on the ipsec phase 1and the vpn will not connect. I have the identifier set to my ip address. As soon as I change it back to to the wan interface it connects within a few seconds. I am not clear on when the interface is set to the gateway group if the ip address will work as the identifier. I am hoping that it will send the current WAN's ip address as the identifier when the gateway group is selected.
I am hoping someone can point me in the right direction or at least confirm that this should work and I can continue troubleshooting.
I have a powershell script that I will be running on a local server that will detect when the public IP changes as a result of the failover and will update the gateway address on azure.
I also thought about setting up a second vpn tunnel with wan 2 as the interface and seeing if that will connect once the ip address on the vpn gateway is updated on azure to the second WAN's address.
This is my first time setting up an ipsec vpn ,so I apologize if I left out any relevant information.
It will fill in the appropriate interface IP or VIP from the gateway group for "My IP address" in that case. You can check leftid in /var/etc/ipsec/ipsec.conf if you want to confirm, but that definitely works both for just the interface address, and where you have a VIP specified.
Thank you for the response cmb. I will verify leftid in that file today when I get back to the office.
I verfiied that the leftid is correct in /var/etc/ipsec/ipsec.conf when using the gateway. When setting the phase 1 interface to the gateway group the tunnel shows connected for phase 1 and a phase 2 entry is shown on the ipsec status page. But, the phase 2 entry shows 0 bytes of traffic and I cannot pass any traffic between the local and remote networks. If I change the interface back to WAN1 on the phase 1 for the vpn and apply the changes. Everything starts working within a few seconds.
I have also unplugged the wan 2 connection so that the gateway group only uses the WAN1 connection(the one that works when selected as the interface in the phase 1 entry).
If anyone has any suggestions or can point me in the right direction I would greatly appreciate it.
![using WAN.PNG_thumb](/public/imported_attachments/1/using WAN.PNG_thumb)
![using WAN.PNG](/public/imported_attachments/1/using WAN.PNG)
![using gateway.PNG](/public/imported_attachments/1/using gateway.PNG)
![using gateway.PNG_thumb](/public/imported_attachments/1/using gateway.PNG_thumb)
Since the device is currently being staged and not in production yet I upgraded to the version below. I was able to duplicate the same behavior with the new version. So,I am guessing it is something that I have not configured properly. As soon as the ipsec phase 1 interface is changed from the gateway group back to wan1 within a few seconds the tunnel passes traffic as expected.
I am also attaching screenshots of the gateways and gateway group configuration.
built on Thu Jul 23 18:30:13 CDT 2015
Any help is greatly appreciated.
I enabled logging and increased the logging level on ipsec and restarted hoping to capture some additional information for troubleshooting. After I did that, I left it for awhile, came back and traffic was flowing properly between the local and remote networks. I started a constant ping between one of the azure devices and an on-premise server to see if the traffic drops after awhile.
Will the phase 2 disconnect if there is not any traffic between the networks? Because with my initial testing I would test by accessing the web interface of pfsense from a remote server on the azure network. But, that was the only traffic between the two networks. Now that the tunnel is working I joined the azure machine to the on-premise domain. So there should be a constant flow of traffic over the tunnel.
I do not want to mark the thread resolved quite yet, because I am still not sure exactly why the traffic did not work initially. I will continue to monitor the traffic today, restart the device a few times, and make sure the tunnel comes back up.
If there isn't any traffic attempting to traverse the VPN, it won't come up. With something joined to the domain, there will be plenty of chatty stuff going on there for AD to keep it live all the time.
There should be no functional difference between the group and the interface, the only thing that sets differently in that regard in the IPsec config is the ID, which you confirmed worked correctly. There could be some other difference there, but I can't think of any that could cause the symptoms you describe. If it's still a problem, I'd be willing to take a look at it. I've setup a number of Azure VPNs for customers, never ran into any issues with them.
Thank you for the reply and the offer to look at the configuration. Everything appears to be working properly now. It seems like I was not giving the tunnel enough time to come up or I was not passing enough traffic across to bring up the tunnel. I just got back from the office where I was able to recreate the configuration on the production equipment and everything works as expected. If I unplug one of the wan interfaces, the gateway group fails over, the vpn ip on the azure gateway gets updated, and the tunnel comes back up on the other wan after manually disconnecting the ipsec connection.
I am very excited that everything is working and I learned quite a bit from the experience!