Firewall - GRE/IPSEC
I am running two copies of pfSense 2.2.4. Each instance of pfSense has 9 IPSEC tunnels set up to various servers. The IPSEC tunnels are set up in transport mode with only GRE being encrypted between the end points (the remote end points are all VyOS with pfSense set to not initiate the connections). IPSEC seems to be working fine, my tunnels are all established.
My issue is I cannot successfully pass traffic over the tunnels. My two pfSense servers are called gateway1 and gateway2, they are setup to sync everything with each other. I set up everything on the two servers, tested, and all the tunnels were working fine, I could ping the remote end point and BGP was connecting. After I confirmed it was all working, I have restarted gateway2 to make sure everything works after a reboot. I now cannot pass any traffic over the tunnels on gateway2 (I am aware of issues with GRE tunnels on reboot, I have added shell commands to set the GRE tunnels to up which works).
Now for my firewall configuration, I have done the following:
Ipsec tab - Allow all IPv4 (all protocols, sloppy state)
Created interface group called GRE, added all GRE interfaces to this group (on both servers). GRE tab - Allowed all IPv4 (all protocols, sloppy state)
When I run a ping to the remote end of the GRE tunnel I get no response, taking a packet capture on the remote end shows nothing being received. When I try to telnet to port 179 (BGP) on the remote end of the tunnel I get this:
[2.2.4-RELEASE][root@gateway2]/root: telnet 10.63.56.41 179 Trying 10.63.56.41... telnet: connect to address 10.63.56.41: Interrupted system call telnet: Unable to connect to remote host
That indicates its most likely a firewall issue on the pfSense server. So as a test I disable all rules (pfctl -d), and it starts working:
[2.2.4-RELEASE][root@gateway2]/root: ping 10.63.56.41 PING 10.63.56.41 (10.63.56.41): 56 data bytes 64 bytes from 10.63.56.41: icmp_seq=0 ttl=64 time=186.846 ms 64 bytes from 10.63.56.41: icmp_seq=1 ttl=64 time=178.177 ms ^C --- 10.63.56.41 ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 178.177/182.512/186.846/4.334 ms [2.2.4-RELEASE][root@gateway2]/root: telnet 10.63.56.41 179 Trying 10.63.56.41... Connected to 10.63.56.41. Escape character is '^]'. Connection closed by foreign host.
When I check the firewall logs, I can see my TCP connection getting blocked, it says the outbound interface is the GRE tunnel and the rule matched is "@6(1000000104) block drop out log inet all label "Default deny rule IPv4"".
I then added a firewall rule to the specific GRE interface to allow all ipv4, all protocols, sloppy state but have the same result. I then added a floating rule, assigned to the interface with sloppy state but still have the same result.
As far as I can see it shouldn't be blocked by the firewall but I just cannot get this to work, I even tried adding a floating rule to allow all traffic on all interfaces with sloppy state but it still gets blocked. I have tried also enabling first match with no change.
Does anyone have any other ideas for what else I can try? It seems like it always hits the default deny rule even though its already been allowed.
Also a bit more detail, this is what it looks like in the firewall log:
10.63.56.102 = pfSense server
10.63.56.101 = remote end of the tunnel
This is happening with the allow all firewall rules on multiple interfaces (IPSEC/GRE Tunnel/Floating).
I have just done some more testing and I think I have found the answer.
These servers used to have old GRE tunnels that were since removed. These new GRE tunnels use the same interface names as before (eg. gre1). When I create some dummy GRE interfaces so that the new tunnels get assigned new interface names, they work fine. I will reinstall pfSense as I think that will fix the issue, but I cannot explain why doing it this way works.
Scrap what I said above, I have now found the actual reason but I am not sure how to resolve this properly. It explains why this happens after reboot too.
When the server boots, the GRE tunnels are bought up. A state entry is added for the GRE tunnel going out the WAN interface. The IPSEC tunnel is established after, but the GRE tunnel still goes over the WAN interface normally, not through the ISPEC tunnel. If I kill the state entry for the GRE tunnel it fixes it.
Does anyone know how I can work around this issue?
So I have solved half the issue.
8 of the IPSEC tunnels are to a VyOS server. I have set VyOS to respond only so pfSense initiates the connection. This is working well, the GRE tunnels come up and pass traffic fine after rebooting and other failure cases.
The last remaining issue I have is one of the tunnels is pfSense <-> pfSense. Both ends are set to respond/initiate. After a reboot the GRE tunnels work about half the time (depending on IPSEC if it responded or initaited), the times it doesn't work require the state entry to be killed.
If I set one end to respond only, it means if it reboots the GRE tunnel doesn't work due to the state entry. Any work around for this would be much appreciated.