TCP Issues on Remote End of HA Pfsense IPsec Tunnel
-
@stephenw10 @Derelict I hope you are doing well! I'm having the following issue with my HA firewall setup:
When I fail over CARP VIPs to secondary firewall, I am able to ping across data centers(all internal networks behind NAT) (VPN IPsec tunnel) because I have static routes in place, and ssh works fine, but my all hosts behind my zabbix proxy talking to the zabbix main server in the other data center (across the VPN IPsec tunnel) report to the main zabbix server as unreachable. I tested zabbix ports 10050 and 10051 and they are open in both directions. The main issue is the Zabbix proxy in DC A cannot properly communicate with the Zabbix Server in DC B. I tested and DC B(10.2.x.x) can also ping the FW CARP VIP(10.8.0.1) and both firewalls. All traffic is allowed on my LAN interface and IPsec interface. The Zabbix proxy in DC A has it's default gateway pointing at 10.8.0.1 VIP and a traceroute to the Zabbix Server in DC B(10.2.x.x.) also seems fine(and vise-versa). It's mostly some TCP packets not getting through as ssh and zabbix ports work just fine between the Zabbix Proxy and Zabbix Server.
I've dug into tcpdumps between 10.2.x.x and 10.8.x.x(both zabbix servers). Traffic seems to drop tcp FIN and SYN flags in both directions while failed over the secondary PFsense. I just see ACK flags going back and forth while on the secondary PFsense firewall (carp in maint mode on primary). I tried enabling a few settings(Bypass firewall rules for traffic on the same interface, Firewall Optimization Options:"conservative", etc...), but still no luck. As soon as the primary takes back the CARP VIPs, then all traffic goes back to normal. Any idea why only the secondary node is not passing all tcp traffic(ssh and other ports work just fine) across the IPSec tunnel? I have checked and sync is working with all boxes checked. I also have outbound NAT configured so that I can ping the secondary firewall over the IPsec tunnel, so that part is fine.
Attached is my network design to help with what I'm trying to explain.
Thanks!
-Rich
-
@rivest1000 Failing over IPsec is going to break all states and they will need to be reestablished.
-
@derelict Right, I do notice a short break once the secondary takes over the IPsec tunnel, but then I see them establish and then ping and ssh start to work, but then I see the TCP FIN and SYN flags start to fail in both directions.
-
@rivest1000 Everything that needs to have the CARP VIP as the default gateway has the CARP VIP as the default gateway?
-
@rivest1000 How are you failing over?
-
@derelict Yes everything in DC A has a default gateway of the CARP VIP 10.8.0.1. I'm manually failing over to test with "Enter Persistent CARP Maintenance Mode" on the primary.
-
@rivest1000 One thing to note is that the secondary is using the same exact IPs and configs as the primary for the IPsec tunnels, including the Phase 2 Local and Remote subnet IPs for the Routed IPsec to reach between DCs. Is it ok the use these same IPs to establish IPsec Phase 2 on the secondary(it's sync'd over with the HA setup)?
-
@rivest1000 That should be fine. Sounds like you need to simultaneously capture an interesting connection on all three inside interfaces and see what there is to see. Sorry but it's something unique to your environment based on what I have so far. Are the missing FIN/SYN packets being sent to the primary while the secondary is MASTER?
You're POSITIVE the zabbix hosts have the correct default gateways for the necessary traffic?