Issues after config restore
-
Hi, my apologies if this isn't the correct location for this one. We had the PSU fail in our pfSense box (Netgate 7100). We have installed a Netgate 8200 MAX, and I've restored the config from the 7100 onto it.
We are getting some pretty weird results, and I was wondering if anyone can give me an idea of how to debug them.
We reassigned the interfaces as the new hardware has different ports, this went smoothly, Brought up the WAN and basic operation was restored really quickly. Incoming traffic however wasn't hitting the NAT rules. We realised that if we recreated a NAT rule, it still didn't work, but recreating a NAT rule AND recreating the IP Alias did work, which was our first clue that perhaps the config restore hadn't quite hit the mark - we deleted an alias and re-added it exactly as it was, and that sparked the NAT rule back the life. All good after that though.
The issue that still remains is that we had two BGP routed VPNs on device. Both P1 and P2 of these came back up no problem, but traffic wasn't flowing. I've spent today debugging this, and the symptoms are quite different.
One is to AWS, and in this case, it worked intermittently - would function for a short period (less than 30 seconds) and then drop for a period (sometimes a second, sometimes a minute or so). While it was up, I was able to ping normally. Throughout, I could ping the BGP servers in AWS (even when other routing through the tunnel was down).
The other tunnel is to azure, and in this case, packet capturing the IPSEC interface shows traffic hitting pfsense (e.g. the BGP traffic from azure) but its getting closed with CLOSED:SYN_SENT in the states list.
I am very confused, and am a little out of ideas of where to debug further. My intention is to try recreating the IPSEC config, but I'm a little unsure as the IPSEC config seems to be fine, it seems to be something more fundemental that is not working.
Has anyone experienced anything like this before, or could perhaps supply some diagnostic paths? I am more interested in resolving the Azure CLOSED:SYN_SENT issue, I mention both to highlight really that there appears to be something fundemental not quite right - I can't explain why two very similar setups behave so differently, except perhaps that under the surface maybe there is some interface confusion from the restore?
Anyway, thank you for reading my wall of pain text, if you have any ideas I would be very grateful.
-
I assume you rebooted after restring the config? Or since?
What pfSense version was the 7100 running? What is the 8200 running?
Steve
-
Thanks, yes I have - config was restored then the device was cold booted in the rack.
The config was backed up from version 22.9 on the 7100, 8200 was restored onto prior to any updates being installed and is now at 24.03.
I'm not sure exactly what version the 8200 shipped with, my apologies.
-
Ah, OK. So you are probably hitting an issue with the changed default state policy in 24.03 hitting VTI tunnels. See:
https://docs.netgate.com/pfsense/en/latest/releases/24-03.html#generalYou can set the policy back to floating globally or (preferably) set it per rule on the VTI pass rules allowing the BGP traffic.
Steve
-
Steve, I owe you a beer.
Thank you very much, that has made my afternoon.
-
No worries, glad it helped.