SG-3100 no routing/NAT after reboot
-
I had cause to reboot our SG-3100 this evening and now there’s no routing/nat. I can’t be sure exactly what the problem is but traffic isn’t passing any more from lan to either wan or the secondary lan. No config was changed. Everything looks fine.
From the Lan I can ping the WAN ip on pfsense but nothing else. To me that suggests a NAT failure??? I can see nothing on the logs to suggest a problem. Outbound NAT is automatic, no manual rules.
LAN firewall rule is allow any any.
Packet capture shows a ping arrive on the LAN interface, nothing leaves on the WAN.
At this stage I’d just factory reset the thing and reconfigure, it isn’t a complex config. However I’m 280 miles away and rebuilding this remotely simply isn’t an option
The other funky thing is the web configurator doesn’t work on a reboot and needs me to SSH in to restart it.
Running 21.05-release
HELP….
-
Having done a bit more digging... Here's where I've got to.
I can SSH into a switch on the LAN and I'm attempting to ping to a WISP CPE gateway on the main WAN subnet. I can ping the WAN address of PFsense but I cannot ping the CPE gateway from the switch (I can ping this from pfsense itself).
I have disabled NAT entirely - no difference.
I have disabled the firewall and NAT in advanced - this causes me to lose connectivity to PFsense and I'm not sure why as it has a public IP address on the secondary wan link.... However it also still isn't possible to ping to the CPE gateway from the LAN.It's looking to me like PFsense simply isn't routing. Any suggestions as to further diagnostics I can try.... or indeed anything?
-
Spotted an error in my testing.... the CPE gateway doesn't have a route to the LAN subnet so disabling the firewall and attempting to ping that wouldn't have worked.
I've done more packet capture with a less tired brain. HTTPS request from a client on the LAN. I can see that arrive at the LAN interface. I see it leave the WAN and a response return. The response never makes it back out on the LAN.
State table shows Established:Syn_Sent
Packet capture shows endless Syn outbound and a Syn Ack returning that PFsense appears to just block.
Why is it doing this?
-
Well... after a very late night and many hours of bafflement I came across this: https://forum.netgate.com/topic/160969/upgrade-to-21-02-release-borked-on-sg-3100
and the suggestion to re-run the initial setup wizard if nothing works for no adequately explained reason and that has fixed it.
How have we reached the stage where an appliance can get into this state after a reboot on 21.05? That's exceptionally poor.
So thanks @rsherwood_va I never would have thought of that.
I should stress that I updated some time ago. What took a working system to a broken system was a restart, nothing else.
-
That sounds like it may have lost it's default route somehow or ended up with the wrong default route. If you have multiple gateways defined and the default is still set as auto it can choose the one in the event the WAN gateway goes down.
Is that possible?In that situation I would still expect to the able to ping the WAN side gateway from something in the LAN subnet as long gateway is in the WAN subnet.
Also a pcap would not show traffic leaving the WAN for some external destination.
Steve
-
@stephenw10 Thanks Steve, the default route for the system is set as the WISP, not automatic. The LAN firewall rules specify gateway groups too.
I did try changing LAN rules to specify different gateways but I didn't try changing the system default gateway.
Nothing I did worked until re-running the initial setup wizard.
-
Hmm. Re-running the Setup Wizard would re-apply the interface settings on WAN and LAN. Something there must have been lost somehow. Losing the default route when the gateway is set as auto is probably most common but I have sometimes seen other things remove the default route. Hard to say without data from the time.
Steve