2.4.4 Gateway Issue
-
I have been using pfsense for the last couple of point releases. Everything has been working well. I have updated for the last 2 point releases via the web UI and today I updated to 2.4.4 stable and the internet connection broke which locked me out of my home network. When I arrived home I checked and found the internal lans and internal routing was ok. I reloaded a recent config backup which forced a reboot and there was no change - still the wan interface was up but there was no internet connectivity.
After some research I noticed netstat -r was not reporting a default route to the internet and this was consistent with reports of "no internet connection" elsewhere in the UI. I added a default route with route add default <IP> and now everything is working ok. I am answering in this thread because I think there may be a problem with loosing the default route after an update to 2.4.4 stable relase.
I dont have the full problem or the solution yet, and I have not been using pfsense or freebsd for more than a few weeks (im a linux guy), but definitely issues happened with the latest update at my location. When I chose to update to 2.4.4 I was remote and the update had been out for about 5 hours. I am now home from work and able to investigate further.
My configuration is one wan, and 5 vlans on a single link to a smart switch. And typical services running on pfsense. I had just added snort within the last 24 hours prior to update, but I do not think that was involved.
The WAN interface was collecting the default route from my ISP correctly via DHCP prior to the update.
[2.4.4-RELEASE][root@pfSense.localdomain]/root: netstat -r
Routing tablesInternet:
Destination Gateway Flags Netif Expire
10.70.1.0/24 link#10 U em0.11
pfSense link#10 UHS lo0
10.70.2.0/24 link#11 U em0.12
10.70.2.254 link#11 UHS lo0
10.70.3.0/24 link#12 U em0.13
10.70.3.254 link#12 UHS lo0
10.70.4.0/24 link#13 U em0.14
10.70.4.254 link#13 UHS lo0
10.70.5.0/24 link#14 U em0.15
10.70.5.254 link#14 UHS lo0
<isp>/20 link#5 U re0
<gw ip> link#5 UHS lo0
<isp ip> <mac> UHS re0
<isp ip> <mac> UHS re0
localhost link#7 UH lo0Internet6:
Destination Gateway Flags Netif Expire
localhost link#7 UH lo0
fe80::%em0/64 link#1 U em0
fe80::226:55ff:fed link#1 UHS lo0
fe80::%re0/64 link#5 U re0
fe80::28c:faff:fed link#5 UHS lo0
fe80::%lo0/64 link#7 U lo0
fe80::1%lo0 link#7 UHS lo0
fe80::%em0.11/64 link#10 U em0.11
fe80::226:55ff:fed link#10 UHS lo0
fe80::%em0.12/64 link#11 U em0.12
fe80::226:55ff:fed link#11 UHS lo0
fe80::%em0.13/64 link#12 U em0.13
fe80::226:55ff:fed link#12 UHS lo0
fe80::%em0.14/64 link#13 U em0.14
fe80::226:55ff:fed link#13 UHS lo0
fe80::%em0.15/64 link#14 U em0.15
fe80::226:55ff:fed link#14 UHS lo0[2.4.4-RELEASE][root@pfSense.localdomain]/root: route add default <ip>
add net default: gateway <ip>
[2.4.4-RELEASE][root@pfSense.localdomain]/root: ping google.com
PING google.com (172.217.25.142): 56 data bytes
64 bytes from 172.217.25.142: icmp_seq=0 ttl=51 time=29.942 ms
64 bytes from 172.217.25.142: icmp_seq=1 ttl=51 time=31.108 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 29.942/30.525/31.108/0.583 ms -
It seems my system->routing->gateways config ended up as:
This was configured as my GW originally while I had the pfsense box on my lan doing an initial configuration. However I removed the lan interface and configured only vlan interfaces. It seems the lan gateway remained present but diabled, and after update to 2.4.4 this interface can be used ahead of the real wan gw.
I removed this GW completely and the defaults still did not work. I then set "Default GW IPv4" to "GW_WAN" and everything works again. I understand what I ended up with is the correct configuration, but I think there is a bug introduced in 2.4.4 allowing a disabled GW to be used as the default and therefore knocking the server off the net. Quite bad for people doing remote upgrades without another way in to the lan.
Probably the removal of this work around is what causes this in 2.4.4. There might need to be some code added to check that gateways are not disabled if they are to be used as default GW.
https://github.com/pfsense/pfsense/pull/3781/files#diff-1332c372788c9e1a8c6c9bae9ebb55a5L2006
src/etc/inc/interfaces.inc
-
yup, it's broken. I have pfsense running as a firewall, with a layer3 switch doing inter-vlan routing - there is a transit subnet between the two. It's worked perfectly for a year, pfense has the switch entered as a gateway, with a static route pointed towards it for all the associated vlans. The switch gateway has NEVER been marked as default, and is only indicated in a couple static routes.
Worked perfectly without a blip all year until 2.4.4 - among other problems, at 7am yesterday it suddenly decided to use my L3 switch as a default route, which you can imagine is a problem given the switch has it set as a default route, so the vlan's can get out to the internet. So begins a giant routing loop. Zero indication of why pfsense suddenly decided to do this. My guess is my DOCSIS wan default gateway went down, and the new gateway detection/switching code decided to use the next gateway as a default. Rebooted pfsense and it went back to working exactly as it should
2.4.4 has been nothing but problems
-
Yup, it is/was: https://redmine.pfsense.org/issues/8910
Plus there are some other tickets covering related things.
It is fixed now and will be in 2.4.4p1 or is in current 2.4.5 snapshots if you're able to test.
Steve