2.4.4 Gateway Issue



  • I have been using pfsense for the last couple of point releases. Everything has been working well. I have updated for the last 2 point releases via the web UI and today I updated to 2.4.4 stable and the internet connection broke which locked me out of my home network. When I arrived home I checked and found the internal lans and internal routing was ok. I reloaded a recent config backup which forced a reboot and there was no change - still the wan interface was up but there was no internet connectivity.

    After some research I noticed netstat -r was not reporting a default route to the internet and this was consistent with reports of "no internet connection" elsewhere in the UI. I added a default route with route add default <IP> and now everything is working ok. I am answering in this thread because I think there may be a problem with loosing the default route after an update to 2.4.4 stable relase.

    I dont have the full problem or the solution yet, and I have not been using pfsense or freebsd for more than a few weeks (im a linux guy), but definitely issues happened with the latest update at my location. When I chose to update to 2.4.4 I was remote and the update had been out for about 5 hours. I am now home from work and able to investigate further.

    My configuration is one wan, and 5 vlans on a single link to a smart switch. And typical services running on pfsense. I had just added snort within the last 24 hours prior to update, but I do not think that was involved.

    The WAN interface was collecting the default route from my ISP correctly via DHCP prior to the update.

    [2.4.4-RELEASE][root@pfSense.localdomain]/root: netstat -r
    Routing tables

    Internet:
    Destination Gateway Flags Netif Expire
    10.70.1.0/24 link#10 U em0.11
    pfSense link#10 UHS lo0
    10.70.2.0/24 link#11 U em0.12
    10.70.2.254 link#11 UHS lo0
    10.70.3.0/24 link#12 U em0.13
    10.70.3.254 link#12 UHS lo0
    10.70.4.0/24 link#13 U em0.14
    10.70.4.254 link#13 UHS lo0
    10.70.5.0/24 link#14 U em0.15
    10.70.5.254 link#14 UHS lo0
    <isp>/20 link#5 U re0
    <gw ip> link#5 UHS lo0
    <isp ip> <mac> UHS re0
    <isp ip> <mac> UHS re0
    localhost link#7 UH lo0

    Internet6:
    Destination Gateway Flags Netif Expire
    localhost link#7 UH lo0
    fe80::%em0/64 link#1 U em0
    fe80::226:55ff:fed link#1 UHS lo0
    fe80::%re0/64 link#5 U re0
    fe80::28c:faff:fed link#5 UHS lo0
    fe80::%lo0/64 link#7 U lo0
    fe80::1%lo0 link#7 UHS lo0
    fe80::%em0.11/64 link#10 U em0.11
    fe80::226:55ff:fed link#10 UHS lo0
    fe80::%em0.12/64 link#11 U em0.12
    fe80::226:55ff:fed link#11 UHS lo0
    fe80::%em0.13/64 link#12 U em0.13
    fe80::226:55ff:fed link#12 UHS lo0
    fe80::%em0.14/64 link#13 U em0.14
    fe80::226:55ff:fed link#13 UHS lo0
    fe80::%em0.15/64 link#14 U em0.15
    fe80::226:55ff:fed link#14 UHS lo0

    [2.4.4-RELEASE][root@pfSense.localdomain]/root: route add default <ip>
    add net default: gateway <ip>
    [2.4.4-RELEASE][root@pfSense.localdomain]/root: ping google.com
    PING google.com (172.217.25.142): 56 data bytes
    64 bytes from 172.217.25.142: icmp_seq=0 ttl=51 time=29.942 ms
    64 bytes from 172.217.25.142: icmp_seq=1 ttl=51 time=31.108 ms
    ^C
    --- google.com ping statistics ---
    2 packets transmitted, 2 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 29.942/30.525/31.108/0.583 ms



  • It seems my system->routing->gateways config ended up as:

    0_1537879703741_unused gw as default.png

    This was configured as my GW originally while I had the pfsense box on my lan doing an initial configuration. However I removed the lan interface and configured only vlan interfaces. It seems the lan gateway remained present but diabled, and after update to 2.4.4 this interface can be used ahead of the real wan gw.

    I removed this GW completely and the defaults still did not work. I then set "Default GW IPv4" to "GW_WAN" and everything works again. I understand what I ended up with is the correct configuration, but I think there is a bug introduced in 2.4.4 allowing a disabled GW to be used as the default and therefore knocking the server off the net. Quite bad for people doing remote upgrades without another way in to the lan.

    Probably the removal of this work around is what causes this in 2.4.4. There might need to be some code added to check that gateways are not disabled if they are to be used as default GW.

    https://github.com/pfsense/pfsense/pull/3781/files#diff-1332c372788c9e1a8c6c9bae9ebb55a5L2006

    src/etc/inc/interfaces.inc

    0_1537881529186_Image4.png



  • yup, it's broken. I have pfsense running as a firewall, with a layer3 switch doing inter-vlan routing - there is a transit subnet between the two. It's worked perfectly for a year, pfense has the switch entered as a gateway, with a static route pointed towards it for all the associated vlans. The switch gateway has NEVER been marked as default, and is only indicated in a couple static routes.

    Worked perfectly without a blip all year until 2.4.4 - among other problems, at 7am yesterday it suddenly decided to use my L3 switch as a default route, which you can imagine is a problem given the switch has it set as a default route, so the vlan's can get out to the internet. So begins a giant routing loop. Zero indication of why pfsense suddenly decided to do this. My guess is my DOCSIS wan default gateway went down, and the new gateway detection/switching code decided to use the next gateway as a default. Rebooted pfsense and it went back to working exactly as it should

    2.4.4 has been nothing but problems


  • Netgate Administrator

    Yup, it is/was: https://redmine.pfsense.org/issues/8910

    Plus there are some other tickets covering related things.

    It is fixed now and will be in 2.4.4p1 or is in current 2.4.5 snapshots if you're able to test.

    Steve