VPN - Routing Issue - Only Linux Hosts



  • Hello All,

    I have a very strange issue happening on our local work network.  We have our local network at our office and then an IPSec tunnel that connects us to our main data center where most of our gear is kept.

    Our local network routes all the internet traffic out of our office and should route only the VPN traffic through the VPN.  When using Windows machines and also my iPhone, this works fine.  But, a local Linux laptop will not "find" anything on the other side of the tunnel and neither will any Android devices.

    Our local network is:
    172.26.0.1 - 172.26.255.254

    The network on the other side of the tunnel is:
    172.25.0.1 - 172.25.255.254

    I can use/ping/traceroute any of the 172.25.. from my windows laptop without issue.

    But, when trying to do the same from the Linux laptop, it fails each time.

    I had thought my colleagues Android issues were just some oddity on his devices, but now seeing the same thing play out from a fresh Linux Mint install, it has to be some sort of routing issue, but I am not sure how to troubleshoot or solve it.

    Can anyone offer some help on this?

    Thanks much!



  • Is it possible the problem devices have a local firewall in place, blocking the external subnet?



  • Thank you for the reply.  I was thinking the same thing so I've been doing some sort of "throw a dart" troubleshooting to try and figure it out.

    I did check and it does not seem there is any firewall active on the linux machine.

    Here are some tests I did:

    Just as a refresher…the Linux laptop is on 172.26.x.x

    My windows laptop can connect to everything on the VPN subnet without issue.

    Windows Laptop on same subnet –> PING --> Linux Laptop = OK
    Linux Laptop --> PING --> Windows Laptop on same subnet = OK

    This seems to indicate that ping itself is OK to/from the linux machine.

    I then used a RDP session from my laptop to a server on the VPN subnet of 172.25.x.x.

    Server –> PING --> Linux Laptop = NOT OK

    Here is where it gets sort of interesting...

    If I try and PING a device on the other subnet from the Linux laptop, I will get 1 good reply reply and then that's it.  And I only get that 1 reply the first time I try and ping the device.  After that, I get no replies at all until I reboot.  After a reboot, I can again get 1 response and then that's it.  Similarly, if I try and access a web page on the VPN subnet, the browser will seem to indicate that a connection was made because it will say "waiting for 172.25.10.231" but then nothing happens.  If I ping 172.25.10.231 and get the 1 reply and then try and hot the web page, it immediately fails to load the page in the borwser.

    Also, I tried using NMAP on the laptop and it seemed to locate devices properly on the VPN subnet.  But, after trying to ping the a devices, NMAP suddenly can't see any of the ones I've pinged any longer.

    It would seem logical that the issue lies on the Linux machine, but the fact that Android devices are similarly unable to talk with anything on the other end of the VPN makes me think there is some setting on pfSense that may overcome whatever the issue is.  After all, the Android device should just be able to be on the LAN here and "see" the other stuff on the VPN without issue.

    Any added thoughts from the crowd?

    Thanks again.



  • Is ufw installed or running?



  • I put the gufw package on there to check that and it was off.  I activated it and then told it to allow all and still had the same results unfortunately.



  • Yesm this is a dumb question, but…did you remember to disable dead peer detection?



  • Dead Peer Detection is active on the IPSEC setup.

    Would that cause an issue somehow?

    Keep in mind all Windows hosts on my end of the tunnel can access everything on the other side without issue, if that matters.

    Thank you for your reply.  Hoping to get this figured out.


  • Netgate Administrator

    Something that can impact Linux (including Android) but not Windows is partial IPv6 connectivity. Linux can attempt to use IPv6 if it appears to be available even if no external route is possible.

    Steve



  • I do not seem to have IPv6 activated anyplace but can you tell me where I should look, just so I can confirm?  Or, is there some option I need to select to handle IPv6 requests?

    Thanks!



  • @DungaBee:

    Dead Peer Detection is active on the IPSEC setup.

    Would that cause an issue somehow?

    Keep in mind all Windows hosts on my end of the tunnel can access everything on the other side without issue, if that matters.

    Sorry, my fault - I somehow assumed that the Linux machines used one IPsec tinnel and the Windows boxes an other one. Had i read your initial post correctly, I would have noted that all machines use the same tunnel.

    DPD can, in some cases, cause the tunnel to disconnect for no apparent reason. Obviously, with the tunnel completly going down, all machines would be affacted.

    What does

    sudo route -n
    netstat
    ip route list
    

    show on a Linux machine? (That are three separate commands)

    The Windows version is

    route print
    


  • Thank you for the follow up.  Here is the info.  I omitted all the misc connection info from netstat as I assumed that was not relevant.

    Windows Machine

    route print
    
    IPv4 Route Table
    ===========================================================================
    Active Routes:
    Network Destination        Netmask          Gateway       Interface  Metric
              0.0.0.0          0.0.0.0    172.26.10.254     172.26.10.50     20
            127.0.0.0        255.0.0.0         On-link         127.0.0.1    306
            127.0.0.1  255.255.255.255         On-link         127.0.0.1    306
      127.255.255.255  255.255.255.255         On-link         127.0.0.1    306
           172.26.0.0      255.255.0.0         On-link      172.26.10.50    276
         172.26.10.50  255.255.255.255         On-link      172.26.10.50    276
       172.26.255.255  255.255.255.255         On-link      172.26.10.50    276
            224.0.0.0        240.0.0.0         On-link         127.0.0.1    306
            224.0.0.0        240.0.0.0         On-link      192.168.56.1    276
            224.0.0.0        240.0.0.0         On-link      172.26.10.50    276
      255.255.255.255  255.255.255.255         On-link         127.0.0.1    306
      255.255.255.255  255.255.255.255         On-link      192.168.56.1    276
      255.255.255.255  255.255.255.255         On-link      172.26.10.50    276
    ===========================================================================
    Persistent Routes:
      None
    

    Linux Machine

    sudo route -n
    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
    0.0.0.0         172.26.10.254   0.0.0.0         UG    0      0        0 wlan0
    172.26.0.0      0.0.0.0         255.255.0.0     U     9      0        0 wlan0
    
    netstat
    Active Internet connections (w/o servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State      
    tcp6       0      0 ip6-localhost:45710     ip6-localhost:ipp       ESTABLISHED
    tcp6       0      0 ip6-localhost:ipp       ip6-localhost:45710     ESTABLISHED
    tcp6       1      0 ip6-localhost:45708     ip6-localhost:ipp       CLOSE_WAIT 
    
    ip route list
    default via 172.26.10.254 dev wlan0  proto static 
    172.26.0.0/16 dev wlan0  proto kernel  scope link  src 172.26.10.152  metric 9 
    

  • Netgate Administrator

    I doubt this is applicable here but just in case. In this thread, example, the issue turned out to be an interface that had it's IPv6 type set to 'track interface' instead of 'none'. I guess you could check the VPN interface for something similar.

    Steve



  • Unfortunately that did not help.  My IPv6 configuration was already set to "None".  I changed it and then changed it back, but no luck.


  • LAYER 8 Netgate

    Pinging IPv4 addresses directly shouldn't involve IPv6 at all.

    Are both sides pfSense?

    What version?

    What's on the IPsec tab of the firewall rules at both ends?



  • Only my side is pfSense.  The other side is a Cisco ASA.

    My end is 2.1.5.

    I do not know much about the ASA other than I told the corporate firewall guys that I didn't want one  :)

    To me, it seems the issue has to be on my end because the windows hosts (and my iPhone) operate just fine through the tunnel.

    Also, just to mention it again, the FIRST time I ping a host on the other end of the tunnel from the Linux laptop, I get ONE reply back and then all others fail.

    All following communications to that same host on the other side fail.  If I try another host on the other end of the tunnel from the Linux machine, I will again get a reply on the FIRST ping.  All other pings fail and all other attempts to communicate with that host fail, until I reboot the linux machine.

    Thanks again for your help in figuring out this mystery.



  • While reading another thread, I noticed a suggestion to use packet capture.  I had forgotten about that being in pfSense so I did that today.

    I pinged a host and captured the following.  You can see that one good ping reply followed by nothing.  But, I am not sure how to really interpret these results so I am hoping someone on here can help in that regard.

    Thank you again.

    12:34:15.423806 IP 172.26.10.153 > 172.25.10.11: ICMP echo request, id 3515, seq 1, length 64
    12:34:15.424004 IP 172.26.10.254 > 172.26.10.153: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36
    12:34:15.448867 IP 172.25.10.11 > 172.26.10.153: ICMP echo reply, id 3515, seq 1, length 64
    12:34:16.425303 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:17.424494 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:18.424525 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:19.424416 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:20.424455 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:20.432494 ARP, Request who-has 172.26.10.254 tell 172.26.10.153, length 46
    12:34:20.432512 ARP, Reply 172.26.10.254 is-at 00:10:18:03:75:7f, length 28
    12:34:21.424495 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:22.424698 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:23.424586 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:24.424355 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    

  • LAYER 8 Global Moderator

    Why would you arp for something that is not on your network?

    12:34:16.425303 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:17.424494 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    12:34:18.424525 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46

    Your arping for 25.10.11 from 26.10.253

    looks like 10.253 redirect your icmp request, and it sent you back a reply.. but clearly this seems to be different network because your not getting arp back.



  • 172.26.10.253 is my pfSense firewall.

    172.26.10.153 is the linux machine that gets 1 ping reply and then none after that.

    172.26.0.0\16 is my local LAN

    172.25.0.0/16 is the other side of the tunnel.

    I know that didn't exactly solve the issue, but does that help in your figuring out why traffic is not being routed?

    Thank you.



  • Wait a minute…..

    172.26.10.253 is my wireless router.

    .254 is pfSense.

    It would see that the wireless router (being used as just an access point) is somehow trying to do more than just drop the wireless clients on to the LAN.

    Could it being trying to find the route itself for some reason?


  • LAYER 8 Netgate

    Unplug it, get everything else working, then add it back properly configured.  I'm starting to smell a duplicate IP address somewhere.


  • Netgate Administrator

    Some of this traffic is going over wifi?
    That packet capture was on the pfSense LAN interface I assume?
    Are you using static IPs or DHCP? Check the DHCP leases are coming from pfSense if you are.

    .253 is not actually shown. I think that's just a misread of .153. Your wifi access point does not appear to be involved at all.

    Try running a similar packet capture while pinging from a Windows client for comparison.

    Steve


  • Netgate Administrator

    What's that ICMP redirect doing?
    It appears, to my untrained eyes, to be pfSense(172.26.10.254) telling your client(172.26.10.153) that to reach the remote host(172.25.10.11) there's a better router going directly via 172.25.10.11.  :-\



  • Here is a ping from my laptop (172.26.10.50) to a host across the VPN (172.25.10.11)

    DHCP is in use, but I am certain only pfSense is giving out addresses.  I reviewed the wireless router setup numerous times and it looks good in that regard:

    Good Ping from Windows

    14:41:21.359361 IP 172.26.10.50 > 172.25.10.11: ICMP echo request, id 1, seq 417, length 40
    14:41:21.359526 IP 172.26.10.254 > 172.26.10.50: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36
    14:41:21.384430 IP 172.25.10.11 > 172.26.10.50: ICMP echo reply, id 1, seq 417, length 40
    14:41:22.359116 IP 172.26.10.50 > 172.25.10.11: ICMP echo request, id 1, seq 418, length 40
    14:41:22.359274 IP 172.26.10.254 > 172.26.10.50: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36
    14:41:22.383116 IP 172.25.10.11 > 172.26.10.50: ICMP echo reply, id 1, seq 418, length 40
    114:41:23.364131 IP 172.26.10.50 > 172.25.10.11: ICMP echo request, id 1, seq 419, length 40
    14:41:23.364276 IP 172.26.10.254 > 172.26.10.50: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36
    14:41:23.388422 IP 172.25.10.11 > 172.26.10.50: ICMP echo reply, id 1, seq 419, length 40
    

    Failed Ping to Same hose from Linux machine (172.26.10.153)

    14:43:50.070739 IP 172.26.10.153 > 172.25.10.11: ICMP echo request, id 2305, seq 1, length 64
    14:43:50.070924 IP 172.26.10.254 > 172.26.10.153: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36
    14:43:50.099853 IP 172.25.10.11 > 172.26.10.153: ICMP echo reply, id 2305, seq 1, length 64
    14:43:51.072299 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    14:43:52.070287 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    14:43:53.070345 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    14:43:54.088953 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    14:43:55.086226 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    14:43:56.086409 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46
    
    

  • LAYER 8 Global Moderator

    And again your ARPing for a IP that is NOT on your network!!!

    14:43:51.072299 ARP, Request who-has 172.25.10.11 tell 172.26.10.153, length 46

    You get a redirect from 10.254 ???  Who is that?  You say you pfsense is .253
    14:43:50.070924 IP 172.26.10.254 > 172.26.10.153: ICMP redirect 172.25.10.11 to host 172.25.10.11, length 36

    And now your client at 26.10.153 is arping for that IP vs sending it out to its gateway.  No shit its never going to get an answer to that.



  • 172.26.10.254 is pfSense.

    I misspoke when I said it was .253 earlier, my fault.

    So, to be clear.

    | pfSense | 172.26.10.254 |
    | Windows Machine | 172.26.10.50 |
    | Linux Machine | 172.26.10.153 |
    | Host on other end of Tunnel | 172.25.10.11 |

    So, the initial redirect by pfSense seems to be correct, but then what would trigger the ARPing?

    I am not even sure the function of that, so I am pretty lost  :)

    Thanks again for your help!


  • LAYER 8 Global Moderator

    Why is doing a redirect? A redirect normally can happen when there a better route..

    "The interface on which the packet comes into the router is the same interface on which the packet gets routed out."
    "The subnet or network of the source IP address is on the same subnet or network of the next-hop IP address of the routed packet."

    This is when cisco routers would send a redirect.

    Do you have some issues with your masks on your interfaces..  How exactly do you have this site to site setup, are you not using a transient network?

    I ping a vpn client from a box on my lan and this is what a capture looks like on the pfsense lan

    15:17:15.135118 IP 192.168.1.100 > 10.0.200.6: ICMP echo request, id 1, seq 1, length 40
    15:17:15.333586 IP 10.0.200.6 > 192.168.1.100: ICMP echo reply, id 1, seq 1, length 40
    15:17:16.142803 IP 192.168.1.100 > 10.0.200.6: ICMP echo request, id 1, seq 2, length 40
    15:17:16.320914 IP 10.0.200.6 > 192.168.1.100: ICMP echo reply, id 1, seq 2, length 40

    You don't know what a arp is?

    You could turn off redirects I would think  net.inet.ip.redirect set to 0

    What does the traceroute look like?


  • Netgate Administrator

    I suspect pfSense is sending the redirect all the time but Windows and IOS are ignoring it.
    Disabling redirects in pfSense should at least prove this but why is it sending them at all? I assume it must be some misconfiguration in the VPN setup.

    Steve



  • Turning OFF redirects in the "System Tunables" worked!!

    net.inet.ip.redirect set to 0

    But, do you think there is a setup issue in the VPN that is really the culprit?

    I'd like to fix the root cause and learn from this, if possible.

    Thanks again and let me know what you think.


  • LAYER 8 Global Moderator

    We don't have anything of worth to work with here, other than saying he has a vpn connection to this other network.  We don't have routing table off the pfsense box, etc.

    Makes no sense that pfsense would send a redirect when it should be routing the traffic down the tunnel.  Is the mask wrong on the network in pfsense?  And it thinks that network is local?

    Really needs some more details on how pfsense vpn is setup, off what interface?  Routing table off pfsense would help for sure.


  • LAYER 8 Netgate

    As would a diagram properly documented with network and interface addresses.



  • Thanks guys.

    I'm heading out of office for the day but will post a diagram with details tomorrow and you can tell me what else to add to help figure it out.

    Hopefully as I document it, perhaps something will jump out.

    For now at least it works, even if I've just sort of put a band-aid on it.

    Thanks again and talk to you tomorrow!


  • Netgate Administrator

    Yes, seeing how your vpn interface is configured should be revealing.
    One thing that seems like it can cause this is having both subnets on the same interface. I'm struggling to see how that might apply here though.

    Steve



  • From a quick scan of this thread, I would guess that the netmask/CIDR on pfSense has been set (accidentally) to cover both the 25 and 26 networks - 172.26.10.254/15 (or smaller) would cover all that and cause pfSense to think that 172.25.n.n is on its LAN and thus send a redirect message back to the client.


  • LAYER 8 Global Moderator

    ^ Agreed, I never understand how people come in here asking without some diagram.. I can not believe a company that has multiple locations and a site to site vpn do not have a network drawing??



  • you create docs/schematics ?
    some of us seem to have the luxury of collegues and spare time ….

    i only know people who get abused by their employer todo a 5-man-job ; on their own    :D


  • Netgate Administrator

    Yes, unfortunately it doesn't surprise me at all. And in fact i'd go further and say that very often network issues can be caused by an existing network diagram that's out of date or just plain wrong. I have always found it prudent to assume nothing. Perhaps just my own experience.  ::)

    Steve



  • Thanks again everyone that's helped.  Comments about lack of documentation duly noted as well.  I am guessing there is a more elaborate network diagram with the main office guys that sort of support the network, but it is likely not fully up to date as well.  We're not a very large company so we do not have a fully dedicated group or person that supports the network.  If we did, they might have tried to force a Cisco ASA on me some time ago.  The fact I can more or less support what I've got has helped me and pfSense is really the reason I can support it, because it's straightforward to use.

    The person that originally reconfigured our company network decided to set up the main office and my office with very large LAN subnets for some reason.  So, you will see in the image that the main office is 172.25.0.0 - 172.25.255.255 and my office is 172.26.0.0 - 172.26.255.255.  We likely could/should have been all on 172.25 with the next digit being the assigned to each office and the last being left for all the hosts within the office.  But, no matter, that is how it is set up.

    When I first set up this remote office, we had no VPN connectivity at all.  I think I started with some Linux firewall distribution and then later used monowall and that lead me to pfSense.  I think it's been here since one of the very, very early releases.  All that being said, I'm a middling sort of network person so mistakes in the setup would not exactly be surprising.  Part of what is awesome about pfSense is the traffic shaping which has been huge for me because I use hosted VoIP for my office phone system.

    I've attached a very basic image that describes some of what I've mentioned along with the relevant pfSense screens (parts of them anyway), so you can see the setup.  I'm guessing one of you experts will notice something right away, which is appreciated.

    Thank you again for your help on this.

    ![pfSense VPN Info.JPG](/public/imported_attachments/1/pfSense VPN Info.JPG)
    ![pfSense VPN Info.JPG_thumb](/public/imported_attachments/1/pfSense VPN Info.JPG_thumb)


  • LAYER 8 Global Moderator

    This does not look right - see attached.

    You have the gateway setup for the remote 25 network as your lan interface 26.10.254 on pfsense???

    Where is your phase 1 details when you setup the tunnel?  You wan interface is normally your endpoint for the tunnel.




  • When you mentioned routing, I started looking around and found a specific routing entry for the VPN.  I am not sure why it's there, but it is.  I did some reading on VPN setup in pfSense and it seems that the routing over the tunnel takes care of itself so no specific routing entry like this should be needed.

    So, I removed this, re-enabled the redirect setting in system tunables back to the default and rebooted the linux machine (to be safe).  I can still ping IPs on the other end of the tunnel, so that's great!

    That introduced a new issue with DNS resolution over the VPN for our domain.  I figured that out with some searching but will post the details here so it might help another person later.

    Basically in the DNS forwarder where you can specify a domain override, I had to also specify the LAN IP of pfSense (172.26.10.254 in my case) as the "Source IP" on the domain override configuration.  Once I did that, lookups for our domain worked perfectly again.

    So, at the end of the day, the issue was the static route that I added and then the IP on the DNS domain override.  I assume I did the route entry to try and "tell" pfSense to send traffic for the remote VPN someplace.  And oddly, it worked until now.

    But, it now seems that all is well and I've only got the configuration in place that is needed.

    Thanks again everyone!




  • Basically in the DNS forwarder where you can specify a domain override, I had to also specify the LAN IP of pfSense (172.26.10.254 in my case) as the "Source IP" on the domain override configuration.

    You usually have to do that when the DNS server that services the domain in question is over a VPN, because otherwise the source IP of the request (from the pfSense, across the VPN to the DNS server) will be some IP address of a VPN tunnel endpoint, or some internal tunnel address. The remote DNS server typically won't have a route back to that and so the reply to those DNS queries would never make it back.


Log in to reply