Static Routing and ICMP Redirects Frequency == Lost Connections?



  • Dear Boardmembers,

    Since the migration from a m0n0wall setup to the latest pfSense stable (2.0.1), I've run into a problem. The routing to another gateway on the lan for specific subnets works, but sometimes the connections drop (Timeout). Sofar I've been able to reproduce the symptoms on Windows, but never on Linux.

    The LAN network looks like this:

    
    Network: 10.16.0.0/24
    Gateway: 10.16.0.100
    
    

    We've got another subnet 10.17.0.0/24 which is accessible via another gateway on the LAN. This gateway has an IP of 10.16.0.253 and has been configured on the System -> Routing -> Gateways tab

    On pfSense (10.16.0.100) there is a static route which looks like this:

    10.17.0.0/24 	ROUTERNAME - 10.16.0.253 	LAN 
    

    With the help of TCPDUMP and WireShark I've discovered that the redirection the to second gateway works due to the ICMP Redirect packets. This was to be expected. However, it seems pfSense sends these packets with intervals of ten minutes. See this tcpdump except. (10.16.0.150 is my test station on which I ran the tcpdump)

    11:03:34.855040 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    11:13:35.696014 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    11:23:35.498285 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    11:33:35.334120 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    11:43:35.125026 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    11:53:34.987470 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:03:34.853496 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:13:35.681633 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:23:35.517551 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:33:35.349875 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:43:35.183940 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    12:53:34.995803 IP 10.16.0.100 > 10.16.0.150: ICMP redirect 10.17.0.69 to host 10.16.0.253, length 36
    
    

    It seems Windows has a 10 minute timeout for routes added to the system by an ICMP Redirect which is a tiny bit too close to the 10 minutes of pfSense I feel.

    Windows NT Specifics:

    A host route learned by means of an ICMP Redirect will be added to the
    route table for 10 minutes, after which time it is removed and must be
    relearned through another ICMP Redirect.

    http://support.microsoft.com/kb/195686

    m0n0wall, (or FreeBSD used in m0n0wall that is) seemed to have offered the ICMP Redirect more often than every 10 minutes. So no lost connections (Timeouts) due to not finding the correct route there.

    My question though. Is there ANY way to increase the frequency with which pfSense (or FreeBSD for that matter) sends it's ICMP Redirect messages? (system tunables or someting)
    I possibly cannot change every Windows machine on the network.
    (And yes I know that the use of ICMP Redirects it not best practice and unsafe networking, but I'm stuck with it for the moment unfortunately)

    Anyone any ideas? Any guru there who can baffle me (not that hard I imagine) with some BSD magic?   ;D


  • Rebel Alliance Developer Netgate

    It's probably not the redirects, you probably need to go to System > Advanced on the Firewall/NAT tab and check "Bypass firewall rules for traffic on the same interface"



  • Thanks for pointing this out. This is indeed the most common issue with routing on the same interface.

    But, Sadly no. the Bypass firewall rules on same interface was allready active. Both on the old m0n0wall and the new pfSense.
    And I tried switching it off.
    How does enabling or disabling this create disconnects (due to timeouts)?

    The sessions (RDP/VNC/Telnet, anything) start fine, but then disconnect (a timeout) after about 10 minutes. And this is not an idle timer of the TCP session or anything, because these are actively used sessions. The problem is reproducable.

    PS. Where I say disconnects I mean Timeouts actually. I'll edit my posts….. (edited)


  • Rebel Alliance Developer Netgate

    Well with asymmetric routing, usually it's one side or the other that drops the state since it stops seeing traffic for it once the client picks up the route, once the client forgets the route it tries to send a packet back, and since it's not creating a connection, and there is no state, it gets dropped without the bypass rules.

    It happens all the time, exactly as you describe, when that option isn't enabled.

    Check the firewall log to see if the traffic is blocked when you see a timeout.



  • Im encountering the same issue.  Have you been able to resolve it on your end?



  • I was having this problem, and realized that somehow the gateway I created to use for the static route was applied to the WAN interface instead of the LAN interface.  As soon as I corrected that the problem completely went away.

    It was weird because the traffic would still pass, but the connection would reset every few minutes.  The firewall log showed that packets would get blocked every once in a while.  I just wanted to note this in case it helps someone else.


Log in to reply