Default deny rule blocking some IPSEC traffic



  • Hi everybody,

    I've a multiple sites connected over IPSEC in an hub and spoke configuration.
    All sites use pfSense, and the whole setup is working fine apart for the issue below.

    On the hub firewall I can see traffic between a server in the hub site and 2 servers in remote sites dropped by LAN's "Default deny rule IPv4".

    Hub site server 192.168.126.10/24 <--> Remote server A 192.168.180.10/24
    Hub site server 192.168.126.10/24 <--> Remote server B 172.17.97.10/24

    0_1537212754203_Firewall_log_01.JPG

    LAN rule is "allow any".

    0_1537213140468_LAN_rule.JPG

    IPSEC rule is "allow LAN".

    0_1537214121496_IPSEC_rule.JPG

    The issue looks like asymmetric routing but it is not: hub firewall does have 2 WAN connections, but all traffic to/from remote sites is going through IPSEC. (I tried "Bypass firewall rules for traffic on the same interface" without success).

    What's going on?

    The funny thing is others machines can connect to each other. Only the three machines above cannot. And of course, by Murphy's Law, these are the ones which must comunicate...


  • Rebel Alliance Developer Netgate

    Are the packets delayed in some way? Drops like that can happen not only from asymmetric routing but also if the traffic somehow is considered "out of state", which could be packets after a state was removed, packets that missed their TCP window, etc.



  • AFAIK no.
    Site to site latency is stable around 35 ms.
    CPU usage is usually below 10%.

    0_1537216455213_CPU.JPG


  • Rebel Alliance Developer Netgate

    That's just one possible way those types of blocks can happen. The "pass all" rules only pass new TCP connections (TCP flags S/SYN set, A/ACK not set), so if anything happens to disrupt things then you see packets like this dropped by the default rules. A few examples include:

    • Asymmetric routing -- packets enter and exit via different paths, or pfSense somehow only sees half of the flow
    • States dropped/killed, for example from gateway failover triggering a clearing of states
    • The traffic did not match an existing state and failed to create a new state (TCP:S or UDP, tends to happen when a broken client reuses source ports incorrectly)
    • Server sent a packet after the client had closed the connection. Variation of the state being removed already. Happens a lot with clients/servers that try to hold open or reuse connections in various ways, primarily with web servers.
    • States were removed because the table was nearly full and it had to make room for new connections -- happens more often with clients that spam servers with tons of connections and fail to close them promptly

    If it isn't asymmetric routing, then odds are it's that last one. It looks like these are client LDAP connections and all of them are close in port range from the same client. Probably a high volume auth environment.

    Try changing Firewall Optimization to Aggressive under System > Advanced, Firewall & NAT tab.

    That said, in these cases the log messages are typically harmless because by the time the connections are reaped it's done using them anyhow.



  • Jiimp, thank you for your help.

    Just to let you know, the problem was not related to pfSense but to a domain controller failing to respond to Active Directory Replication.



  • Well, problem is not actually solved.

    The hub firewall still drops packtes "by default rule" randomly on IPSEC.
    Here you can see 2 different domain controllers (192.168.126.10 and 192.168.58.200) on remote sites trying to connect to a domain controller (192.168.126.10) in hub site.

    0_1540300953332_532e7fcb-2c1c-4bb1-9730-7fd9fc927221-image.png

    Despite drops AD replication works, probably due to many retries, on all but one remote site. The pfSense on this remote site also experience drops "by default rule" on IPSEC and probably drops on both directions are too much to handle.

    The only "strange" thing both firewall have in common is the WAN interface is disabled. WAN connectivity is provided over one of the OPT ports. Default WAN port is no longer used and disabled.

    Is this setup supported?


  • Rebel Alliance Developer Netgate

    Those are not new connections trying to reach that target.

    They are TCP:FA packets (FIN+ACK) meaning a connection was successful but is now being closed. The fact that it is logged probably means these final packets were sent after the state was removed by pf. Not sure why they held that packet back, but it was probably some kind of wonky attempt at connection re-use.



  • Thank you jiimp,

    but FA are not the only ones begin dropped.

    0_1540304031470_a1c972ea-6f92-469d-9105-a960faa56df4-image.png

    These should be the last step in the 3 way handshake, isn't it?


  • Rebel Alliance Developer Netgate

    A simple ACK could be from any number of things. Though normally you wouldn't see those blocked unless the state was dropped, or unless there was some kind of asymmetric routing happening, where pfSense only sees half the connection packets. But in those cases you would normally have problems getting traffic to pass at all.

    It still seems like it's trying to reuse the connection in a non-standard way and the states were gone by the time those packets went through. Tough to tell for sure, though.



  • States number is low (below 2000 now) and asymmetric routing should not happen on IPSEC...

    Actually I have connection issues now. Ie I cannot connect from 192.168.180.10 to 192.168.126.10 (or viceversa) with remote desktop, but I can do it (to both) coming from another IPSEC tunnel.
    And maybe tomorrow they will start to work again, and in few days they will not connect anymore…

    Is this something you can look at under a Support Subscription?


  • Rebel Alliance Developer Netgate

    And there is no other VPN or path that packets could be taking between those subnets?

    The inconsistency is definitely odd, but it could have an issue when rekeying or similar. That wouldn't usually show up in firewall logs, though.

    Check that none of your IPsec P1 entries have "Disable rekey" set. Stop and start the IPsec service (don't use the restart button, that does something different). Then see if it comes back to life.

    That is definitely something the support crew could help you track down.