State Sync and connection loss



  • Hi Everyone,

    We are building a new firewall cluster using pfsense. For now, we are testing the state sync feature and failover. What we notice is :

    • States are synchronized between firewalls

    • When CARP vIP moves to the slave, client connections are lost

    In order to validate the failover procedure we did the following actions :

    • Reboot master firewall

    • Temporary disable CARP on master

    • Put CARP in maintenance mode on master

    Each time : The vIPs are correctly moved but the connections are dropped. I've validated this with an ISO download (wget) and using SSH connection between 2 zones.

    Any idea on this issue ?

    Thank you for advance



  • Have you reconfigured the outbound NAT to use the WAN CARP VIP instead of WAN IP?



  • Thank you for answer.
    I did not and that's true I did not think about that. I'm on mode "Hybrid Outbound NAT rule generation.". Do I have to use manual mode ?
    However, I think it could only be a part of the problem : I'm on "multi-LAN" setup and my ssh connection between 2 LAN (no WAN involved) is also getting dropped. Do you have an idea ?



  • If you switch to Hybrid mode the automatically generated rules will still persists, as far as I know. But anyway, if you stay in hybrid mode, manual mappings should be preferred to automatic rules.

    But this setting doesn't affect the internal connections between two LANs where no NAT is in use, of course.



  • @viragomann:

    If you switch to Hybrid mode the automatically generated rules will still persists, as far as I know. But anyway, if you stay in hybrid mode, manual mappings should be preferred to automatic rules.

    But this setting doesn't affect the internal connections between two LANs where no NAT is in use, of course.

    Thanks for answer I will look at this. But first I think it is necessary to solve the LAN-LAN issue which might be 50% of the LAN-WAN issue (where the other 50% are NAT mapping like you said)

    Anybody has an idea ?

    Thanks !



  • After a test with a proper outbound NAT with the WAN vIP => Verified with tcpdump

    Same issue when downloading an ISO via http => "Connection reset by peer"

    There is a problem somewhere else



  • OK I think I've found the issue … It's because the hardware is different on both firewalls

    The problem is : sync state seems to be based on the Interface's physical name, and not to the logical name. If, example :

    Firewall1 :
    SRV -> bge0
    LAN -> em1

    Firewall2:
    SRV -> em2
    LAN -> em1

    It's not working !

    On I've tested with VLAN interfaces and there it seems to be OK.

    It's a normal behavior or do we need to declare a bug / enhancement request ?

    //EDIT: Maybe solution found here: https://forum.pfsense.org/index.php?topic=85245.0 -> Workarround is to create a fake lagg interface



  • Yes, the states are bound to the hardware interface name. As I remember, this behaviour was different in the past and was changed with FreeBSD 10.1 and pfSense 2.2 and assigning a LAGG interface is a recommended workaround:
    https://doc.pfsense.org/index.php/Redundant_Firewalls_Upgrade_Guide#pfSense_2.2.x_and_pfsync

    But I was thinking, this should only be an issue during a failover, cause the states are not true at the other pfSense.