• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Port forwarded NAT TCP state disappearing during failover (SOLVED)

Scheduled Pinned Locked Moved HA/CARP/VIPs
2 Posts 1 Posters 2.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    adam65535
    last edited by Apr 18, 2012, 6:51 PM Apr 18, 2012, 3:42 PM

    I setup a new cluster install with bare minimal rules to test pfsense clustering for deployments in the future.  During the test I connect from a client on the WAN to an external WAN carp VIP which has a port forward to an internal LAN host.  If I can get this figured out I will be deploying this setup soon and will be purchasing support  btw.

    When I pull the WAN interface cable on main firewall I interestingly still have communication for about 24 seconds but then the NAT state of the connection gets removed from the secondary for some reason causing the connection to freeze.  I looked at the main firewall and the NAT state was also removed from the primary firewall.

    State table on secondary firewall before pulling the WAN cable on the main firewall is below(stripped of other states).  State is being synced to the secondary properly.  Looks good.

    y.y.50.22 is the internal SSH server on the LAN interface.
    x.x.136.204 is the CARP VIP on the WAN side of the firewall
    x.x.136.73 is the client on the WAN making the connection

    tcp y.y.1.50:22 <- x.x.136.204:22 <- x.x.136.73:51127 ESTABLISHED:ESTABLISHED
    tcp x.x.136.73:51127 -> y.y.50:22 ESTABLISHED:ESTABLISHED

    State table on secondary (primary too)firewall about 24 seconds after pulling the cable on primary.
    tcp x.x.136.73:51127 -> y.y.1.50:22 ESTABLISHED:ESTABLISHED

    Connection now freezes about 24 seconds after I disconnected the WAN cable on primary.  The firewall shows that it is now blocking the communication from the client because the state is no longer there for that connection.
    The NAT state disappeared but the other part of that connection is still there.

    If I establish a new connection through the secondary it connects just fine.
    When I plug the cable back into the main firewall the NAT state does not disappear on the new connection so the failover back to the primary works without dropping the port forwarded connection from the WAN.  This main to secondary and secondary to main failover behavior is repeatable every time.

    I have a 2 cluster firewall with a main and backup on 2 Dell Poweredge 1950 servers.  VIPs are on the WAN and LAN with a dedicated sync interface.  Sync is working from primary to secondary(rules, VIP, state, etc).  State sync is also working from secondary to primary.  I am actually using 2 virtual IPs on the WAN.  One is the cluster IP of the firewall itself (not used for port forwarding).  The second (x.x.136.204) is what I am using to port forward the traffic to an internal private IP.  All VIPs have a different VHID group.

    Can anyone think of a condition that would cause the primary to remove the port forewarded NAT state and replicate that to the secondary or maybe both of them removing it during a main to backup failover?

    Installed using pfSense-2.0.1-RELEASE-amd64.iso.gz btw using the SMP kernel on scsi raid mirrored hard drives.

    1 Reply Last reply Reply Quote 0
    • A
      adam65535
      last edited by Apr 18, 2012, 6:56 PM Apr 18, 2012, 6:53 PM

      Problem solved…  After finding release notes mentioning a gateway monitoring option that disabled clearing states I found the option below.

      System->Advanced->Miscellaneous
      the bottom option...

      Gateway Monitoring
      States

      • By default the monitoring process will flush states for a gateway that goes down. This option overrides that behavior by not clearing states for existing connections.

      That is definitely not something you want for a cluster HA solution.  I don't see anything stopping deployment now with some more testing.

      1 Reply Last reply Reply Quote 0
      2 out of 2
      • First post
        2/2
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
        This community forum collects and processes your personal information.
        consent.not_received