Issues using NAT source-hash on a /29 (2.2.6)

fredronn

I'm currently developing a NAT-setup based on pfSense for a larger wireless installation, and I'm currently rolling out a limited testing ground for technical people. The install is 2.2.6-RELEASE-pfSense.

The inside network is a 10.X.0.0/16 network, where I've set up 10.X.0.1 and 10.X.0.2 on each server using CARP, each being the master of one. Each server has a DHCP server handling a /18 setting the default route to their master 10.X.0.Y ip. This is a rough and easy way of making sure each server handles about 50% of the traffic. Lease times are one day, so if a server goes down, the other will take over the 10.X.0.Y ip and handle all traffic. Clients are not expected to last longer than a day.

On the outside I've dedicated 8 ip's, a /29, to the the outside translation address, and I've tried using source-hash to ensure that inside clients always translate to the same outside address, however it's not working properly. Clients are NATed to an outside address in a round-robin fashion, even with the source-hash directive. A few web apps that our clients are using depend on the origin address staying the same during the session, so it's not ideal to have round-robin NAT.

The 8 addresses are shared among the servers using CARP, where each server is the master of 4 of those. During normal operation each server has a NAT rule that will translate to it's /30 (4 addresses). I've also recently tried having the same NAT rule for the /29 on both servers.

I also have a dedicated pfsync interface (lagg0 2x1G LACP).

I'm not sure why source-hash doesn't seem to work properly, so I'm throwing out to the community to see if anyone else has run into any gotchas with source-hash? Maybe I'm missing something obvious.

Here's the generated NAT rule, and user rule (from /tmp/rules.debug), very few firewall/nat rules.


# Outbound NAT rules (manual) 
nat on $NAT_OUT from 10.X.0.0/16 to any -> A.B.C.D/29 port 1024:65535 source-hash
[...snip...]
pass in log quick on $NAT_IN inet from 10.X.0.0/16 to any tracker 1457101961 keep state label "USER_RULE: Default allow LAN IPv4 to any rule"

fredronn

As I've been digging through this the last few days, I have come to the conclusion that the source-hash pool option needs an optional key in order to provide consistent hashing. Unfortunately this isn't available in the pfsense ui, however you can specify a custom value by changing config.xml


<poolopts>source-hash 0x2fc76c65e927fcf98f56743d776747cc</poolopts>

This value is randomly generated unless specified every time pf is reloaded, so if you need consistent hashing you have to provide it. For our setup it is absolutely crucial that both servers use the same key.

I will also say that what we've opted to do, in order to not be limited to max # of vhid, was to for each server on the outside configure only one CARP address. Then we split the NAT CIDR range on the outside router with static routes to each CARP vip that's then redistributed into our infrastructure using OSPF.

I have submitted a pull request to the pfsense github repository for some webui changes https://github.com/pfsense/pfsense/pull/2743