UDP NAT Problem : Random NAT bug ?

it_wyp

Hi,

First of all, thanks for the Pfsense team for offering such a good product without price.
I'm using Pfsense in a professional context (CARP, multiple Vlans, multiple Wans…) for 2 years now and i'm very satisfied (used to work with quite expensive products that had more bugs in it !). But this time, i have an issue i can't resolve, so, here am i.
My Pfsense version is 2.0.2 on two dedicated server.

I use 1:1 NAT for a very specific TCP & UDP Stream (L2 VPN), this stream must use a dedicated wan which differs from my default route.

Once the VPN is established between my server & and my client, their is a continuous UDP sendings between my server and my client (the L2 packets)

All is working perfectly for some time (seems random), but after a while, my UDP stream "loss" his NAT and switch to my regular wan (i assume because it's my default gateway).

A tcpdump shows non-nated packets (lan address) in my regular wan interface. (i'm using manual Outbound NAT but this should not be a problem because i'm using 1:1 for this case)

If i'm restarting my VPN server, or just killing the related UDP entry in the state table of Pfsense, my VPN works again for some time.
I assume that restarting Pfsense does the trick too but it's not a real solution isn't it ? ;)

Has someone experienced UDP NAT problem and solved it neatly ?

Thanks for your help.

user183

A tcpdump shows non-nated packets (lan address) in my regular wan interface

This sounds very similar to my problem!
The LAN address also appears in the WAN interface occasionally when I watch the WAN interface with TCPDUMP.

Please see my post earlier this week.
http://forum.pfsense.org/index.php/topic,62519.0.html

it_wyp

Thanks for responding.

I'm not sure but i think this is a different problem, you say you can't reproduce your bug on your secondary FW while mine is reproducible on the secondary, my TCP sessions does'nt seems to have problems too.

hard to say if this is relevant, maybe the "NAT bug" is a default behavior when the Pfsense doesn't know what to do, it may not be directly related to the root cause.

user183

I can now reproduce it on the secondary
It happens very rarely on the secondary but it still happens.

So, now we have two users with the same NAT problem.
You are also using pfSense 2.0.2 like I am.
Also, I have been using pfSense in production since 2010.

jimp

Sounds like this:
http://redmine.pfsense.org/issues/958

it_wyp

Just upgraded to 2.0.3 : same problem.

Indeed it seems related to this old issue, but i'm not using floating rules.

PS : jimp, thanks for writing the PFsense Guide, excellent book ;)

jimp

Interface group rules could also cause that. Or if your WAN or WAN2 don't have a gateway selected. Or if you've somehow otherwise disabled reply-to.

it_wyp

Interface group rules could also cause that.
I don't use them neither.

If your WAN or WAN2 don't have a gateway selected.
Gateway selected on both wan.

If you've somehow otherwise disabled reply-to.
I don't see how to make this, could you please explain me how to check it is well enabled ?

Although, in the old bug, it seems reproducible (as far as i understand the syn ack is always on the wrong interface).
In my case, it actually works for some time before giving weird results.
I just need to kill the states to temporarily fix the problem.

This is because i'm not sure it's something "disabled" but really a bug.

BTW, thanks for taking of your time for me.

user183

"reply-to" is in the System -> Advanced -> Firewall and NAT menu

it_wyp

Thanks.

the box isn't checked, so i assume it's not disabled.

(FYI, have tried to set the Firewall Optimization Options to conservative, but same results).

dhatz

@it_wyp:

Just upgraded to 2.0.3 : same problem.

Could you check with pfsense 2.1 ?

Btw Firewall Optimization Options => conservative only increases the state timeouts for TCP & UDP. It would be handy if you'd want to keep a UDP NAT state with a long period between "ping" packets. You can check your system's values with pfctl -st

jimp

We'll need to see the full /tmp/rules.debug to tell much more.

it_wyp

Could you check with pfsense 2.1

I'm sorry but my firewalls are in a production environment, i can't use beta versions as any devs problem would have major impact.
If this is the only way to investigate, i would have to build a test case in a lab but i don't know when.

Btw Firewall Optimization Options => conservative only increases the state timeouts for TCP & UDP.
I suspect a miss function in the way UDP sessions are handled.
As you certainly know, UDP isn't really statefull, so Pfsense has to work on "unperfect" sessions.
I was assuming that Pfsense (after some time) was considering my udp stream as a new one and treat it differently (in that case, without nat and on the wrong eth).
As TCP has no problem, i was thinking it was a good idea. that's why i tried the conservative mode, it seems i was wrong.

With the pfctl -st, i will check if my "random problem" becomes more reproductive, thanks !

I will send you the /tmp/rules.debug as soon as possible (a pm will be ok ?)

it_wyp

I've just checked the file content, i'm sorry, but /tmp/rules.debug contains way to much private data, i'm sure you will understand that i can't send it to someone without some serious NDA.

In order to let you investigate properly, i will try to reproduce my problem in a lab, i'll come to this topic as soon as possible.

Sorry for the delay.