Test Request: UPnP Fix for Multiple Consoles playing the same game / static port outbound NAT
-
Thanks to everyone in this thread. With the help of these posts I was able to get CoD WWII at least working with 2 PCs, using the non-uPNP method @hansaya posted (I have 2 open NATs, and they can play together on a custom game, with other friends joining from the internet). Sadly that technique doesn't work for Black OPS III. I'm on Pfsense 2.5.2, miniupnpd 2.2.1.
Using just uPNP with no manual outbound nats/port forwards works on only one PC (Open NAT, but the second cannot establish any connection at all).
So I am available to help test fixes as they become available as well. Looks like the thread ended with the last dev snapshot still not working. Is it still being looked at by the devs? -
I spent more time testing this weekend, and narrowed things down a bit for the second PC in a call of duty WWII setup, where UDP 3074 seems to be the port of contention. (Reminder, first PC connects fine, gets Open NAT inside the game, second PC cannot connect at all).
I started by pcapping all the upnp control traffic. It starts off by asking for 3074, which is already taken by the first PC. The server responds correctly with an error code 718, and then the DemonwarePortMapping service on the PC asks for a new random port, usually in the 31xx range. The upnp server responds successfully, adds the mapping and I can see it in all 3 places - the pfsense gui, the output of the pfanchordrill and on the windows PC in the settings of the DemonwarPortMapping entries on the upnp router (in the network section of file explorer). This cycle repeats forever at this point, and the client will not connect. Each time it asks for another port, and obtains it successfully. The list of ports in the uPNP GUI fills up indefinitely.
So then I opened 2 ssh sessions to the firewall, did a pcap on the LAN and WAN at the same time, then merged them in wireshark. This allows me to see the packet as it left the host, and the packet as it leaves the wan interface, with timecodes to correlate everything.
Interestingly enough, despite the correct NAT/RDR pair in place, absolutely no packets leave on the wan. They come in from the client with src 3074/dest 3074, but never go out on the internet with the NATed src port. In fact nothing comes out. I have no firewall blocks at all. That explains why WWII just keeps asking for new ones as each attempt times out.
So what would cause the firewall not to NAT and transmit outbound on the WAN, despite the correct NAT/RDR paring in place?
Is there a debug log to check that might show why it isn't NATing and forwarding? There's nothing in routing.log, where miniupnpd writes its entries.
Hoping this sheds some light on the true nature of the issue.
-
Did a bit more troubleshooting, this time using pflog which is a bit easier to use in real time. X.X.X.X is my public ip masked out. 192.168.7.2 is the 1st PC that gets Open NAT. 192.168.7.3 is the second PC that will not connect at all. igb0 is my WAN interface.
Here are the nat/rdr pairs in place according to the pfanchordrill:
nat log quick on igb0 inet proto udp from 192.168.7.2 port = 3074 to any keep state label "DemonwarePortMapping" rtable 0 -> X.X.X.X port 3074 nat log quick on igb0 inet proto udp from 192.168.7.3 port = 3074 to any keep state label "DemonwarePortMapping" rtable 0 -> X.X.X.X port 3182 rdr pass log quick on igb0 inet proto udp from any to any port = 3074 keep state label "DemonwarePortMapping" rtable 0 -> 192.168.7.2 port 3074 rdr pass log quick on igb0 inet proto udp from any to any port = 3182 keep state label "DemonwarePortMapping" rtable 0 -> 192.168.7.3 port 3074
Looks perfect so far.
Here's what pflog shows happening with the UDP packets for the first PC, which gets Open NAT in game:
18:20:06.661452 rule 732/0(match): pass in on lan: 192.168.7.2.3074 > 185.34.107.128.3074: UDP, length 3 18:20:06.661471 rule 116/0(match): pass out on igb0: X.X.X.X.3074 > 185.34.107.128.3074: UDP, length 3 18:20:06.885506 rule 0/0(match): rdr in on igb0: 185.34.107.129.3075 > 192.168.7.2.3074: UDP, length 15
Again, the above is exactly what it should be.
Now PC 2 tries to join the game, and here's the issue:
18:31:45.791654 rule 732/0(match): pass in on lan: 192.168.7.3.3074 > 185.34.107.128.3074: UDP, length 3 18:31:45.791668 rule 115/0(match): pass out on igb0: 192.168.7.3.3074 > 185.34.107.128.3074: UDP, length 3
Note that the NAT failed to assign the Firewall interface IP to the Source IP. It straight up routed the packet as-is (with an RFC 1918 private IP). It also did not replace the source port with 3182 as it should have. (EDIT)
Of course the upstream router will get that packet and toss it.The pfsense will never actually route this packet as it is in the wrong subnet for that interface. That would be why in my previous tests, I didn't see them in the packet capture. (END EDIT) I am puzzled as to how that could happen. The second PC is falling right through the NAT rules, despite a manual catch all outbound NAT I have defined. That rule obviously works or I would have no functional internet traffic at all.I have to say, this looks like a PF issue to me, not miniupnpd.
If there is anyone out there still listening, it would be great if you could try this and see what you get.
To use pflog, you can type this on an ssh shell:
service pflog onestart tcpdump -eni pflog0 | grep 3074
Likewise, to stop pflog use:
service pflog onestop
-
@encrypt1d Great work. We are listening, hopefully the PFSENSE team is listening and can review your work.
-
That in itself isn't a
pf
issue really. It's what happens when there is a port conflict.There can only be one NAT state for
<wan addr>:<source port>
-><destination addr>:<destination port>
. If there is already a state for that and something else comes along and tries to start another connection that would end up using the same external source port and destination. It can't create a state for the second one because it already exists, but the firewall rules have passed the traffic, so it leaves without NAT.What that does tell me is that in your case they are both matching some other outbound NAT rule that is set to use static port, not the rules you printed above, but it does give me an idea.
Try applying this change, then either reboot or do a filter reload then reset states:
diff --git a/src/etc/inc/filter.inc b/src/etc/inc/filter.inc index d36d6df2e2..5a7c21bc2a 100644 --- a/src/etc/inc/filter.inc +++ b/src/etc/inc/filter.inc @@ -2091,6 +2091,8 @@ function filter_nat_rules_generate() { $natrules = "no nat proto carp\n"; $natrules .= "no rdr proto carp\n"; + $natrules .= "binat-anchor \"miniupnpd\"\n"; + $natrules .= "nat-anchor \"miniupnpd\"\n"; $natrules .= "nat-anchor \"natearly/*\"\n"; $natrules .= "nat-anchor \"natrules/*\"\n\n";
-
@jimp
Very promising! I just tested, and I think that change will work for everyone, except me. I can see the packets getting translated now. So that might be the fix for everyone else, but sadly I cannot completely verify.I hadn't brought this up yet to avoid mixing too many issues. I am unfortunately behind a ISP that gives me a 10.x.x.x address on my firewall. I have read all about the changes made a while back in miniupnpd that broke double NAT. My ISP gives me a dedicated Public IP, and forwards everything fully with no translations other than the public IP to my 10.x WAN IP. (paid extra for that).
What I see now is it using the public IP instead of my WAN interface IP. I have the ext_ip=x.x.x.x line in my miniupnpnd config file, without which it will never even add nat/rdr pairs. I tried using STUN, but then miniupnp says it is impossible to forward under that config via a debug message.
So I think I have had 2 issues all along. The first being those missing lines from the filter.inc file, and now I've moved to the 10.x address on my WAN interface.
Thought I was almost there. Is there a way to get miniupnp to use 10.x addresses for the NAT rules? From what I've read, it seems like no, so I might just be SOL. If there is an easy answer to that, I'd appreciate it. If I need to open another discussion over at miniupnp I'll follow that.
Their default config has an option to revert the IGD behaviour:
https://github.com/miniupnp/miniupnp/blob/master/miniupnpd/miniupnpd.confforce_igd_desc_v1=yes
Alas my version says this option is unrecognized. Not sure if it would even help.
I encourage everyone else behind a 'normal' ISP to give @jimp's change a try.
vi /etc/inc/filter.inc *Make the changes /etc/rc.filter_configure
Then from the GUI, reset the states under the diagnostics menu.
-
You can install the System Patches package and then create an entry for the diff I posted to apply the fix without manually editing files.
That said, being behind double NAT will be a problem for anyone in the same situation. Do they maintain static port on your outbound NAT even? Seems like most CGNAT type places won't do that. Some time ago, miniupnpd decided that it wouldn't run with private addresses. I'm still not sure why they won't give us a way to override that, as it also makes testing it internally more difficult.
-
-
I am trying my luck on a pull request that a user was working on that was aimed at fixing this double NAT issue, I think.
https://github.com/miniupnp/miniupnp/pull/565
-
@encrypt1d said in Test Request: UPnP Fix for Multiple Consoles playing the same game / static port outbound NAT:
I am trying my luck on a pull request that a user was working on that was aimed at fixing this double NAT issue, I think.
https://github.com/miniupnp/miniupnp/pull/565
Hopefully they do pull in something like that but I can't believe they won't just add a knob to disable all those checks. There are plenty of people with private addresses on WAN that know full well what they're doing with upstream NAT and it would work if they'd stop forcing the current behavior.
-
@jimp
I couldn't agree more. Options options options.
Let the network admin configure it the way they want.
We'll see what comes of it. -
I've also added my findings to a note on Redmine and setup a merge request to get the change into development snapshots. It's too late for 22.01/2.6.0 but once we're done with the release work, we'll get this into dev snapshots for easier testing.
-
@jimp Thank you!! Fixed for me on my plain old ipv4 cable internet with PFSense.
I was able to disable every single nat/rule I had setup for one particular game, and both PC and PS5 displayed "open NAT", automatically. Reverted the patch, reloaded filters, reset firewall state, and went back to "strict NAT", applied patch again, reloaded, reset, back to "open NAT". I've been so accustomed to doing this, that I wonder how many additional rules I could potentially delete for which the software supports UPnP!
This will be so nice to delete the many things I have in there, making it easier to see what I have setup, not to mention no more frustrations from my nephews "how long is this going to take??" every time I have to reconfigure something so we can play games together online.
Thank you @encrypt1d and @jimp for getting this resolved!
-
@jon8rfc For me the old days of simple NAT rules were easier.
I have found this era of supposedly automatic port forwarding problematic.
Not to mention that when I looked up the documentation the thing apparently needs more than 10 ports open to make it all work on modern consoles, I dont know how it all got so messy.
I also had to disable upnp a few weeks back when I was getting broadcast storms from the internet taking down my lan entering via upnp.
-
@jon8rfc I definitely would not thank @jimp .
If anything; he just admitted it's not a UPNP issue and it's been an issue under their control the whole time for years on end without them doing anything.
(Even people have been complaining for years and he even asked me go to the UPNP dev years ago claiming it was a upnp issue; for apparently; no reason.)Happy you guys finally solved something a standard router out of the box could do since it's inception.
Most of us have moved on by now. -
@firetop said in Test Request: UPnP Fix for Multiple Consoles playing the same game / static port outbound NAT:
(Even people have been complaining for years and he even asked me go to the UPNP dev years ago claiming it was a upnp issue; for apparently; no reason.)
miniupnpd
could not add the NAT rules needed in pf until recently. That only changed in a release in the last year or so. We couldn't do anything until that was done. After that was in place, it wasn't until a day or two ago that someone with an appropriate setup and diagnostic skills was able to get us the information we needed to know what else might be wrong. -
@jimp said in Test Request: UPnP Fix for Multiple Consoles playing the same game / static port outbound NAT:
it wasn't until a day or two ago that someone with an appropriate setup and diagnostic skills was able to get us the information we needed to know what else might be wrong.
This must include yourselves then? I'll agree I'll thank @encrypt1d; but lets face it as I already said most of us have moved on to working solutions already without this being fixed.
The fact is this reeks of "idk what to do to fix it; so I won't do anything" from the PFSense team; and that's just the worst kind of thinking to have for a project like this. It's really not that hard to setup a few consoles or PCs behind a PFsense firewall to test this "appropriate setup". So the only thing left is from what you said is that the PFSense team didn't have the diagnostic skills and left it for years on end. (Sad)
Even if UPNP did need a change; the fact that you asked a community member to go do a report for the UPNP dev instead of you or your team when you had an open issue for 2 years prior with no action; is still a sad fact to this day; I hope this helps the PFsense team reflect on how they deal with issues like this in the future.
-
@firetop said in [Test Request: UPnP Fix for Multiple Consoles playing the same
This must include yourselves then? I'll agree I'll thank @encrypt1d; but lets face it as I already said most of us have moved on to working solutions already without this being fixed.
Reproducing and confirming things required two identical consoles (that support UPnP) running the exact same game (that also needs UPnP). None of the developers or TAC crew here have a setup like that.
As for how many people may have given up on pfSense in that time, it's unlikely to have been as many as you suggest.
The fact is this reeks of "idk what to do to fix it; so I won't do anything" from the PFSense team; and that's just the worst kind of thinking to have for a project like this. It's really not that hard to setup a few consoles or PCs behind a PFsense firewall to test this "appropriate setup". So the only thing left is from what you said is that the PFSense team didn't have the diagnostic skills and left it for years on end. (Sad)
It's more difficult than you think (and expensive) to have multiple identical consoles and games with active online support.
It wasn't "we don't know what to do" but "we need someone with access to an appropriate setup to get us more information". I mentioned the potential need for this change even before the miniupnpd NAT code was ready but again, without an appropriate setup we couldn't diagnose it internally.
Even if UPNP did need a change; the fact that you asked a community member to go do a report for the UPNP dev instead of you or your team when you had an open issue for 2 years prior with no action; is still a sad fact to this day; I hope this helps the PFsense team reflect on how they deal with issues like this in the future.
If Netgate asks them to do it, it has the appearance of one company wanting them to do work for free. If users request changes, they can see that it was a request from the wider community and would have more benefit than only helping Netgate/pfSense.
But no matter how we arrived here, what appears to be a solid fix is pending and what we need now is more testing. I'm going to make a fresh thread once the fix is merged in snapshots so people don't need to go through years of history and discussion to find it.
-
@jimp You can justify this how you'd like; I still think someone should be help accountable for how long this took and so far it seems to be the pfsense team. (The UPNP dev made his change within what? 2 days of his report?) But this has been opened on the PFsense side since 2017... Took 3 years before it even made it's way to UPNP through the community and not through yourselves, and then another year before you guys thought to bring it back up. Sure I'll admit you mentioned the potential need for this change; but why did it take a year for you to re-address it? Just because you didn't have 2 PCs or 2 Consoles with the same game? It's things like this I think the team may need to let sink in. There is more that could be been done to speed this up for sure.
-
If you want to blame someone, you're welcome to blame me, but that accomplishes exactly nothing and serves no purpose.
I've already explained multiple times why it wasn't possible or feasible for us to reproduce it here and we needed feedback from others, which was the original purpose of this thread, before it was derailed repeatedly by people complaining rather than helping or staying out of the way so people could offer productive feedback.
Rarely does this kind of situation happen because in most cases we can either reproduce a problem in lab conditions or at least see where in the code it might occur, but not here. We can't just make changes arbitrarily hoping it might help, we need some evidence that the change will correct the problem and not have a negative impact. Which we now have.
If you have moved on, then move on. There is no need to keep clogging the thread. I'm going to lock the thread and make another comment below with the patch and links to the fix. There will be a new thread up once there are images to test.
-