Tuning OpenVPN over 4G to reduce fragmentation and retranmission

circle

Hi

I have build a temporary access solution for a site our company is moving to that allows access the main application back at the old office but performance is poor. This appears to be due to MTU/Fragmentation/retransmission

The setup looks like this:

New LAN -- pfSense/LAN/OpenVPN Client/WAN -- 4G Router -- Internet -- Leased Line -- Firewall -- Port NAT -- pfSense/WAN/Open VPN Server/LAN -- Local App Server (HTTP).

From the new site LAN we are tunnelling traffic into the OpenVPN and then exposing it on the LAN side of the old network.

In principle this works and we are able to access the web application at the far site but as soon as the data sizes grow we see performance is very poor. 4G performance for normal speed tests is 20 - 30Mbs download and about 10 - 15 Mbs upload which is plenty for this scenario.

Running wire shark at the server end shows lots of fragmentation and retransmission as soon as the request get about a few KB and despite reading lots of articles about tuning the MTU none of the changes seem to make notable difference.

Different articles seem to suggests the MTU should be altered in a number of places, either in the interface settings or in the custom options for OpenVPN. Where should we be making the changes?

Does anyone have experience of running a pfSense client over 4G, and can offer some suggested parameter tweaks. We are using EE 4G in the UK and I am starting to suspect they may be shaping the traffic.

Thanks in advance

Gary

JKnott

@circle

I don't know that MTU is the issue. IP has long worked between networks with different MTUs (I used to run 4K MTU over token ring, when I was at IBM). Also, these days Path MTU Detection is often used to adjust packet size and even when it isn't fragmentation still works (IPv4 only). In that instance the packet is reassembled at the destination, as though nothing had happened. Fragmentation due to different MTU shouldn't cause problems. TCP retransmission may indicate lost packets, for whatever reason, and occasional retransmission indicates flow control is doing it's job.

circle

Hi @jknott

I would tend to agree, I have never seen fit to tune MTU too much and have left it to do as it wishes. But having researched a bit there is a lot of suggestion that the VPN element is extremely MTU sensitive and this will result in fragmentation and very slow performance.

I'm currently building a secondary client setup to see if taking the 4G part out of the equation changes things.

Regards

Gary

JKnott

@circle

First off, a VPN is just another link with a slightly smaller MTU. Since it runs on UDP, then every packet is stand alone. If there is some issue with lost packets then that is dealt with at a higher level. Also, with Ethernet, each packet is protected by CRC so that any corrupted packet is discarded. I assume something similar protects 4G, though I can't say for certain as I don't work with that. Regardless, whatever happens to the actual VPN is irrelevant to the payload, which wouldn't even know about the VPN. All it sees is potentially lost packets that would be handled as usual. As for EE, I have no experience with them. However, with my carrier, I get 15GB of data per month and anything beyond that is throttled. When that happens, it's no different than experiencing congestion on Ethernet.

I have often connected to my home network, which is protected by pfsense, though not the remote end.

Does the fragmentation affect the VPN packets? Or payload?

circle

Hi @jknott

After a long weekend I was able to pin the true nature eof the issue down and find a solution via a workaround. As this is a temporary site-to-site VPN requirement the workaround will probably stay.

As I investigated the problem in more detail I replaced some parts of the solution to see how this impacted the fragmentation requests and was able to localise the issue to the pfSense LAN interface of the primary site.

This removed any concerns about the VPN, transmission of the 4G network etc.

In summary I could see a significant number ICMP Destination Unreachable (Fragmentation Required) messages sourced from the LAN interface toward a webserver that was sending back a larger multi-packet response to a request. For smaller responses where limited frangmentation was required the connection worked but as repsonse data grew the frequency of fragmentation required messages grew untile such a time that no data was being transmitted.

Despite changing MTU sizes there was little influence I could exert over this pattern and I was starting to become convinced that I the issue lie between pfSense/FreeBSD and the Hyper-V/Network cards in the Hyper-V host. I decided I would not solve this in a short period of time and needed to look at alternative options.

Taking a different tact I built 2 Debian based Untagle NGFW hosts in exactly the same way as the pfSense hosts, configured OpenVPN and the site to site link and bingo everything worked as I had expected. This supported my incompatability line of thought so at some point I plan to recreate the setup in our lab to see if I can recreate the fault. For now the staff are working and they are happy.

Gary

Pippin

Ticking "IP Do-Not-Fragment compatibility" might do the trick.

circle

Hi @pippin

I will give that a try and see if it helps.

Thanks

Gary