Bandwidth problems between sites
-
@bp81 said in Bandwidth problems between sites:
Know of any sources of guidance on how to do this for VMware ESXi and Synology DSM?
No, unfortunately not - haven’t tried it on those systems.
Another possible solution is to insert a WAN accelerator device/software stack in both ends. A accelerator proxys traffic between sites and fools operating systems in both ends with immidiate TCP ACKs and what not. They also compress traffic and filters unneeded packets from the WAN link. A WAN accelerator can be ENORMOUSLY effective and allow you to utilise the link almost as LAN.
-
Yup, those can make a huge difference on high latency links. Pretty much required on Sat links for example.
However 25Mbps still 'feels' low to me for 40ms. I can hit my WAN limit here ~70Mbps when downloading from Austin and that's ~120ms.
Here's a quick test:
[22.05-RELEASE][admin@6100-2.stevew.lan]/root: fetch -o /dev/null https://atxfiles.netgate.com/mirror/downloads/pfSense-CE-2.6.0-RELEASE-amd64.iso.gz /dev/null 416 MB 5085 kBps 01m24s [22.05-RELEASE][admin@6100-2.stevew.lan]/root: ping -c 2 atxfiles.netgate.com PING files.atx.netgate.com (208.123.73.81): 56 data bytes 64 bytes from 208.123.73.81: icmp_seq=0 ttl=50 time=110.073 ms 64 bytes from 208.123.73.81: icmp_seq=1 ttl=50 time=109.770 ms --- files.atx.netgate.com ping statistics --- 2 packets transmitted, 2 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 109.770/109.921/110.073/0.151 ms
Not as good as I have seen, 40Mbps over 110ms, but still. It does take a while for the window to scale up though. You need to test for at least a minute to actually see the maximum.
Steve
-
@keyser said in Bandwidth problems between sites:
@bp81 said in Bandwidth problems between sites:
Know of any sources of guidance on how to do this for VMware ESXi and Synology DSM?
No, unfortunately not - haven’t tried it on those systems.
Another possible solution is to insert a WAN accelerator device/software stack in both ends. A accelerator proxys traffic between sites and fools operating systems in both ends with immidiate TCP ACKs and what not. They also compress traffic and filters unneeded packets from the WAN link. A WAN accelerator can be ENORMOUSLY effective and allow you to utilise the link almost as LAN.
Are there any pfSense packages that do this, or do I need to start looking for an appliance?
-
There are not. And, as far as I know, there are no FreeBSD ports either. Someone would probably have tested it by now otherwise. I believe there are some FreeBSD based solutions that so this but I don't think there are any that are freely available. I could be wrong...
-
@stephenw10 said in Bandwidth problems between sites:
There are not. And, as far as I know, there are no FreeBSD ports either. Someone would probably have tested it by now otherwise. I believe there are some FreeBSD based solutions that so this but I don't think there are any that are freely available. I could be wrong...
Is there anything available that woudn't require me to sell my children?
-
Did some more testing this morning and I think there is at least 2, maybe 3 things going on here. I changed the settings of iperf to a longer test time and longer interval (60 second test, 30 second interval. Seems like this would simulate large file transfers pretty well). I also tested against another branch office (let's call it Brance Office 2). Branch Office 2 also has 100mbps fiber.
Branch Office 2, if I run iperf outside the vpn tunnel, I can get about 36mbps transfer speed. Latency between HQ and Branch Office 2 is 47ms. Branch Office 1, I was able to get around 40mbps if using the longer test interval with iperf and not pushing traffic through the tunnel.
In both branch offices, pushing traffic through the vpn tunnel with the longer interval is still garbage. 6-8 mbps at best.
So I do tend to think there is a VPN problem after all, but I don't think it's the only problem. I expect the latency issue is also causing problems but, at the end of the day, VPN performance seems to be the biggest immediate problem.
I know that OpenVPN is not renowned for its speed. I am in a position to use something else if necessary, and I can tweak OpenVPN if needed. Would limited packet size inside the tunnel help (figuring there could be some fragmentation issues?)
-
Ha, that I don't know. I've looked into this a few times and never found anything practical that we could test. You really would think there would be an open source implementation of this somewhere....
-
@stephenw10 said in Bandwidth problems between sites:
Ha, that I don't know. I've looked into this a few times and never found anything practical that we could test. You really would think there would be an open source implementation of this somewhere....
I found OpenNOP that seems to be an open source solution for this. It's Linux based, I'll give it a test drive and see how it goes.
But from my earlier results this morning, I don't think this is the only problem. I do think there's a problem with my VPN tunnels in addition to latency issues.
-
-
OK, got some more weirdness with additional testing. Rather than testing iperf running on each firewall, I went back to testing from a workstation at HQ to a server at Branch Office. I am testing exclusively traffic passing through the VPN tunnel.
I ran iperf and set my TCP window to the max, about 3 megabytes. In the first couple of seconds I hit some decent bandwidth numbers, around 60mbps, but it quickly recedes back to around 10mbps and settles there.
I reran iperf with the large tcp window and set the packet size to 576, just in case I was tripping across a fragmentation issue. Same behavior.
I did UDP with iperf and set my bandwidth to 80 mbps, got huge packet loss on that. Even at a miserable 10mbps it was still around 1.5%.
It feels like the router's can't keep up with traffic crossing the vpn tunnel. The CPUs are not overworked on the appliances. CPU usage on either end spiked to around 10%. It feels more like there is a buffer somewhere filling up and it can't keep up, but for the life of me I can't see that on my routers. Is there some kind of bandwidth limitation that OpenVPN does by default I'm not aware of?
-
@bp81 The jumping up and down in throughput is a textbook example of how TCP reacts during attempts to upscale the sliding windows for faster transfers, and then out of order packets arrive, or a packet it lost. Whenever that happens, TCP will reduce its scaling windows size to half.
So if you are seeing lost packets or out-of-order packets, you will have this elevator up and down on actual bandwidth/throughput.
-
@keyser said in Bandwidth problems between sites:
@bp81 The jumping up and down in throughput is a textbook example of how TCP reacts during attempts to upscale the sliding windows for faster transfers, and then out of order packets arrive, or a packet it lost. Whenever that happens, TCP will reduce its scaling windows size to half.
So if you are seeing lost packets or out-of-order packets, you will have this elevator up and down on actual bandwidth/throughput.
It's far more acute of a problem when traffic passes through VPN tunnel than going router to routr outside the tunnel though. I think it's a problem in both places, but it absolutely kills VPN performance.
-
@bp81 said in Bandwidth problems between sites:
It's far more acute of a problem when traffic passes through VPN tunnel than going router to routr outside the tunnel though. I think it's a problem in both places, but it absolutely kills VPN performance.
Your next test should be testing the TCP throughput from your workstation with Wireshark installed and running. Wiresharks decoder should fairly easily point out to you if packets are lost or recieved out-of-order. It will also clearly show you if retransmits are occuring
-
What sort of VPN are you using there? How is it configured?
-
@keyser said in Bandwidth problems between sites:
@bp81 said in Bandwidth problems between sites:
It's far more acute of a problem when traffic passes through VPN tunnel than going router to routr outside the tunnel though. I think it's a problem in both places, but it absolutely kills VPN performance.
Your next test should be testing the TCP throughput from your workstation with Wireshark installed and running. Wiresharks decoder should fairly easily point out to you if packets are lost or recieved out-of-order. It will also clearly show you if retransmits are occuring
I’ll do Wireshark next week but I did a quick and dirty test that confirmed some of my hypothesis.
I replaced the open vpn tunnel with ipsec. A default iperf test through the tunnel went from 8 mbps to 40mbps. I extended the tcp window to 3mb and I hit max bandwidth (100mbps) and sat on it for the entirety of the test.
I think this demonstrates that open vpn is slow (known issue) and that the latency is also an issue pretty conclusively.
I’m probably going to replace my site to site links with ipsec. I’d rather not, as administration of openvpn is simpler, but I have a good reason to do it. I’m not going to switch client to site links with ipsec. OpenVPN just has too much going for it in that role, and the bandwidth isn’t an issue for our limited client to site use.
-
@bp81 Very interesting observation about the VPN type.
And pretty intereseting you can get the throughput up so evenly considering you had some issues outside VPN with iPerf as well..Anyhow, never quite understod why so many love OpenVPN for enduser VPN.
I think the OpenVPN client is a pita when it comes to userinterface, maintenance and deployment. I much much prefer the simplicity of the operatingsystems builtin VPN client (IPSec). The UI is very simple and integrated in the OS. Setup is either a simple manual guide or a simple script/configurationfile that needs to be deployed.
And all modern operating systems works beautifully with the Mobile IPsec VPN in pfSense. -
@keyser said in Bandwidth problems between sites:
@bp81 Very interesting observation about the VPN type.
And pretty intereseting you can get the throughput up so evenly considering you had some issues outside VPN with iPerf as well..Anyhow, never quite understod why so many love OpenVPN for enduser VPN.
I think the OpenVPN client is a pita when it comes to userinterface, maintenance and deployment. I much much prefer the simplicity of the operatingsystems builtin VPN client (IPSec). The UI is very simple and integrated in the OS. Setup is either a simple manual guide or a simple script/configurationfile that needs to be deployed.
And all modern operating systems works beautifully with the Mobile IPsec VPN in pfSense.Our experience with the Windows VPN client has been lackluster. We are also under cybersecurity and compliance obligations to implement "paranoid levels of security" let us say. Authentication for end users is via AD / LDAP authentication AND client certificate. That last bit with the authentication does not work nicely with IPSEC as implemented in Windows; it will authenticate one and only one thing (be that certificate or credentials). It does ok if it's certificate only, but requirements are for two steps of authentication, and that's not negotiable at any level (eventually I will add the Azure AD extensions to our NPS servers and use Azure's authenticator app pop up as our second factor as we do with other systems. This might be friendlier to IPSEC).
On the flipside, with OpenVPN, all I need to do is issue a client certificate, generate a profile, and deploy that profile via PDQ deploy or simple network file copy. I can deploy to any end user at any time without even touching their workstation as long as they're inside a corporate network.
-
@bp81 said in Bandwidth problems between sites:
@keyser said in Bandwidth problems between sites:
@bp81 Very interesting observation about the VPN type.
And pretty intereseting you can get the throughput up so evenly considering you had some issues outside VPN with iPerf as well..Anyhow, never quite understod why so many love OpenVPN for enduser VPN.
I think the OpenVPN client is a pita when it comes to userinterface, maintenance and deployment. I much much prefer the simplicity of the operatingsystems builtin VPN client (IPSec). The UI is very simple and integrated in the OS. Setup is either a simple manual guide or a simple script/configurationfile that needs to be deployed.
And all modern operating systems works beautifully with the Mobile IPsec VPN in pfSense.Our experience with the Windows VPN client has been lackluster. We are also under cybersecurity and compliance obligations to implement "paranoid levels of security" let us say. Authentication for end users is via AD / LDAP authentication AND client certificate. That last bit with the authentication does not work nicely with IPSEC as implemented in Windows; it will authenticate one and only one thing (be that certificate or credentials). It does ok if it's certificate only, but requirements are for two steps of authentication, and that's not negotiable at any level (eventually I will add the Azure AD extensions to our NPS servers and use Azure's authenticator app pop up as our second factor as we do with other systems. This might be friendlier to IPSEC).
On the flipside, with OpenVPN, all I need to do is issue a client certificate, generate a profile, and deploy that profile via PDQ deploy or simple network file copy. I can deploy to any end user at any time without even touching their workstation as long as they're inside a corporate network.
It’s true that Windows IPSec client is less than happy about anything but simple authentication (User or certificate).
But all my clients are using Azure/Office365 anyways, so they all use two factor auth on VPN with the Azure plugin on the Authenticating Radius Server. This does require the clients to have the Microsoft authenticator app on a smart device, but it works beautifully :-) -
@bp81 And then you only have to have a AD GPO that sets up the VPN on all required clients - fully automatic and never requires user intervention or manual procedures.
-
Update
I replaced all my OpenVPN tunnels with IPSEC tunnels initially. This proved to be a minor disaster as I started having bizarre problems with employees in remote offices being unable to access our database server over the VPN. Granted the connection has a lot of hops (It goes: Branch office -> HQ -> Datacenter hosting database server. Connection path passes through two separate tunnels) but it always worked when Branch Office -> HQ was an OpenVPN tunnel, if slowly (FWIW, the tunnel from HQ -> Datacenter has always been an IPSEC tunnel)
Replacing the Branch Office -> HQ tunnel with IPSEC tunnel created intermitten connection issues that did not show up by doing simple ping tests. I tried larger packets in my ping tests and found that a standard size packet, 1500 bytes, would not go through the tunnel. I did NOT specify that the packet could not be fragmented.
Once again I am reminded why I despise IPSEC and why I got rid of it years ago in the first place. Trying it again was an obvious mistake I should not have made. I am ultimately uninterested in dealing with the literal army of little gotchas that comes with IPSEC, and the fact that pfSense apparently doesn't expose in the gui configuration the arcane settings I need to find in order to redress the issue I was having.
I never did figure this one out and I don't have time to leave my employees twisting in the wind with a semi functional tunnel while I tinker with a thousand different arcane configuration settings to try to get it to act right.
The IPSEC tunnels have been taken down and replaced with OpenVPN tunnels again. We will simpy have to suffer the poor data transfer rates associated with OpenVPN in order to have something that actually functions.
I may look at issuing 3rd party https certificates to our NAS boxes and let them communicate outside a VPN tunnel altogether to redress the data transfer issues those units are having.