Slow copy over site-to-site VPN, but only if copy initiated on the server side
-
Hello, I have a couple of sleepless night in trying to resolve this issue, so now I am asking for your help.
I have an OpenVPN server with three site-to-site clients connected.
Networks / Sites / Tunnels
Server (A):192.168.9.x / Openvpn tunnel:10.10.10.0/24
Site B (B):192.168.4.x - Openvpn client
Site C (C):192.168.7.x / Openvpn client
Site D (D):192.168.8.x / Openvpn clientServer and Site B are running on pfSense with 1GB/1GB down/up speed on both sites.
All sites can connect, all can communicate and access each others resource, no issue on this side of things.The issue what I can't figure out is the following and related to the Server and Site (B) - the rest of the sites have no fast router HW to counter-test.
Initializing the transfer from Site (B) I can copy FROM a file server on Server (A) with roughly 20MB/sec which is great.
Initializing the transfer from Site (B) I can copy TO a file server on Server (A) with roughly 20MB/sec which is great.
Initializing the transfer from Server (A) I can copy FROM a file server on Site (B) with roughly 20MB/sec which is great....but Initializing the transfer from Server (A) I can copy TO a file server on Site (B) with only roughly 8MB/sec
This is less than half when the same transfer from the same file server with the same file, but the transfer is initialized on Site (B).How this can be?
Traceroute looks great and same from both directions:
From Site (B) to the Server:
traceroute to 192.168.9.3 (192.168.9.3), 30 hops max, 60 byte packets 1 192.168.4.254 (192.168.4.254) 0.211 ms 0.187 ms 0.183 ms 2 10.10.10.1 (10.10.10.1) 22.747 ms 22.824 ms 22.825 ms 3 192.168.9.3 (192.168.9.3) 22.922 ms 22.928 ms 22.920 ms
From Server to Site (B) :
# traceroute 192.168.4.4 traceroute to 192.168.4.4 (192.168.4.4), 30 hops max, 60 byte packets 1 pfsense.localdomain (192.168.9.254) 0.181 ms 0.156 ms 0.208 ms 2 10.10.10.50 (10.10.10.50) 22.040 ms 22.688 ms 22.687 ms 3 192.168.4.4 (192.168.4.4) 22.735 ms 22.736 ms 22.787 ms
The issue is 100% reproducible and not related to busy links.
The only clue I had that it might be MTU related as the server WAN is a PPPOE link with MTU 1492, while Site (B) is on MTU 1500.
OpenVPN was set up as default with no MTU tweaks, but I tried with "mssfix 1300" with no change in this.
However, if this was an MTU issue, then the transfer wouldn't be fast for the same copy wither if initialized on Site (B).One more thing in case this would be important.
These speeds above are with having send/receive buffers set to 2MB.
With default send and receive buffers the 20MB/ sec goes down to 17/MB while the 8MB/sec goes down below 3MB/sec....The whole thing doesn't make any sense to me. Do you have an idea how this can be?
-
Still trying to figure this out. No one has any clue/ hint on this?
-
I'd also investigate a MTU mismatch etc... Here's my (potentially flawed) logic:
Server on Side A has larger MTU than Server on Side B. (I assume you copy server to server)
Initializing the transfer from Site (B) I can copy FROM a file server on Server (A) with roughly 20MB/sec which is great.
I assume the server on Side B requests a small packet size... (Maybe Path MTU Discovery)
Initializing the transfer from Site (B) I can copy TO a file server on Server (A) with roughly 20MB/sec which is great.
The server on Side B sends data packets that are smaller than Server A maximum accept size.
Initializing the transfer from Server (A) I can copy FROM a file server on Site (B) with roughly 20MB/sec which is great.
The server on Side B will only send small packets (or packets that are smaller than what Server A can receive)
...but Initializing the transfer from Server (A) I can copy TO a file server on Site (B) with only roughly 8MB/sec
Server A doesn't know that Server B can only receive small packets. The Firewall (VPN endpoint) on Side B now has do extra work breaking up large packets into smaller ones - which Server B can accept.
So my guess would be fragmentation etc...
MTU can be set on Host interfaces, too ... You could try reducing the MTU Size on Server A network interface.
Also have a look at the pfsense option (Remove DF bit)
https://www.reddit.com/r/sysadmin/comments/2mt3jc/reducing_mtu_value_to_fix_slow_cifssmb_over_vpn/