Problem with OpenVPN only for remote users with Google Fiber


  • In the previous 3 days I have been having a problem with my travelling openVPN clients connecting use OpenVPN in Tun mode. I have a fleet of identical laptops that use OpenVPN to get back in for remote work. All are working except for the ones that are using Google fiber. All have identical configurations except for the user account.

    The symptoms are as follows. The users using Google Fiber see intermittent disconnects, freezes, and sometimes complete inability to connect at all. Even though they are all connecting to the same OpenVPN instance, the logs show differences for my 2 users that have Google Fiber. If one of the Google Fiber users takes their laptop to another connection or uses a wifi hotspot on their phone it works just fine.

    Here is a server side log sequence from a NON google fiber user:

    Jan 7 08:49:14 	openvpn 		user 'nongooglefiberuser' authenticated
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_GUI_VER=OpenVPN_GUI_11
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_TCPNL=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_COMP_STUBv2=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_COMP_STUB=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_LZO=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_LZ4v2=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_LZ4=1
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_PROTO=2
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_PLAT=win
    Jan 7 08:49:14 	openvpn 	24165 	nongooglefiberuser/<IP ADDRESS>:1194 peer info: IV_VER=2.4.8 
    

    Here is a log sequence from a Google Fiber user:

    Jan 7 09:08:05 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 PUSH: client wants to negotiate cipher (NCP), but server has already generated data channel keys, re-sending previously negotiated cipher 'AES-256-GCM'
    Jan 7 09:08:04 	openvpn 		user 'googlefiberuser' authenticated
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_GUI_VER=OpenVPN_GUI_11
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_TCPNL=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_COMP_STUBv2=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_COMP_STUB=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_LZO=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_LZ4v2=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_LZ4=1
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_NCP=2
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_PROTO=2
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_PLAT=win
    Jan 7 09:08:04 	openvpn 	24165 	googlefiberuser/<IP ADDRESS>:1194 peer info: IV_VER=2.4.8 
    

    Even though they are using identical configs and connecting to the same server instance, something about Google Fiber is changing the connection and triggering IV_NCP=2 and tons of renegotiations somehow.

    Here is the config used on all client sides. The only difference is the user account and their corresponding certs. All other aspects of my remote users are identical (laptop model, OS, OpenVPN version, everything)

    dev tun
    persist-tun
    persist-key
    cipher AES-256-CBC
    auth SHA256
    tls-client
    client
    resolv-retry infinite
    remote <server IP> 1194 udp
    verify-x509-name "pfsense" name
    auth-user-pass
    pkcs12 pfSense-hq-udp-1194-user1.p12
    tls-auth pfSense-hq-udp-1194-user1-tls.key 1
    comp-lzo adaptive
    

  • Try to add

    mssfix 1400
    

    to the client config.


  • @pippin No change. Took less then 2 minutes to drop again.

  • LAYER 8 Rebel Alliance

    Peering problem between Google Fiber and your ISP using for the OpenVPN server?

    -Rico


  • IV_NCP=2 only indicates the client supports the ciphers pushed by the server.
    This looks more like some network issue, since

    the previous 3 days .....
    The users using Google Fiber see intermittent disconnects, freezes, and sometimes complete inability to connect at all.

    You could try a second server instance, basically a copy of UDP but changing to TCP.
    Then let those two connect, see if it works.


  • @rico Can you explain what this means and how I would resolve?


  • @pippin Creating a new server instance and setting it to TCP is not helping either. Still dropping connections. At night under less load, the Google Fiber connections seem more stable but still drop. During the day it is completely unusable for my two users with Google Fiber.

    Also, I understand about the IV_NCP=2 being what it is, but why is it DIFFERENT for the Google Fiber user logs, when up until last night before I made the new server instance, everything about the configs and laptops was identical. What about the CONNECTION could be triggering that to show up in the logs when it doesn't for non Google Fiber users.


  • @rtkluttz said in Problem with OpenVPN only for remote users with Google Fiber:

    I understand about the IV_NCP=2 being what it is, but why is it DIFFERENT for the Google Fiber user logs

    I would think ...
    The IV_NCP=2 for the 'googlefiberuser' is because the client reconnects within the keepalive timeout.
    When this happens the server doesn't know the client disconnected and uses the keys that were already generated.
    From the client point of view, it just wants to negotiate a cipher but it's unaware of the server not knowing it disconnected.
    .

    PUSH: client wants to negotiate cipher (NCP), but server has already generated data channel keys, re-sending previously negotiated cipher 'AES-256-GCM'

    That message ^^^ makes me think that.
    So I doubt it has anything to do with the problem.


  • Additional logs. Hopefully this will help someone help me. Disconnects, failed auths with cached credentials that succeed before and after, and bad source address errors all showing up.

    I have log level 4 stuff to post, but it won't let me post it and is saying askimet is thinking I am a spammer.


  • Try to post log as a text file.

    sanitized-log.txt