OpenVPN through PFSense just stopped working - Help!



  • Hello everyone,

    I am writing from a medium size startup company where I administer the networks. I understand the basics of network connectivity although I am not an IT professional. We have had PFSense installed and running perfectly as our boundary firewall for several years, using OpenVPN to establish remote connections. Unfortunately over the last weekend something changed somewhere and now almost nobody can connect any more. I say almost nobody because one colleague can - and she's out of country on a business trip in China! Seeing how China often blocks VPN connections this irony is driving me nuts.

    Clients give the classic "can't reach the server" output in the logs:

    Tue Jul 23 09:40:47 2019 OpenVPN 2.4.6 x86_64-w64-mingw32 [SSL (OpenSSL)] [LZO] [LZ4] [PKCS11] [AEAD] built on Apr 26 2018
    Tue Jul 23 09:40:47 2019 Windows version 6.2 (Windows 8 or greater) 64bit
    Tue Jul 23 09:40:47 2019 library versions: OpenSSL 1.1.0h  27 Mar 2018, LZO 2.10
    Tue Jul 23 09:40:47 2019 WARNING: --ns-cert-type is DEPRECATED.  Use --remote-cert-tls instead.
    Tue Jul 23 09:40:47 2019 TCP/UDP: Preserving recently used remote address: [AF_INET]xx.yy.zz.abc:pppp
    Tue Jul 23 09:40:47 2019 UDP link local (bound): [AF_INET][undef]:0
    Tue Jul 23 09:40:47 2019 UDP link remote: [AF_INET]xx.yy.zz.abc:pppp
    Tue Jul 23 09:41:48 2019 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
    Tue Jul 23 09:41:48 2019 TLS Error: TLS handshake failed
    Tue Jul 23 09:41:48 2019 SIGUSR1[soft,tls-error] received, process restarting
    

    The server logs for OpenVPN show nothing, meaning that the client did not reach OpenVPN. The server is up, responds to pings, the IP address is correct, and all other external connectivity and functionality of the firewall seems to be fine. The gateway is up, nothing has changed in the firewall rules or NAT for many months. We do updates manually on the server, and all clients (Windows, Mac and Linux) have the same problem.

    Can anyone help me debug this?


  • LAYER 8

    did you check the firewall logs? do you have any additional packages that could block incoming connection from external ip? did you try "Packet Capture" from "Diagnostics" to see if there is any incoming request?



  • Thank you for the suggestions - I ran the packet capture on the external gateway and see a small number of packets arriving. Running the capture on the OpenVPN interfaces shows nothing is arriving there. I imagine this means it is not our network provider or anything external - the problem is somewhere inside PFSense?

    To check the other packages installed what sort of thing am I looking for? Only Snort, ntopng, and openvpn-client-export are installed and Snort's blocking is disabled.


  • LAYER 8

    if snort is not blocking ok
    please copy here what you have in the packet capture
    do you have anything logged in Status / System Logs / OpenVPN ?



  • The logs in Status / System Logs / OpenVPN are interesting. They show that some time between 08:00 and 10:00 yesterday morning something changed and that from that point on only our one colleague can connect over the VPN. She still can.

    This is starting to look like a certificate / Authentication / Handshaking error - I understand that if there is anything misconfigured with the certificates then the OpenVPN server will play dead as a security measure. I have exported new configuration details from our PFSense and now get lots of TLS errors in the server logs:

    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 Fatal TLS error (check_tls_errors_co), restarting
    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 TLS Error: TLS handshake failed
    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 TLS Error: TLS object -> incoming plaintext read error
    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 TLS_ERROR: BIO read tls_read_plaintext error
    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 OpenSSL: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed
    Jul 23 11:52:20	openvpn	22773	123.456.789.001:58706 VERIFY ERROR: depth=0, error=unsupported certificate purpose: C=DE, ST=County, L=City, O=Company, emailAddress=my.email@company.com, CN=Company VPN Server
    Jul 23 11:52:19	openvpn	22773	TCP connection established with [AF_INET]123.456.789.001:58706
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 Fatal TLS error (check_tls_errors_co), restarting
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 TLS Error: TLS handshake failed
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 TLS Error: TLS object -> incoming plaintext read error
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 TLS_ERROR: BIO read tls_read_plaintext error
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 OpenSSL: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed
    Jul 23 11:52:14	openvpn	22773	123.456.789.001:63576 VERIFY ERROR: depth=0, error=unsupported certificate purpose: C=DE, ST=County, L=City, O=Company, emailAddress=my.email@company.com, CN=Company VPN Server
    Jul 23 11:52:13	openvpn	22773	TCP connection established with [AF_INET]123.456.789.001:63576
    Jul 23 11:52:08	openvpn	22773	123.456.789.001:44894 Fatal TLS error (check_tls_errors_co), restarting
    Jul 23 11:52:08	openvpn	22773	123.456.789.001:44894 TLS Error: TLS handshake failed
    Jul 23 11:52:08	openvpn	22773	123.456.789.001:44894 TLS Error: TLS object -> incoming plaintext read error
    

    I will dig a bit deeper and see what pops out...


  • LAYER 8


  • Netgate Administrator

    @charry2014 said in OpenVPN through PFSense just stopped working - Help!:

    The server logs for OpenVPN show nothing,

    What changed between that and now where you are seeing loads of errors in the logs.

    The client certs could have expired. Is the config of the colleague in China much newer?

    Steve


  • LAYER 8 Rebel Alliance

    Check Diagnostics > Backup & Restore > Config History to clarify nothing changed.

    -Rico



  • Thank you for your help everyone. The Backup and Restore / Config History logs show there were no suspicious updates to the server - so everything seems to be going as planned there.

    For better or worse our company has three OpenVPN servers running. These are mapped though different ports to separate tunnel networks to restrict certain classes of users (external contractors usually) from some of the server we run. If I try to establish a connection through one of these on port 10694 then I am able to see the TLS errors I quoted in the post above. Attempting the same thing on the 10691 admin server results in no OpenVPN server activity, but I do see packets in the packet logger diagnostic.

    What I conclude so far is that our external connectivity is fine but something inside our OpenVPN configuration automagically went foobar all by itself.



  • The packet capture from the admin VPN connection attempts (where nothing appears in the OpenVPN logs are)

    11:13:18.516683 IP 192.168.10.18.33835 > 12.123.213.210.10691: UDP, length 54
    11:13:20.552002 IP 192.168.10.18.33835 > 12.123.213.210.10691: UDP, length 54
    11:13:24.622042 IP 192.168.10.18.33835 > 12.123.213.210.10691: UDP, length 54
    11:13:32.149070 IP 192.168.10.18.33835 > 12.123.213.210.10691: UDP, length 54
    11:13:48.803252 IP 192.168.10.18.33835 > 12.123.213.210.10691: UDP, length 54
    

  • Netgate Administrator

    The three OpenVPN servers are running on pfSense? If they are external then check packet captures to them.

    If that traffic hits pfSense but seemignly does not reach the OpenVPN server check the firewall logs foir blocked traffic. Check the state table for any port 10691 states. If there are none something is blocking it.

    Are you sure Snort is not blocking it? Nothing in the Snort blocked hosts table?

    Steve



  • Thank you for your replies. All OpenVPN servers are running in PFSense. I have tried a number of configuration changes and nothing has made a difference. I will schedule a network outage for lunchtime today and try to reboot the whole thing in desperation.

    I conducted a review of how our PFSense is running and there are a few weird things - Firstly the OpenVPN as mentioned in this post, then the sync to the failover box is throwing hundreds of errors since around the time the VPN stopped working

    A communications error occurred while attempting to call XMLRPC method host_firmware_version: Unable to connect to tls://172.16.0.20:8304. Error: Operation timed out	@ 2019-05-24 00:06:39
    

    Finally one of our subnets is unaccessible even though I can see no changes in any configuration - I am still debugging exactly what the issue is here but no clients on the subnet are issued an IP address. Again the timing of this is interesting, the subnet became unavailable some hours after the above two problems occurred, perhaps when the DHCP leases expired.

    All told, I am now completely puzzled what happened.


Log in to reply