Site-to-site dropout every minute



  • I have two-site setup with pfsense providing internet routing at each site. I've had a site-to-site OpenVPN connection running for about 3 months enabling a cross-site domain. Lately i've started experiencing dropouts. This looks to be fairly common where the client restarts every 1 minute. Typically this seems to be due to a firewall or config issue but in my case the VPN tunnel runs fine, without issue, outside of business hours. At about 8.30am when people arrive, the tunnel drops out and never recovers until about 5pm, and then never fails again during the evening.

    Here's the log:

    Jan 24 09:26:41 openvpn[33892]: UDPv4 link remote: [AF_INET]203.45.101.120:1194
    Jan 24 09:26:41 openvpn[33892]: UDPv4 link local (bound): [AF_INET]192.168.1.100
    Jan 24 09:26:41 openvpn[33892]: Preserving previous TUN/TAP instance: ovpnc1
    Jan 24 09:26:41 openvpn[33892]: Re-using pre-shared static key
    Jan 24 09:26:41 openvpn[33892]: NOTE: the current –script-security setting may allow this configuration to call user-defined scripts
    Jan 24 09:26:39 openvpn[33892]: SIGUSR1[soft,ping-restart] received, process restarting
    Jan 24 09:26:39 openvpn[33892]: Inactivity timeout (–ping-restart), restarting
    Jan 24 09:25:39 openvpn[33892]: UDPv4 link remote: [AF_INET]203.45.101.120:1194
    Jan 24 09:25:39 openvpn[33892]: UDPv4 link local (bound): [AF_INET]192.168.1.100
    Jan 24 09:25:39 openvpn[33892]: Preserving previous TUN/TAP instance: ovpnc1
    Jan 24 09:25:39 openvpn[33892]: Re-using pre-shared static key
    Jan 24 09:25:39 openvpn[33892]: NOTE: the current –script-security setting may allow this configuration to call user-defined scripts
    Jan 24 09:25:37 openvpn[33892]: SIGUSR1[soft,ping-restart] received, process restarting
    Jan 24 09:25:37 openvpn[33892]: Inactivity timeout (–ping-restart), restarting
    Jan 24 09:24:37 openvpn[33892]: UDPv4 link remote: [AF_INET]203.45.101.120:1194
    Jan 24 09:24:37 openvpn[33892]: UDPv4 link local (bound): [AF_INET]192.168.1.100
    Jan 24 09:24:37 openvpn[33892]: Preserving previous TUN/TAP instance: ovpnc1
    Jan 24 09:24:37 openvpn[33892]: Re-using pre-shared static key
    Jan 24 09:24:37 openvpn[33892]: NOTE: the current –script-security setting may allow this configuration to call user-defined scripts
    Jan 24 09:24:35 openvpn[33892]: SIGUSR1[soft,ping-restart] received, process restarting
    Jan 24 09:24:35 openvpn[33892]: Inactivity timeout (–ping-restart), restarting

    So like others the connection restarts at exactly 1 minute and 2 second intervals. The log repeats all day. Some time in the evening the vpn recovers and remains connected until the following day.

    We have only a single ADSL2 connection at each site and admittedly it is becoming a bottleneck with the increasing activity. However we don't experience internet dropouts. For instance, from the server site I can remotely dial the client site through LogMeIn and watch the client's pf GUI to check the log files. So in order to do this, it proves both the server and client's internet connectivity is sound. Also I have no other issues at either site with our web & ftp servers and other applications.

    So my questions are: how come the vpn drops out? And why would it be exactly 62 seconds, always? If it is due to a timeout can I increase this? If there is a packet loss can I adjust the retries? Would it help running OpenVPN separate to pfsense? Or do I need a dedicated internet connection just for the VPN? Is this even possible?

    Thanks for your help.



  • It does look like the server end is not responding. I guess the server end IP address is static 24 hours/day! Does the server end pfSense have anything in Firewall:Schedules that is implemented during business hours? Anything that might decide to block incoming on port 1194 at those times?



  • Hi, both sites have static IPs. Today I was able to get partial access for a few minutes today. Looking at the logs the dropouts were every minute, but in some cases the VPN connection remained up for about 3 or 4 minutes max. The time is now 4.45pm and the tunnel has been up for the last 10 minutes without a dropout.

    The issue is clearly linked to internet activity. As people start to leave, the connection becomes more reliable. However i've had a separate LogMeIn connection running between sites, sharing the same internet connection and this VPN hasn't dropped out all day. Seems OpenVPN is far more sensitive to timeouts etc.



  • I also have slow links (256kbps - remote areas in Nepal) and am running pfSense OpenVPN site-to-site. I have sometimes noticed that the server end sometimes shows up (green on OpenVPN Status widget) but the client end shows down (red). That state should only last for a minute or 2, until the server times out waiting for response from the client and resets itself, but it seems to be able to get "stucK" - it can endure for a long time (i.e. I gave up waiting and used Services:Status to restart the server and client ends). I haven't seen it recently though (I run 2.1-BETA1 built on Sat Jan 19 20:44:40 EST 2013 at the moment) and there was an update of the underlying OpenVPN software not long ago, in the BETA version.
    I also increased the timeout, in the Advanced box:

    keepalive 10 180
    

    But I can't confirm that this made any useful difference. (Hmmm - I should remove those and see if anything cares)
    I wonder if you see one end of the link shown as up, and the other end down?
    Sorry to not be much direct help :(



  • Hi Phil,
    Thanks for the info. Indeed the server end never goes down. There's no single error in the log. The only entry is when the client re-establishes a connection:

    openvpn[20305]: Peer Connection Initiated with [AF_INET]99.99.99.99:1233

    which is every minute or so.

    And like clockwork, from about 4.45pm yesterday to about 8.15am there were no dropouts, and now since 8.15am, it is dropping out again every minute or so….. :(



  • OK, might have found the problem. Someone was running uTorrent and seeding some files. This was maxing out the upload on our ADSL connection. The whole net connection is now more responsive since I removed the app. The person of course denies all knowledge - "wasn't me, don't know nothing about it….."



  • Can you say him to try something similar again and he can find himself looking for a new job.



  • Well again the site connection has dropped out every minute, all day. I've checked squid logs and also bandwidthD and there's nothing to suggest excessive traffic on either end. However the dropouts continue. So I established a private VPN so I could look at the two pfsense appliances side-by-side. The server end never reports a drop-out, however the reconnection from the client end (every minute) is noted in the OpenVPN server logs. On the client side is a repetition of the  Inactivity timeout (–ping-restart), restarting log entry.

    So I tried something. I disabled the client VPN, then disabled the server VPN, waited a minute, re-enabled the server VPN, then re-enabled the client VPN. And now the connection has worked for the past hour without a drop-out.

    So it seems to me the problem is not necessarily due to constant drop-outs, but instead that once an issue occurs, the reconnect doesn't work properly and the client side attempts unsuccessfully to reconnect every minute, but without a full reset, the connection might not be made again all day.

    Frustrating….  :(


Log in to reply