Stale WG session ?
-
@cmcdonald I've tried to do some additional troubleshooting at times when the WG session has gone stale. When this happens, the android client shows repeated log messages stating that the "Handshake did not complete after 5 seconds, retrying". If I do nothing, the handshake process typically completes eventually after maybe ~5 mins.
In my case, the issue only seems to occur (at least I've only noticed it) when the phone connected to my IOT WiFi network that is behind the firewall. When looking at the state of the WG port at the time the handshake issue occurs, I see the following:
IOT udp <WAN_IP>:51420 -> <LAN_IP>:39844 MULTIPLE:SINGLE 33 / 270 4 KiB / 34 KiB
If I kill this state, the next handshake will succeed and the state then changes to:
IOT udp <WAN_IP>:51420 -> <LAN_IP>:39844 MULTIPLE:MULTIPLE 246 / 212 47 KiB / 46 KiB
I'm not sure if any of this helps shed any light on the issue and I'm no expert, but I wonder if there is perhaps an underlying issue in NAT reflection for the WAN address?
-
@hvbakel I switched to using a split DNS setup with a host override for the dynamic DNS name to point to the internal firewall address rather than the WAN address. Cautiously optimistic that this may have resolved the handshake issues I was seeing when connected to the internal network, as I've not encountered any since switching. I will keep monitoring.
-
@hvbakel Cheered too soon I'm afraid and the split-DNS solution also does not solve the periodic issue with handshake failures and sessions going stale. The issue also persists after upgrading to the recently released 2.6/22.01 version of pfSense.
-
I believe I finally got to the root of the issue on my end. As background, my goal was to have my phone to remain connected to the home network through wireguard when leaving the house. The official iPhone wireguard client has the option to conditionally connect on network changes, but this feature is not available in the Android client. While it is possible to use e.g. tasker to control wireguard tunnels this is more convoluted and requires location access to read out the wifi ssid. Therefore I was looking for a way to have an always-on WG connection whether the phone is connected to the home IOT WiFi, or to an external network. The WAN interface has a hostname registered through Dynamic DNS and my intial attempt was to use NAT reflection to maintain the WG connection when switching between external networks and the internal IOT WiFi. Unfortunately, it seems that the NAT reflection for WG is rather unstable and will periodically lose the ability to do handshaking when on the internal WiFi network, requiring a manual off/on toggle of the WG connection to get things working again.
The alternative solution I tried next was to turn off NAT reflection altogether and use split DNS instead. While this works in the sense that it maintains a stable connection on the internal network without handshaking issues, it leads to a new problem because once the WG connection has been established it expects the host IP address to remain the same. Therefore, the connection is lost when moving between internal/external networks because the split DNS will change the IP address.
My solution to this issue was to switch from the official WG Android Client to VPN Client Pro. This VPN client has two options to force a WG reconnection and re-resolution of the host DNS when switching networks and before re-establishing handshakes. This, in combination with the split DNS solution has finally resulted in stable WG connections on internal and external networks. VPN Client Pro also has extensive options to conditionally activate the WG tunnel when connected/disconnected to certain networks, though this again requires location access and location to be always on. One downside to the VPN Client Pro is that it requires a subscription, but in my view it's a small cost compared to the benefits it brings.
In summary, at least in my case, the stale connection issues were limited to connections on the home network and related to instability of NAT reflection with WG tunnels. It does not seem to be an issue with WG itself. A split DNS setup with re-resolving of hostnames allows for seamless transitions between networks with an always-on VPN connection inside or outside the firewall.
-
@hvbakel i think you hit the spot...!!! Thanks.
-
After @hvbakel analysis, I wonder how to apply it on iPhone...
-
I was experiencing this issue on my iPhone as well. Setting a keep-alive interval of 25 seconds in the peer configuration in pfSense did the trick. It's been perfect for the past 48 hours. No more stale sessions, no more toggling the WireGuard connection on and off.
Hope this helps someone.
-
@johnnytheguy much appreciated!
-
I see no difference
-
@chudak Don't know what to tell you. For me, it's been fixed for three days now. Maybe 25 seconds is the wrong value for your network? Also, sometimes different problems can have the same symptoms.