25.07.r.20250709.2036 First Boot WireGuard Service not running
-
@stephenw10 Just upgraded to 25.07.r.20250715.1733, it happened again.
The problematic tunnel also is the same.
Another reboot fixes it.
Sidenote: I had uninstalled Nexus before but it was re-added on upgrading 25.07.r.20250715.1733.
-
Yes, Nexus is a default package in Plus, it should always be installed.
-
And it happened again. Tomorrow I will recreate this WG-interface just to make sure.
<opt1> <descr><![CDATA[VPNcWgNtcpDirect]]></descr> <if>tun_wg7</if> <enable></enable> <spoofmac></spoofmac> <mtu>1420</mtu> <mss>1420</mss> <ipaddr>10.3.9.26</ipaddr> <subnet>29</subnet> <gateway>VPNcWgNtcpDirectGW</gateway> </opt1>
-
I had recreated the interface, also moving it away from being opt1. Today I wanted to try the if_pppoe kernel module. After the mandatory reboot, not only was the WireGuard Service down again, also none of the tunnels were up... I switched back to the old module after having no success with WireGuard after another reboot. But even then it took two further reboots to have WireGuard working again. Right now it works with the new module according to the web-UI but I am really concerned what will happen at next reboot.
-
Hmm, so both the service and none of the tunnels were up after rebooting several times?
Nothing logged at boot or in the system log? No errors shown?
-
@stephenw10 Nothing at boot and nothing that pops into my eyes but I am not versed with the logs in general.
Some stuff:
Jul 20 17:30:38 vnstatd 44706 Interface "hn2.110" disabled. Jul 20 17:30:38 vnstatd 44706 Interface "hn2.111" disabled. Jul 20 17:30:38 vnstatd 44706 Interface "hn2.185" disabled. Jul 20 17:30:38 vnstatd 44706 Interface "hn2.35" disabled. Jul 20 17:30:38 vnstatd 44706 Interface "tun_wg7" disabled.
These interfaces don't exist anymore, still they are in the logs, why.
Other stuff I picked...
Jul 20 17:30:38 vnstatd 50564 Error: pidfile "/var/run/vnstat/vnstat.pid" lock failed (Resource temporarily unavailable), exiting. Jul 20 17:30:27 kernel wg5: changing name to 'tun_wg0' Jul 20 17:30:27 kernel tun_wg6: link state changed to UP Jul 20 17:30:27 kernel wg4: changing name to 'tun_wg6' Jul 20 17:30:27 kernel tun_wg5: link state changed to UP Jul 20 17:29:48 php-fpm 38554 /diag_reboot.php: The command '/usr/local/etc/rc.d/wireguardd stop' returned exit code '1', the output was 'umount: /var/unbound/dev: not a file system root directory'
Interestingly same problem at next reboot. While I was going through the logs, I restarted WG in Service Status and it came up, happy about that.
-
@Bob-Dig said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
not only was the WireGuard Service down again, also none of the tunnels were up...
Correction: None of the gateways corresponding to the tunnels were up. Before, only one gateway wasn't up, now no gateway was up. Have to check with the tunnels next time, if they are partially up or not.
-
@stephenw10 The problem is persistent. On every boot the WireGuard service is disabled and all corresponding gateways are disabled too.
All the WireGuard tunnels are up. If I enable the gateways by hand and then restart WireGuard, it is running fine. At least this is a solution that works. -
@Bob-Dig said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
These interfaces don't exist anymore, still they are in the logs, why.
They probably still exist in the configuration file for one of the traffic monitoring packages, traffic totals maybe? Resaving that with existing interfaces should remove those lines but I doubt they are causing this.
That error stopping wireguard looks to have come from the reboot script. I assume that was after you manually rebooted but before the actual reboot?
@Bob-Dig said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
If I enable the gateways by hand and then restart WireGuard, it is running fine. At least this is a solution that works.
The wiregaurd tunnel gateways? Or the WAN gateways?
I wouldn't expect the WG gateways to be available if the wireguard service is stopped. Conversely I expect them to become available when it starts and I assume that isn't happening if you have to manually start them. -
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
I wouldn't expect the WG gateways to be available if the wireguard service is stopped. Conversely I expect them to become available when it starts and I assume that isn't happening if you have to manually start them.
Yep. Although before on stable (and not recreating the first tunnel), it was sometimes the case, that the Service was stopped but almost all tunnels and their gateways where up.
-
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
hat error stopping wireguard looks to have come from the reboot script. I assume that was after you manually rebooted but before the actual reboot?
Probably, because when I "halted" the system and later rebooted, it wasn't there at all.
-
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
No errors shown?
Ok, used the Log Filter function the first time in my life.
Jul 21 09:37:46 vnstatd 49296 Error: pidfile "/var/run/vnstat/vnstat.pid" lock failed (Resource temporarily unavailable), exiting. Jul 21 09:37:26 php-cgi 709 rc.bootup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1753083446] unbound[58124:0] error: bind: address already in use [1753083446] unbound[58124:0] fatal error: could not open ports' Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (iwi_monitor_fw, 0xffffffff8077ff30, 0) error 1 Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (iwi_ibss_fw, 0xffffffff8077fe80, 0) error 1 Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (iwi_bss_fw, 0xffffffff8077fdd0, 0) error 1 Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (ipw_monitor_fw, 0xffffffff80760760, 0) error 1 Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (ipw_ibss_fw, 0xffffffff807606b0, 0) error 1 Jul 21 09:37:16 kernel module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff80760600, 0) error 1 Jul 21 09:34:32 nginx 2025/07/21 09:34:32 [error] 73048#100275: send() failed (54: Connection reset by peer) while logging to syslog, server: unix:/var/run/log
Jul 21 09:53:20 php_wg 12725 /usr/local/pkg/wireguard/includes/wg_service.inc: The command '/usr/local/bin/dpinger -S -r 0 -i VPNcWgNtcpGW -B 10.3.9.26 -p /var/run/dpinger_VPNcWgNtcpGW~10.3.9.26~10.3.9.25.pid -u /var/run/dpinger_VPNcWgNtcpGW~10.3.9.26~10.3.9.25.sock -C "/etc/rc.gateway_alarm" -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 10.3.9.25 >/dev/null' returned exit code '1', the output was '' Jul 21 09:53:20 php_wg 12725 /usr/local/pkg/wireguard/includes/wg_service.inc: The command '/usr/local/bin/dpinger -S -r 0 -i VPNcOciOPNGW -B 10.3.9.13 -p /var/run/dpinger_VPNcOciOPNGW~10.3.9.13~10.3.9.14.pid -u /var/run/dpinger_VPNcOciOPNGW~10.3.9.13~10.3.9.14.sock -C "/etc/rc.gateway_alarm" -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 10.3.9.14 >/dev/null' returned exit code '1', the output was '' Jul 21 09:53:20 php_wg 12725 /usr/local/pkg/wireguard/includes/wg_service.inc: The command '/usr/local/bin/dpinger -S -r 0 -i VPNcOciPFGW -B 10.3.9.9 -p /var/run/dpinger_VPNcOciPFGW~10.3.9.9~10.3.9.10.pid -u /var/run/dpinger_VPNcOciPFGW~10.3.9.9~10.3.9.10.sock -C "/etc/rc.gateway_alarm" -d 1 -s 500 -l 2000 -t 60000 -A 1000 -D 500 -L 20 10.3.9.10 >/dev/null' returned exit code '1', the output was '' Jul 21 09:48:24 php-cgi 709 rc.bootup: The command '/usr/local/sbin/strongswanrc stop' returned exit code '1', the output was 'strongswan not running? (check /var/run/daemon-charon.pid).' Jul 21 09:48:17 php-fpm 589 /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1753084097] unbound[46889:0] error: bind: address already in use [1753084097] unbound[46889:0] fatal error: could not open ports'
-
Hmm, most of those errors are common and harmless.
But those dpinger failures are interesting. Do those IPs exist in the routing table?
When you say you have to enable the gateways by hand what exactly are you doing?
-
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
But those dpinger failures are interesting. Do those IPs exist in the routing table?
Not exactly sure what you mean.
10.3.9.0/29 link#16 U 1 1420 tun_wg3 10.3.9.1 link#2 UHS 6 16384 lo0 10.3.9.8/30 link#17 U 7 1420 tun_wg4 10.3.9.9 link#2 UHS 17 16384 lo0 10.3.9.12/30 link#18 U 23 1420 tun_wg5 10.3.9.13 link#2 UHS 27 16384 lo0 10.3.9.24/29 link#20 U 31 1420 tun_wg0 10.3.9.26 link#2 UHS 55 16384 lo0 10.3.12.0/24 link#19 U 60 1280 tun_wg6 10.3.12.1 link#2 UHS 41 16384 lo0 10.3.13.0/24 link#27 U 5 1500 hn0.2 10.3.13.1 link#2 UHS 26 16384 lo0 10.3.178.0/24 link#8 U 37 1500 hn3 10.3.178.2 link#2 UHS 57 16384 lo0
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
When you say you have to enable the gateways by hand what exactly are you doing?
Mostly this (klicking it):
But this is only an example, because every WireGuard running through OpenWRT-VMs is doing fine, only WireGuard running on pfSense itself makes problems.Right now, when I went there, I saw one gateway was down again and WG was stopped. This is new behavior to me. Interestingly it was that tunnel again, which I had recreated (the last Gateway in the picture).
The other end is a WireGuard installation done by me inside of Debian (Proxmox VE). I disabled and re-enabled the gateway (instead of just enabling it, because it was already enabled by me after the boot, where it was disabled like all the others) and I restarted WireGuard, up again and running. This is odd and problematic. "Fixing" it after a reboot is one thing but having it going down while in use is another...
-
-
@stephenw10 Sidenote, I noticed that Lawrence Systems would do it differently, making the gateway address the same as the interface address but I don't think that this is mandatory and the Configuration Recipe doesn't show that either.
-
Yup I meant if the tunnel subnets did not exist then dpinger would be expected to fail. They are there in that output but is that after rebooting when WG fails to start?
It feels like it could be a timing/ordering issue where dpinger tries to start before the WG config has been applied creating the routes. But if so that should be evident in the log order.
-
@stephenw10 said in 25.07.r.20250709.2036 First Boot WireGuard Service not running:
They are there in that output but is that after rebooting when WG fails to start?
Here is right after a reboot with WG down, again.
10.3.9.0/29 link#16 U 7 1420 tun_wg3 10.3.9.1 link#2 UHS 8 16384 lo0 10.3.9.8/30 link#17 U 30 1420 tun_wg4 10.3.9.9 link#2 UHS 34 16384 lo0 10.3.9.12/30 link#18 U 53 1420 tun_wg5 10.3.9.13 link#2 UHS 54 16384 lo0 10.3.9.24/29 link#20 U 58 1420 tun_wg0 10.3.9.26 link#2 UHS 60 16384 lo0 10.3.12.0/24 link#19 U 55 1280 tun_wg6 10.3.12.1 link#2 UHS 56 16384 lo0 10.3.13.0/24 link#27 U 20 1500 hn0.2 10.3.13.1 link#2 UHS 25 16384 lo0 10.3.178.0/24 link#8 U 9 1500 hn3 10.3.178.2 link#2 UHS 10 16384 lo0
With every reboot I will get a new IPv6 and often IPv4 thanks to PPPoE and my ISP...
-
Hmm, so can you see in the logs that dpinger is failing to start during that reboot? And can you see if that's before WG tries to start?
-
@stephenw10 Hope this helps...