Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    "tun_wg0: loop detected" following WireGuard gateway 100% loss and WAN interface rc.newwanip

    Scheduled Pinned Locked Moved General pfSense Questions
    29 Posts 3 Posters 4.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      Mixka @stephenw10
      last edited by Mixka

      @stephenw10

      Thanks. Do not remember to touch on that settings. I did change "Log Rotation Size" recently. I had same as this one https://forum.netgate.com/topic/169923/tons-sshguard-log-entries-and-its-not-enabled

      That anyway did not fix/change my original problem.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Yup, that wouldn't be related to the loop issue, just confusing the logs.

        This entry implies there is no longer a subnet on igb0 for the gateway yto be added in:

        Aug 4 09:42:43 	kernel 		arpresolve: can't allocate llinfo for *SNIP* on igb0
        

        I assume igb0 is the WAN?

        If you disable 'State Killing on Gateway Failure' does that stop the loop errors?

        Steve

        M 2 Replies Last reply Reply Quote 0
        • M
          Mixka @stephenw10
          last edited by

          @stephenw10

          Actually those "arpresolve" messages have been non-existing for couple of days. Trying now anyway to disable "State Killing on Gateway Failure" option. Let see how it works. Thanks!

          1 Reply Last reply Reply Quote 0
          • M
            Mixka @stephenw10
            last edited by Mixka

            @stephenw10

            Loop messages still continues...

            Log (removed some lines and code tags, because it was getting marked as spam):


            Aug 8 02:47:23 check_reload_status 487 Reloading filter
            Aug 8 02:47:23 check_reload_status 487 Restarting OpenVPN tunnels/interfaces
            Aug 8 02:47:23 check_reload_status 487 Restarting IPsec tunnels
            Aug 8 02:47:23 check_reload_status 487 updating dyndns WG_VPN_GW
            Aug 8 02:47:23 rc.gateway_alarm 25152 >>> Gateway alarm: WG_VPN_GW (Addr:1.1.1.1 Alarm:0 RTT:1.316ms RTTsd:.151ms Loss:12%)
            Aug 8 02:45:36 kernel tun_wg0: loop detected
            Aug 8 02:45:36 kernel tun_wg0: loop detected
            Aug 8 02:45:34 check_reload_status 487 Reloading filter
            Aug 8 02:45:34 check_reload_status 487 Restarting OpenVPN tunnels/interfaces
            Aug 8 02:45:34 check_reload_status 487 Restarting IPsec tunnels
            Aug 8 02:45:34 check_reload_status 487 updating dyndns WG_VPN_GW
            Aug 8 02:45:34 rc.gateway_alarm 68371 >>> Gateway alarm: WG_VPN_GW (Addr:1.1.1.1 Alarm:1 RTT:1.597ms RTTsd:1.754ms Loss:21%)
            Aug 8 02:45:34 kernel tun_wg0: loop detected
            Aug 8 02:45:34 kernel tun_wg0: loop detected
            Aug 8 02:45:12 check_reload_status 487 Reloading filter
            Aug 8 02:45:12 php-fpm 448 /rc.newwanip: rc.newwanip: on (IP address: SNIP) (interface: WAN1[wan]) (real interface: igb0).
            Aug 8 02:45:12 php-fpm 448 /rc.newwanip: rc.newwanip: Info: starting on igb0.
            Aug 8 02:45:12 kernel tun_wg0: loop detected
            Aug 8 02:45:11 check_reload_status 487 rc.newwanip starting igb0
            Aug 8 02:45:11 kernel tun_wg0: loop detected
            Aug 8 02:45:11 kernel tun_wg0: loop detected
            Aug 8 02:37:26 check_reload_status 487 Reloading filter

            J 1 Reply Last reply Reply Quote 0
            • J
              Jarhead @Mixka
              last edited by

              @mixka You're static route is using your WAN DHCP gateway, should be the WG gateway.

              M 1 Reply Last reply Reply Quote 0
              • M
                Mixka @Jarhead
                last edited by

                @jarhead

                No. WG gateway is default gateway and it's need route to out via WAN gateway.

                J 1 Reply Last reply Reply Quote 0
                • J
                  Jarhead @Mixka
                  last edited by

                  @mixka Right, it will. But the peer is on the other end of the tunnel, correct? It should use the WG gateway, then you have the WG gateway set as the upstream gateway in the interface or maybe I'm missing something. You have a lot redacted that doesn't need to be.

                  M 1 Reply Last reply Reply Quote 0
                  • M
                    Mixka @Jarhead
                    last edited by Mixka

                    @jarhead

                    I'm new with WG (this abbreviation always got me, because WG is WatchGuard for me 😄) so this setup might be wrong in many places. I applied this guide for my use https://youtu.be/wYe7FzZ_0X8 (not using Mullvad).

                    What might be right way to use WG gateway as default gateway? Peer endpoint is in public domain and interface+gateway private.

                    J 1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Yes, the static route for subnets at the remote end of the tunnel must be the Wireguard gateway.

                      The public peer end point IP would not normally need a static route. The only reason you might need that is if you change the system default route to be the Wireguard gateway. I would not recommend doing that though.

                      1 Reply Last reply Reply Quote 0
                      • J
                        Jarhead @Mixka
                        last edited by

                        @mixka Use this:
                        https://youtu.be/2oe7rTMFmqc

                        M 1 Reply Last reply Reply Quote 1
                        • M
                          Mixka @Jarhead
                          last edited by

                          @jarhead

                          Golden video as Christians videos always are. But I can't see relations to my setup there (other than that WG is used to make tunnel). Video anyway explains routes and NAT's nicely.

                          @stephenw10

                          So, setup is somewhat right, but not recommended. I see, but goal is to use WG gateway as default route, so recommended or not this is the way.

                          Anything else to test out?

                          I also found when started to look other logs, that ntpd have log lines just after first "loop detected" message is timestamped. Is it root cause or just following system melt down?

                          Log:

                          Aug 8 20:16:14 	check_reload_status 	487 	Reloading filter 
                          Aug 8 20:14:26 	kernel 		tun_wg0: loop detected 
                          Aug 8 20:14:05 	ntpd 	1691 	Listen normally on 25 igb0 80.221.147.161:123
                          Aug 8 20:14:02 	ntpd 	1691 	Deleting interface #24 igb0, 80.221.147.161#123, interface stats: received=0, sent=0, dropped=0, active_time=14804 secs
                          Aug 8 20:14:01 	kernel 		tun_wg0: loop detected 
                          
                          Aug 8 16:09:28 	check_reload_status 	487 	Reloading filter
                          Aug 8 16:07:42 	kernel 		tun_wg0: loop detected 
                          Aug 8 16:07:18 	ntpd 	1691 	Listen normally on 24 igb0 80.221.147.161:123
                          Aug 8 16:07:16 	ntpd 	1691 	Deleting interface #23 igb0, 80.221.147.161#123, interface stats: received=0, sent=0, dropped=0, active_time=14756 secs
                          Aug 8 16:07:15 	kernel 		tun_wg0: loop detected
                          
                          Aug 8 12:03:30 	check_reload_status 	487 	Reloading filter
                          Aug 8 12:01:44 	kernel 		tun_wg0: loop detected
                          Aug 8 12:01:20 	ntpd 	1691 	Listen normally on 23 igb0 80.221.147.161:123
                          Aug 8 12:01:18 	ntpd 	1691 	Deleting interface #22 igb0, 80.221.147.161#123, interface stats: received=0, sent=0, dropped=0, active_time=1782 secs 
                          Aug 8 12:01:17 	kernel 		tun_wg0: loop detected
                          
                          J 1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Mmm, it's unclear where the loop is being created there but I'd have to guess it's because the system default route is switched back to the WAN when the WG gateway goes down and you have something else routing traffic back.
                            Do you have the default gateway set specifically to the WG gateway or to a group?

                            M 1 Reply Last reply Reply Quote 0
                            • J
                              Jarhead @Mixka
                              last edited by

                              @mixka Did you fix the WG config? That's what the video was for.
                              From what I can see, you need to remove the upstream gateway on the interface and correct the static route.

                              1 Reply Last reply Reply Quote 0
                              • M
                                Mixka @stephenw10
                                last edited by

                                @stephenw10

                                Specifically to WG gateway.

                                @Jarhead

                                I found out that video really helpful, but not on my case. I do not want to remove upstream gateway, because traffic out of tunnel needs to be NAT'd. Or does NAT rules still operate even upstream gateway is disabled? What does it help in that case? That video did not mention. And what comes to static route, I think there is no things to correct. In video WAN gateway was used and I need to use WG as my gateway.

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  If you have a gateway on the interface automatic outbound NAT will be applied but you can still add manual rules to an interface that has no gateway.
                                  You need a gateway there to use as the default route or to add policy routing for but it doesn't have to be on the interface itself. However I wouldn't expect it to make any difference.

                                  Steve

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    Mixka
                                    last edited by Mixka

                                    It's been long days trying to solve this. I even deleted all firewall rules and redo only basic setup.

                                    I have for example tried these:

                                    • Disable NTP: No affect
                                    • Disable WG interface upstream gateway: Interface errors out statistics goes down by ~90%
                                    • Disable Negate rules: No affect
                                    • Disable gateways monitoring: After that it did not recover from loop
                                    • Started firewall rules from scrach with basic setup: Maybe loop occurances are lower, or not...
                                    • Put "supersede interface-mtu 0" to WAN interface DHCP option modifiers: Definately made loop occurance lower, or maybe just coincidence

                                    Now I did get 15 hours before problem occured after reboot.

                                    Should I try to play with WAN interface DHCP protocol timing or other options? Connection is optical fiber 100/100M.

                                    I also found this Reddit post https://reddit.com/r/PFSENSE/comments/ln14gu/arpresolve_cant_allocate_llinfo_ip_crippling_wan
                                    Should I try dumb switch before WAN interface and what difference it might doing? I do not think this is tagged traffic anyway.

                                    I just did speedtest with different services and my connection speed is low like 8/30M. I have run speedtest often and haven't got problems before. I think only thing what I have changed after last time is that DHCP option modifier.

                                    During speedtest:
                                    WG-problem8.png

                                    I have had traffic shaper, but disabling it won't help any. I do not know if this is related in anyway, but is this right way to reduce buffer bloating? It is (was) working really well for me. https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/775

                                    And I have this spam for 1000 lines per minute:
                                    (51820 is WG peer endpoint port and 43279 is WG tunnel listening port)

                                    Aug 14 14:06:00 	WAN1 	Default deny rule IPv4 (1000000103) 	*WG_Endpoint_IP*:51820		*WAN_Interface_IP*:43279		UDP
                                    Aug 14 14:06:00 	WAN1 	Default deny rule IPv4 (1000000103) 	*WG_Endpoint_IP*:51820		*WAN_Interface_IP*:43279		UDP
                                    Aug 14 14:06:00 	WAN1 	Default deny rule IPv4 (1000000103) 	*WG_Endpoint_IP*:51820		*WAN_Interface_IP*:43279		UDP
                                    

                                    Should I allow this traffic or is there misconfiguration?

                                    I have now used way over 50 hours to trying to solve this. I am even thinking to clear install whole box. I do not know what to do. I have tried everything and problems just pile up.

                                    Loop occurance:

                                    Aug 14 10:18:34 	check_reload_status 	484 	Reloading filter
                                    Aug 14 10:18:34 	check_reload_status 	484 	Restarting OpenVPN tunnels/interfaces
                                    Aug 14 10:18:34 	check_reload_status 	484 	Restarting IPsec tunnels
                                    Aug 14 10:18:34 	check_reload_status 	484 	updating dyndns *WG_GW*
                                    Aug 14 10:18:34 	rc.gateway_alarm 	17746 	>>> Gateway alarm: *WG_GW* (Addr:8.8.8.8 Alarm:0 RTT:7.390ms RTTsd:36.380ms Loss:11%)
                                    Aug 14 10:18:25 	php 	14097 	[pfBlockerNG] filterlog daemon started
                                    Aug 14 10:18:25 	tail_pfb 	13938 	[pfBlockerNG] Firewall Filter Service started
                                    Aug 14 10:18:25 	lighttpd_pfb 	12699 	[pfBlockerNG] DNSBL Webserver started
                                    Aug 14 10:18:25 	php_pfb 	10645 	[pfBlockerNG] filterlog daemon stopped
                                    Aug 14 10:18:25 	lighttpd_pfb 	10375 	[pfBlockerNG] DNSBL Webserver stopped
                                    Aug 14 10:18:25 	tail_pfb 	10461 	[pfBlockerNG] Firewall Filter Service stopped
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] Applied package default settings as necessary.
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] Installed Unbound ACL group (WireGuard).
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] De-installed Unbound ACL group (WireGuard).
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] Installed interface group (WireGuard).
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] De-installed interface group (WireGuard).
                                    Aug 14 10:18:24 	check_reload_status 	484 	Syncing firewall
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] Installed earlyshellcmd(s).
                                    Aug 14 10:18:24 	check_reload_status 	484 	Syncing firewall
                                    Aug 14 10:18:24 	php-fpm 	30665 	/rc.start_packages: Configuration Change: (system): [pfSense-pkg-WireGuard] De-installed earlyshellcmd(s).
                                    Aug 14 10:18:24 	php 	83866 	haproxy: started new pid:87304
                                    Aug 14 10:18:24 	php 	83866 	haproxy: reload old pid:42610
                                    Aug 14 10:18:23 	php-fpm 	30665 	/rc.start_packages: Restarting/Starting all packages.
                                    Aug 14 10:18:22 	check_reload_status 	484 	Starting packages
                                    Aug 14 10:18:22 	php-fpm 	10714 	/rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> *WAN_Interface_IP* - Restarting packages.
                                    Aug 14 10:18:20 	php-fpm 	10714 	/rc.newwanip: Creating rrd update script
                                    Aug 14 10:18:20 	php-fpm 	10714 	/rc.newwanip: Resyncing OpenVPN instances for interface WAN1.
                                    Aug 14 10:18:15 	php-fpm 	10714 	/rc.newwanip: IP Address has changed, killing all states (ip_change_kill_states is set).
                                    Aug 14 10:18:15 	check_reload_status 	484 	Reloading filter
                                    Aug 14 10:18:15 	check_reload_status 	484 	Restarting OpenVPN tunnels/interfaces
                                    Aug 14 10:18:15 	check_reload_status 	484 	Restarting IPsec tunnels
                                    Aug 14 10:18:15 	check_reload_status 	484 	updating dyndns *WG_GW*
                                    Aug 14 10:18:15 	rc.gateway_alarm 	70659 	>>> Gateway alarm: *WG_GW* (Addr:8.8.8.8 Alarm:1 RTT:0.000ms RTTsd:0.000ms Loss:100%)
                                    Aug 14 10:18:13 	php-fpm 	10714 	/rc.newwanip: Removing static route for monitor 8.8.8.8 and adding a new route through *WG_GW_IP*
                                    Aug 14 10:18:13 	php-fpm 	10714 	/rc.newwanip: Removing static route for monitor 8.8.4.4 and adding a new route through *WAN_GW_IP*
                                    Aug 14 10:18:11 	php-fpm 	10714 	/rc.newwanip: rc.newwanip: on (IP address: *WAN_Interface_IP*) (interface: WAN1[wan]) (real interface: igb0).
                                    Aug 14 10:18:11 	php-fpm 	10714 	/rc.newwanip: rc.newwanip: Info: starting on igb0.
                                    Aug 14 10:18:10 	check_reload_status 	484 	rc.newwanip starting igb0
                                    Aug 14 10:18:10 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:10 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:10 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:10 	kernel 		arpresolve: can't allocate llinfo for *WAN_GW_IP* on igb0
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		arpresolve: can't allocate llinfo for *WAN_GW_IP* on igb0
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:09 	kernel 		arpresolve: can't allocate llinfo for *WAN_GW_IP* on igb0
                                    Aug 14 10:18:08 	kernel 		tun_wg0: loop detected
                                    Aug 14 10:18:08 	kernel 		arpresolve: can't allocate llinfo for *WAN_GW_IP* on igb0
                                    
                                    Aug 14 10:18:34 	dpinger 	65033 	*WG_GW* 8.8.8.8: Clear latency 7390us stddev 36380us loss 11%
                                    Aug 14 10:18:15 	dpinger 	65033 	*WG_GW* 8.8.8.8: Alarm latency 0us stddev 0us loss 100%
                                    Aug 14 10:18:13 	dpinger 	65033 	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 8.8.8.8 bind_addr *WG_Interface_IP* identifier "*WG_GW*"
                                    Aug 14 10:18:13 	dpinger 	64735 	send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr 8.8.4.4 bind_addr *WAN_Interface_IP* identifier "WAN_DHCP"
                                    
                                    stephenw10S 1 Reply Last reply Reply Quote 0
                                    • stephenw10S
                                      stephenw10 Netgate Administrator @Mixka
                                      last edited by

                                      @mixka said in "tun_wg0: loop detected" following WireGuard gateway 100% loss and WAN interface rc.newwanip:

                                      Should I try dumb switch before WAN interface and what difference it might doing?

                                      That's unlikely to help here because, AFAIK, you are not seeing igb0 actually lose link which is what that would prevent.
                                      It would still hit whatever is causing it to see a new WAN IP triggering the reload of the interface and everything on it.

                                      Reviewing the thread though, this seems odd:
                                      @mixka said in "tun_wg0: loop detected" following WireGuard gateway 100% loss and WAN interface rc.newwanip:

                                      I pin down problem that they all was behind same dummy switch. I restarted that switch and everything worked like 2 minutes and then goes down. I removed that switch and everything back to normal.

                                      Could it be that switch has some loop detection and was shutting down one or more ports to prevent it?
                                      If so that implies whatever is creating the loop ends up sending traffic to the LAN side incorrectly.

                                      Try checking the routing tables when it's working and compare that to the failed state.

                                      Steve

                                      M 3 Replies Last reply Reply Quote 0
                                      • M
                                        Mixka @stephenw10
                                        last edited by Mixka

                                        This post is deleted!
                                        1 Reply Last reply Reply Quote 0
                                        • M
                                          Mixka
                                          last edited by

                                          This post is deleted!
                                          1 Reply Last reply Reply Quote 0
                                          • M
                                            Mixka @stephenw10
                                            last edited by

                                            This post is deleted!
                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.