Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    if_pppoe with frequent connection losses due to ISP connection making firewall unstable

    Scheduled Pinned Locked Moved Development
    27 Posts 3 Posters 754 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Hmm, nothing obviously a problem individually. But there's a lot going on there. There is so much happening that it's still restarting services minutes after the issue is cleared.

      I think I would at least try disabling the gateway monitoring action on anything that doesn't need it. Most of those tunnels and VPNs likely don't need it for example. Currently each of them is triggering a restart of all services.

      LaxarusL 1 Reply Last reply Reply Quote 0
      • LaxarusL
        Laxarus @stephenw10
        last edited by

        @stephenw10 🤔
        so basically, due to frequent connection losses and restores, it creates a race condition when services are getting restarted?

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Potentially.

          Try doing one manual reconnect and see what is logged until it's stable again. How long does it spend reloading stuff.

          LaxarusL 1 Reply Last reply Reply Quote 0
          • LaxarusL
            Laxarus @stephenw10
            last edited by

            @stephenw10 I cannot reliably do this test right now. It is losing connection almost every 20 seconds. Did the new version got tested for such extreme cases?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              It should be fine. Or at least no worse than mpd5/netgraph. Bouncing the WAN every 20s is going to be pretty disruptive on a default install. With all the services and additional sub-interfaces you have it 's going to be a lot for the firewall to do. If it takes longer than 20s to reload all the tunnels and services then it could be in a continuous churn with high load.

              LaxarusL 1 Reply Last reply Reply Quote 0
              • LaxarusL
                Laxarus @stephenw10
                last edited by Laxarus

                @stephenw10 I have disabled most of the services and some gateway monitors but this still did not help.

                It appears that after getting this screen a simple CTRL+C brings the firewall back online which indicates the system is getting stuck on some process.

                ca898ea5-6957-487e-a49b-93583af24750-1750271709997-ec110d2c-0f43-451f-be87-3224ed24fe9b-ekran-görüntüsü-2025-06-18-205629.png

                Unless I catch this immediately, I dont think it is possible to trace the logs.

                Is there a way to increase the log size or log rotation so that the next time this happens we can trace it? At this point, I am pissed at my ISP :D.

                1 Reply Last reply Reply Quote 0
                • LaxarusL
                  Laxarus
                  last edited by

                  @stephenw10
                  and there are some new interesting logs with unbound

                  Jul 15 14:23:57 FIREWALL php-fpm[92187]: /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1752578637] unbound[61748:0] error: bind: address already in use [1752578637] unbound[61748:0] fatal error: could not open ports'
                  

                  2.zip

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    How long do your logs last? They should just rotate and store unless you're running ram disks. You can only display up to 2000 lines in the webgui but if you look in /var/log directly there should be far more.

                    If you hit that at the console again try hitting ctl+t before ctl+c. That should show what process is running and/or stuck.

                    LaxarusL 1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      That unbound log is quite common when it restarts. It can try to start that before the previous process has stopped. It shouldn't cause a problem.

                      1 Reply Last reply Reply Quote 0
                      • LaxarusL
                        Laxarus @stephenw10
                        last edited by

                        @stephenw10
                        unfortunately, it is not enough. In a couple of hours it is filling the logs up to the system.log.6.

                        See my first post about it.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          You can set the size it rotates at and the number of files to retain in the log settings at Status > Logs > Settings. As long as you have the space you should be able to increase it.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.