Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Major issue with QUAGGA-OSPF and VLANs (pfsense 2.3.0)

    Scheduled Pinned Locked Moved Routing and Multi WAN
    81 Posts 23 Posters 35.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      reqlez
      last edited by

      yes but … who is going to be developing the fork lol

      @Spydre13:

      @reqlez:

      @Spydre13

      Let's get like a fund raiser going, and collect like $10,000 and offer 10K to whoever can fix OSPF bug and integrate quagga VTY support into pfsense lol  ( oh and integrate TCP/DNS instead of just ping support for gateway monitoring because every ISP now drops ICMP on high usage and gateway monitoring sucks without DNS / TCP ports support )  …. i'm willing to pitch in $1000 ... if all 3 conditions are met lol  who else wants to donate here for a good cause ???

      First you need to find someone willing to fix the problem, otherwise the money doesn't help.  I've already pointed out where the bug is (fairly confident anyways), and could fix it just by reverting the change they made.  However, there's no guarantee that they will accept the fix.  I can't get a response on why that change was made, or what the intention was.  If they're not going to be responsive it seems like pfSense should either revert to the older version or use a fork that corrects this issue.

      1 Reply Last reply Reply Quote 0
      • R
        reqlez
        last edited by

        Hi guys… so ... maybe we should try changing the script and remove -9 like Martin suggested, I think he might not be too keen to respond until that is tried since he specifically asked to try that.  Is it possible that while that piece of code was removed, another one was added to do the same function for cleanup of routes or similar ?

        1 Reply Last reply Reply Quote 0
        • H
          heper
          last edited by

          @reqlez

          you can find/remove the -9 in

          
          line 306-325
          /usr/local/pkg/quagga_ospfd.inc
          
          

          after clicking 'save' in the webgui the rc-file will be updated

          
          /usr/local/etc/rc.d/quagga.sh
          
          

          i don't have a test environment but i've done this on my home box. adjusting above should be fairly safe in a non-production environment.

          i also fail to see how this will solve the issue; but it might be a hackish workaround (as jimp already mentioned)

          1 Reply Last reply Reply Quote 0
          • S
            Spydre13
            last edited by

            @reqlez:

            Hi guys… so ... maybe we should try changing the script and remove -9 like Martin suggested, I think he might not be too keen to respond until that is tried since he specifically asked to try that.  Is it possible that while that piece of code was removed, another one was added to do the same function for cleanup of routes or similar ?

            I would say go ahead and try this and see what happens.  However, even if it cleans up the routes without using -9, that won't be ideal for two reasons:

            1. Do you really want it to remove all OSPF routes from your firewall for a few seconds, maybe even longer?  It takes some time for OSPF to start back up, establish neighbors, etc.
            2. What if Quagga (either zebra or ospf) crashes at some point?  You would need to restart your firewall, just starting Quagga won't work because it didn't shut down cleanly and remote OSPF routes.

            Quagga really should be able to detect routes that it put into the kernel.  Before v1.0 it did this, and it still actually does detect the routes it put there, it just doesn't remove them from the zebra RIB like it used to.

            1 Reply Last reply Reply Quote 0
            • T
              Tex
              last edited by

              Hello there,

              I'm wondering if anything have change this last month concerning Quagga OSPF/Kernel problem. Seems I'm still stuck with the kernel route written and the OSPF not used …

              Thx

              1 Reply Last reply Reply Quote 0
              • W
                winmasta
                last edited by

                Still not solved?

                1 Reply Last reply Reply Quote 0
                • R
                  reqlez
                  last edited by

                  @winmasta:

                  Still not solved?

                  I don't know man … I almost want to switch to watchguard for my multi site OSPF deployments now.

                  Knowing that support for a major component Like a routing package is non existent ( not pfsense fault seems quagga doesn't want to a knowledge problem  )  is worrisome.  I don't even have any more hours to invest in troubleshooting this as I have to catch up with projects.

                  1 Reply Last reply Reply Quote 0
                  • A
                    Andres11
                    last edited by

                    I'd like to confirm that removing the -9 has resolved the OSPF learned routes getting stuck as Kernel routes. I have been attempting a seamless voice failover setup using 2 openvpn tunnels and was running OSPF on those interfaces. This had been the only issue preventing this from working. After several tests in my lab everything appears to be working without issue.

                    1 Reply Last reply Reply Quote 0
                    • H
                      heper
                      last edited by

                      @reqlez could you report to your contacts at quagga that removing -9 works around the issue.

                      We need a permanent fix, that also works with -9

                      1 Reply Last reply Reply Quote 0
                      • D
                        doktornotor Banned
                        last edited by

                        Did https://github.com/pfsense/FreeBSD-ports/pull/265 to hack around this stupidity, since this has been going on for way too long… Obviously not a real solution, as noted here and here.

                        1 Reply Last reply Reply Quote 0
                        • S
                          Spydre13
                          last edited by

                          @doktornotor:

                          Did https://github.com/pfsense/FreeBSD-ports/pull/265 to hack around this stupidity, since this has been going on for way too long… Obviously not a real solution, as noted here and here.

                          So just to clarify, if you kill quagga without -9 it will remove the routes from the kernel until it starts back up and re-learns the routes, correct?  So it basically creates a brief outage, which is not great either.

                          It would be nice to hear from someone at pfSense about what our options are to get a long-term solution.  From my understanding there are two options:

                          1. Prevent quagga from restarting, by using VTY for configuration changes instead of generating new configuration files and restarting.
                          2. Add the code back to quagga (zebra) that was removed that filters out the kernel routes put there by itself.

                          I think #2 would be easiest, but I'm not sure if the quagga community will be open to that, as I can't find out why the code was removed/commented out to begin with.

                          1 Reply Last reply Reply Quote 0
                          • D
                            doktornotor Banned
                            last edited by

                            @Spydre13:

                            @doktornotor:

                            Did https://github.com/pfsense/FreeBSD-ports/pull/265 to hack around this stupidity, since this has been going on for way too long… Obviously not a real solution, as noted here and here.

                            So just to clarify, if you kill quagga without -9 it will remove the routes from the kernel until it starts back up and re-learns the routes, correct?  So it basically creates a brief outage, which is not great either.

                            I'd figure out that dealing with ~1-2 seconds outage would be a whole lot better than having bogus "kernel" routes picked up by zebra and getting routing broken. You of course are welcome to provide better solution. So far, for ~1 year, noone provided any better ideas for the upstream regression.

                            Also, this thread is not about "pfSense should not restart routing packages". I'd guess that the summary provided by jimp is pretty accurate:

                            Preventing it from restarting is a hackish workaround no matter what signal is used. It will get restarted at some point and failing to recover gracefully is a regression in quagga's behavior in 1.x.

                            Restarting the package is required at minimum on upgrades, not avoidable.

                            1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              @Spydre13:

                              1. Add the code back to quagga (zebra) that was removed that filters out the kernel routes put there by itself.

                              I think #2 would be easiest, but I'm not sure if the quagga community will be open to that, as I can't find out why the code was removed/commented out to begin with.

                              That is what needs to happen. Quagga needs to recognize its own routes by the flags in the routing table. There's no reason they should have removed that code that I can see.

                              Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              1 Reply Last reply Reply Quote 0
                              • S
                                Spydre13
                                last edited by

                                @doktornotor:

                                I'd figure out that dealing with ~1-2 seconds outage would be a whole lot better than having bogus "kernel" routes picked up by zebra and getting routing broken. You of course are welcome to provide better solution. So far, for ~1 year, noone provided any better ideas for the upstream regression.

                                I'm not knocking your efforts, I think in most cases (including mine) your pull request would be better than the current situation.  However, I'm not sure it would be better in all cases.

                                @doktornotor:

                                Also, this thread is not about "pfSense should not restart routing packages". I'd guess that the summary provided by jimp is pretty accurate:

                                Preventing it from restarting is a hackish workaround no matter what signal is used. It will get restarted at some point and failing to recover gracefully is a regression in quagga's behavior in 1.x.

                                Restarting the package is required at minimum on upgrades, not avoidable.

                                I agree, however I'm pretty sure the quagga community disagrees, at least with the first sentence.  According to the quagga community the proper way to handle configuration changes (from what I can tell) is to use the VTY or VTYSH to make changes like you would with a router, not by re-writing the configuration files, killing with -9, and restarting the daemons.  For reference I was discussing this on their list here: https://lists.quagga.net/pipermail/quagga-users/2016-November/014557.html, and then here (one guy was replying off-list and I tried to add it back to the list, but nobody else chimed in): https://lists.quagga.net/pipermail/quagga-users/2016-November/014571.html.  If you're doing an upgrade you can kill it without -9 and it will recover fine, and in that case an outage of a few seconds isn't a big deal.

                                @jimp:

                                @Spydre13:

                                1. Add the code back to quagga (zebra) that was removed that filters out the kernel routes put there by itself.

                                I think #2 would be easiest, but I'm not sure if the quagga community will be open to that, as I can't find out why the code was removed/commented out to begin with.

                                That is what needs to happen. Quagga needs to recognize its own routes by the flags in the routing table. There's no reason they should have removed that code that I can see.

                                I can try to discuss with them again (there has been turmoil on the quagga lists lately), and even submit a pull request reverting the changes.  If they refuse to allow that code back in, what is the plan going forward for OSPF support in pfSense?

                                1 Reply Last reply Reply Quote 0
                                • jimpJ
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by

                                  If they won't fix it I'm not sure what the best path is. Maybe adding a port for the old version, or adding that code back in as a patch on the port.

                                  If FreeBSD's route command would let us flush based on -proto1/RTF_PROTO1 then we could clear out its old routes before restarting, but that also seems harsh.

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • J
                                    Jackish
                                    last edited by

                                    Hi,

                                    Does anyone know if this issue has been fixed? I just noticed that Quagga 0.6.17 is available in the packet manager. Will try myself obviously but just wondering if anyone can confirm.

                                    Thanks.

                                    EDIT: Can confirm that 0.6.17 solves the issue.

                                    1 Reply Last reply Reply Quote 0
                                    • D
                                      doktornotor Banned
                                      last edited by

                                      The killall -9 is gone, yes. https://github.com/pfsense/FreeBSD-ports/pull/265

                                      1 Reply Last reply Reply Quote 0
                                      • S
                                        Soyokaze
                                        last edited by

                                        Is anyone else can confirm?

                                        Need full pfSense in a cloud? PM for details!

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Slicster
                                          last edited by

                                          Hi All,
                                          I'm having the same issue but when I tried to revert using the following command:

                                          pkg add -f http://pkg.freebsd.org/freebsd:10:x86:64/release_3/All/quagga-0.99.24.1_2.txz
                                          

                                          The OSPF and ZEBRA service no longer started.

                                          If I ran the following command via SSH, I received this error:

                                          Exec format error
                                          

                                          Anyone have an idea of what I may be doing wrong or perhaps a configuration incompatibility that I must remove?  I tried uninstalling the packages, rebooting then reinstalling but didn't help.  I tried removing all the interfaces from the configuration but services still didn't start.

                                          This is a MAJOR issue for us because we rely on OSPF for redundancy, at the moment, without it working, if a link goes down, we have to manually reboot the pfSense units so that the new routes are written.

                                          I've attached my ospfd.conf and zebra.conf files with some of the IP's and passwords changed.

                                          ospfd.conf.txt
                                          zebra.conf.txt

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.