Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Major issue with QUAGGA-OSPF and VLANs (pfsense 2.3.0)

    Scheduled Pinned Locked Moved Routing and Multi WAN
    81 Posts 23 Posters 34.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • H
      heper
      last edited by

      i started looking at this more closely.

      i'm facing the same/similar issue on a multilink-openvpn-site2site  (192.168.99.1 & 192.168.88.1)
      while both vpn are online:

      
      O   10.0.0.0/24 [110/120] via 192.168.99.1, ovpnc1, 00:07:54
      K>* 10.0.0.0/24 via 192.168.99.1, ovpnc1
      O   10.10.10.0/24 [110/110] via 192.168.99.1, ovpnc1, 00:07:54
      K>* 10.10.10.0/24 via 192.168.99.1, ovpnc1
      O   10.10.44.0/24 [110/110] via 192.168.99.1, ovpnc1, 00:07:54
      K>* 10.10.44.0/24 via 192.168.99.1, ovpnc1
      O   10.10.100.0/24 [110/110] via 192.168.99.1, ovpnc1, 00:07:54
      K>* 10.10.100.0/24 via 192.168.99.1, ovpnc1
      O   10.20.10.0/24 [110/10] is directly connected, em2_vlan10, 00:12:26
      C>* 10.20.10.0/24 is directly connected, em2_vlan10
      O   10.20.100.0/24 [110/10] is directly connected, em2, 00:12:27
      C>* 10.20.100.0/24 is directly connected, em2
      O   10.30.10.0/24 [110/1010] via 192.168.223.2, ovpns3, 00:12:21
      K>* 10.30.10.0/24 via 192.168.223.2, ovpns3
      C>* 127.0.0.0/8 is directly connected, lo0
      
      

      While one vpn is down:

      
      O   10.0.0.0/24 [110/520] via 192.168.88.1, ovpnc4, 00:00:05
      K>* 10.0.0.0/24 via 192.168.99.1, ovpnc1
      O   10.10.10.0/24 [110/510] via 192.168.88.1, ovpnc4, 00:00:05
      K>* 10.10.10.0/24 via 192.168.99.1, ovpnc1
      O   10.10.44.0/24 [110/510] via 192.168.88.1, ovpnc4, 00:00:05
      K>* 10.10.44.0/24 via 192.168.99.1, ovpnc1
      O   10.10.100.0/24 [110/510] via 192.168.88.1, ovpnc4, 00:00:05
      K>* 10.10.100.0/24 via 192.168.99.1, ovpnc1
      O   10.20.10.0/24 [110/10] is directly connected, em2_vlan10, 00:29:30
      C>* 10.20.10.0/24 is directly connected, em2_vlan10
      O   10.20.100.0/24 [110/10] is directly connected, em2, 00:29:31
      C>* 10.20.100.0/24 is directly connected, em2
      O   10.30.10.0/24 [110/1010] via 192.168.223.2, ovpns3, 00:29:25
      K>* 10.30.10.0/24 via 192.168.223.2, ovpns3
      C>* 127.0.0.0/8 is directly connected, lo0
      
      

      quagga is showing/USING selected kernel routes while there are no static routes set for subnets 10.0.0.0/24 | 10.10.10.0/24 | 10.10.44.0/24
      When I take down the link, quagga changes its Ospf-route correctly / but the "old" kernel route stays in place & remains selected. This causes the routing to fail

      Not sure if this is a quagga issue or a freebsd issue.
      Might be related to:
      https://forum.pfsense.org/index.php?topic=110245.0

      Hopefully @jimp will pick up this post / afaik he's one of the few people who might know the root cause of this.
      In the mean time i created a bugreport here: https://redmine.pfsense.org/issues/6305

      as requested adding config files:
      ospfd_client_side

      
      # This file was created by the pfSense package manager.  Do not edit!
      
      password ******
      interface ovpnc4
        ip ospf cost 500
      interface ovpnc1
        ip ospf cost 100
      interface ovpns3
        ip ospf cost 1000
      
      router ospf
        ospf router-id 10.20.10.1
        network 192.168.88.0/30 area 0.0.0.1
        network 192.168.99.0/30 area 0.0.0.1
        network 192.168.223.0/30 area 0.0.0.1
        network 192.168.77.0/26 area 0.0.0.1
        network 192.168.99.2/32 area 0.0.0.1
        network 192.168.223.1/32 area 0.0.0.1
        network 192.168.88.2/32 area 0.0.0.1
        network 192.168.100.1/32 area 0.0.0.1
        network 192.168.226.1/28 area 0.0.0.1
        network 10.20.10.0/24 area 0.0.0.1
        network 192.168.2.0/24 area 0.0.0.1
        network 10.20.100.0/24 area 0.0.0.1
        network 172.20.20.0/24 area 0.0.0.1
        network 192.168.66.0/24 area 0.0.0.1
      
      

      zebra_client_side

      
      # This file was created by the pfSense package manager.  Do not edit!
      
      password ******
      ip prefix-list ACCEPTFILTER deny 192.168.77.0/26
      ip prefix-list ACCEPTFILTER deny 192.168.99.2/32
      ip prefix-list ACCEPTFILTER deny 192.168.223.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.88.2/32
      ip prefix-list ACCEPTFILTER deny 192.168.100.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.226.1/28
      ip prefix-list ACCEPTFILTER permit any
      route-map ACCEPTFILTER permit 10
      match ip address prefix-list ACCEPTFILTER
      ip protocol ospf route-map ACCEPTFILTER
      
      

      ospf_server_side

      
      # This file was created by the pfSense package manager.  Do not edit!
      
      password ********
      interface ovpns7
        ip ospf cost 500
      interface ovpns2
        ip ospf cost 1000
      interface ovpns5
      interface ovpns1
        ip ospf cost 1000
        ip ospf authentication-key *******
      
      router ospf
        ospf router-id 10.10.10.1
        network 192.168.88.0/30 area 0.0.0.1
        network 192.168.99.0/30 area 0.0.0.1
        network 192.168.222.0/30 area 0.0.0.1
        network 192.168.224.0/30 area 0.0.0.1
        area 0.0.0.0 authentication
        network 192.168.88.1/32 area 0.0.0.1
        network 192.168.99.1/32 area 0.0.0.1
        network 192.168.222.1/32 area 0.0.0.1
        network 192.168.224.1/32 area 0.0.0.1
        network 192.168.100.2/32 area 0.0.0.1
        network 10.10.10.0/24 area 0.0.0.1
        network 10.10.100.0/24 area 0.0.0.1
        network 192.168.77.0/24 area 0.0.0.1
        network 192.168.1.0/24 area 0.0.0.1
        network 10.10.44.0/24 area 0.0.0.1
      
      

      zebra_server_side

      
      # This file was created by the pfSense package manager.  Do not edit!
      
      password ******
      ip prefix-list ACCEPTFILTER deny 192.168.88.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.99.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.222.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.224.1/32
      ip prefix-list ACCEPTFILTER deny 192.168.100.2/32
      ip prefix-list ACCEPTFILTER permit any
      route-map ACCEPTFILTER permit 10
      match ip address prefix-list ACCEPTFILTER
      ip protocol ospf route-map ACCEPTFILTER
      
      
      1 Reply Last reply Reply Quote 0
      • S
        shaoranrch
        last edited by

        Hi,

        I see, incredible at least it's not an isolated issue. Hopefully they'll check this and give us a solution.

        Thanks.

        1 Reply Last reply Reply Quote 0
        • K
          kennylam
          last edited by

          Same problem applies on my pair of pfSense 2.3 too….. the cost of path was properly calculated, but the kernel route just occupied the highest prioity.

          Take my case as example:

          O  192.168.101.0/24 [110/15] via 172.16.53.254, em1_vlan999, 00:00:12
          K>* 192.168.101.0/24 via 192.168.168.1, em5

          While em1_vlan999 is a direct link with lower cost (5) and em5 is a remote site with is in higher cost (200), em5 was selected still. The cost settings on all site are equal.

          My setup relied on OpenVPN too, and worked fine on pfSense 2.2.3-2.2.6, until I upgraded all routers to pfSense 2.3.

          pfSense 2.3_1 with Quagga_OSPF 0.6.13

          1 Reply Last reply Reply Quote 0
          • S
            shaoranrch
            last edited by

            I believe this is a major issue and should be given top priority, we're talking about routing and deployments where redundancy is a must, this is just unacceptable. Maybe the devs could tell us when can we expect this to be solved.

            1 Reply Last reply Reply Quote 0
            • H
              heper
              last edited by

              while this is a major issue for you, me & probably a some others / the chances are, that more urgent matters exist.
              If you can provide more detailed debugging info, it will help finding the root cause & will help getting a solution faster.

              i'm just a user of ospf & don't have the knowledge to find out why it is behaving like it is. afaik there has been little changes to the pfSense-package (except the conversion of the GUI)

              –--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              I've just tried going back to an earlier version of quagga on a test system. it appears to solve the 'kernel-route' issue …. but my test setup is too limited to fully test this. If i have spare time next week i'll run some further tests
              if your test environment is better (or wish to risk this on a production environment), run below from shell :
              for 32bit:

              
              pkg add -f http://pkg.freebsd.org/freebsd:10:x86:32/release_3/All/quagga-0.99.24.1_2.txz
              
              

              for 64bit:

              
              pkg add -f  http://pkg.freebsd.org/freebsd:10:x86:64/release_3/All/quagga-0.99.24.1_2.txz
              
              

              USE WITH CAUTION / THIS MAY HAVE UNWANTED CONSEQUENCES

              1 Reply Last reply Reply Quote 0
              • H
                heper
                last edited by

                just tried it on one of my production systems. downgrading seems to have solved the routing issues i had with the dual-openvpn failover.
                i'll update the redmine accordingly.

                If @shaoranrch & @kennylam could confirm that downgrading helps, then we are getting somewhere  :)

                1 Reply Last reply Reply Quote 0
                • K
                  kennylam
                  last edited by

                  That worked for me too. OSPF routes on VLAN/OpenVPN are now selected as primary route ,as the costs defined.

                  1 Reply Last reply Reply Quote 0
                  • R
                    reqlez
                    last edited by

                    Great … I have the same issue, of course after beating my head against the wall for 2 hours i find this post. K and O routes of same interface showing up, the K obviously doesn't get updated and my traffic doesn't failover.

                    I dont have any VLANs ... maybe rename the topic to "Major issue with QUAGGA-OSPF"

                    1 Reply Last reply Reply Quote 0
                    • H
                      heper
                      last edited by

                      Dus reverting ti older version work first you?

                      1 Reply Last reply Reply Quote 0
                      • R
                        reqlez
                        last edited by

                        By the way I confirmed that installing an older version as per above instructions fixed the problem.

                        What i still hate is that when the VPN connection gets reconnected ( even one with lower priority ) , the OSPF package gets restarted and the routing table gets cleared and stuff and drops traffic for a few seconds. This is an old limitation that has not been fixed still :(

                        1 Reply Last reply Reply Quote 0
                        • R
                          reqlez
                          last edited by

                          I also found something else different on the version of the OSPF that works ( downgraded ).

                          router ospf
                            ospf router-id 192.168.2.254
                            passive-interface re1
                            network 192.168.2.0/24 area 0.0.0.0
                            network 192.168.101.0/24 area 0.0.0.0
                            network 192.168.102.0/24 area 0.0.0.0
                            network 192.168.103.0/24 area 0.0.0.0
                            network 192.168.104.0/24 area 0.0.0.0

                          on the version that works, there is only ONE entry per subnet here … on the NEW version that doesn't work, there are 2 entries per subnet ... so it looks like this :

                          router ospf
                            ospf router-id 192.168.2.254
                            passive-interface re1
                            network 192.168.2.0/24 area 0.0.0.0
                            network 192.168.101.0/24 area 0.0.0.0
                            network 192.168.102.0/24 area 0.0.0.0
                            network 192.168.103.0/24 area 0.0.0.0
                            network 192.168.101.0/24 area 0.0.0.0
                            network 192.168.104.0/24 area 0.0.0.0
                            network 192.168.102.0/24 area 0.0.0.0
                            network 192.168.103.0/24 area 0.0.0.0
                            network 192.168.104.0/24 area 0.0.0.0

                          ( NOT EXACT but you get the idea, two entries per subnet under the ospfd.conf )

                          1 Reply Last reply Reply Quote 0
                          • R
                            r.vanmoerkerk
                            last edited by

                            Hi All,

                            First of all, thanks for this post. We had a lot of major issues in the network after 2.3 update of pfsense. By this post we could fix the issue and found what happend after a lot of hours troubleshooting.

                            Found a bug notice about this already one month old: https://redmine.pfsense.org/issues/6305 and created a new one on our own name with our support subscription. Wil post an update if we get one.

                            Downgrading the package fixed the issue for us.

                            Also we cannot redistribute the default 0.0.0.0/0 using zebra.conf to our lan. We also have a support out for that question to hopfully get a fix or update.

                            1 Reply Last reply Reply Quote 0
                            • H
                              heper
                              last edited by

                              Thanks. Ive bumped the redmine ticket Yesterday.

                              Hopefully it'll get fixed soon

                              1 Reply Last reply Reply Quote 0
                              • M
                                miguelgoncalves
                                last edited by

                                Hi!

                                I am also seeing this behaviour… Asked about it here: https://forum.pfsense.org/index.php?topic=112698.0

                                Attached are configuration files. These were created manually because I had to include some commands to stop Quagga inserting routes to the OpenVPN addresses into the kernel. It worked before. A recent upgrade stopped the failover from working.

                                I really hope this is solved quickly.

                                Cheers,
                                Miguel

                                dc_ospfd.txt
                                dc_zebra.txt
                                hq_ospfd.txt
                                hq_zebra.txt

                                1 Reply Last reply Reply Quote 0
                                • R
                                  r.vanmoerkerk
                                  last edited by

                                  No luck with support, they don't give any feedback or recognizes the issue. The test mentioned in the redmine is not a fair test. The bug has the effect that it advertises the whole network to itself so other bgp/ ospf instances in our netwerk are overwriten with this new data and locations are not reachable. It has nothing to do with wan failover in our case.

                                  It could be that the extra kernel routes are the issue but if that is the case then try to fix this. All was doing well in previous configs and after upgrade this issue happend. After reversing quagga packages it is fixed so don't blame me for thinking that it is related to the quagga package.

                                  How can we get some more action on this from the pfsense side? It is with issues like this that our management is not having faith in the solution, we have support but no response about this issue, not our package. It is part of the pfsense firewall suite product.

                                  1 Reply Last reply Reply Quote 0
                                  • H
                                    heper
                                    last edited by

                                    i think it would be ideal if one of the coredevs reverts the pfsense package to quagga 0.99.x.x for now.  (there wasn't anything wrong with it)

                                    then the coredevs have more time to find a way to replicate the issue & report it upstream.

                                    I believe this issue might affect a lot of quagga-users, but not all of them have noticed it…. in some cases you only notice it when an interface goes down.

                                    1 Reply Last reply Reply Quote 0
                                    • E
                                      echu2016
                                      last edited by

                                      Hello All!!!
                                      I´ve been with this annoying bug like three long long days!!
                                      While I realized that wasn´t only me, i quickly solved by reverting the package to an older version as proposed previously in this thread (thanks!)

                                      For me is quite easy to reproduce it.
                                      Let´s start by assuming we have a running and configured instance on an pfSense box.
                                      Our daemon now learns a brand new route, for instance 10.1.1.0/24.
                                      Example output:

                                      Quagga Zebra Routes:

                                      Codes: K - kernel route, C - connected, S - static, R - RIP,
                                            O - OSPF, I - IS-IS, B - BGP, P - PIM, A - Babel,
                                            > - selected route, * - FIB route

                                      O>* 10.1.1.0/24 [110/12] via 192.168.123.13, em5, 00:00:31

                                      We now have a next hop change (because of a link down situation).

                                      That line would now be seen (in my case) like this:

                                      O>* 10.1.1.0/24 [110/12] via 192.168.19.25, em5, 00:00:02

                                      (sorry for the upercase)
                                      UP TO HERE OK!!!!

                                      BUT, let´s go back to the original route:

                                      O>* 10.1.1.0/24 [110/12] via 192.168.123.13, em5, 00:00:31

                                      IF, in this step for any reason zebra is reloaded or restarted from now on we will see  like this:

                                      O> 10.1.1.0/24 [110/12] via 192.168.123.13, em5, 00:00:31
                                      K>* 10.1.1.0/24 [110/12] via 192.168.123.13, em5, 00:00:31

                                      What happens in a failover scenario? Well… This:

                                      O> 10.1.1.0/24 [110/12] via 192.168.19.25, em5, 00:01:20
                                      K> 10.1.1.0/24 [110/12] via 192.168.123.13, em5, 00:00:05*

                                      Red line shows the problem!!! Kernel route is wrong!!
                                      As far as i read, there is a daemon option line "–keep-kernel" That says zebra to preserve previous learned routes before actually booting up.

                                      If my explanation seems ok, then there is only one simple way to reproduce it:

                                      1- Make OSPF learn a new route.
                                      2- Go to services and restart both :Quagga OSPFd and Quagga Zebra daemons.
                                      3- Try to alter the paths and see that the line beggining with K won´t change any more!!!

                                      Hope I helped!

                                      Thanks!!!

                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        georgeman
                                        last edited by

                                        Bump!!

                                        Any updates on this? Unfortunately I don't have an appropiate lab to test what echu2016 posted above, and my production systems are currently running the previous version of the Quagga package as suggested.
                                        But what he posted makes perfect sense, and should be pretty simple to reproduce and track.

                                        If it ain't broke, you haven't tampered enough with it

                                        1 Reply Last reply Reply Quote 0
                                        • jimpJ
                                          jimp Rebel Alliance Developer Netgate
                                          last edited by

                                          After restarting services and yanking (virtual) cables I did manage to make it break, once.

                                          If it is related to restarting zebra, this patch might help:

                                          http://files.atx.pfsense.org/jimp/patches/skip_restart_for_routing_packages-2.3.1.patch

                                          Ultimately someone that can reproduce this reliably needs to report this directly to quagga since it appears to be a problematic change introduced in their 1.0.x code base.

                                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          • T
                                            Trey
                                            last edited by

                                            Hi,

                                            this problem is a real show stopper. Has nobody a config, that we can supply to the quagga team in order to fix the problem? This problem really sucks, as it is only showing itself from time to time…

                                            Dear pfsense team, what about a paid bugfix? What should it cost?!

                                            regards

                                            trey

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.