Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue

    Scheduled Pinned Locked Moved L2/Switching/VLANs
    247 Posts 7 Posters 85.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • johnpozJ
      johnpoz LAYER 8 Global Moderator @NRgia
      last edited by

      @nrgia said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

      then what is vlan 0

      That is a special use vlan - it is not commonly used.. Its used normally to set priority when the actual vlan is not known, or whatever can not set the priority on the actual vlan.

      It is a special use.. when a switch sees a vlan 0, it should set the priority of the frame to whatever the priority is on on vlan 0 to whatever the default vlan/native vlan for that port is, ie the pvid.. Which normally is 1 on any switch, unless it has been changed by the operator..

      Here is what I can tell you how uncommon it is in the normal enterprise - I have never in 30 some years working in the biz, ever had need/want to set that on any sort of switches or routers, and have worked with lots and lots of them over the years. Nor have I ever seen it in the field on any pcaps, or any pcaps sent to me from multiple customer and locations - and I work for a major player, and have gotten pcaps for things they want help on from really all over the globe. Vlan 0 has never been part of any discussion or troubleshooting have ever been involved in. Now is it possible it was there and pcaps sent to me didn't have it - ok sure. But I have to say its not a very common used thing in my personal and professional opinion.

      That your seeing them it is odd for sure..

      An intelligent man is sometimes forced to be drunk to spend time with his fools
      If you get confused: Listen to the Music Play
      Please don't Chat/PM me for help, unless mod related
      SG-4860 24.11 | Lab VMs 2.8, 24.11

      N 1 Reply Last reply Reply Quote 1
      • N
        NRgia @johnpoz
        last edited by NRgia

        @johnpoz said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

        @nrgia said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

        then what is vlan 0

        That is a special use vlan - it is not commonly used.. Its used normally to set priority when the actual vlan is not known, or whatever can not set the priority on the actual vlan.

        It is a special use.. when a switch sees a vlan 0, it should set the priority of the frame to whatever the priority is on on vlan 0 to whatever the default vlan/native vlan for that port is, ie the pvid.. Which normally is 1 on any switch, unless it has been changed by the operator..

        Here is what I can tell you how uncommon it is in the normal enterprise - I have never in 30 some years working in the biz, ever had need/want to set that on any sort of switches or routers, and have worked with lots and lots of them over the years. Nor have I ever seen it in the field on any pcaps, or any pcaps sent to me from multiple customer and locations - and I work for a major player, and have gotten pcaps for things they want help on from really all over the globe. Vlan 0 has never been part of any discussion or troubleshooting have ever been involved in. Now is it possible it was there and pcaps sent to me didn't have it - ok sure. But I have to say its not a very common used thing in my personal and professional opinion.

        That your seeing them it is odd for sure..

        pfff...As deeper we go in, the more weirder it becomes.
        The point is, if it was a package, like Suricata, or something, I could live without...but this is something that prevents me, to use pfSense. Sure, for a few months 22.01 is sound, but after those...

        I mean you guys are way more experienced than me, and ran out of ideas...
        Can I at least ask both of you @johnpoz and @stephenw10 , to just drop me a message, if you stumble upon the same issue on other users posts, and you find a solution.
        If you please @stephenw10 and find out, that something changed upstream in FreeBSD, can you let me know to test again?

        Thank you guys for your time, I will not dare to keep more on this. If you have any news, ideas please let me know.
        Thank you again.

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator @NRgia
          last edited by

          @nrgia said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

          So due to the fact that for the same port I have tagged and untagged, maybe the tags for the VLANS get stripped ? And only the traffic to VLAN 0(1) remains?

          I was suggesting that what might have happened to get that client a dhcp lease on LAN is that while making changes to the QoS settings at some point the switch removed the VLAN tags from the port long enough for the DHCP sequence to complete.
          Where as if you had your LAN assigned as ix2.10, for example, untagged traffic arriving at pfSense would simply be dropped.

          That's a completely separate issue to the inexplicable VLAN 0 tags we see in the pcaps though.
          And I agree VLAN 0 (priority) tagged traffic is rare. I've only seen it in DHCP traffic arriving from an ISP.

          Steve

          N 1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            @johnpoz were you able to get a mirror port working that could see all the tagged traffic on your Netgear switch?

            @NRgia Do you have any other interfaces you can use on that box? Or could you add any?

            Steve

            johnpozJ N 2 Replies Last reply Reply Quote 0
            • johnpozJ
              johnpoz LAYER 8 Global Moderator @stephenw10
              last edited by

              @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

              get a mirror port working that could see all the tagged traffic on your Netgear switch?

              No I haven't I got tied up with real work stuff.. And have to look into being able to sniff vlan traffic on windows machine before hand anyway.. Or might be easier to just mirror the traffic to a spare interface on pfsense ;) That will be much easier I think.

              An intelligent man is sometimes forced to be drunk to spend time with his fools
              If you get confused: Listen to the Music Play
              Please don't Chat/PM me for help, unless mod related
              SG-4860 24.11 | Lab VMs 2.8, 24.11

              1 Reply Last reply Reply Quote 1
              • N
                NRgia @stephenw10
                last edited by NRgia

                @stephenw10 said in [pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue]

                @NRgia Do you have any other interfaces you can use on that box? Or could you add any?

                Steve

                Yep, I have 4, only 2 are used
                Is this board:
                https://www.supermicro.com/en/products/motherboard/A2SDi-4C-HLN4F

                I tried to find something that resembles to Netgate Hardware, and not to be a no name board, but this is another discussion.

                1 Reply Last reply Reply Quote 0
                • N
                  NRgia @stephenw10
                  last edited by

                  @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                  @nrgia said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                  So due to the fact that for the same port I have tagged and untagged, maybe the tags for the VLANS get stripped ? And only the traffic to VLAN 0(1) remains?

                  I was suggesting that what might have happened to get that client a dhcp lease on LAN is that while making changes to the QoS settings at some point the switch removed the VLAN tags from the port long enough for the DHCP sequence to complete.
                  Where as if you had your LAN assigned as ix2.10, for example, untagged traffic arriving at pfSense would simply be dropped.

                  That's a completely separate issue to the inexplicable VLAN 0 tags we see in the pcaps though.
                  And I agree VLAN 0 (priority) tagged traffic is rare. I've only seen it in DHCP traffic arriving from an ISP.

                  Steve

                  Telling you that I saw the word (Incomplete) under DHCP leases table, will be of any help? I only saw this Incomplete during the last tests

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Mmm, so only the 4 ix NICs on board. Is it in a case you can use the PCIe slow in to add another type of NIC?
                    That would be an easy way to prove out the driver/hardware vs config/network.

                    N 1 Reply Last reply Reply Quote 0
                    • N
                      NRgia @stephenw10
                      last edited by NRgia

                      @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                      Mmm, so only the 4 ix NICs on board. Is it in a case you can use the PCIe slow in to add another type of NIC?
                      That would be an easy way to prove out the driver/hardware vs config/network.

                      I thought of that, unfortunately I don't have any low profiles NICs to test. For the low profile I need to buy one.
                      The chassis looks like this:

                      https://www.supermicro.com/en/products/system/Mini-ITX/SYS-E300-9A-4C.cfm

                      In the PCI port I have a Riser card, in which a low profile NIC must be inserted.

                      I also want to be the driver, but you also tested with a board that uses ix driver. What is your NIC model? Mine is X553. Maybe the model counts also?

                      What if we do a little hack, it should work, for example to build an if_ix.ko on pfSense 22.01, then install pfSense 22.05, and load the .ko from the 22.01. Do you think it's a good test?

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S
                        stephenw10 Netgate Administrator
                        last edited by

                        Yeah it seems unlikely to be the driver because that's the same SoC with the same NICs we use in the 6100 and 7100. And the 7100 uses VLANs on it by default.

                        Kernel modules from 22.01 will not load in the 22.05 kernel. At least most won't, especially something like that. You would have to build the 22.01 driver against 22.05. And it probably won't build without some work. You might be able to just revert the patch we think went in the created the issue in your network and compile that. However if it does that only proves the VLAN0 handling was bad.

                        We need to see pcap on a mirror port showing all the traffic on the wire going into the port with working two way connectivity.

                        It almost impossible to believe the driver could add those tags to incoming packets. Incorrectly removing them in 22.01 is far more likely.

                        Steve

                        johnpozJ N 2 Replies Last reply Reply Quote 0
                        • johnpozJ
                          johnpoz LAYER 8 Global Moderator @stephenw10
                          last edited by johnpoz

                          @stephenw10 I just don't see how the driver would add them as well - that just makes no sense. And if that was the case as you mention you have it on a bunch of devices sold by netgate. And hard to believe he is the only one running this specific MB etc. But the driver would be the same.. So why would add something in on his, but not all the others?

                          So what would be something different on his hardware where something wrong with drive adding the vlan 0, but nothing else is - I would think lots of people running 22.05 with ix interfaces - why is the board not a flame with people saying their vlans are not working.

                          This is something pretty unique in this setup that is causing it.. Just missing what that something is in trying to solve the puzzle.

                          btw: I will fire up the mirror port on my netgear tmrw, unless something breaks or catches fire I have a pretty empty cal tmrw with real life work ;)

                          An intelligent man is sometimes forced to be drunk to spend time with his fools
                          If you get confused: Listen to the Music Play
                          Please don't Chat/PM me for help, unless mod related
                          SG-4860 24.11 | Lab VMs 2.8, 24.11

                          1 Reply Last reply Reply Quote 0
                          • stephenw10S
                            stephenw10 Netgate Administrator
                            last edited by

                            Yup, I agree. I seems far more likely we are just failing to capture the offending traffic at this point.

                            Except it appeared to do the same thing with the AP connected directly to ix2. 😕
                            So now I'm starting to doubt things!

                            1 Reply Last reply Reply Quote 0
                            • N
                              NRgia @stephenw10
                              last edited by

                              @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                              it seems unlikely to be the driver because that's the same SoC with the same NICs we use in the 6100 and 7100. And the 7100 uses VLANs on it by default.

                              I don't know if you have the management LAN, on Native LAN, you have it also on VLAN, like VLAN 10 or something? Maybe you can try replicate my setup, although you don't have the same switch. Or maybe I ask to much, it was just an idea.

                              I mean is there any mistake in this picture for LAN:
                              https://imgur.com/a/WIZZ6rB
                              ?

                              If I can do something else, let me know

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                That should work fine. You do have spare NICs so you could try other setups like putting VLANs 20 and 30 on ix1 and running a second link to the switch. But it should work as you have it now in 22.05.

                                N 1 Reply Last reply Reply Quote 0
                                • N
                                  NRgia @stephenw10
                                  last edited by NRgia

                                  @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                                  That should work fine. You do have spare NICs so you could try other setups like putting VLANs 20 and 30 on ix1 and running a second link to the switch. But it should work as you have it now in 22.05.

                                  I think I can repurpose another mini pc that uses igb and em drivers(for a few hours). But I will need some time to save my work. Then just to understand what it will prove:

                                  1. If it works on another machine with other type of NICs then it's not the switch.
                                  2. If it doesn't it means it's the switch?
                                    I'm trying to double check with you on this, to make sure this will not be in vain, because what I see here https://github.com/pfsense/FreeBSD-src/commit/9c762cc125c0c2dae9fbf49cc526bb97c14b54a4 is that the fix is also for igb and em drivers.
                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Yes if the switch is really tagging the traffic with VLAN0 it should fail using any NIC/driver in 22.05.

                                    Though as I understand it we also saw that when the switch was removed entirely and the AP was connected directly to pfSense. Which is hard to explain.

                                    N 1 Reply Last reply Reply Quote 0
                                    • N
                                      NRgia @stephenw10
                                      last edited by

                                      @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                                      Yes if the switch is really tagging the traffic with VLAN0 it should fail using any NIC/driver in 22.05.

                                      Though as I understand it we also saw that when the switch was removed entirely and the AP was connected directly to pfSense. Which is hard to explain.

                                      Yes, that's correct, that was one of the test you asked me to do. The AP was connected directly to pfSense. That's why I don't know what to say, if it's the switch or not. If I understood it corectly @johnpoz has almost the same switch as mine and also the same AP ( Flex HD) and he did not encounter any issues. I don't know if he had a board that used ix driver. Other then that he has 2 of my possible ofending devices.

                                      johnpozJ 1 Reply Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @NRgia
                                        last edited by

                                        @nrgia yeah I had no issues with vlans, and a native vlan as well. Both with port qos and 802.1p.

                                        Only difference is yeah my switch is a bit different, baby sister to your switch. And mine is downstream of another switch that could be stripping the vlan 0??

                                        But if you said you plugged the AP directly into the interface on pfsense, that kind of rules out the switch.

                                        This is a odd one for sure.. tmrw I could plug my AP directly into a port on pfsense, its just on a ix interface its a igb interface.

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.8, 24.11

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Mmm, because we also saws the vlan0 tags when we removed the access point and just used the switch.
                                          That implies it's either not the AP or the switch or that it's both devices doing it. Which seems unlikely!

                                          1 Reply Last reply Reply Quote 0
                                          • N
                                            NRgia
                                            last edited by NRgia

                                            @stephenw10 said in pfSense 22.05 breaks VLANS, restoring pfSense 22.01 fixes the issue:

                                            Mmm, because we also saws the vlan0 tags when we removed the access point and just used the switch.
                                            That implies it's either not the AP or the switch or that it's both devices doing it. Which seems unlikely!

                                            I got news. @stephenw10 and @johnpoz
                                            I installed pfSense 22.05 on the another mini PC that has 2 NICS that uses igb and em drivers.

                                            This is some info from ifconfig:

                                            [22.05-RELEASE][root@Entaro.Blueshift]/root: ifconfig
                                            igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                                    description: LAN
                                                    options=8100b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER>
                                                    ether 80:ee:73:bb:0e:55
                                                    inet6 fe80::82ee:73ff:febb:e55%igb0 prefixlen 64 scopeid 0x1
                                                    inet 172.18.0.12 netmask 0xfffe0000 broadcast 172.19.255.255
                                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                                    status: active
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                                    description: WAN
                                                    options=812098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER>
                                                    ether 80:ee:73:bb:0e:54
                                                    inet6 fe80::82ee:73ff:febb:e54%em0 prefixlen 64 scopeid 0x2
                                                    inet ************* netmask 0xfffff800 broadcast ***************
                                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                                    status: active
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            enc0: flags=0<> metric 0 mtu 1536
                                                    groups: enc
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
                                                    options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
                                                    inet6 ::1 prefixlen 128
                                                    inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
                                                    inet 127.0.0.1 netmask 0xff000000
                                                    inet 10.10.10.21 netmask 0xffffffff
                                                    groups: lo
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            pfsync0: flags=0<> metric 0 mtu 1500
                                                    groups: pfsync
                                            pflog0: flags=100<PROMISC> metric 0 mtu 33160
                                                    groups: pflog
                                            igb0.20: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                                    description: IoT
                                                    ether 80:ee:73:bb:0e:55
                                                    inet6 fe80::82ee:73ff:febb:e55%igb0.20 prefixlen 64 scopeid 0x9
                                                    inet 192.168.10.1 netmask 0xffffffc0 broadcast 192.168.10.63
                                                    groups: vlan
                                                    vlan: 20 vlanpcp: 0 parent interface: igb0
                                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                                    status: active
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            igb0.30: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
                                                    description: GuestNetwork
                                                    ether 80:ee:73:bb:0e:55
                                                    inet6 fe80::82ee:73ff:febb:e55%igb0.30 prefixlen 64 scopeid 0xa
                                                    inet 192.168.20.1 netmask 0xffffffc0 broadcast 192.168.20.63
                                                    groups: vlan
                                                    vlan: 30 vlanpcp: 0 parent interface: igb0
                                                    media: Ethernet autoselect (1000baseT <full-duplex>)
                                                    status: active
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                            ovpns1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
                                                    options=80000<LINKSTATE>
                                                    inet6 fe80::82ee:73ff:febb:e55%ovpns1 prefixlen 64 scopeid 0xb
                                                    inet 10.0.8.1 --> 10.0.8.2 netmask 0xffffff00
                                                    groups: tun openvpn
                                                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                                                    Opened by PID 59956
                                            
                                            

                                            Now a tcpdump from pfSense igb0 - the new LAN interface with NATIVE, VLAN20 and VLAN30
                                            pfsense_lan_tcpdump_new_machine.txt

                                            Everything works, every WLAN, every VLAN, any host. No problem whatsoever.

                                            Conclusions please guys

                                            Tell me @stephenw10 if you need a tcpdump via the monitor port also.

                                            Thank you

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.