Incoming traffic on vlan not recognized



  • Hi,

    I'm building a new firewall based on pfsense 2.1 to have IPv6 support.
    Now it turns out that incoming traffic with a vlan tag is not recognized.

    I'm using a jetway NF99FL-525 atom based chassis with a 3 port intel expansion board.

    em0 is setup as the wan connection, having two vlan's
    em0_vlan4 = IPtv (vlan is bridged to em4 which is the local IPtv interface, DHCP for local IP address so IGMP proxy can be used to have local lan join the IPtv multicast streams)
    em0_vlan6 = Internet (actually this is PPPoE over vlan 6)

    In my current 2.0.1 firewall, based on the same hardware, this works flawlessly.
    on pfsense 2.1 I can see outgoing traffic (PPPoE discovery initiation frames- PADI) I also see the incoming PPPoE discovery offer. (PPPoE - PADO)
    Traces of this traffic are made with port mirroring on a switch (MRV OS906) and with tcpdump -ei em0

    When running tcpdump -ei em0_vlan6 I only see the initiation frames and not the offer from the provider.
    This also happens with the DHCP request on the IPtv vlan.

    To check again and to be sure incoming traffic on the vlan is not handled, I gave the vlan an IP address and started a ping.
    result is that I can see the ARP requst gouing out of the firewall with the right vlan tag.
    I can also see the incoming ARP reply with the tag on the em0 interface, but not on the em0_vlan6 interface.

    Anyone an idea what might be wrong?

    @ndre



  • Do you have both boxes up at the same time? (2.0.1 and 2.1?). If you don't, are you swapping out port for port? I also assume that you have setup your rules correctly.



  • No, I did not run both boxes on the same time.
    The new one is only in test phase. so I only connected the WAN interface and a laptop to the LAN interface.

    The rules do not apply yet. This is still layer 2 traffic (MAC layer).
    This on during the startup phase where the FW tries to get an IP address.

    @ndre



  • I'd bet, that this again has to do with the vlan-pcp settings as stated in http://forum.pfsense.org/index.php/topic,52721.0.html

    In your case, since you really have three ports on the board, I guess you may be able to set the vlan pcp manually from the console.
    For some IPTV-Providers the CoS tags have to be set for some reason.

    Something like
    "ifconfig em0_vlan4 vlanpcp 4"
    will probably do the trick.

    Instead of "4" you may try other values, see here:
    http://en.wikipedia.org/wiki/IEEE_P802.1p

    In that case, I think pfSense, should change the way of it's PCP-handling for incoming traffic. Now it looks like a bug, not a feature, if traffic with diverging tags is being dropped.

    Regards
    Epek



  • epek, this might be true, but he is talking even pings. Does that affect pings also or just the IPtv port traffic?

    avink, If i read between the lines, it looks like you may not have the ports that you are plugging up setup for the correct VLAN access. Have you setup the switch port the WAN on pfsense uses just like the existing firewall's switch port? If you are not tagged for VLAN 1,4, and 6, then you might send a tagged packet, but you probably won't get one back.
    I might be wrong though. Is your provider tagging the vlan, or do you have them coming into a untagged port on vlan 6? Do they allow for more than 1 PPPoE connection to be made?
    If you have this working in 2.0.1, then if the switch ports are setup correctly, it should work the same if you have the same settings.



  • Ping may work, if proxy_arp is on for some reasons.
    In my problem, ping does not work either, but I have proxy_arp off.

    In case avink had wrongly configured vlans, the ping would also not work, with exception to the case that a device not filtering the packet was (ip-)accessible in both subnets. IPTV setups in contrary are often bridged or bridged + proxied, while PPPoE is proxied through the same layer as the vlans, isn't it?

    What's your provider anyway avink?

    I was also told, that mixing lan and vlan is not recommended.



  • I'm using the dutch provider XS4all.

    There is no mixing of lan and vlan. the physical interface em0 is not used for traffic. It's not even configured as a firewall interface.
    on em0 there are two vlans, em0_vlan4 (obtaining an ip address by dhdp and bridged to em4 as mentioned before) and em0_vlan6 obtaining its ip address by PPPoE.

    There is no misconfiguration of vlans.

    [2.1-BETA0][admin@firewall]/root(9): ifconfig -a
    em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=5219b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwfilter,vlan_hwtso>ether 00:30:18:a2:bd:13
            inet6 fe80::230:18ff:fea2:bd13%em0 prefixlen 64 scopeid 0x6
            nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    em0_vlan4: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=103 <rxcsum,txcsum,tso4>ether 00:30:18:a2:bd:13
            inet6 fe80::76f0:6dff:fe80:9448%em0_vlan4 prefixlen 64 scopeid 0x13
            nd6 options=1 <performnud>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 4 vlanpcp: 0 parent interface: em0
    em0_vlan6: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            options=103 <rxcsum,txcsum,tso4>ether 00:30:18:a2:bd:13
            inet6 fe80::76f0:6dff:fe80:9448%em0_vlan6 prefixlen 64 scopeid 0x14
            nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 6 vlanpcp: 0 parent interface: em0
    bridge0: flags=8843 <up,broadcast,running,simplex,multicast>metric 0 mtu 1500
            ether 02:fe:4a:c8:9c:00
            id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
            maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
            root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
            member: em0_vlan4 flags=143 <learning,discover,autoedge,autoptp>ifmaxaddr 0 port 19 priority 128 path cost 20000
            member: em4 flags=143 <learning,discover,autoedge,autoptp>ifmaxaddr 0 port 14 priority 128 path cost 2000000

    If there was a vlan misconfiguration one would expect there was no traffic on the vlan at all.
    Now I can see outgoing traffic on the vlan.
    Incoming traffic is seen on the master interface em0 but not on the vlan interface.

    Traffic seen on the em0 interface, with vlan tag
    [2.1-BETA0][admin@firewall]/root(12): tcpdump -ei em0
    tcpdump: WARNING: em0: no IPv4 address assigned
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on em0, link-type EN10MB (Ethernet), capture size 96 bytes
    18:35:21.906368 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x40196E0500FFFFFF] [Service-Name]
    18:35:27.908087 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x00E6542000FFFFFF] [Service-Name]
    18:35:29.907267 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x00E6542000FFFFFF] [Service-Name]
    ^C3 packets captured
    3 packets received by filter
    0 packets dropped by kernel
    traffic seen on the em0_vlan6 vlan interface, as tcpdump captures the virtual interface, traffic is seen without the VID
    [2.1-BETA0][admin@firewall]/root(14): tcpdump -ei em0_vlan6
    tcpdump: WARNING: em0_vlan6: no IPv4 address assigned
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on em0_vlan6, link-type EN10MB (Ethernet), capture size 96 bytes
    18:39:01.960611 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype PPPoE D (0x8863), length 36: PPPoE PADI [Host-Uniq 0x007C2A2000FFFFFF] [Service-Name]
    18:39:05.963381 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype PPPoE D (0x8863), length 36: PPPoE PADI [Host-Uniq 0x00E43B0500FFFFFF] [Service-Name]
    18:39:07.962558 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype PPPoE D (0x8863), length 36: PPPoE PADI [Host-Uniq 0x00E43B0500FFFFFF] [Service-Name]
    18:39:11.962525 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype PPPoE D (0x8863), length 36: PPPoE PADI [Host-Uniq 0x00E43B0500FFFFFF] [Service-Name]
    ^C
    4 packets captured
    4 packets received by filter
    0 packets dropped by kernel

    the answer from the provider is seen on the master interface, but NOT on the vlan interface so firewall keeps sending requests
    [2.1-BETA0][admin@firewall]/root(15): tcpdump -ei em0
    tcpdump: WARNING: em0: no IPv4 address assigned
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on em0, link-type EN10MB (Ethernet), capture size 96 bytes
    18:43:55.033543 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x80053C0500FFFFFF] [Service-Name]
    18:43:59.033491 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x80053C0500FFFFFF] [Service-Name]
    18:43:59.047620 00:90:1a:a4:60:4d (oui Unknown) > 00:30:18:a2:bd:13 (oui Unknown), ethertype 802.1Q (0x8100), length 75: vlan 6, p 7, ethertype PPPoE D, PPPoE PADO [AC-Name "dr7.d12"] [Host-Uniq 0x80053C0500FFFFFF] [Service-Name] [AC-Cookie 0x8CE02000C47364613408C38906D53904] [EOL]
    18:44:06.039529 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x0046B72000FFFFFF] [Service-Name]
    18:44:08.039526 00:30:18:a2:bd:13 (oui Unknown) > Broadcast, ethertype 802.1Q (0x8100), length 40: vlan 6, p 0, ethertype PPPoE D, PPPoE PADI [Host-Uniq 0x0046B72000FFFFFF] [Service-Name]

    this is only layer 2 traffic, IP (L3) is not involved.
    When starting the same machine with pfsense 2.0.1 everything works fine.

    @ndre</learning,discover,autoedge,autoptp></learning,discover,autoedge,autoptp></up,broadcast,running,simplex,multicast></full-duplex></performnud,accept_rtadv></rxcsum,txcsum,tso4></up,broadcast,running,simplex,multicast></full-duplex></performnud></rxcsum,txcsum,tso4></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,accept_rtadv></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwfilter,vlan_hwtso></up,broadcast,running,promisc,simplex,multicast>



  • Did you upgrade your 2.0.1 config or did you start from scratch?



  • I created the lan and wan interfaces including vlans during install.
    Then I imported the 2.0.1 config.

    After I found out things didn't work like it should I deleted the wan interface and recreated it again, but this did not solve the problem.

    Starting from scratch is a awfull lot of work.
    But if it has to be done…, still I'm not convinced it's a configuration issue.

    @ndre



  • again another update.

    Because I wanted to rule out any hardware issues I tried another system (a jetway NF92 board) also with a intel 3port eth extension board.
    The on-board interface has a Broadcom chipset.

    On this system I used the re0 with vlan's as the WAN interface.
    Again, this system shows exactly the same symptoms. Incoming traffic is seen on the main interface but not on the vlan interface.

    Here I also created a FW rule allowing any traffic to be able to test with ICMP (ping) also.

    As a final test I did the same with 2.0.1 software and everything works as a charm.

    In the end my only conclusion can be it is in the vlan code of the 2.1 software.
    The build I tested with was from Sat Aug 25 13:20:25 EDT 2012.

    I hope this will be fixed soon.

    @andre



  • After reading all that I've done and again carefully reading epek's answer I must admit he was right.

    Setting the PCP bits to 6 (Internetwork control) I see ARP replies.
    but… 'normal' traffic has a PCP of 0. The reply should have PCP 0 also.
    This makes it more or less unworkable. all network control traffic should have PCP 6 and regular traffic should have PCP 0

    I tested this by starting a ping.
    Since the firewall doesn't know the MAC address of the opposite switch it will send an ARP Request with PCP 6
    The switch replies with an ARP Reply having PCP 6, but since the firewall has PCP 0 on the vlan, the traffic is dropped.
    During the ping I change the vlan PCP to 6. This makes that the ARP Reply is accepted and the ping starts running, with PCP 6.

    Then I stop the ping and start a SSH session from the firewall console to the switch.
    This won't work because the SSH session is initiated with PCP 0 from the firewall. The vlan PCP is 6 because I just configured it for the ARP.
    After setting PCP to 0 on the vlan the SSH works but only until the ARP cache times out. then it starts the ARP requests again demanding PCP 6

    Currently I made a work-around by using a switch with an ACL that reset PCP to 0 for all traffic.

    This is a unworkable situation.

    Sometimes your own knowledge is bothering you.
    Being a network engineer for an carrier ethernet vendor I expected the regular behavior, the PCP bits do not have any relation to the acceptance of the traffic.
    I must agree with epek the current implementation is a bug or at least a misinterpretation of the .1q standard.

    The pcp bits are used to tell the switch the traffic has a certain level of priority. traffic with a high pcp value should have precedence over lower values.
    Depending on the pcp bits traffic is/can be directed to a specific queue.

    @ndre



  • @avink:

    After reading all that I've done and again carefully reading epek's answer I must admit he was right.

    I have to admit, I would have preferred not to be right.
    Your observations underline my assumption, that something in these pcp patches has gone awfully wrong.
    It somehow reminds me of ECN problematic of the late 90s/beginning millenium.

    @avink:

    Setting the PCP bits to 6 (Internetwork control) I see ARP replies.
    but… 'normal' traffic has a PCP of 0. The reply should have PCP 0 also.

    Some providers - as far as I have read about it - have deliberately chosen to set priority tags for their IPTV services.
    As long as this is also separated by vlans, it should not matter. Just set the vlanpcp for just the vlan interface in question.
    Other traffic should arrive on the other vlan, and may stay pcp-tagged zero.

    In my scenario, I would have to do some bridging between an untagged port on openwrt and the vlan on wan.  :-/

    @avink:

    Currently I made a work-around by using a switch with an ACL that reset PCP to 0 for all traffic.

    I tried this too, but was unsuccessful. - Cisco SLM200-8T

    @avink:

    This is a unworkable situation.

    I absolutely agree.

    @avink:

    Sometimes your own knowledge is bothering you.
    Being a network engineer for an carrier ethernet vendor I expected the regular behavior, the PCP bits do not have any relation to the acceptance of the traffic.
    I must agree with epek the current implementation is a bug or at least a misinterpretation of the .1q standard.

    But not only here… See Openwrt - why do untagged ports send pcp 1 instead of 0?
    I fear, that this problems will arise as soon as more OSs start supporting PCP instead of ignoring it.

    @avink:

    The pcp bits are used to tell the switch the traffic has a certain level of priority. traffic with a high pcp value should have precedence over lower values.
    Depending on the pcp bits traffic is/can be directed to a specific queue.

    While tags and meaning differ in case of '0' and '1' in respect of 802.1q/p …
    PfSense has rewrite functionality built into the web interface, but it won't work. (Values are always identical for in and out after the settings have been saved. I guess, that a newly introduced default pf-rule is the culprit, not the patch itself. Being a newbie to pfSense and FreeBSD, I have not yet figured it out.

    Epek

    P.S. Thanks Andre for filing this bug report: http://redmine.pfsense.org/issues/2613



  • I have not received an answer of any kind from a developer yet.

    Bump



  • It's very quiet indeed.
    Still waiting for a solution.



  • Have you tried latest 2.1-BETA build?

    The ticket has been marked as having been fixed.



  • When I checked tonight (CET) there wasn't a new build yet.
    Let me check.

    It still sys you're on the latest version

    Version 2.1-BETA0 (amd64)
    built on Mon Aug 27 14:57:37 EDT 2012
    FreeBSD 8.3-RELEASE-p4

    You are on the latest version.



  • Should be a new version out there for you.



  • No, unfortunately still no new version.

    2.1-BETA0 (amd64)
    built on Mon Aug 27 14:57:37 EDT 2012
    FreeBSD 8.3-RELEASE-p4

    You are on the latest version.
    This was on Friday 08:00 CET

    Still waiting….:(



  • The 32-bit nanobsd has had regular builds the last few days:

    Current version: 2.1-BETA0
      NanoBSD Size : 2g
           Built On: Thu Aug 30 02:36:04 EDT 2012
        New version: Thu Aug 30 06:26:55 EDT 2012
    

    So maybe something is going wrong building 64-bit?



  • Must be the embedded, full is there.

    2.1-BETA0 (amd64) 
    built on Thu Aug 30 06:54:02 EDT 2012 
    FreeBSD 8.3-RELEASE-p4
    


  • Then something must be wrong on my side.
    I still get that I'm on the latest version…. let's do the update by hand.

    Updated without problems.
    This afternoon I will test the PCP issue.



  • I can confirm that the issue has been fixed.
    I will do more elaborate testing this weekend with different PCP's and PCP combinations.
    For now it looks promising.

    @ndre



  • Confirmed. '2.1-BETA0 (amd64) built on Fri Aug 31 11:22:13 EDT 2012' seems to work.
    Thanks to everyone involved!
    What exactly went wrong?

    Update: the web interface for special rules for 802.1p still does not save different values. So shifting pcp on incoming/outgoing packets is still unsupported (through the gui).



  • Have you tried turning off VLAN_HWTAGGING ???

    I have an Atom based machine with 2 Intel NICs and vlans don't work until I turn this feature off.

    I had to put this in a cronjob:

    ifconfig em0 | grep -q VLAN_HWTAG && ifconfig em0 -vlanhwtag  
    ifconfig em1 | grep -q VLAN_HWTAG && ifconfig em1 -vlanhwtag  
    

    Worth a try….



  • @frater:

    Have you tried turning off VLAN_HWTAGGING ???
    I have an Atom based machine with 2 Intel NICs and vlans don't work until I turn this feature off.

    I had to put this in a cronjob:

    ifconfig em0 | grep -q VLAN_HWTAG && ifconfig em0 -vlanhwtag  
    ifconfig em1 | grep -q VLAN_HWTAG && ifconfig em1 -vlanhwtag  
    

    This is an issue that should be investigated …

    What is the exact model of your Intel NIC (output of dmesg) and your mainboard ?



  • @dhatz:

    This is an issue that should be investigated …

    What is the exact model of your Intel NIC (output of dmesg) and your mainboard ?

    I brought it up before ( http://forum.pfsense.org/index.php/topic,52224.0.html ) and made a bug report asking for an option to turn off hardware vlan tagging.

    http://redmine.pfsense.org/issues/2577

    I would welcome some follow-up, but don't want to hijack this thread.



  • Well,  This is the output of dmesg regarding to my nics:

    em0: <intel(r) 1000="" pro="" network="" connection="" 7.3.2="">port 0xcc00-0xcc1f mem 0xfe7e0000-0xfe7fffff,0xfe7dc000-0xfe7dffff irq 18 at device 0.0 on pci3
    em0: Using MSIX interrupts with 3 vectors
    em0: [ITHREAD]
    em0: [ITHREAD]
    em0: [ITHREAD]
    pcib4: <acpi pci-pci="" bridge="">irq 19 at device 28.3 on pci0
    pci4: <acpi pci="" bus="">on pcib4
    em1: <intel(r) 1000="" pro="" network="" connection="" 7.3.2="">port 0xdc00-0xdc1f mem 0xfe8e0000-0xfe8fffff,0xfe8dc000-0xfe8dffff irq 19 at device 0.0 on pci4
    em1: Using MSIX interrupts with 3 vectors
    em1: [ITHREAD]
    em1: [ITHREAD]
    em1: [ITHREAD]
    em2: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.4="">port 0xec00-0xec3f mem 0xfebe0000-0xfebfffff,0xfebc0000-0xfebdffff irq 18 at device 4.0 on pci6
    em2: [FILTER]
    em3: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.4="">port 0xe880-0xe8bf mem 0xfeb80000-0xfeb9ffff,0xfeb60000-0xfeb7ffff irq 19 at device 6.0 on pci6
    em3: [FILTER]
    em4: <intel(r) 1000="" pro="" legacy="" network="" connection="" 1.0.4="">port 0xe800-0xe83f mem 0xfeb20000-0xfeb3ffff,0xfeb00000-0xfeb1ffff irq 16 at device 7.0 on pci6

    em0 and em1 are the on-board nics. em2..5 are the nics on the expansion board.

    VLAN_HTWAG is off by default. I checked again to be sure.

    @ndre</intel(r)></intel(r)></intel(r)></intel(r)></acpi></acpi></intel(r)>



  • @avink:

    VLAN_HTWAG is off by default. I checked again to be sure.

    @ndre

    In this thread you gave the output of your ifconfig and it clearly shows you have VLAN_HWTAGGING enabled.
    How did you disable it, then?
    There's no option inside pfsense to do this.
    That's why I made the feature request which was rejected.

    
    [2.1-BETA0][admin@firewall]/root(9): ifconfig -a
    em0: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=5219b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwfilter,vlan_hwtso>ether 00:30:18:a2:bd:13
            inet6 fe80::230:18ff:fea2:bd13%em0 prefixlen 64 scopeid 0x6
            nd6 options=3 <performnud,accept_rtadv>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
    em0_vlan4: flags=8943 <up,broadcast,running,promisc,simplex,multicast>metric 0 mtu 1500
            options=103 <rxcsum,txcsum,tso4>ether 00:30:18:a2:bd:13
            inet6 fe80::76f0:6dff:fe80:9448%em0_vlan4 prefixlen 64 scopeid 0x13
            nd6 options=1 <performnud>media: Ethernet autoselect (1000baseT <full-duplex>)
            status: active
            vlan: 4 vlanpcp: 0 parent interface: em0</full-duplex></performnud></rxcsum,txcsum,tso4></up,broadcast,running,promisc,simplex,multicast></full-duplex></performnud,accept_rtadv></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,vlan_hwfilter,vlan_hwtso></up,broadcast,running,promisc,simplex,multicast>
    

Locked