Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Is anyone else seeing lots of Oerrs on a PPPoE ISP connection running over VLAN 911?

    Scheduled Pinned Locked Moved General pfSense Questions
    26 Posts 2 Posters 1.3k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S Offline
      stephenw10 Netgate Administrator
      last edited by

      Hmm, could be a clue there. 🤔

      C 1 Reply Last reply Reply Quote 0
      • C Offline
        ChrisJenk @stephenw10
        last edited by ChrisJenk

        @stephenw10 I've been doing more experiments and have some more info on all of this.

        1. With the new driver it seems that heavy traffic triggers an increase in the number of reported errors. For example a HTTP or Iperf3 speed test that pushes the link close to the limit (though the speeds are still very good).

        2. I'm now running with the old driver (as an experiment for comparison purposes). I saw a small number of errors on the VLAN only just after the router had restarted and some small increase in those over time, and no errors at all on the pppoe0 interface, even after several hours of running including many speed tests.

        After boot

        Name       Mtu Network                  Address                                   Ipkts Ierrs Idrop    Opkts Oerrs  Coll
        igc0      1500 <Link#1>                 XX:XX:77:7f:c9:d6                       4804079     0     0  5717254     0     0
        ...
        igc0.911  1500 <Link#15>                XX:XX:77:7f:c9:d6                       4804079     0     0  5717254   820     0
        igc0.911     - fe80::%igc0.911/64       fe80::XXXX:77ff:fe7f:c9d6%igc0.911            0     -     -        0     -     -
        ...
        pppoe0    1492 <Link#17>                pppoe0                                  4803848     0     0  5717807     0     0
        pppoe0       - XXX.69.48.XXX/32         abcdef.com                                15982     -     -        7     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d6%pppoe0           1386     -     -     1391     -     -
        pppoe0       - abcdef.com               abcdef.com                                 1232     -     -     3075     -     -
        pppoe0       - 2XX2:XXX:62fb::123/128   2XX2:XXX:62fb::123                          501     -     -        1     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d9%pppoe0              0     -     -        0     -     -
        pppoe0       - 2XX2:XXX:feed:62fb::/64  2XX2:XXX:feed:62fb:92ec:77ff:fe7f:c9d6     1848     -     -        0     -     -
        ...
        

        An hour, and several speed tests, later

        Name       Mtu Network                  Address                                    Ipkts Ierrs Idrop     Opkts Oerrs  Coll
        igc0      1500 <Link#1>                 XX:XX:77:7f:c9:d6                       15059231     0     0  18533104     0     0
        ...
        igc0.911  1500 <Link#15>                XX:XX:77:7f:c9:d6                       15059231     0     0  18533104   820     0
        igc0.911     - fe80::%igc0.911/64       fe80::XXXX:77ff:fe7f:c9d6%igc0.911             0     -     -         0     -     -
        ...
        pppoe0    1492 <Link#17>                pppoe0                                  15058839     0     0  18533476     0     0
        pppoe0       - XXX.69.48.XXX/32         abcdef.com                                111491     -     -        28     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d6%pppoe0            8527     -     -      8535     -     -
        pppoe0       - abcdef.com               abcdef.com                                  4958     -     -     12415     -     -
        pppoe0       - 2XX2:XXX:62fb::123/128   2XX2:XXX:62fb::123                          3912     -     -         1     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d9%pppoe0               0     -     -         0     -     -
        pppoe0       - 2XX2:XXX:feed:62fb::/64  2XX2:XXX:feed:62fb:92ec:77ff:fe7f:c9d6      8376     -     -         0     -     -
        ...
        

        Several hours later

        Name       Mtu Network                  Address                                    Ipkts Ierrs Idrop     Opkts Oerrs  Coll
        igc0      1500 <Link#1>                 XX:XX:77:7f:c9:d6                       57841186     0     0  68454730     0     0
        igc0         - fe80::%igc0/64           fe80::XXXX:77ff:fe7f:c9d6%igc0                 0     -     -         1     -     -
        ...
        igc0.911  1500 <Link#15>                XX:XX:77:7f:c9:d6                       57841186     0     0  68454730  1590     0
        igc0.911     - fe80::%igc0.911/64       fe80::XXXX:77ff:fe7f:c9d6%igc0.911             0     -     -         0     -     -
        ...
        pppoe0    1492 <Link#17>                pppoe0                                  57840234     0     0  68454319     0     0
        pppoe0       - XXX.69.48.XXX/32         abcdef.com                                513438     -     -       250     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d6%pppoe0           39368     -     -     39387     -     -
        pppoe0       - abcdef.com               abcdef.com                                 21290     -     -     62223     -     -
        pppoe0       - 2XX2:XXX:62fb::123/128   2XX2:XXX:62fb::123                         18804     -     -         1     -     -
        pppoe0       - fe80::%pppoe0/64         fe80::XXXX:77ff:fe7f:c9d9%pppoe0               0     -     -         0     -     -
        pppoe0       - 2XX2:XXX:feed:62fb::/64  2XX2:XXX:feed:62fb:92ec:77ff:fe7f:c9d6     40841     -     -         0     -     -
        ...
        

        mac_stats for igc.0 show no errors of any kind. I'd be interested to know where the (very few) errors counted against the VLAN are coming from.

        It seems to me that maybe the new driver is not quite ready for prime time.

        1 Reply Last reply Reply Quote 0
        • stephenw10S Offline
          stephenw10 Netgate Administrator
          last edited by

          Hmm, seeing a few errors when the interface comes up is not that unusual. Errors on the VLAN only is odd though. Especially as they are increasing after boot.

          You have hardware VLAN tagging enabled on igc0?
          Shown in options out of capabilities like:

          [25.07.1-RELEASE][admin@6100.stevew.lan]/root: ifconfig -vm igc0
          igc0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1300
          	options=48020b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,HWSTATS,MEXTPG>
          	capabilities=4f43fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
          

          Try disabling that and see if the errors stop:
          ifconfig igc0 -vlanhwtag

          C 1 Reply Last reply Reply Quote 0
          • C Offline
            ChrisJenk @stephenw10
            last edited by ChrisJenk

            @stephenw10 Yes it is enabled:

            igc0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
                    options=4e427bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWTSO,RXCSUM_IPV6,T
            XCSUM_IPV6,HWSTATS,MEXTPG>
                    capabilities=4f43fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC
            ,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
                    ether XX:XX:77:7f:c9:d6
                    inet6 fe80::XXXX:77ff:fe7f:c9d6%igc0 prefixlen 64 scopeid 0x1
                    media: Ethernet autoselect (2500Base-T <full-duplex>)
                    status: active
                    supported media:
                            media autoselect
                            media 2500Base-T
                            media 1000baseT
                            media 1000baseT mediaopt full-duplex
                            media 100baseTX mediaopt full-duplex
                            media 100baseTX
                            media 10baseT/UTP mediaopt full-duplex
                            media 10baseT/UTP
                    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
                    drivername: igc0
            

            I've turned it off now:

            igc0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0 mtu 1500
            	options=4e427ab<RXCSUM,TXCSUM,VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
            	capabilities=4f43fbb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,NETMAP,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,MEXTPG>
            	ether XX:XX:77:7f:c9:d6
            	inet6 fe80::XXXX:77ff:fe7f:c9d6%igc0 prefixlen 64 scopeid 0x1
            	media: Ethernet autoselect (2500Base-T <full-duplex>)
            	status: active
            	supported media:
            		media autoselect
            		media 2500Base-T
            		media 1000baseT
            		media 1000baseT mediaopt full-duplex
            		media 100baseTX mediaopt full-duplex
            		media 100baseTX
            		media 10baseT/UTP mediaopt full-duplex
            		media 10baseT/UTP
            	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
            	drivername: igc0
            

            Is that likely to have any detrimental effect?

            1 Reply Last reply Reply Quote 0
            • stephenw10S Offline
              stephenw10 Netgate Administrator
              last edited by

              Potentially it might make the connection fractionally slower but I'd be surprised if you're able to detect it!

              C 1 Reply Last reply Reply Quote 0
              • C Offline
                ChrisJenk @stephenw10
                last edited by

                @stephenw10 I'll post an update after it has been running like this for several hours (this is still with the old PPPoE driver). If this eliminates the errors then I could also try it with the new driver to see if it has any effect on that.

                C 1 Reply Last reply Reply Quote 1
                • C Offline
                  ChrisJenk @ChrisJenk
                  last edited by ChrisJenk

                  @stephenw10 Looking good so far; since turning off hardware VLAN tagging no further errors have accrued on the vlan interface, and still zero on the base interface and the pppoe interface, this being with the old PPPoE driver.

                  If the situation remains the same by the morning (UK time), is it worth me trying with the new PPPoE driver to see if this also eliminates the errors that was reporting? I've set up a 'scriptcmd' to disable the hardware tagging on every reboot in case I forget.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S Offline
                    stephenw10 Netgate Administrator
                    last edited by

                    Yes, definitely try the if_pppoe driver if you can. That would be odd if the pppoe packets are somehow triggering some issue there but at least consistent. And it does nicely tie in with the 'vlan not parent' behaviour. You might not be seeing it as much in the old driver simply because it's not pushing the NIC as hard.

                    C 1 Reply Last reply Reply Quote 0
                    • C Offline
                      ChrisJenk @stephenw10
                      last edited by

                      @stephenw10 So over a 12 hour period with the old driver and hwvlantag turned off there were a total of 19 Oerrs reported against the clan device with zero on the base interface and zero on the PPPoE interface. It's unclear why there were any errors on the VLAN device but at least the rate of increase is minuscule now.

                      This morning I switched back to the new PPPoE driver and ensured that hwvlantag was still turned off (it was).After the reboot there were 357 Oerrs showing on the VLAN interface and 514 on the PPPoE interface - a difference of 157 (previously with the new driver the PPPoE error count was always 5 less than the VLAN error count).

                      After 30 minutes the Oerr count has increased to 4333 on the VLAN device and 4490 on the PPPoE device (still a difference of 157). So turning off hardware VLAN tagging hasn't resolved the fundamental issue when using the new driver. I'll leave it to run in this configuration for the rest of today at least.

                      I'm not sure where this leaves me; there is clearly soem kind of issue with the new PPPoE driver. Maybe it is just mis-reporting errors (though why that also percolates down to the VLAN interface is a bit concerning) or maybe there truly is soem kind of problem. If so it seems likely that it is a driver issue rather than an actual interface/cable/ONT issue.

                      So, should I stick with the new driver permanently or switch back to the old driver until these issues get resolved? Is there anything more we can do to diagnose this so that someone can fix the issue?

                      1 Reply Last reply Reply Quote 0
                      • stephenw10S Offline
                        stephenw10 Netgate Administrator
                        last edited by

                        Well if it's not hurting the speeds and the total throughput is still higher I'd stick with the new driver. That will at least show any other issues you might have with your particular ISP for example.

                        Also it will be much easier to run tests against when we come up with some!

                        C 1 Reply Last reply Reply Quote 0
                        • C Offline
                          ChrisJenk @stephenw10
                          last edited by

                          @stephenw10 Yeah, that was kind of my thinking too. I'll stick with it for now despite the (seemingly) high error rate (0.1% if it could be believed).

                          Do let me know if anyone has any ideas for trying to hone in on the cause of these 'errors'.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.