Is anyone else seeing lots of Oerrs on a PPPoE ISP connection running over VLAN 911?
-
@stephenw10 I'll post an update after it has been running like this for several hours (this is still with the old PPPoE driver). If this eliminates the errors then I could also try it with the new driver to see if it has any effect on that.
-
@stephenw10 Looking good so far; since turning off hardware VLAN tagging no further errors have accrued on the vlan interface, and still zero on the base interface and the pppoe interface, this being with the old PPPoE driver.
If the situation remains the same by the morning (UK time), is it worth me trying with the new PPPoE driver to see if this also eliminates the errors that was reporting? I've set up a 'scriptcmd' to disable the hardware tagging on every reboot in case I forget.
-
Yes, definitely try the if_pppoe driver if you can. That would be odd if the pppoe packets are somehow triggering some issue there but at least consistent. And it does nicely tie in with the 'vlan not parent' behaviour. You might not be seeing it as much in the old driver simply because it's not pushing the NIC as hard.
-
@stephenw10 So over a 12 hour period with the old driver and hwvlantag turned off there were a total of 19 Oerrs reported against the clan device with zero on the base interface and zero on the PPPoE interface. It's unclear why there were any errors on the VLAN device but at least the rate of increase is minuscule now.
This morning I switched back to the new PPPoE driver and ensured that hwvlantag was still turned off (it was).After the reboot there were 357 Oerrs showing on the VLAN interface and 514 on the PPPoE interface - a difference of 157 (previously with the new driver the PPPoE error count was always 5 less than the VLAN error count).
After 30 minutes the Oerr count has increased to 4333 on the VLAN device and 4490 on the PPPoE device (still a difference of 157). So turning off hardware VLAN tagging hasn't resolved the fundamental issue when using the new driver. I'll leave it to run in this configuration for the rest of today at least.
I'm not sure where this leaves me; there is clearly soem kind of issue with the new PPPoE driver. Maybe it is just mis-reporting errors (though why that also percolates down to the VLAN interface is a bit concerning) or maybe there truly is soem kind of problem. If so it seems likely that it is a driver issue rather than an actual interface/cable/ONT issue.
So, should I stick with the new driver permanently or switch back to the old driver until these issues get resolved? Is there anything more we can do to diagnose this so that someone can fix the issue?
-
Well if it's not hurting the speeds and the total throughput is still higher I'd stick with the new driver. That will at least show any other issues you might have with your particular ISP for example.
Also it will be much easier to run tests against when we come up with some!
-
@stephenw10 Yeah, that was kind of my thinking too. I'll stick with it for now despite the (seemingly) high error rate (0.1% if it could be believed).
Do let me know if anyone has any ideas for trying to hone in on the cause of these 'errors'.