[BUG?] New PPPoE module (if_pppoe) causes high "Errors Out" on WAN (Vivo Fibra)
-
@stephenw10 Yes those errors are beig produced while the PPPoE connection is up and running.
-
OK try:
dtrace -n 'fbt::if_sendq_enqueue:return / arg1 != 0 / { stack(); printf("=> %d", arg1); }'
If that shows ENOBUFS (55) then it looks like it's exhausting the available buffers on your system for some reason. We can try just increasing it to allow for larger bursts.
-
@stephenw10 here is the output:
[2.8.0-RELEASE][admin@pfSense.localdomain]/root: dtrace -n 'fbt::if_sendq_enqueue:return / arg1 != 0 / { stack(); printf("=> %d", arg1); }' dtrace: description 'fbt::if_sendq_enqueue:return ' matched 1 probe CPU ID FUNCTION:NAME 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55 1 65524 if_sendq_enqueue:return if_pppoe.ko`sppp_output+0x1d0 kernel`pf_route+0x8e5 kernel`pf_test+0x1093 kernel`pf_check_out+0x2e kernel`pfil_mbuf_fwd+0x38 kernel`ip_tryforward+0x288 kernel`ip_input+0x34a kernel`swi_net+0x128 kernel`ithread_loop+0x239 kernel`fork_exit+0x7b kernel`0xffffffff812c1bae => 55
-
Ok great so it is exhausting the interface buffer.
Check the current buffer size:
sysctl net.link.ifqmaxlen
Try increasing it by 50%. It's a loader value so create the file /boot/loader.conf.local and add the line:
net.link.ifqmaxlen=192
Then reboot. And see if that makes any difference.
-
@stephenw10 the current size is 128 - I will apply the change and reboot - might be tomorrow now. Why would this only be affecting the new if_pppoe code - how does it differ from the old code? Is it really an error if the buffer is full - doesn't it just wait until it can write the packet?
-
The if_pppoe code is very different so there could be any number of reasons.
Another interesting question might be why it's only affecting a small number of users. Something unique about your connection maybe.
-
@stephenw10 I increased net.link.ifqmaxlen to 192 and rebooted - checked this was set correctly but still getting WAN out errors - it doesn't seem to have made any difference.
-
Hmm, actually at the same rate? Not reduced at all?
-
@stephenw10 if anything higher:
In/out packets 946815/304707 (1.15 GiB/161.23 MiB)
In/out packets (pass) 946815/304707 (1.15 GiB/161.23 MiB)
In/out packets (block) 770/0 (69 KiB/0 B)
In/out errors 0/3250
Collisions 0 -
OK try setting that to something much larger like 1024. If it still throws errors we know it's not simply bursting up to the buffer.
You could also try checking the error type again with:
dtrace -n 'fbt::if_sendq_enqueue:return / arg1 != 0 / { stack(); printf("=> %d", arg1); }'
Make sure it's still throwing error 55 with the buffer at 192B
-
@stephenw10 still throwing error 55 with buffer set to 192 - will try 1024 later.
-
@stephenw10 I increased the setting to 1024, rebooted and still getting loads of errors with code 55.
In/out packets 382831/240827 (445.27 MiB/265.41 MiB)
In/out packets (pass) 382831/240827 (445.27 MiB/265.41 MiB)
In/out packets (block) 1255/0 (168 KiB/0 B)
In/out errors 0/5242
Collisions 0 -
Ok, we'll dig deeper.
-
Would you have any VIP on the WAN interface by any chance ?
-
There is a patch for that but the symptoms there are generally that the link continually reconnects.
But, seperately, @brookheather are you using ALTQ traffic shaping on the pppoe interface? It could be exhausting the queue on that.
-
@stephenw10 Yes I am using the CODELQ traffic shaper on both LAN and WAN. WAN is set to 73Mb/s (for 500/75 connection).
-
Aha, and what is the queue length set to on WAN? Try increasing it and see if that reduces the output errors.
-
@stephenw10 do you mean the Queue Limit on the WAN shaper? I don't have anything set for this.
-
Yes. It's usually set to 50 packets by default. You can check that in Status > Queues.
Try setting it to 100 and see if that changes anything.
-
@stephenw10 So I just deleted my WAN shaper (but retained the LAN shaper) and re-enabled if_pppoe - after a reboot I ran a number of speed tests and there were no WAN out errors so it seems to be an issue with having a WAN shaper in conjunction with if_pppoe enabled. I would prefer to have a WAN shaper enabled as it can help when I WFH and max out the 75Mb/s upload capacity during a grid compile. I will leave the config as is for the moment and see how I get on without the WAN shaper.