Performance regression 2.7.2 to 2.8
-
@stephenw10 Thank you!
Looks like it affects both.
before:
scrub on $WAN inet all max-mss 1452 fragment reassemble
scrub on $WAN_STF inet6 all max-mss 1432 fragment reassemble
after setting MSS:
scrub on $WAN inet all max-mss 1432 fragment reassemble
scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
What file creates the scrubs that are applied? -
Oh, OK. That looks correct then. Except it's not including the 6RD overhead. What we want to see there is:
scrub on $WAN inet all max-mss 1452 fragment reassemble scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
Yeah OK let me dig into this. It's likely a simple patch....
-
Can you test a patch?
That should allow if_pppoe to work as expected. mpd5/netgraph is a different matter!
-
@stephenw10 Thanks for the reply!
That worked instantly.scrub on $WAN inet all max-mss 1452 fragment reassemble scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
-
Nice
And traffic passing as expected?
I imagine one of our devs might have a better patch than that but it proves the issue.
-
@stephenw10
99% of FIN_WAIT_2 are gone.
WAN side seems OK.
LAN side inconsistently get NO_TRAFFIC:NO_TRAFFIC with 64:ff9b::7f00:1 -
Hmm, that could be nothing if you're not seeing connection issues at the clients.
You could try setting a slightly lower MSS value and see if it changes anything.
-
ping6 -s56 64:ff9b::7f00:1
ping6 -s32 64:ff9b::7f00:1
Sometimes works, sometimes does not. -
Hmm, just with different ping sizes?
MSS has no effect on pings, only TCP. So nothing should have changed there.
-
@stephenw10
ping6 from lan side to pfSense it self and lan to lan.
I have only tested with small packets, however so far size does not matter.Both fail sometimes:
ping6 -s56 64:ff9b::7f00:1
ping6 -s32 64:ff9b::7f00:1
Even the default address of 64:ff9b::c0a8:101 sometimes fails.
What I do not understand is why it comes and goes.
setting lan side mtu/mss to 1.4k, 1.5k or 9k changes nothing. -
@fathead said in Performance regression 2.7.2 to 2.8:
ping6 from lan side to pfSense it self and lan to lan.
Hmm, well that would have nothing to do with the pppoe change on WAN. Something local blocking traffic?
-
@stephenw10 Only package installed is System_Patches for that one patch and all lan firewall rules are pass except port 53.
-
And you only see this for ping6? Internal IPv4 traffic is unaffected?
-
I was having this issue as well and can confirm the diff has solved my 6rd issues in 2.8.0
If you'd like I have a pcap from when I was having issues I can provide.
-
@stephenw10 Unable to reproduce with v4 and the link local, fe80::1:1 is always reachable.
It also affects Virtual IPs.
Is it expected behavior that the cpu usage is high with an VIPs 10.0.0.1/32 is on wan?
10.0.0.1/32 has been reassigned to lan and cpu can idle.
For supplementary information if a ping6 64:ff9b::a00:3 is started it will fail, restarting pfSense while the ping6 remains undisturbed; when pfSense boots up ping6 is successful for about 10 minutes.
Restarted pfSense 3 times testing VIPs, ping6 64:ff9b::7f00:1 can work if it is just one ping, if two or more lan IPs ping6 at the same time it does not work; this all may or may not be correct behavior if pfSense is seeing all pings from the same ip.ping6: Warning: time of day goes back (-16520us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=269 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16536us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=270 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16529us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=271 ttl=64 time=0.000 ms (DUP!) 64 bytes from 64:ff9b::7f00:1: icmp_seq=770 ttl=64 time=0.207 ms ping6: Warning: time of day goes back (-16541us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=272 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16471us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=273 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16543us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=274 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16523us), taking countermeasures
-
@fathead said in Performance regression 2.7.2 to 2.8:
Is it expected behavior that the cpu usage is high with an VIPs 10.0.0.1/32 is on wan?
No, not just that. It might if it's having to deal with a lot of traffic to that VIP that otherwise gets blocked.
Can I assume that applying that patch has not changed this new problem? Just that it too is new in 2.8?
-
@stephenw10
With or without patch mr1226.diff
No traffic on any VIPs and cpu is high.
kea-dhcp6 php-fpm.
kea-dhcp6 is using almost about 3% when 10.0.0.1/32 IP Alias is set on the wan, set it to lan kea-dhcp6 uses 0.00% -
Block private networks and loopback addresses
Is enabled on wan, turning that off is all the same high cpu. -
Oh this is on the PPPoE WAN?
That's a known issue: https://redmine.pfsense.org/issues/16235
Try the patch refferenced there.
-
@stephenw10 Yes the PPPoE WAN.
Is this the patch?
This patch does fix high cpu, however when a VIP is set on wan, it breaks the whole nat 64:ff9b::/96 address space, or is a reload/restart needed?