Performance regression 2.7.2 to 2.8
-
@stephenw10
MSS is set to 1472 on the only wan interface, the only other interface is the lan.
interfaces.php?if=wan How does this not affect ipv4?
ifconfig shows wan_stf with mtu1472
How to check MSS or are the pf scrub rules in a file or only in running memory?
Also does a doc exist on docs.netgate.com about ipv6 packet too big being a default pass rule, is creating a pass rule for this redundant? -
This post is deleted! -
Ah, OK the 6RD tunnel is not exposed dircetly... Hmm. I don't use 6RD.
So, yes, applying that on the WAN will affect IPv4 traffic. And more importantly may not apply it to traffic inside the 6RD tunnel. Setting it on LAN would though.
The actual value required may vary but in the one other case I've seen it was 1472.
We are digging into this....
-
@fathead said in Performance regression 2.7.2 to 2.8:
How to check MSS or are the pf scrub rules in a file or only in running memory?
Look at the ruleset file: /tmp/rules.debug
-
@stephenw10 Thank you!
Looks like it affects both.
before:
scrub on $WAN inet all max-mss 1452 fragment reassemble
scrub on $WAN_STF inet6 all max-mss 1432 fragment reassemble
after setting MSS:
scrub on $WAN inet all max-mss 1432 fragment reassemble
scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
What file creates the scrubs that are applied? -
Oh, OK. That looks correct then. Except it's not including the 6RD overhead. What we want to see there is:
scrub on $WAN inet all max-mss 1452 fragment reassemble scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
Yeah OK let me dig into this. It's likely a simple patch....
-
Can you test a patch?
That should allow if_pppoe to work as expected. mpd5/netgraph is a different matter!
-
@stephenw10 Thanks for the reply!
That worked instantly.scrub on $WAN inet all max-mss 1452 fragment reassemble scrub on $WAN_STF inet6 all max-mss 1412 fragment reassemble
-
Nice
And traffic passing as expected?
I imagine one of our devs might have a better patch than that but it proves the issue.
-
@stephenw10
99% of FIN_WAIT_2 are gone.
WAN side seems OK.
LAN side inconsistently get NO_TRAFFIC:NO_TRAFFIC with 64:ff9b::7f00:1 -
Hmm, that could be nothing if you're not seeing connection issues at the clients.
You could try setting a slightly lower MSS value and see if it changes anything.
-
ping6 -s56 64:ff9b::7f00:1
ping6 -s32 64:ff9b::7f00:1
Sometimes works, sometimes does not. -
Hmm, just with different ping sizes?
MSS has no effect on pings, only TCP. So nothing should have changed there.
-
@stephenw10
ping6 from lan side to pfSense it self and lan to lan.
I have only tested with small packets, however so far size does not matter.Both fail sometimes:
ping6 -s56 64:ff9b::7f00:1
ping6 -s32 64:ff9b::7f00:1
Even the default address of 64:ff9b::c0a8:101 sometimes fails.
What I do not understand is why it comes and goes.
setting lan side mtu/mss to 1.4k, 1.5k or 9k changes nothing. -
@fathead said in Performance regression 2.7.2 to 2.8:
ping6 from lan side to pfSense it self and lan to lan.
Hmm, well that would have nothing to do with the pppoe change on WAN. Something local blocking traffic?
-
@stephenw10 Only package installed is System_Patches for that one patch and all lan firewall rules are pass except port 53.
-
And you only see this for ping6? Internal IPv4 traffic is unaffected?
-
I was having this issue as well and can confirm the diff has solved my 6rd issues in 2.8.0
If you'd like I have a pcap from when I was having issues I can provide.
-
@stephenw10 Unable to reproduce with v4 and the link local, fe80::1:1 is always reachable.
It also affects Virtual IPs.
Is it expected behavior that the cpu usage is high with an VIPs 10.0.0.1/32 is on wan?
10.0.0.1/32 has been reassigned to lan and cpu can idle.
For supplementary information if a ping6 64:ff9b::a00:3 is started it will fail, restarting pfSense while the ping6 remains undisturbed; when pfSense boots up ping6 is successful for about 10 minutes.
Restarted pfSense 3 times testing VIPs, ping6 64:ff9b::7f00:1 can work if it is just one ping, if two or more lan IPs ping6 at the same time it does not work; this all may or may not be correct behavior if pfSense is seeing all pings from the same ip.ping6: Warning: time of day goes back (-16520us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=269 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16536us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=270 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16529us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=271 ttl=64 time=0.000 ms (DUP!) 64 bytes from 64:ff9b::7f00:1: icmp_seq=770 ttl=64 time=0.207 ms ping6: Warning: time of day goes back (-16541us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=272 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16471us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=273 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16543us), taking countermeasures 64 bytes from 64:ff9b::7f00:1: icmp_seq=274 ttl=64 time=0.000 ms (DUP!) ping6: Warning: time of day goes back (-16523us), taking countermeasures
-
@fathead said in Performance regression 2.7.2 to 2.8:
Is it expected behavior that the cpu usage is high with an VIPs 10.0.0.1/32 is on wan?
No, not just that. It might if it's having to deal with a lot of traffic to that VIP that otherwise gets blocked.
Can I assume that applying that patch has not changed this new problem? Just that it too is new in 2.8?