ICMP spikes after 23.05 upgrade
-
@johnpoz said in ICMP spikes after 23.05 upgrade:
@stephenw10 is he doing shaping? Traffic on the link could for sure cause fluctuations that is for sure..
Hi, if this was question on me - no it is not related to shapeing. It seems like some sort of drivers issue on xg7100. This probably starts on 23.01 . This not happaen on APU or build from intel NUC (realtec NIC)
-
@GeorgeCZ58
You will always get a small change (less than 1ms in your case) in ping time across the internet, post-upgrade. This is normal and to be expected.This has nothing to do with the pfSense update itself, it just due to the reconnection to your ISP triggered by the reboot. The reconnection is invariably a different pathway than before (BGP, hops, load balancing, contention, IPv6 vs 4 etc). The physical routing differences in distance/time will always change the inherent latency.
Your example is particularly mild, with just a ~1ms change on the new connection path. This is excellent but it is not always the case.
My link should, theoretically, run a ping at 5.8ms to the first external hop. Wonders being what they are, on rare occasions it does exactly that. A more typical figure is around 7ms:
However, due to major fibre infrastructure changes going on, a reconnect can select a new pathway as slow as 25ms. In practice I just reconnect again until I can get a first external hop down in single digits (as that helps with precision clock sync with some external equipment).
The graph below shows an increase in ping on reconnection from 7.2ms to 9.7ms, as a working example. It also covers another significant point - where to test to.
If you are looking at your ISP connection or pfSense you should only be looking at your first external hop - in the cases above and below you should only look at or only ping to Hop 2. There is nothing you can do to influence anything beyond that point:
Your connection to connection variance looks remarkably good and any reboot or reconnection will inevitably change the pathway and therefore the ping.
In sum, everything is just fine for you (but less so for me).
️
-
@RobbieTT but this is issue taht occure only after upgrade to 23.05. It is not related only to WAN side, but also to LAN side(routing between VLANs). I tried reboot of Netgates with no resolution. And for me it is not fine, I want to have same results like in 22.05.
-
As I understand it the concern here is not the ~1ms bump but the 20-30ms spikes shown by Zabbix?
-
@GeorgeCZ58 said in ICMP spikes after 23.05 upgrade:
@RobbieTT but this is issue taht occure only after upgrade to 23.05. It is not related only to WAN side, but also to LAN side(routing between VLANs).
You posted about your pings changing to both 1.1.1.1 and 8.8.8.8 - these are WAN addresses. My reply was to the WAN issue you raised:
Also it is visible on pFsense monitoring, here is graph of ping on 8.8.8.8 before and after update.
Any idea how to improve pings to get same pings as were on 22.05?
Your graph shows as sub-1ms increase to 8.8.8.8 (a WAN target) post-upgrade & post-reconnection. That change shown is tiny; you actually had slightly greater variance before the upgrade.
The line you point at is the standard deviation, so it will be influenced by any spikes in the ICMP ping - its just maths. ICMP pings have the lowest priority across any network, so spikes are expected. If any router along the pathway is busy with packets the humble ICMP ping is going to be the first to get dropped.
In no way can ICMP be used to represent or infer what happens to regular (non-ICMP) traffic. Indeed, many sites do not allow ICMP traffic at all or limit it to the lowest ping size possible.
️
-
@stephenw10 exactly, the spikes. This wasnt here before. They were at least 2x lower. As I wrote before - I dont feel like there grows network issues yet. But it is complete strange that it happen. I prefer to have everything like it was before.
-
For information - I upraded to 23.05.1 , unfortunately this not resolved the issue with spikes. It seems wholy as a problem with some NIC drivers in 23 in case of XG7100. Is there somebody with XG7100 who upgrade and can check if something changed also for him?
-
Seems like no development in this topis. I was chatting with stephenw10 - they are aware of this issue. I just apply last system patches, it didnt bring fix. Without fixing this I am not sure if I should upgrade bunch of ours XG7100 in remote sites.
-
There have been some commits to ixgbe since 23.05.1. 23.09-dev should become public soon and you will be able to test that.
Nothing in the commit logs looks to address this specifically but it still may have been affected by one of them.Steve
-
@stephenw10 hello, I am just waiting when 23.09 will becom official. You havent time to test in your lab enviroment?
-
@GeorgeCZ58
The dev loads have been very stable so it is probably worth trying 23.09 on one of your units to see if your issue has changed.️
-
Just tested 23.09 and it didnt resolve the issue. In the chat stephenw10 write this: open bug report on our internal system (#10923) which we are using to discuss this.
So Netgate team know about issue now and I hope it will be resolved asap.
-
Yup we are looking at this. Testing time is limited but I'm confident we will see this when we can test it.
-
@stephenw10 but it is only me, who is complaining about this issue with XG 7100 units? I need to get some result, as I need to upgrade our units. Can I pay to speed up resolving this issue?
-
@GeorgeCZ58 said in ICMP spikes after 23.05 upgrade:
@stephenw10 but it is only me
Perhaps the use of the Zabbix package is a discriminator here. It's not something I have used and probably the majority of pfSense users do not use it either.
If you could demonstrate the issue without an additional package it may be that others can replicate it. It would also rule-out Zabbix itself.
️
-
No there are quite a few users seeing this. I suspect there are more still who are hitting it but just not noticing because it doesn't affect most use cases significantly. I was only able to see it here by looking for it specifically.
-
-
It only affects the 7100 as far as I know at this point. Which points pretty squarely at the switch driver IMO.
-
@stephenw10 said in ICMP spikes after 23.05 upgrade:
It only affects the 7100 as far as I know at this point. Which points pretty squarely at the switch driver IMO.
And this driver was changed in 23.X release, yeah? Is there posibility to replase driver manualy?
-
It was not significantly changed. Additionally I have tested with the switch driver disabled entirely and it still shows the issue on the SFP ports so it's more subtle than just a bad driver update.