Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    ICMP spikes after 23.05 upgrade

    Scheduled Pinned Locked Moved Firewalling
    45 Posts 5 Posters 6.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator @johnpoz
      last edited by

      @johnpoz said in ICMP spikes after 23.05 upgrade:

      So your in the same DC as google?

      That's the std deviation. The actual ping times went from ~19ms to 20ms. The increase is 1ms at most from what I can see.

      johnpozJ G 2 Replies Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator @stephenw10
        last edited by johnpoz

        @stephenw10 so maybe the increase in packet size could maybe account for that? But my point still stands - there is really nothing that the client (pfsense) could do that could increase the time something takes to respond - other than sending a larger packet..

        As to an increase in the std dev, that is a measure of the jitter really or how much the numbers fluctuate. Again your pinging across the public internet, something that takes 19ms RTT.. There is going to be fluctuation, how could the client sending the pings have anything to do with that?

        Can see that in just local pings that are normally under 1ms

        From 192.168.9.253: bytes=60 seq=0004 TTL=64 ID=9e18 time=0.495ms
        
        Packets: sent=4, rcvd=4, error=0, lost=0 (0.0% loss) in 1.504176 sec
        RTTs in ms: min/avg/max/dev: 0.402 / 0.456 / 0.498 / 0.041
        

        Now increase that size.

        From 192.168.9.253: bytes=540 seq=0004 TTL=64 ID=bd0e time=0.491ms
        
        Packets: sent=4, rcvd=4, error=0, lost=0 (0.0% loss) in 1.508622 sec
        RTTs in ms: min/avg/max/dev: 0.357 / 0.456 / 0.530 / 0.064
        

        Notice the std dev increased.. Even though my avg is the same.. Sending 4 pings did take 4 ms more total time.. etc. .

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.8, 24.11

        1 Reply Last reply Reply Quote 0
        • G
          GeorgeCZ58 @stephenw10
          last edited by

          @stephenw10 yes you are right. Small detail, I forgot that I am pinging 1.1.1.1, not google one. But I think it doesnt matter. Problem is, that "something" has changed. Question is, if it is related only to XG7100 on 23.05, or it is behavior on 23.05, that pings have lower priority then on 22.05 .

          One XG7100 is with original expansion card and one is without - the basic version. Till now I think all services are working properly, I think (and hope) that only ICMP is affected.

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            ~1ms change like that shouln't really affect anything. It's nothing like the sort of spikes you were seeing in Zabbix initially.

            johnpozJ 1 Reply Last reply Reply Quote 0
            • johnpozJ
              johnpoz LAYER 8 Global Moderator @stephenw10
              last edited by

              @stephenw10 what I am not understanding is how could anything in pfsense have to do with it pinging something across the public internet.. Makes no sense - other than the size of the ping, pfsense has no control over how fast something goes across the public internet and how long that something your pinging takes to respond..

              Pfsense knows when it put something on the wire, and when it returned.. It has no control on how long that something might take to respond... or any possible delays across the multiple hops to get from A to B and back again...

              Nor does it have any control over what the jitter or std dev in a sample of pings might be.. The only thing I could think of that could effect the std dev that might be in the control of pfsense monitoring some remote IP is the sample size it uses to determine that std dev..

              An intelligent man is sometimes forced to be drunk to spend time with his fools
              If you get confused: Listen to the Music Play
              Please don't Chat/PM me for help, unless mod related
              SG-4860 24.11 | Lab VMs 2.8, 24.11

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                If it's doing any sort of shaping it could.

                We saw issues like that when reloading the ruleset was triggering a huge CPU load previously. But not continually.

                johnpozJ 1 Reply Last reply Reply Quote 0
                • johnpozJ
                  johnpoz LAYER 8 Global Moderator @stephenw10
                  last edited by johnpoz

                  @stephenw10 is he doing shaping? Traffic on the link could for sure cause fluctuations that is for sure..

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.8, 24.11

                  G 1 Reply Last reply Reply Quote 0
                  • G
                    GeorgeCZ58 @johnpoz
                    last edited by

                    @johnpoz said in ICMP spikes after 23.05 upgrade:

                    @stephenw10 is he doing shaping? Traffic on the link could for sure cause fluctuations that is for sure..

                    Hi, if this was question on me - no it is not related to shapeing. It seems like some sort of drivers issue on xg7100. This probably starts on 23.01 . This not happaen on APU or build from intel NUC (realtec NIC)

                    RobbieTTR 1 Reply Last reply Reply Quote 0
                    • RobbieTTR
                      RobbieTT @GeorgeCZ58
                      last edited by

                      @GeorgeCZ58
                      You will always get a small change (less than 1ms in your case) in ping time across the internet, post-upgrade. This is normal and to be expected.

                      This has nothing to do with the pfSense update itself, it just due to the reconnection to your ISP triggered by the reboot. The reconnection is invariably a different pathway than before (BGP, hops, load balancing, contention, IPv6 vs 4 etc). The physical routing differences in distance/time will always change the inherent latency.

                      Your example is particularly mild, with just a ~1ms change on the new connection path. This is excellent but it is not always the case.

                      My link should, theoretically, run a ping at 5.8ms to the first external hop. Wonders being what they are, on rare occasions it does exactly that. A more typical figure is around 7ms:

                      Redacted Ping Plot 6.6 First Hop.png

                      However, due to major fibre infrastructure changes going on, a reconnect can select a new pathway as slow as 25ms. In practice I just reconnect again until I can get a first external hop down in single digits (as that helps with precision clock sync with some external equipment).

                      The graph below shows an increase in ping on reconnection from 7.2ms to 9.7ms, as a working example. It also covers another significant point - where to test to.

                      If you are looking at your ISP connection or pfSense you should only be looking at your first external hop - in the cases above and below you should only look at or only ping to Hop 2. There is nothing you can do to influence anything beyond that point:

                      Redacted Ping Plot 7.2 to 9.7 First Hop.png

                      Your connection to connection variance looks remarkably good and any reboot or reconnection will inevitably change the pathway and therefore the ping.

                      In sum, everything is just fine for you (but less so for me).

                      ☕️

                      G 1 Reply Last reply Reply Quote 0
                      • G
                        GeorgeCZ58 @RobbieTT
                        last edited by

                        @RobbieTT but this is issue taht occure only after upgrade to 23.05. It is not related only to WAN side, but also to LAN side(routing between VLANs). I tried reboot of Netgates with no resolution. And for me it is not fine, I want to have same results like in 22.05.

                        RobbieTTR 1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          As I understand it the concern here is not the ~1ms bump but the 20-30ms spikes shown by Zabbix?

                          G 1 Reply Last reply Reply Quote 0
                          • RobbieTTR
                            RobbieTT @GeorgeCZ58
                            last edited by

                            @GeorgeCZ58 said in ICMP spikes after 23.05 upgrade:

                            @RobbieTT but this is issue taht occure only after upgrade to 23.05. It is not related only to WAN side, but also to LAN side(routing between VLANs).

                            @GeorgeCZ58

                            You posted about your pings changing to both 1.1.1.1 and 8.8.8.8 - these are WAN addresses. My reply was to the WAN issue you raised:

                            Also it is visible on pFsense monitoring, here is graph of ping on 8.8.8.8 before and after update.icmp_WAN.JPG

                            Any idea how to improve pings to get same pings as were on 22.05?

                            Your graph shows as sub-1ms increase to 8.8.8.8 (a WAN target) post-upgrade & post-reconnection. That change shown is tiny; you actually had slightly greater variance before the upgrade.

                            The line you point at is the standard deviation, so it will be influenced by any spikes in the ICMP ping - its just maths. ICMP pings have the lowest priority across any network, so spikes are expected. If any router along the pathway is busy with packets the humble ICMP ping is going to be the first to get dropped.

                            In no way can ICMP be used to represent or infer what happens to regular (non-ICMP) traffic. Indeed, many sites do not allow ICMP traffic at all or limit it to the lowest ping size possible.

                            ☕️

                            1 Reply Last reply Reply Quote 0
                            • G
                              GeorgeCZ58 @stephenw10
                              last edited by

                              @stephenw10 exactly, the spikes. This wasnt here before. They were at least 2x lower. As I wrote before - I dont feel like there grows network issues yet. But it is complete strange that it happen. I prefer to have everything like it was before.

                              1 Reply Last reply Reply Quote 0
                              • G
                                GeorgeCZ58
                                last edited by

                                For information - I upraded to 23.05.1 , unfortunately this not resolved the issue with spikes. It seems wholy as a problem with some NIC drivers in 23 in case of XG7100. Is there somebody with XG7100 who upgrade and can check if something changed also for him?

                                1 Reply Last reply Reply Quote 0
                                • G
                                  GeorgeCZ58
                                  last edited by

                                  Seems like no development in this topis. I was chatting with stephenw10 - they are aware of this issue. I just apply last system patches, it didnt bring fix. Without fixing this I am not sure if I should upgrade bunch of ours XG7100 in remote sites.

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    There have been some commits to ixgbe since 23.05.1. 23.09-dev should become public soon and you will be able to test that.
                                    Nothing in the commit logs looks to address this specifically but it still may have been affected by one of them.

                                    Steve

                                    G 1 Reply Last reply Reply Quote 1
                                    • G
                                      GeorgeCZ58 @stephenw10
                                      last edited by

                                      @stephenw10 hello, I am just waiting when 23.09 will becom official. You havent time to test in your lab enviroment?

                                      RobbieTTR 1 Reply Last reply Reply Quote 0
                                      • RobbieTTR
                                        RobbieTT @GeorgeCZ58
                                        last edited by

                                        @GeorgeCZ58
                                        The dev loads have been very stable so it is probably worth trying 23.09 on one of your units to see if your issue has changed.

                                        ☕️

                                        1 Reply Last reply Reply Quote 1
                                        • G
                                          GeorgeCZ58
                                          last edited by

                                          Just tested 23.09 and it didnt resolve the issue. In the chat stephenw10 write this: open bug report on our internal system (#10923) which we are using to discuss this.

                                          So Netgate team know about issue now and I hope it will be resolved asap.

                                          1 Reply Last reply Reply Quote 0
                                          • stephenw10S
                                            stephenw10 Netgate Administrator
                                            last edited by

                                            Yup we are looking at this. Testing time is limited but I'm confident we will see this when we can test it.

                                            G 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.