High ping and packet loss in local network


  • LAYER 8 Global Moderator

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    I will do some testing at work from highend cisco switch connected to another highend cisco switch..  Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..

    And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.


  • Netgate Administrator

    Those ping times look OK to me, aside from the sudden jump to 20ms.
    My test pfSense box is setup behind my home pfSense box connected directly by a 0.5m cat5e cable. The 'normal' ping responce is <0.2ms. See attached RRD graph.
    That is attached to some bridged ports on the pfSense box also so I would expect that to add some time.

    Though your ping times are still lower than mine and you are using a switch.

    In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?

    Steve




  • @johnpoz:

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    I will do some testing at work from highend cisco switch connected to another highend cisco switch..  Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..

    And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.

    For short cable runs on gigabit, that is rather normal.  I usually run ping tests after setting up structured cabling (loss packets are more indicative of issues than throughput tests that vary largely).  Sub .3ms is very normal even for longer runs (>50m).  The varying factor is usually the load on the end point systems.
    e.g.
    This is a ping test for a Gigabit connected access point that is currently actively streaming HD videos over WLAN:

    --- 192.168.0.11 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.463/4.956/9.644/3.571 ms
    

    This for a 10/100 connected access point that is currently inactive (no clients connected):

    --- 192.168.0.10 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.257/0.275/0.327/0.026 ms
    

    And this is a ping test for my Gigabit connected PC that's hardly doing much other than streaming a youtube video or two:

    --- 192.168.0.2 ping statistics ---
    5 packets transmitted, 5 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.247/0.262/0.312/0.025 ms
    

  • Netgate Administrator

    Forgot to say that my boxes above are both using fxp 10/100 NICs.  ;)

    Steve



  • It's really normal to see <0.5ms pings on a local network run (even through dumb switches).  My point being that the latencies will vary largely based on the end point devices and their load.  A short spike may just indicate that the end device is under load at the time.

    In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).



  • @dreamslacker:

    My point being that the latencies will vary largely based on the end point devices and their load.  A short spike may just indicate that the end device is under load at the time.
    In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).

    It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.

    @stephenw10:

    In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?

    Sry, my error, em1: 192.168.1.130/24, not 192.168.1.254.



  • @bullet92:

    It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.

    Is this machine on the same network as pfSense?

    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?

    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?


  • LAYER 8 Global Moderator

    "It's really normal to see <0.5ms pings on a local network run"

    Agreed..  .4 to .5 very common I see this all the time in the lan and expect it..  Is just .1 to .2, I don't see that – I wouldn't call our switches over worked or anything but only time I recall seeing such low numbers is pinging local..

    When at work trmw going to ping around the datacenter seeing what kind of low times I can find ;)



  • @dreamslacker:

    Is this machine on the same network as pfSense?
    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?
    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?

    Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.



  • @bullet92:

    @dreamslacker:

    Is this machine on the same network as pfSense?
    Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface.  Is the machine you're using also attached to the same interface or a different interface?
    BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?

    Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.

    Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.



  • @johnpoz: Don't bother to do so on my account. Lol. I do see 0.2 ms roundtrip on wired connections now and then during line testing but this is with fully idle systems. I.e. Idling system ping to smart switch on the other end without and other connected devices.
    Nevertheless, a lightly loaded system should still give 0.4-0.5 ms pings.


  • LAYER 8 Global Moderator

    talking about pinging say the console or mgmt switches from say another switch - neither would be very active – idle sucking juice is about all they wold be doing.. Now in a DC where the run might be 100ft, or quite possible they are next to each other in the rack but again patch panels to connect them most likely.

    I wouldn't be doing it for anyone other than myself - I don't recall ever seeing those kinds of speed in any network I have worked on in 20+ years..  But then again maybe I just never pinged anything directly connected ;)  Quite possible.



  • I only do that to quickly test structured runs really.  It's not indicative of real world practical applications but any major issues or interference generally shows.



  • @dreamslacker:

    Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.

    @192.168.2.1-QOS_ENABLED:

    TEST1
    ping -c50 192.168.2.1
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.117/1.591/14.669/3.278 ms
    TEST2
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.132/1.868/19.644/3.629 ms

    @192.168.1.254-QOS_ENABLED:

    TEST1
    ping -c50 192.168.1.254
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.640/2.117/10.636/2.417 ms
    TEST2
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.609/1.666/12.005/1.936 ms

    @192.168.2.1-QOS_DISABLED:

    TEST1
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 48 packets received, 4.0% packet loss
    round-trip min/avg/max/stddev = 0.105/4.136/46.755/8.910 ms

    TEST2
    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.116/2.888/20.880/5.121 ms

    @192.168.1.254-QOS_DISABLED:

    TEST1
    ping -c50 192.168.1.254
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.614/3.516/78.450/10.933 ms
    TEST2
    –- 192.168.1.254 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.646/2.797/30.649/4.756 ms

    @johnpoz:

    And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1

    arp -a
    ? (192.168.2.1) at 00:0d:61:79:54:d8 on em1_vlan13 expires in 1175 seconds [vlan]
    ? (192.168.2.5) at 00:26:55:e3:3f:67 on em1_vlan13 permanent [vlan]

    Yesterday i've also updated to snapshot 2.1.1, but no changes.
    I think that the prioritization give some benefits, but the proble are not the ping (obviously) but the internet traffic that go through this links wich results slower.. So if QoS is working, there is a bottleneck somewhere in my box? It seems strange load is low and the connections too..
    Moreover, even if QoS reduces the problem, this is still present, is inconceivable go from 0.117 ms latency to 14.669 ms. :/

    P.S. At this time (12.00am) every day ping and packet loss are worsen

    PING 192.168.2.1 (192.168.2.1): 56 data bytes
    64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=49.547 ms
    64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=54.244 ms
    64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=51.494 ms
    64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=27.504 ms
    64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=119.251 ms
    64 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=43.744 ms
    64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=8.241 ms
    64 bytes from 192.168.2.1: icmp_seq=8 ttl=64 time=0.118 ms
    64 bytes from 192.168.2.1: icmp_seq=9 ttl=64 time=90.099 ms
    
    --- 192.168.2.1 ping statistics ---
    10 packets transmitted, 9 packets received, 10.0% packet loss
    round-trip min/avg/max/stddev = 0.118/49.360/119.251/35.273 ms
    

    top

    last pid: 55487;  load averages:  0.00,  0.03,  0.01    up 0+19:20:19  12:23:35
    39 processes:  1 running, 38 sleeping
    CPU:  0.0% user,  0.0% nice,  0.4% system,  0.0% interrupt, 99.6% idle
    Mem: 63M Active, 213M Inact, 143M Wired, 1052K Cache, 112M Buf, 1570M Free
    

    Swap:

    vmstat -i

    interrupt                          total       rate
    irq1: atkbd0                           3          0
    irq14: ata0                           57          0
    irq19: uhci1+                     135782          1
    cpu0: timer                     27847076        399
    irq256: em0                      4073157         58
    irq257: em1                     44002413        631
    irq258: em2                       818545         11
    irq259: em3                     40608091        583
    cpu3: timer                     27847053        399
    cpu2: timer                     27847052        399
    cpu1: timer                     27847052        399
    Total                          201026281       2886
    

    /0  /1  /2  /3  /4  /5  /6  /7  /8  /9  /10
        Load Average

    Interface          Traffic              Peak                Total
        em1_vlan13  in    216.004 KB/s        233.244 KB/s            2.379 GB
                    out    10.831 KB/s        11.537 KB/s            1.078 GB

    em0_vlan10  in      0.146 KB/s          2.091 KB/s          175.861 MB
                    out    1.430 KB/s        13.097 KB/s            3.474 GB

    lo0  in      0.000 KB/s          0.000 KB/s          957.725 KB
                    out    0.000 KB/s          0.000 KB/s          957.725 KB

    em3  in    126.926 KB/s        165.866 KB/s            2.608 GB
                    out  316.054 KB/s        316.054 KB/s          702.344 MB

    em2  in      0.600 KB/s          0.600 KB/s          70.714 MB
                    out    0.744 KB/s          0.744 KB/s          286.328 MB

    em1  in    316.593 KB/s        316.593 KB/s            2.063 GB
                    out  123.062 KB/s        129.154 KB/s            2.080 GB

    em0  in      1.521 KB/s          4.879 KB/s          376.255 MB

    netstat -i -b -n -I em1_vlan13

    Name               Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
    em1_vlan13        1496 <link#11>     00:26:55:e3:3f:67 14875307     0     0 2561511047  8441171 72643 1157981020     0
    em1_vlan13        1496 fe80::226:55f fe80::226:55ff:fe        0     -     -          0        2     -        152     -
    em1_vlan13        1496 192.168.2.0/2 192.168.2.5           7945     -     -     508860       20     -       1680     -</link#11>
    

    netstat -i -b -n -I em1

    Name               Mtu Network       Address              Ipkts Ierrs Idrop     Ibytes    Opkts Oerrs     Obytes  Coll
    em1               1500 <link#2>      00:26:55:e3:3f:67 32198966     0     0 2232366523 19473909     0 2238003408     0
    em1               1500 fe80::226:55f fe80::226:55ff:fe        0     -     -          0        1     -         96     -
    em1               1500 192.168.1.0/2 192.168.1.130        25921     -     -    3280518       10     -        840     -</link#2>
    


  • @bullet: I meant prioritizing the ping packets rather than to disable the qos as a whole. That is, for the purpose of testing, set floating rules on each interface with direction out, protocol icmp, source address of the interface and place in the highest priority queue.

    Edit:  By any chance, do you have Upperlimit, or Limiter, or PowerD with P4TCC, or any combination of those enabled?



  • @dreamslacker
    yes, i understand that, infact is what i've done!

    I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.

    So now i tried to remove all the shaper and this is what i get:

    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 ms

    SOLVED!!!!  ;D ;D ;D ;D

    ps: So now I only need to configure correctly the traffic shaping


  • LAYER 8 Global Moderator

    Glad you got your issue solved.. I want to do some more testing in my own DC, trying to find these sub .2 ms response times..  So couple of cisco switches yesterday I pinged from one to the other.. Now the IPs were SVI on a vlan, but switches not doing anything and directly connected was seeing .4 ms roughly..  Which is bit of difference from .1 ms ;)

    I will be keeping an eye out.. I just personally don't recall seeing such low response times even just local lan with directly connected equipment unless you were pinging loopback, local IP, etc.

    But I will be paying more attention in the future on the hunt ;)



  • @bullet92:

    @dreamslacker
    yes, i understand that, infact is what i've done!

    I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.

    So now i tried to remove all the shaper and this is what i get:

    –- 192.168.2.1 ping statistics ---
    50 packets transmitted, 50 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 ms

    SOLVED!!!!  ;D ;D ;D ;D

    ps: So now I only need to configure correctly the traffic shaping

    I dislike the wizard, it never gives an optimal solution (nor should it since every use-case is different).

    Nevertheless, you can run the wizard but remove all upperlimits in all queues once it's done.  I've found that it increases latencies across the board (usually when it's active on the queue and occasionally at low loads) for some unknown reason.

    I've settled on manually setting up my own queues and settings damn nearly since day 1 (pfSense 1.2rc2).


  • Netgate Administrator

    Good call.  :)

    Time for a low ping contest Johnpoz?

    Steve



  • @dreamslacker:

    I dislike the wizard, it never gives an optimal solution (nor should it since every use-case is different).
    Nevertheless, you can run the wizard but remove all upperlimits in all queues once it's done.  I've found that it increases latencies across the board (usually when it's active on the queue and occasionally at low loads) for some unknown reason.
    I've settled on manually setting up my own queues and settings damn nearly since day 1 (pfSense 1.2rc2).

    I will try your advice, thanks ;)

    Glad you got your issue solved.. I want to do some more testing in my own DC, trying to find these sub .2 ms response times..  So couple of cisco switches yesterday I pinged from one to the other.. Now the IPs were SVI on a vlan, but switches not doing anything and directly connected was seeing .4 ms roughly..  Which is bit of difference from .1 ms ;)
    I will be keeping an eye out.. I just personally don't recall seeing such low response times even just local lan with directly connected equipment unless you were pinging loopback, local IP, etc.
    But I will be paying more attention in the future on the hunt ;)

    Thx :) I advice you to try to ping a zeroshell box! Just out of curiosity :D


Log in to reply