High ping and packet loss in local network
-
So this is physical wires and devices involved anywhere in this setup - or is this all virtual networks and vms?
I've used VM (virtual box on windows) only for testing ping from another linux box(i hate windows ping), my pFsense configuration is hardware!
64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 ms
64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.142 ms.15 to .2 ms is freakishly FAST.. And then bounces to 20ms ??
64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=20.249 msYou finally hit the problem!! ;D How is it possible?! :o
I don't know why is so fast, but this is a production server, with good cable (cat 5e, 6) and decent switch (but not excellent).Ok low 3's – but sub .2 -- I don't think I have ever seen such speeds.
2.1 box is a zeroshell router, before putting it i haven't this low ping.
You sure your not pinging your own IP address again? ;)
I'm sure because of this:
PING 192.168.2.1 (192.168.2.1) from 192.168.2.5: 56 data bytes
64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=0.203 ms
64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.149 msand:
Interface configuration:
WAN (wan) -> em1 -> v4: 192.168.1.130/24 DMZ (lan) -> em0 -> v4: 172.16.30.5/24 WAN2 (opt1) -> em1_vlan13 -> v4: 192.168.2.5/24 WIBRI (opt2) -> em3 -> v4: 192.168.168.5/24 SEDE (opt3) -> em0_vlan10 -> v4: 192.168.132.5/24 WAN3 (opt4) -> em2 -> v4: 217.xx.xx.30/27
So here is question for you - are you having actual operational issues with actual applications having issues with packet loss.. Or are you just seeing weird stuff when your pinging?
I've started to monitoring my ping because of issue with VOIP, that is more suscettibile of packet loss and high latency. I've got a lot of drop calls or very low calling quality.
Then i've PHYSICAL switch WAN3 (where the VOIP go out) from em0 (em0, em1 are on the same physical network, a dual port) to em2 (motherboard's NIC ) AND i've moved other traffic that exit from wan3 to wan1, SO this network now work good. At this point i've thinked that i've a broken or not properly working NIC, but i've tried another hardware configuration, but the issue persists. So i've tried, on my production server, to put all the wan on the NIC that i was sure working good: em3. I've created and configured 2 wan on 2 vlan + 1 wan without it: DISASTER :o ALL the wan had the same problem, even worse. I've tried also to change switch!With my actual configuration i haven't problem with VOIP, but the internet navigation is worse and slower because of this lag/packet loss.
PS. thanks for your interest :D
-
And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1
I will do some testing at work from highend cisco switch connected to another highend cisco switch.. Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..
And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.
-
Those ping times look OK to me, aside from the sudden jump to 20ms.
My test pfSense box is setup behind my home pfSense box connected directly by a 0.5m cat5e cable. The 'normal' ping responce is <0.2ms. See attached RRD graph.
That is attached to some bridged ports on the pfSense box also so I would expect that to add some time.Though your ping times are still lower than mine and you are using a switch.
In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?
Steve
-
And what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1
I will do some testing at work from highend cisco switch connected to another highend cisco switch.. Its just sub <.2ms just seems like one screaming LAN or your just pinging yourself..
And your sub .2 and then out of the blue 20ms – then next ping back to sub .2, that seems just not right.
For short cable runs on gigabit, that is rather normal. I usually run ping tests after setting up structured cabling (loss packets are more indicative of issues than throughput tests that vary largely). Sub .3ms is very normal even for longer runs (>50m). The varying factor is usually the load on the end point systems.
e.g.
This is a ping test for a Gigabit connected access point that is currently actively streaming HD videos over WLAN:--- 192.168.0.11 ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.463/4.956/9.644/3.571 ms
This for a 10/100 connected access point that is currently inactive (no clients connected):
--- 192.168.0.10 ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.257/0.275/0.327/0.026 ms
And this is a ping test for my Gigabit connected PC that's hardly doing much other than streaming a youtube video or two:
--- 192.168.0.2 ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.247/0.262/0.312/0.025 ms
-
Forgot to say that my boxes above are both using fxp 10/100 NICs. ;)
Steve
-
It's really normal to see <0.5ms pings on a local network run (even through dumb switches). My point being that the latencies will vary largely based on the end point devices and their load. A short spike may just indicate that the end device is under load at the time.
In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).
-
My point being that the latencies will vary largely based on the end point devices and their load. A short spike may just indicate that the end device is under load at the time.
In this case, the devices the OP is pinging may simply be under load at the time (since they are technically routers doing their job).It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.
In your diagram above you seem to have two boxes labelled 192.168.1.254. Typo? Just indicating the subnet? My not understanding your diagram?
Sry, my error, em1: 192.168.1.130/24, not 192.168.1.254.
-
It might seem correct, but if it were i should saturate 100mbit or the end point devices should have 100% or something like cpu using. This is impossible also because when i have this latency from pfsense if i ping from another box the same router at the SAME TIME, i get a fast and stable ping time.
Is this machine on the same network as pfSense?
Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface. Is the machine you're using also attached to the same interface or a different interface?
BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?
-
"It's really normal to see <0.5ms pings on a local network run"
Agreed.. .4 to .5 very common I see this all the time in the lan and expect it.. Is just .1 to .2, I don't see that – I wouldn't call our switches over worked or anything but only time I recall seeing such low numbers is pinging local..
When at work trmw going to ping around the datacenter seeing what kind of low times I can find ;)
-
Is this machine on the same network as pfSense?
Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface. Is the machine you're using also attached to the same interface or a different interface?
BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.
-
Is this machine on the same network as pfSense?
Presumably, pfSense is pinging the said router on its 'LAN' or 'DMZ' interface. Is the machine you're using also attached to the same interface or a different interface?
BTW, do you have traffic shaping or QOS enabled on pfSense or the target router(s)?Yes, i've putted this machine on the same switch ( so the same router interface ) of pfSense. Yes, traffis shaping is currently enabled on pfSense, disabled on the target routers.
Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.
-
@johnpoz: Don't bother to do so on my account. Lol. I do see 0.2 ms roundtrip on wired connections now and then during line testing but this is with fully idle systems. I.e. Idling system ping to smart switch on the other end without and other connected devices.
Nevertheless, a lightly loaded system should still give 0.4-0.5 ms pings. -
talking about pinging say the console or mgmt switches from say another switch - neither would be very active – idle sucking juice is about all they wold be doing.. Now in a DC where the run might be 100ft, or quite possible they are next to each other in the rack but again patch panels to connect them most likely.
I wouldn't be doing it for anyone other than myself - I don't recall ever seeing those kinds of speed in any network I have worked on in 20+ years.. But then again maybe I just never pinged anything directly connected ;) Quite possible.
-
I only do that to quickly test structured runs really. It's not indicative of real world practical applications but any major issues or interference generally shows.
-
Try prioritizing icmp using the floating rules and see what you get. In practical terms, it does little but if it works then you know what to do to get the best effect on your setup.
@192.168.2.1-QOS_ENABLED:
TEST1
ping -c50 192.168.2.1
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.117/1.591/14.669/3.278 ms
TEST2
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.132/1.868/19.644/3.629 ms@192.168.1.254-QOS_ENABLED:
TEST1
ping -c50 192.168.1.254
–- 192.168.1.254 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.640/2.117/10.636/2.417 ms
TEST2
–- 192.168.1.254 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.609/1.666/12.005/1.936 ms@192.168.2.1-QOS_DISABLED:
TEST1
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 48 packets received, 4.0% packet loss
round-trip min/avg/max/stddev = 0.105/4.136/46.755/8.910 msTEST2
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.116/2.888/20.880/5.121 ms@192.168.1.254-QOS_DISABLED:
TEST1
ping -c50 192.168.1.254
–- 192.168.1.254 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.614/3.516/78.450/10.933 ms
TEST2
–- 192.168.1.254 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.646/2.797/30.649/4.756 msAnd what macs are you seeing on those IPs from arp -a on 2.5 pinging 2.1
arp -a
? (192.168.2.1) at 00:0d:61:79:54:d8 on em1_vlan13 expires in 1175 seconds [vlan]
? (192.168.2.5) at 00:26:55:e3:3f:67 on em1_vlan13 permanent [vlan]Yesterday i've also updated to snapshot 2.1.1, but no changes.
I think that the prioritization give some benefits, but the proble are not the ping (obviously) but the internet traffic that go through this links wich results slower.. So if QoS is working, there is a bottleneck somewhere in my box? It seems strange load is low and the connections too..
Moreover, even if QoS reduces the problem, this is still present, is inconceivable go from 0.117 ms latency to 14.669 ms. :/P.S. At this time (12.00am) every day ping and packet loss are worsen
PING 192.168.2.1 (192.168.2.1): 56 data bytes 64 bytes from 192.168.2.1: icmp_seq=0 ttl=64 time=49.547 ms 64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=54.244 ms 64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=51.494 ms 64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=27.504 ms 64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=119.251 ms 64 bytes from 192.168.2.1: icmp_seq=5 ttl=64 time=43.744 ms 64 bytes from 192.168.2.1: icmp_seq=7 ttl=64 time=8.241 ms 64 bytes from 192.168.2.1: icmp_seq=8 ttl=64 time=0.118 ms 64 bytes from 192.168.2.1: icmp_seq=9 ttl=64 time=90.099 ms --- 192.168.2.1 ping statistics --- 10 packets transmitted, 9 packets received, 10.0% packet loss round-trip min/avg/max/stddev = 0.118/49.360/119.251/35.273 ms
top
last pid: 55487; load averages: 0.00, 0.03, 0.01 up 0+19:20:19 12:23:35 39 processes: 1 running, 38 sleeping CPU: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle Mem: 63M Active, 213M Inact, 143M Wired, 1052K Cache, 112M Buf, 1570M Free
Swap:
vmstat -i
interrupt total rate irq1: atkbd0 3 0 irq14: ata0 57 0 irq19: uhci1+ 135782 1 cpu0: timer 27847076 399 irq256: em0 4073157 58 irq257: em1 44002413 631 irq258: em2 818545 11 irq259: em3 40608091 583 cpu3: timer 27847053 399 cpu2: timer 27847052 399 cpu1: timer 27847052 399 Total 201026281 2886
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load AverageInterface Traffic Peak Total
em1_vlan13 in 216.004 KB/s 233.244 KB/s 2.379 GB
out 10.831 KB/s 11.537 KB/s 1.078 GBem0_vlan10 in 0.146 KB/s 2.091 KB/s 175.861 MB
out 1.430 KB/s 13.097 KB/s 3.474 GBlo0 in 0.000 KB/s 0.000 KB/s 957.725 KB
out 0.000 KB/s 0.000 KB/s 957.725 KBem3 in 126.926 KB/s 165.866 KB/s 2.608 GB
out 316.054 KB/s 316.054 KB/s 702.344 MBem2 in 0.600 KB/s 0.600 KB/s 70.714 MB
out 0.744 KB/s 0.744 KB/s 286.328 MBem1 in 316.593 KB/s 316.593 KB/s 2.063 GB
out 123.062 KB/s 129.154 KB/s 2.080 GBem0 in 1.521 KB/s 4.879 KB/s 376.255 MB
netstat -i -b -n -I em1_vlan13
Name Mtu Network Address Ipkts Ierrs Idrop Ibytes Opkts Oerrs Obytes Coll em1_vlan13 1496 <link#11> 00:26:55:e3:3f:67 14875307 0 0 2561511047 8441171 72643 1157981020 0 em1_vlan13 1496 fe80::226:55f fe80::226:55ff:fe 0 - - 0 2 - 152 - em1_vlan13 1496 192.168.2.0/2 192.168.2.5 7945 - - 508860 20 - 1680 -</link#11>
netstat -i -b -n -I em1
Name Mtu Network Address Ipkts Ierrs Idrop Ibytes Opkts Oerrs Obytes Coll em1 1500 <link#2> 00:26:55:e3:3f:67 32198966 0 0 2232366523 19473909 0 2238003408 0 em1 1500 fe80::226:55f fe80::226:55ff:fe 0 - - 0 1 - 96 - em1 1500 192.168.1.0/2 192.168.1.130 25921 - - 3280518 10 - 840 -</link#2>
-
@bullet: I meant prioritizing the ping packets rather than to disable the qos as a whole. That is, for the purpose of testing, set floating rules on each interface with direction out, protocol icmp, source address of the interface and place in the highest priority queue.
Edit: By any chance, do you have Upperlimit, or Limiter, or PowerD with P4TCC, or any combination of those enabled?
-
@dreamslacker
yes, i understand that, infact is what i've done!I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.
So now i tried to remove all the shaper and this is what i get:
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 msSOLVED!!!! ;D ;D ;D ;D
ps: So now I only need to configure correctly the traffic shaping
-
Glad you got your issue solved.. I want to do some more testing in my own DC, trying to find these sub .2 ms response times.. So couple of cisco switches yesterday I pinged from one to the other.. Now the IPs were SVI on a vlan, but switches not doing anything and directly connected was seeing .4 ms roughly.. Which is bit of difference from .1 ms ;)
I will be keeping an eye out.. I just personally don't recall seeing such low response times even just local lan with directly connected equipment unless you were pinging loopback, local IP, etc.
But I will be paying more attention in the future on the hunt ;)
-
@dreamslacker
yes, i understand that, infact is what i've done!I've setted the traffic shaper with the wizard and it's automatically set a Upperlimit.
So now i tried to remove all the shaper and this is what i get:
–- 192.168.2.1 ping statistics ---
50 packets transmitted, 50 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.100/0.174/0.251/0.039 msSOLVED!!!! ;D ;D ;D ;D
ps: So now I only need to configure correctly the traffic shaping
I dislike the wizard, it never gives an optimal solution (nor should it since every use-case is different).
Nevertheless, you can run the wizard but remove all upperlimits in all queues once it's done. I've found that it increases latencies across the board (usually when it's active on the queue and occasionally at low loads) for some unknown reason.
I've settled on manually setting up my own queues and settings damn nearly since day 1 (pfSense 1.2rc2).
-
Good call. :)
Time for a low ping contest Johnpoz?
Steve