RRD Quality Delay values wrong after updating to 2.2

kejianshi

5ms seems good. See that on comcast and TWC all the time. For years.

You sure its not just working correctly now where it wasn't before?

neik

My problem went the other way: from around 5ms to 35ms. Capturing the pings shows that there is not actually anything wrong:

08:36:46.496455 ICMP echo request, id 3767, seq 53505, length 44
08:36:46.500664 ICMP echo reply, id 3767, seq 53505, length 44
08:36:47.496452 ICMP echo request, id 3767, seq 53761, length 44
08:36:47.501089 ICMP echo reply, id 3767, seq 53761, length 44
08:36:48.497399 ICMP echo request, id 3767, seq 54017, length 44
08:36:48.501712 ICMP echo reply, id 3767, seq 54017, length 44
08:36:49.498143 ICMP echo request, id 3767, seq 54273, length 44
08:36:49.502348 ICMP echo reply, id 3767, seq 54273, length 44
08:36:50.498388 ICMP echo request, id 3767, seq 54529, length 44
08:36:50.502767 ICMP echo reply, id 3767, seq 54529, length 44
08:36:51.499141 ICMP echo request, id 3767, seq 54785, length 44
08:36:51.503391 ICMP echo reply, id 3767, seq 54785, length 44
08:36:52.499137 ICMP echo request, id 3767, seq 55041, length 44
08:36:52.503104 ICMP echo reply, id 3767, seq 55041, length 44

Here is the apinger log. I guess the 60ms value is raising the average?

Jan 30 08:46:36 apinger: Polling, timeout: 0.692s
Jan 30 08:46:36 apinger: (avg. loss: 0.0%)
Jan 30 08:46:36 apinger: (avg: 5.866ms)
Jan 30 08:46:36 apinger: #1055 from WAN_PPPOE(62.3.84.23) delay: 4.280ms/6.347ms/58.662ms received = 1054
Jan 30 08:46:36 apinger: Polling, timeout: 0.697s
Jan 30 08:46:36 apinger: Recently lost packets: 0
Jan 30 08:46:36 apinger: Sending ping #1055 to WAN_PPPOE (62.3.84.23)
Jan 30 08:46:35 apinger: Polling, timeout: 0.993s
Jan 30 08:46:35 apinger: (avg. loss: 0.0%)
Jan 30 08:46:35 apinger: (avg: 6.073ms)
Jan 30 08:46:35 apinger: #1054 from WAN_PPPOE(62.3.84.23) delay: 4.648ms/4.398ms/60.729ms received = 1053
Jan 30 08:46:35 apinger: Polling, timeout: 0.998s
Jan 30 08:46:35 apinger: Recently lost packets: 0
Jan 30 08:46:35 apinger: Sending ping #1054 to WAN_PPPOE (62.3.84.23)

![zen latency increase.png](/public/imported_attachments/1/zen latency increase.png)
![zen latency increase.png_thumb](/public/imported_attachments/1/zen latency increase.png_thumb)

wbond

In my case the ping command shows around 22ms, whereas the RTT in the gateway widget at the same time is showing 0.7ms. I think it was correctly displayed in 2.1.5 but 2.2 is too low.

kejianshi

Sorry man - That graph looks like 4 or 5 ms to me, which seems good but not too good to be true.

(I am sipping wine as I critique the post though - It's a good strong Cabernet, so itt may be influencing my judgement)

wbond

I've never logged the apinger data before, but it looks like lines that list the "delay:" is showing the average/min/max and this looks correct. I'm guessing that the RTT showing in the gateway widget is showing the data from the "avg:" lines which doesn't seem to correlate. This is a home connection via dsl 20down/0.8up. For some reason I think that dsl tends to have higher latencies than cable.

wbond

I just noticed this post.

https://forum.pfsense.org/index.php?topic=87835.0

Restarting the apinger service seems to have fixed this issue on my system.

deltix

I'm seeing the same thing. Fresh installation, ping is 5-6ms, apinger is showing 0.5ms. After restarting apinger it looks fine.

Chucko

Thanks to wbond for letting me know about this thread. I've noticed the same thing; apinger delays are much less than a manual ping or traceroute will show; and restarting apinger corrects the problem.

wbond

It looks like in my case restarting apinger only worked for couple of hours, and now it's back to underreporting the RTT.

untitled.png_thumb

Harvy66

wbond, RRD is showing about 5ms, it's supposed to be about 25ms? Are you sure you're pinging the same IP address?

wbond

Harvy66, yes, same ip address and it usually measures in the mid 20's using ping and was consistently in that range in 2.1.5

I just pinged from pfsense and results were:

PING 67.41.239.70 (67.41.239.70): 56 data bytes
64 bytes from 67.41.239.70: icmp_seq=0 ttl=64 time=22.187 ms
64 bytes from 67.41.239.70: icmp_seq=1 ttl=64 time=22.040 ms
64 bytes from 67.41.239.70: icmp_seq=2 ttl=64 time=23.700 ms
64 bytes from 67.41.239.70: icmp_seq=3 ttl=64 time=21.708 ms
64 bytes from 67.41.239.70: icmp_seq=4 ttl=64 time=25.904 ms
64 bytes from 67.41.239.70: icmp_seq=5 ttl=64 time=39.864 ms
64 bytes from 67.41.239.70: icmp_seq=6 ttl=64 time=46.233 ms
64 bytes from 67.41.239.70: icmp_seq=7 ttl=64 time=22.777 ms
64 bytes from 67.41.239.70: icmp_seq=8 ttl=64 time=21.835 ms
64 bytes from 67.41.239.70: icmp_seq=9 ttl=64 time=21.580 ms

–- 67.41.239.70 ping statistics ---
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 21.580/26.783/46.233/8.347 ms

The dashboard is show 2.6ms at the same time.

keylevel

Sorry - didn't get any notifications that anyone had replied to this.

Anyway - restarting apinger has put the stats back to normal. I'll keep an eye on it and see how long it looks sensible ;-)

wbond

Restarting apinger only seemed to fix it for a couple of hours for me. I ended up changing the probe interval on the gateway to 2 seconds and changed the Down value to 20 and the RTT has been running at normal levels for 2 days now. No idea why this would make any difference, but it seems to.

justincase

Just to let you I tried a workaround to restart apinger when monitoring alarm goes off and it seems to do the trick for 2 days now:
https://forum.pfsense.org/index.php?topic=87835.msg487013#msg487013

keylevel

@cmb:

Could you enable gateway monitoring debug logging under System>Advanced, Misc, check "Enable gateway monitoring debug logging", and post your apinger logs?

I think I may have a clue here…

I restarted apinger and cleared the logs.The RRD RTT values where then ok for some time, but the reported values "went silly" when the log showed:

Feb 15 22:01:35 apinger: alarm canceled: GW_WAN(###) *** delay ***
Feb 15 22:01:22 apinger: ALARM: GW_WAN(###) *** delay ***
Feb 15 19:27:16 apinger: alarm canceled: GW_WAN(###) *** delay ***

The graphs changed at the same time as the first "alarm canceled" event, so it looks like its related to the alarm clearing rather than triggering.

epimeteo

I have the same problem, a VSAT wan (supposed to be always more than 500ms) and a WIMAX (supposed to be always more than 120ms), after 2.2 upgrade (well, fresh installation) are 20/10/0ms