Dpinger - Probe & Alert Interval Relationship

NOYB

https://pfsense.home/system_gateways_edit.php

Advanced - Additional information

"The alert interval must be greater than or equal to the probe interval. There is no point checking for alerts more often than probes are done."

Perhaps if the two are coordinated events. But if they not …

Example:

Probe interval 10 seconds
Alert interval 10 seconds

t0 minus 1 alert check
t0 probe fail
t9 alert check
t10 probe
t19 alert check

Wouldn't it be better to receive the alert sooner than the probe interval?
Maybe like this.

Probe interval 10 seconds
Alert interval 5 seconds

t0 minus 1 alert check
t0 probe fail
t4 alert check
t9 alert check
t10 probe
t14 alert check

dennypage

The information section could possibly offer some additional guidance. While it's generally true that there is no point in checking for alerts more often than echo requests are sent, it's actually a bit more complicated than that.

Dpinger works quite differently than apinger or smokeping. Rather than a "probe" that fires off and processes a handful of echo request/replies all at once, dpinger maintains a rolling array of echo requests spaced on the send interval. In other words, instead of waking up every second and sending 4 echo requests at once, dpinger sends an echo request every 250 milliseconds. When dpinger receives an echo reply, the time difference between the request packet and reply packet (latency) is recorded. But there is nothing that records a reply/request as permanently lost.

When the alert check is made, or a report is generated, dpinger goes through the array and examines each echo request. If a reply has been received, it is used as part of the overall latency calculation. If a reply has not yet been received, the amount of time since the request is compared against the loss interval. If it is greater than the loss interval, the request/reply is counted as lost in the current report. But this is not a permanent decision. In subsequent reports, if a the missing reply has been received, its latency will be used instead of being counted as lost.

It's important to keep in mind that latency and loss are reported as averages across the entire set. The default time period for dpinger is 30 seconds, with an echo request being sent every 250 milliseconds. This means that the latency and loss will be reported as averages across 115-120 samples. The alert check runs every second by default. So each time, the 4 oldest entries in the set have been replaced by the 4 newest ones.

Btw, if you want to have accurate loss reporting, it is important that the number of samples be sufficient. If you want to achieve 1% loss resolution, you have need more than 100 samples in the set. The actual calculation for loss resolution is:

100 * send_interval / (time_period - loss_interval)

The default settings for dpinger produce report loss with an accuracy of 0.87%.

I would generally recommend staying with the default values unless you are monitoring high latency links.

phil.davis

After reviewing the dpinger source code and doing some associated changes/fixes to the pfSense GUI front-end, I looked at the previous text explanation of the parameter relationships, added some validation checks in the GUI for stuff that I thought "must be on in any use case I can think of", and then wrote that text explanation based on what I observed.

If anyone has later thoughts about extra use cases that should be allowed or better wording of the explanation… feel free to make suggestions and PRs...