Clarification on gateway thresholds - apinger

luckman212

Down == above the defined thresholds you have on the gateway for what should be considered down.

Chris can I ask for a little clarification on this-

Basically, with the default values for apinger (200/500ms latency, 10-20% pkt loss, 1s probe int, and 10s down as shown above) is it correct to interpret this as follows:

A probe is sent to the gateway every 1 second
In a sliding window of the last 10 probes, if they all come back with >500ms of latency, a latency alarm condition will be triggered
The latency alarm will continue until latency returns to below 200ms for 10 seconds
Once all of the last 10 probes in the sliding window have a latency below 200ms, the alarm will be canceled
In a sliding window of the last 10 probes, if 20% of these 10 probes are lost (2 or more in this example) a packet loss alarm condition will be triggered
The packet loss alarm will continue until at least 9 out of every 10 probes in the sliding window is returned successfully
Once fewer than 10% of the probes in the last 10 second window (is that 0 or 1?) have been successfully returned, the alarm will be canceled

It does not make sense to have a packet loss threshold of e.g. 5%-10% if your "Down" number is 10, because if even 1 packet is lost, you are already at 10% (high water mark). Essentially the formula for the minimum level of granularity you can have between the low- and high-water mark would be 100/Down. So if you set Down to 20 then you could use a 5% granularity e.g. 5% to 10%. If you set it to 100 then you could use a 1% granularity e.g. 7% to 9%. The downside to this is you wait longer for alarms to fire.

Also, for the "Down" value, where it says "The number of seconds of failed probes…" is that really what it means, or does it mean simply "The number of failed probes…"

Is that all correct? ???

luckman212

Sorry to bump the thread but - can anyone confirm if my assumptions above are correct?

PF64

I'd like to know as well.

I get quite a few:

send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xxxx…..

luckman212

This is a pretty old thread, I would start a new one if you have questions – because I would assume all past assumptions are out the window since apinger has been replaced by dpinger

hda

@PF64:

I get quite a few:
send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 10….

Triggering coincides with your ISP lease-renewal ? That's what I see at my place (via PPPoE). Dpinger 2.3.