Best practices for apinger, gateway monitoring / DNS
I have a question on how you all are setting up your gateway monitoring + DNS on multi-WAN failover setups. I'm having a hard time finding settings that work well in most situations. The thing is, many ISPs deliver service via some CPE device where the device itself is the "gateway" (on-premise) so the default gateway monitoring is essentially useless because you're monitoring the local equipment and not the upstream circuit. So e.g. you have a T-1 line and it goes down but the ISP's Cisco router that's sitting in the IT closet is still responding to pings, so pfSense never fails over.
To work around this, I have tried using DNS servers as the monitoring IP (override) e.g. Google DNS 188.8.131.52 or OpenDNS 184.108.40.206 but this has a side effect that if that DNS server has issues, it might cause the WAN to go down or flap needlessly. The other problem is, if you want to also use 220.127.116.11 as the system DNS server, it doesn't work right since entering that as a gateway monitor IP creates a static route binding traffic to just one of the WAN uplinks.
In my head I think the best solution would be to allow multiple IPs (e.g. a set of 3) for gateway monitoring. Ping all of them and only mark a WAN circuit "down" if they ALL fail. This would solve the single point of failure situation above. I am not sure if there's a good "solution" here with the current tools we have- just curious what others are doing to see if I'm missing an obvious solution.
See this feature request: https://redmine.pfsense.org/issues/1189
IMO your analysis is correct - just needs to be implemented in code one day.
This is a good idea even when pfsense has a public IP on it, and single wan connection. Since it is quite possible for the isp gateway to top resonding to ping, or become sluggish in answering pings that exceed a timeout, while through the isp gateway still works and connection is therefore still up. So if monitoring a IP past your isp gateway you would not go down since this still answers..
And also quite possible that the upstream IP you picked be it a dns or not could also just be down or not answering pings any more, etc. Having multiple IPs to monitor would make it less likely to get false down issue.
That feature request is like 4 years old ;) So is this something slated for 2.2? or 2.2.1 or 2.3?
It won't happen in 2.2. But after 2.2 is released I am happy to work on an implementation, because I would really like such a feature, the way internet and ISPs routing and… goes up, down and all around here in Nepal.
Phil that would be an incredible enhancement - I really hope you'll find some time time to work on this!
Phil, now that 2.2.x has been out and starting to settle down a bit, do you still think this is something that you would be open to working on? I want to start a bounty to help you with the effort.
I had a look at the feature request again and added another comment to see if enhancing this using PHP scripts would be acceptable. Given the current issues with apinger, and possibilities of a replacement utility being written from scratch, there is not much point modifying the existing apinger code to do the multiple monitor IPs thing.
I don't care about the bounty - if something comes of this then just buy some gifts from the gift catalog in my signature :)
Thanks - I read your comment and it does certainly make sense. I didn't know a replacement or rewite of apinger was even being considered but that is good news. I would certainly be willing to help fund that effort if others are involved. In the meantime, if there is a kludge you can hack together using PHP + the existing binaries then I would of course be grateful for such a thing.
for a new and reliable Apinger with 2 or more checked IPs
I also am willing to fund this.
Blast from the past! Digging up this old thread.
Now that 2.3 is around the corner and apinger is being replaced with dpinger it would be awesome if the multiple-target monitoring method could be implemented for multi-wan. To state again: having a single host's uptime as the deciding factor determining whether a gateway gets marked down is not ideal — 3-5 monitor IPs would be more robust.
Actually, screw pings, often ISPs mess with ICMP packets, have an option to do a DNS check instead of PINGS ! Very needed.
I put this same question to myself and after some time I decided to use 18.104.22.168.
I never had 22.214.171.124 going down (never noticed that to be precise).
Having the same reliability of google as a icmp reply I think is close to the best you can achieve, how can this be not enough?
I agree with the one who said that another protocol as a failover should be present, icmp alone is not enough to say internet is there.
about that I wish to add a thing: I monitor my vpn by pinging a dns inside her and is says under 2ms, just impossible, how this could happen?
Any re-thought about adding this? while sending the fail signal if multiple points are down could be useful, we really want it to for Historical Quality reporting, Packet Loss, Latency… really like the new easy to read Std. deviation in 2.3.. nice
We always have multiple points of reference, in EU ISP network, Our data center, Our ISP and all possible interconnects between. Having this historical info in EU end points would be very helpful in a lot of ways.
anyway the first step in getting multiple fail confirm, is to have multiple monitoring!!!!