• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Apinger only working on wan 8/6/13 64bit snapshot

Scheduled Pinned Locked Moved 2.1 Snapshot Feedback and Problems - RETIRED
54 Posts 14 Posters 19.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    doktornotor Banned
    last edited by Aug 7, 2013, 6:08 PM

    @phil.davis:

    I had upgraded 3 systems to Aug 6 snapshot - the 2 with multi-gateways had these symptoms. The 1 with only 1 gateway did not have a problem. Not a big enough sample size to decide if multiple gateways being monitored is the real trigger for the "feature". Will report tomorrow if I see the latency numbers go silly again.

    Well, if IPv6 tunnel counts as multigateway, then count me in.

    1 Reply Last reply Reply Quote 0
    • J
      jimp Rebel Alliance Developer Netgate
      last edited by Aug 8, 2013, 12:36 PM

      It seems to be that the longer the delay to the gateway, the more likely there is to be a problem compounded over time.

      I set one of my gateways in a VM last night to 8.8.8.8 and within an hour it was into the thousands of ms in delays when in reality it was ~50ms.

      There is also still an issue with changing monitor IPs requiring a manual restart of apinger.

      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • D
        doktornotor Banned
        last edited by Aug 8, 2013, 12:54 PM

        @jimp:

        It seems to be that the longer the delay to the gateway, the more likely there is to be a problem compounded over time.
        I set one of my gateways in a VM last night to 8.8.8.8 and within an hour it was into the thousands of ms in delays when in reality it was ~50ms.

        Pretty much same here. If I really use the real GW, it does not happen. However monitoring the real GW is rather useless for me, I need to monitor real internet connectivity, not a device a couple of meters away from the firewall.

        1 Reply Last reply Reply Quote 0
        • T
          traxanos
          last edited by Aug 8, 2013, 1:43 PM

          we have the same problem. ping default gateway normal (1.3ms). when we ping the next hop gateway, we have a ping over 600ms). a pink from terminal show a ping of 1.4ms. after a restart of apinger the values correct again.

          1 Reply Last reply Reply Quote 0
          • T
            traxanos
            last edited by Aug 9, 2013, 9:43 AM

            hi we have now used the gateway ip for monitoring and we have the problem on the backup firewall, too. the ms stacked up from time to time. it start with 1ms and after some hours the apinger is over 2000ms. after restart apinger, it start with 1ms and from time to time the value was higher.

            1 Reply Last reply Reply Quote 0
            • G
              ggzengel
              last edited by Aug 9, 2013, 11:08 PM

              My (calculated) ping times growing, too.
              My rrd graphs from last 6 month are less than 1 pixel.

              1 Reply Last reply Reply Quote 0
              • D
                DrCain
                last edited by Aug 12, 2013, 12:05 AM

                Same problem here.

                Is it possible to cut out the effected part of the rrd graph?

                1 Reply Last reply Reply Quote 0
                • M
                  mastahfr
                  last edited by Aug 12, 2013, 12:29 AM

                  I've a hard feeling my problem is related: https://redmine.pfsense.org/issues/3138
                  Multi wan is going fubbar since I switched from RC0 to RC1 a couple of days ago.

                  I also confirm that I got the increasing of ping in a linear curve path.

                  1 Reply Last reply Reply Quote 0
                  • N
                    NOYB
                    last edited by Aug 12, 2013, 1:22 AM

                    @DrCain:

                    Is it possible to cut out the effected part of the rrd graph?

                    You could export the RRD to XLM, edit the XML to re-set the values of the effected part of the graph.  Then import the XLM back to RRD.

                    Export / Import RRD Database
                    /usr/local/bin/rrdtool dump rrddatabase xmldumpfile
                    /usr/local/bin/rrdtool restore -f xmldumpfile rrddatabase

                    1 Reply Last reply Reply Quote 0
                    • P
                      phil.davis
                      last edited by Aug 14, 2013, 10:39 AM

                      I had upgraded a multi-WAN site from 6 Aug 16:41:59 EDT 2013 to the latest snapshot yesterday (so I guess it would have been about a 12 Aug snapshot).
                      The 6 Aug snapshot was the one when apinger was added to the Services Status list, and apinger started counting up big numbers in the latency field. I was hoping that the later snap would fix everything.
                      The site was remote from me, and reported "no/intermittent internet". It did seem that OpenVPN links to it were coming and going. I couldn't get on to it long enough to see anything real. From the descriptions, it was probably constantly failing over from 1 gateway to the other and back, and/or thinking that both gateways were down…
                      I got them to switch slices and reboot, so it is back on 6 Aug snapshot. When I logged in just now the latency figures on the OPT! gateway were showing silly high numbers. I have disabled gateway monitoring on both gateways, and things have stabilised. For the moment, there will be no auto-failover at this site.
                      Unfortunately I can't give any better information, and for obvious reasons I don't want to roll forward at this site just now!
                      How are the apinger changes going? Do others have multi-WAN test systems that can be used as guinea-pigs?

                      As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                      If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                      1 Reply Last reply Reply Quote 0
                      • J
                        jimp Rebel Alliance Developer Netgate
                        last edited by Aug 14, 2013, 12:03 PM

                        I have four gateways on three interfaces on a test VM and it was OK there, but they aren't "real" WANs.

                        Can you give any more information about your exact gateway config there?

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • P
                          phil.davis
                          last edited by Aug 14, 2013, 12:36 PM

                          WAN - DHCP, attached to a WiMax device that has its own private IP and NATs out to internet. (Gets an address 10.1.1.x from the WiMax DHCP server)
                          OPT1 - static private IP to a TP-Link ADSL router, which again NATs out to the real internet.

                          WANGW - Monitor IP 8.8.8.8 - latency thresholds 4000 to 5000ms - packet loss thresholds 40 to 50% - probe interval 2 sec - down 30 sec.

                          OPT1GW - Monnitor IP 8.8.4.4 - latency thresholds 4000 to 5000ms - packet loss thresholds 40 to 50% - probe interval 2 sec - down 30 sec.

                          These connections have reasonably high latency normally, and when saturating the links with downloads the latency would normally go high, hence the wacky high gateway monitoring parameters to prevent gateways from being declared down when they are in fact "working".

                          Unfortunately I can't tell the exact symptoms, since it was a phone call and instructions about how to go back. The CF card multi-slice thing is very useful. As per previous post, I do know that links were coming and going, as I observed OpenVPN site-to-site links establishing for a minute or so, then dropping out.

                          I am at another site with multi-WAN at the moment. If I can gain a little confidence that apinger in the latest build is working OK and seems to be controlling failover OK, then I can upgrade here this evening and will be around to monitor it the next few days. This site is on a 31 Jul snap, which was before the recent apinger changes. So I will easily be able to switch back slices if needed. (I am not at home with a real test box)

                          As the Greek philosopher Isosceles used to say, "There are 3 sides to every triangle."
                          If I helped you, then help someone else - buy someone a gift from the INF catalog http://secure.inf.org/gifts/usd/

                          1 Reply Last reply Reply Quote 0
                          • J
                            jimp Rebel Alliance Developer Netgate
                            last edited by Aug 14, 2013, 12:54 PM

                            I pulled up another VM that has a better multi-WAN config and it was still OK there.

                            Though when I was experiencing problems before the latest round of fixes, it was worse with high-latency gateways, so it's possible that the issue is compounded by the actual latency there. To reproduce it you may have to artificially induce the same level of latency.

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • V
                              vielfede
                              last edited by Aug 14, 2013, 1:23 PM

                              @jimp:

                              I pulled up another VM that has a better multi-WAN config and it was still OK there.

                              Though when I was experiencing problems before the latest round of fixes, it was worse with high-latency gateways, so it's possible that the issue is compounded by the actual latency there. To reproduce it you may have to artificially induce the same level of latency.

                              Did you try to test failover?
                              As I state on this thread http://forum.pfsense.org/index.php/topic,65455.0.html, on RC1-20130812 failover does not work anymore (in my case).
                              Thanks
                              FV

                              1 Reply Last reply Reply Quote 0
                              • G
                                ggzengel
                                last edited by Aug 14, 2013, 1:31 PM

                                I have 2 pfsense with this.

                                1. pfsense:
                                2.1-RC1  (amd64)
                                built on Thu Aug 8 14:25:22 EDT 2013
                                FreeBSD 8.3-RELEASE-p9
                                1 WAN (0.4ms) (always green). The apinger shows 0ms which is wrong since update (pfsense1_WAN.png).
                                2 OpenVPN Server (23ms + 16ms) which have growing latencies. The corresponding clients at the other sides are green.

                                2. pfsense:
                                2.1-RC1  (amd64)
                                built on Wed Aug 7 20:59:21 EDT 2013
                                FreeBSD 8.3-RELEASE-p9
                                2 WANs: static WAN (1.4ms) + DSL (22ms). The DSL has growing latency. WAN shows less latency (pfsense2_WAN.png).
                                2 OpenVPN Server. Both have growing latency.
                                1 OpenVPN Client which has growing latency, too.

                                pfsense1_WAN.png
                                pfsense1_WAN.png_thumb
                                pfsense2_DSL.png
                                pfsense2_DSL.png_thumb
                                pfsense2_LAN.png
                                pfsense2_LAN.png_thumb
                                pfsense2_OVPN1.png
                                pfsense2_OVPN1.png_thumb
                                pfsense2_OVPN2.png
                                pfsense2_OVPN2.png_thumb
                                pfsense2_OVPN3.png
                                pfsense2_OVPN3.png_thumb
                                pfsense2_WAN.png
                                pfsense2_WAN.png_thumb

                                1 Reply Last reply Reply Quote 0
                                • J
                                  jimp Rebel Alliance Developer Netgate
                                  last edited by Aug 14, 2013, 1:33 PM

                                  Those snapshots are known to have apinger issues, upgrade to a current snapshot.

                                  Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                  Need help fast? Netgate Global Support!

                                  Do not Chat/PM for help!

                                  1 Reply Last reply Reply Quote 0
                                  • G
                                    ggzengel
                                    last edited by Aug 14, 2013, 1:33 PM

                                    I forgot to write:
                                    The LAN shows strange values, too.

                                    1 Reply Last reply Reply Quote 0
                                    • J
                                      jimp Rebel Alliance Developer Netgate
                                      last edited by Aug 14, 2013, 1:43 PM

                                      @vielfede:

                                      @jimp:

                                      I pulled up another VM that has a better multi-WAN config and it was still OK there.

                                      Though when I was experiencing problems before the latest round of fixes, it was worse with high-latency gateways, so it's possible that the issue is compounded by the actual latency there. To reproduce it you may have to artificially induce the same level of latency.

                                      Did you try to test failover?
                                      As I state on this thread http://forum.pfsense.org/index.php/topic,65455.0.html, on RC1-20130812 failover does not work anymore (in my case).
                                      Thanks
                                      FV

                                      It does appear as though the filter reload at the end of the apinger event isn't doing what it should there. I'll need to run some more tests to narrow it down though.

                                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                      Need help fast? Netgate Global Support!

                                      Do not Chat/PM for help!

                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        ggzengel
                                        last edited by Aug 14, 2013, 1:49 PM

                                        I updated pfsense1.
                                        While the first minutes a didn't see growing latencies.
                                        But WAN still has 0ms in RRD and is less than real 0.4ms.

                                        1 Reply Last reply Reply Quote 0
                                        • J
                                          jimp Rebel Alliance Developer Netgate
                                          last edited by Aug 14, 2013, 2:17 PM

                                          The lack of failover working seems to be this:
                                          http://redmine.pfsense.org/issues/3146

                                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          29 out of 54
                                          • First post
                                            29/54
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received