Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    An earnest appeal - please do fix APINGER in 2.2

    2.2 Snapshot Feedback and Problems - RETIRED
    29
    95
    29.7k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate
      last edited by

      The problem is that none of us can reproduce this on demand in an environment we control and can debug. Yes, we know some people have problems, but they don't affect the majority of users.

      Multi-WAN works fine here, and for many others. Ermal has been working on fixing this up but it's been a long process since the exact parameters to reproduce the problem have never been clearly identified or replicated.

      The notifications are a separate issue entirely, but that's not slated to be cleaned up on 2.2 but sometime afterward.

      Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • P
        pubmsu
        last edited by

        Thanks for your update jimp - we at least now know the difficulty in fixing it.

        If you want, I can help you with access to an otherwise good multi-WAN test environment that has this issue, which is an exact replication of our production environment. You can let me know in PM if this will help.

        We have been using pfSense for last 7 years I guess, and really need this to be resolved.

        1 Reply Last reply Reply Quote 0
        • jimpJ
          jimp Rebel Alliance Developer Netgate
          last edited by

          Getting access wouldn't help as much as definitively identifying the specific condition leading to the problem if possible (e.g. a latency over X for Y amount of time, or Z gateways with Q latency, etc)

          Depending on how long it takes for the problem to repeat it could still be difficult for us to find time to watch it closely enough to find when the problem starts specifically.

          Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

          Need help fast? Netgate Global Support!

          Do not Chat/PM for help!

          1 Reply Last reply Reply Quote 0
          • P
            pubmsu
            last edited by

            Got it Jim. In our case though the problem starts within 5 to 10 minutes.

            1 Reply Last reply Reply Quote 0
            • jimpJ
              jimp Rebel Alliance Developer Netgate
              last edited by

              Seeing the quality graph for your gateways may help as well, with notes about where apinger was restarted and where the problem was first noticed.

              I tried to artificially induce latency using one firewall in front of another and increasing the delay on a limiter for ICMP traffic. Each time I let it run for 15+ minutes at various latencies and then lifted the limiter. Each time it always bounced back to close to 0 for me, I never saw it get stuck, so there must be a few different factors at work making it get stuck over time for others.

              Remember: Upvote with the šŸ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

              Need help fast? Netgate Global Support!

              Do not Chat/PM for help!

              1 Reply Last reply Reply Quote 0
              • A
                athurdent
                last edited by

                Maybe this is something to consider: I never had any problems with my setup running NanoBSD for the last year. I switched to a full install recently (CF died, bought an SSD) and now I am seeing Packetloss steadily increasing for my HENet tunnel. 120% packetloss ATM, uptime of the firewall is 4 days. I am pinging the same IPv6 HostĀ  via Smokeping from a Linux host behind the pfSense GW and the graphs look a little different. This is on 2.1.4, not on 2.2.

                Screenshot_1.png
                Screenshot_1.png_thumb
                status_rrd_graph_img.png
                status_rrd_graph_img.png_thumb
                tunnel-endpoint_last_864000.png
                tunnel-endpoint_last_864000.png_thumb

                1 Reply Last reply Reply Quote 0
                • J
                  Jeremy11one
                  last edited by

                  I have this problem too.Ā  Apinger reports that my WAN connection keeps going up and down several times every hour.Ā  It started a few months ago.Ā  I have not switched ISPs or anything.Ā  I installed the latest snapshot (built on Mon Jul 28 12:22:20 CDT 2014) and still have the problem.

                  I do not use multiple WANs.Ā  Just one.

                  1 Reply Last reply Reply Quote 0
                  • R
                    ridnhard19
                    last edited by

                    I had this same issue as well and ended up coding in the local/private cable modems IP address into the config (192.168.100.1) and that was the workaround I used. Doesnt do anything for monitoring the connection but it's not always bouncing the connection up and down.

                    1 Reply Last reply Reply Quote 0
                    • J
                      Jeremy11one
                      last edited by

                      When you say "the config," do you mean the "Monitor IP"?Ā  My config was monitoring the default gateway IP, which is on the cable modem and I still had the problem.

                      1 Reply Last reply Reply Quote 0
                      • G
                        georgeman
                        last edited by

                        I don't think this is related to the main issue described here, but I have observed a similar behavior under high network load and while using the traffic shaper, because the ping probes are put on the default queue instead of the one specified by the floating rule on WAN that is supposed to handle the situation. Probably this happens because apinger starts before the firewall itself, since killing the related states makes them go into the correct queue immediately

                        If it ain't broke, you haven't tampered enough with it

                        1 Reply Last reply Reply Quote 0
                        • N
                          naras
                          last edited by

                          Issue still exsits in recent bulids in my testing enviroments.

                          1 Reply Last reply Reply Quote 0
                          • ?
                            Guest
                            last edited by

                            …and frequently results in tunnels (IPsec or openVPN) going down for no obvious reasons, except for apinger freakin' out.

                            I increased the times for apinger alarm significantly, that helps at least a little...

                            1 Reply Last reply Reply Quote 0
                            • S
                              Supermule Banned
                              last edited by

                              I dont have these problems at all running 40+ pfsenses….

                              I use traceroute to monitor the wanted IP upstream to decide if the GW is down.

                              All are stable currently running 0% packetloss..... No change from 2.0.X

                              I dont like the idea of monitoring other external hosts not in your upstream environment. That way you dont get a real picture of your GW status.

                              Capture.PNG
                              Capture.PNG_thumb

                              1 Reply Last reply Reply Quote 0
                              • Raul RamosR
                                Raul Ramos
                                last edited by

                                @Supermule:

                                I dont have these problems at all running 40+ pfsenses….

                                What's your config for WAN interfaces? I see allot of people write that have problems but doesn't put configs to help troubleshoot the problem.

                                Some times i have the problem in my multi-wan interface (PPPoE only config user and pass and a ppp (LTE) WAN only config default number). Don't see the problem when i disconnect my ppp.

                                pfSense:
                                ASRock -> Wolfdale1333-D667 (2GB TeamElite Ram)
                                Marvell 88SA8040 Sata to CF(Sandisk 4GB) Controller
                                NIC's: RTL8100E (Internal ) and IntelĀ® PRO/1000 PT Dual (Intel 82571GB)

                                1 Reply Last reply Reply Quote 0
                                • S
                                  Supermule Banned
                                  last edited by

                                  More or less the same for all 40+….

                                  Capture.PNG
                                  Capture.PNG_thumb

                                  1 Reply Last reply Reply Quote 0
                                  • N
                                    naras
                                    last edited by

                                    @Supermule:

                                    I use traceroute to monitor the wanted IP upstream to decide if the GW is down.

                                    All are stable currently running 0% packetloss….. No change from 2.0.X

                                    I dont like the idea of monitoring other external hosts not in your upstream environment. That way you dont get a real picture of your GW status.

                                    Could you please tell us how to use traceroute to monitor the wanted IP upstream to decide if the GW is down?

                                    Multi-wan with static IPs and different gateways within each wan subnets are stable at least in my tests,Ā  but I use pppoe connections and we get the same gateway IP allmost all the times, so we have to set at least one monitor IP outside the wan subnet,Ā  and this line with outsideĀ  monitor ip allways gets offline as the apinger reported, but the connectionĀ  functional as normal.

                                    If there is another to monitor the gatwway, that really helps.

                                    1 Reply Last reply Reply Quote 0
                                    • S
                                      Supermule Banned
                                      last edited by

                                      http://ping.eu/traceroute/

                                      Use the first one thats not in your WAN subnet.

                                      1 Reply Last reply Reply Quote 0
                                      • N
                                        naras
                                        last edited by

                                        @Supermule:

                                        http://ping.eu/traceroute/

                                        Use the first one thats not in your WAN subnet.

                                        It's not within pfsense, and not doneĀ  automaticly either?

                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Supermule Banned
                                          last edited by

                                          I understand why you are confused…

                                          I use traceroute to monitor the wanted IP upstream to decide if the GW is down.

                                          I use traceroute to locate the IP to monitor and then use the built in GW monitor tool in PFSense.

                                          Works fine here.

                                          1 Reply Last reply Reply Quote 0
                                          • N
                                            naras
                                            last edited by

                                            @Supermule:

                                            I understand why you are confused…

                                            I use traceroute to monitor the wanted IP upstream to decide if the GW is down.

                                            I use traceroute to locate the IP to monitor and then use the built in GW monitor tool in PFSense.

                                            Works fine here.

                                            OK, I did that several months ago,Ā  and with no use.
                                            The next hop routers are always outside my wan subnet:(

                                            Thanks anyway.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.