Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    IPhone IPSEC working again

    Scheduled Pinned Locked Moved 2.1 Snapshot Feedback and Problems - RETIRED
    17 Posts 4 Posters 4.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • jimpJ
      jimp Rebel Alliance Developer Netgate
      last edited by

      Not sure what it's not doing for you… I am on the latest snapshot, and I can connect right up with my phone and surf to things on the lan side.

      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

      Need help fast? Netgate Global Support!

      Do not Chat/PM for help!

      1 Reply Last reply Reply Quote 0
      • D
        daplumber
        last edited by

        @jimp:

        Not sure what it's not doing for you… I am on the latest snapshot, and I can connect right up with my phone and surf to things on the lan side.

        Would you mind forwarding me a copy of your settings? That way I can test to see if something else in my install is broken.

        –--------
        This user has been carbon dated to the 8-bit era...

        1 Reply Last reply Reply Quote 0
        • D
          daplumber
          last edited by

          OK, so now with an update to:

          8.3-RELEASE-p2 FreeBSD 8.3-RELEASE-p2 #1: Tue Jun 5 23:58:17 EDT 2012 root@FreeBSD_8.3_pfSense_2.1.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8 i386

          It's working again. No config changes at all, and same end point hardware and networks. What the <bleep>is going on? I didn't see anything to do with IPSEC in the commit or activity for this last update?</bleep>

          –--------
          This user has been carbon dated to the 8-bit era...

          1 Reply Last reply Reply Quote 0
          • rcfaR
            rcfa
            last edited by

            @daplumber:

            It's working again. No config changes at all, and same end point hardware and networks. What the <bleep>is going on? I didn't see anything to do with IPSEC in the commit or activity for this last update?</bleep>

            Not sure if it's related, but I had to disable/enable IPSec after an upgrade to get things working and/or reboot the system a second time. After, things seem to run fairly reliably, but after the system comes up after an upgrade, it usually doesn't work properly until I cycle IPSec and/or reboot.

            1 Reply Last reply Reply Quote 0
            • D
              daplumber
              last edited by

              This getting insane. Anecdotal experience is that my IPSEC stops working every other update and then works again.

              Checkpoint: "8.3-RELEASE-p2 FreeBSD 8.3-RELEASE-p2 #1: Fri Jun 8 06:50:37 EDT 2012 root@FreeBSD_8.3_pfSense_2.1.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8 i386"

              is working again. The previous update wasn't. Bouncing the service seems to matter not one whit.

              –--------
              This user has been carbon dated to the 8-bit era...

              1 Reply Last reply Reply Quote 0
              • rcfaR
                rcfa
                last edited by

                Sorry to hear that the bouncing of the service doesn't work for you, I assume, a second reboot after the install didn't do anything either?

                I have obviously slightly different issues with IPSec, one is, that when I pass massive amounts of traffic through the tunnel (easy, given that all my IPv4 traffic passes through that tunnel), it silently stops working: tunnel shows as up, gateways are up, etc. just traffic stops flowing (still trying to figure out how to debug that one). Things on the dashboard are indistinguishable from the working setup. Bounce the tunnel down and up, back to working condition, until the next time it happens. Tried the prefer older SA setting, too, no difference. ???

                Anyway, not trying to hijack your thread, just saying there are still some glitches somewhere in IPSec.

                1 Reply Last reply Reply Quote 0
                • D
                  daplumber
                  last edited by

                  Understood. I'm just wondering if there's any difference between the first snapshot build of the day and the second? The second is the one that seems to work for me. If no-one's working on it, it should be the same, right?  ;)  ::)

                  If you have a lot of traffic, have you checked to see if racoon is running out of some resource or maybe timing out somewhere? Generically speaking code under load can cause bugs to crawl out of the woodwork that may not otherwise show up, especially timing issues, resource allocation and cleanup, locks, and race conditions. One of the first test activities in a former life of mine as a tester is to ramp up the usage until something breaks. IMHO well-written "defensive" code should degrade gracefully, then refuse to service more requests and/or abort with a meaningful message about what resource was exhausted or error occurred.

                  Programmers loathe testers, it's usually because they've finally got something to work after many hours and frustrations, and now: "This <bleep>wants the code to behave under abusive/crazy conditions?!" I was/am a very good tester. I can break any code, the point was that it should break only with a sufficiently high level of effort, and "go down screaming errors about the injustice of the insanity it is being subjected to."

                  The FreeBSD "fortunes" have always had some of the best quotes IMHO:

                  Osborn's Law:
                          Variables won't; constants aren't.

                  O'Toole's Commentary on Murphy's Law:
                          Murphy was an optimist.

                  Our OS who art in CPU, UNIX be thy name.
                          Thy programs run, thy syscalls done,
                          In kernel as it is in user!

                  (All starting with "O" for some reason…  ;D  )</bleep>

                  –--------
                  This user has been carbon dated to the 8-bit era...

                  1 Reply Last reply Reply Quote 0
                  • rcfaR
                    rcfa
                    last edited by

                    @daplumber:

                    Understood. I'm just wondering if there's any difference between the first snapshot build of the day and the second? The second is the one that seems to work for me. If no-one's working on it, it should be the same, right?  ;)  ::)

                    One should think so…
                    ...but since you were quoting, here's one of my favorite ones:

                    The difference between theory and practice is, that in theory there is no such difference, but in practice, there is.

                    @daplumber:

                    If you have a lot of traffic, have you checked to see if racoon is running out of some resource or maybe timing out somewhere? Generically speaking code under load can cause bugs to crawl out of the woodwork that may not otherwise show up, especially timing issues, resource allocation and cleanup, locks, and race conditions. One of the first test activities in a former life of mine as a tester is to ramp up the usage until something breaks. IMHO well-written "defensive" code should degrade gracefully, then refuse to service more requests and/or abort with a meaningful message about what resource was exhausted or error occurred.

                    I understand what you say, I did a reasonable bit of software testing too, and I'm obviously still good at using things in a way that they break :D

                    I should somewhat explain, though what I mean with "lots of traffic". Most of the day, the internet just sits there idle: here a page load on a web site, there an e-mail trickling in. We're talking about a few hundred e-mail messages per day, maybe a few hundred or low thousands of web pages visited. "Heavy traffic" is something like downloading a Mac OS X OS update disk image with e.g. 1.4GB in size, or streaming a Netflix movie.
                    So it's the naked data volume that's somewhat heavy, but not the number of requests on racoon or such.
                    The "beauty" of it is, that you look at the IPSec status page, the dashboard, etc. and everything looks fine and dandy. Just nothing is happening. Anything that brings down the tunnel and restarts it, is just fine. It's just the easiest thing to do to toggle the "enable IPSec" checkbox on the IPsec page.

                    I guess I just have to keep my eyes peeled, and hopefully sooner or later I find that fried moth…

                    1 Reply Last reply Reply Quote 0
                    • W
                      wallabybob
                      last edited by

                      @rcfa:

                      when I pass massive amounts of traffic through the tunnel (easy, given that all my IPv4 traffic passes through that tunnel), it silently stops working: tunnel shows as up, gateways are up, etc. just traffic stops flowing (still trying to figure out how to debug that one). Things on the dashboard are indistinguishable from the working setup.

                      Does data transfer recover after (say) 5 to minutes?

                      It might be worth doing a packet capture on the tunnel interface when it is in this state: maybe there is no traffic at all, maybe the only traffic is the two ends saying to each other "I'm here".

                      @rcfa:

                      Bounce the tunnel down and up, back to working condition, until the next time it happens.

                      Have you tried "less brutal" means such as initiating a ping across the tunnel or starting a new TCP connection across the tunnel (e.g. to access a web page).

                      1 Reply Last reply Reply Quote 0
                      • rcfaR
                        rcfa
                        last edited by

                        @wallabybob:

                        @rcfa:

                        when I pass massive amounts of traffic through the tunnel (easy, given that all my IPv4 traffic passes through that tunnel), it silently stops working: tunnel shows as up, gateways are up, etc. just traffic stops flowing (still trying to figure out how to debug that one). Things on the dashboard are indistinguishable from the working setup.

                        Does data transfer recover after (say) 5 to minutes?

                        Not that I'm aware of…
                        Here's a couple of typical modes of failure:
                        a) I return to my computer after a longer idle period, try to access a web page: nothing happens, eventually it times out with an error page. I notice Skype's off-line, too. So at this point, chances are that I'm catching it after having been in that state for a while...

                        b) I'm actively browsing the web, downloading something or another, suddenly the downloads "slow down" (which is of course just the effect of the browser calculating average download speed, when in reality the download just plain stops). Since there are speed fluctuations anyway, I may not notice, until I open another web page, or notice that Skype is off-line.

                        @wallabybob:

                        It might be worth doing a packet capture on the tunnel interface when it is in this state: maybe there is no traffic at all, maybe the only traffic is the two ends saying to each other "I'm here".

                        Well, unless the SA monitor and the Dashboard are not plain lying, the tunnel is up, so something is supposed to be working.

                        @wallabybob:

                        @rcfa:

                        Bounce the tunnel down and up, back to working condition, until the next time it happens.

                        Have you tried "less brutal" means such as initiating a ping across the tunnel or starting a new TCP connection across the tunnel (e.g. to access a web page).

                        Yep, see above.

                        The only thing that's somewhat "abnormal" about my setup, is that this IPSec link is the IPv4 pseudo-default route. Obviously as far as pfSense goes, the default route is something else, i.e. the WAN interface, but since the remote network on the IPSec link is 0.0.0.0/0, it snarfs up all regular traffic. So maybe IPSec, which usually is used just for snarfing up a specific subnet's traffic has some glitches that are exposed by my somewhat different use.

                        What makes things since yesterday a bit more difficult, is that I now also have an IPv6 tunnelbroker interface, which is the default route for IPv6 traffic. So now I can have a mixed-mode situation, where IPv6 works, and IPv4 doesn't, so I'm not as quick to catch on with the IPSec tunnel acting up, because some things may continue to work, because they use the IPv6 network…

                        1 Reply Last reply Reply Quote 0
                        • jimpJ
                          jimp Rebel Alliance Developer Netgate
                          last edited by

                          Do you control the other side of the IPsec tunnel?

                          What you describe is a classic symptom of the far side dropping the P1 but not informing you that it did so. pfSense, without DPD, has no way to know it's down, so it keeps trying to talk on the SA it has.

                          IF you can enter a keep-alive IP on the far side for an IP in your LAN, that would make their end re-establish a P1 when it fails and maintain connectivity.
                          Otherwise, make sure both sides support DPD.

                          Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                          Need help fast? Netgate Global Support!

                          Do not Chat/PM for help!

                          1 Reply Last reply Reply Quote 0
                          • rcfaR
                            rcfa
                            last edited by

                            @jimp:

                            Do you control the other side of the IPsec tunnel?

                            What you describe is a classic symptom of the far side dropping the P1 but not informing you that it did so. pfSense, without DPD, has no way to know it's down, so it keeps trying to talk on the SA it has.

                            IF you can enter a keep-alive IP on the far side for an IP in your LAN, that would make their end re-establish a P1 when it fails and maintain connectivity.
                            Otherwise, make sure both sides support DPD.

                            The other side is a ZyWALL unit, and the link is marked as "nailed up" i.e. permanent/auto-reestablish
                            It also has a remote monitor IP that's the LAN address of the pfSense box that it is supposed to ping regularly.

                            And DPD is turned on on the pfSense box, too.

                            1 Reply Last reply Reply Quote 0
                            • rcfaR
                              rcfa
                              last edited by

                              I hope I'm not jinxing myself by posting this, but, things seem to have remained stable since I turned off NAT Traversal on both sides.

                              Strictly speaking, right now things don't go through NAT, but there are/were cases when I had to put a VoIP appliance between the WAN and the firewall, at which point there would be NAT even though the firewall would be an "exposed host". So for such cases, I always had NAT traversal turned on, and during link negotiation the systems notice that it's not needed and then don't use it.

                              This was the same with pfSense according to the logs, so I figured, it's fine. For shits and giggles, I turned NAT-T off on both sides, and since then things have been up. (Of course, maybe I was just lucky and in a few hours I have to say:"Oops, back to the same old…")

                              Still, while it seems like I might have found a cure, why would it negotiate a NAT-T free connection, and then later fail?

                              Well, I'll keep an eye on things, to see if it now stays up reliably, which would be great.

                              Or have there been other recent changes that could have had an influence on this issue?

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.