IPSec: odd tunnel behavior, and questions/suggestions…
-
So the good news:
the 2.1 box is up and running, doing it's basic work already, which is to route a public class-C net over an IPSec tunnel to a collocation provider who routes my net (because the local ISP isn't willing to do so).
Doesn't do much firewall stuff yet, because I need to bring things to where I want them to be step by step.
In any case, throughput is now about twice as fast as with the hacked-together solution I had with my now resting ZyWall unit. That's the good part.Now for the not-so-good:
-
For an IPSec tunnel where one endpoint is DHCP, and authentication is via PSK, it seems using a FQDN as indent string doesn't work. Since there is no pop-up for 'DNS name' (like on the Zywall) or "FQDN" (like on other boxes), I tried with "Distinguished Name", but that didn't work. Only after on both sides of the tunnel I switched to IP Address and of course changed the string to some IP address, did I get the tunnel up.
-
in phase 2, I can't select a NULL encryption method. I know this is likely not a common requirement, but in my case encryption of any sort is just wasted CPU cycles, since all that traffic goes out to the public internet anyway.
-
in the ZyWall and most other boxes I know of, one can set what kind of connection an IPSec tunnel is: manua, on-demand, nailed-up. I can't find anything like this in the settings. Is an IPSec tunnel always up? Is it automatically established at system boot time, or on demand with traffic? Does it stay up, or time out after a certain amount of inactivity? Couldn't find any setting that seems to imply anything like it would influence such behavior. Am I blind, or is this missing? If it's missing, what is the implied behavior?
And now the bad:
After staying up until the wee-hours to get this all working, I went to bed with an active tunnel, and a working connection to the public internet (all traffic to the outside work has to go through that tunnel, except the IPSec traffic that creates that tunnel, of course). When I got up, the tunnel was showing as active, but I couldn't connect to anywhere, i.e. the packets weren't flowing through the tunnel. No idea where they were going, but obviously nowhere.
So I was looking for a place where one can disconnect/reconnect a tunnel, which again most vpn boxes have a button for that e.g. next to the list of defined tunnels, or something like that. Couldn't find anything.
So I restarted racoon from the Dashboard. Still no connectivity. So then I stopped the racoon service. To my surprise, I noticed that even with racoon not running, the tunnel stayed up.
In the end, I had to go to the IPSec page (VPN:IPSec) uncheck the "Enable IPSec" box, hit save, wait a few moments, check that box again, and hit save again. That actually brought the tunnel down, then back up, and I had connectivity again.So besides being really complicated (and affecting other tunnels, too, if I had any others active), the question is what sort of state got the tunnel itself into, where it showed as active and up, but wouldn't pass any traffic until being terminated and reestablished.
With all this verbose background information, I guess what it boils down to is this:
a) I miss a way to see the "health" of an IPSec tunnel, because from what was visible on the Dashboard, everything should have been fine and dandy but it wasn't.
b) I miss a way to quickly bring down or up individual tunnels
c) I miss a way to specify a particular tunnel's behavior (manual connection, on-demand, permanent/nailed-up)
d) a NULL encryption would be useful on occasion
e) using FQDN as identification strings would be nicer than using IP addresses, but it doesn't seem to work at this point in time.Not sure how much of this is 2.1 specific, but since I'm working with 2.1 at this point, I figure I post it here.
-
-
hm - i can share some of this experience with mpd - looking at status_interfaces might show that a connection is up when looking into the logs show that there is a problem. Did you try to use gateway monitoring ? This might be a temporary option to quickcheck. But for the long run, im with you on the point that more gui customisation would be a good thing to have.
For the actual problem of loosing connectivity on some point - what about your Status:System Logs:IpSec Normally there should be some hint about the reason why this was happening ?
Another option might be (when the connection is lost in a resonable time) to enable diagnostics:packet capture with minimal verbosity to see whats going on.thank you for that feedback, its good to know what functionality Administrators actually miss when priorizing coding.
-
There's actually another thing I came across, which I think is a bug, but I just want to make sure I'm not being boneheaded by bouncing it off here first, before filing it in redmine.
Same scenario as above. So I try to save CPU cycles by switching from ESP to AH. Wouldn't work, because if I look at the SA, pfSense tries to establish encryption, which shouldn't exist in AH. I noticed, that it kept the same encryption options active that I had checked before switching from ESP to AH. So I switched back from AH to ESP and unchecked all encryption options, then back to AH. But now, even though there shouldn't be any encryption with AH, the interface complains that at least one encryption method must be selected.
Screenshots attached of how pfSense reports AES active in an AH connection, and how it complains about no encryption being selected.
![Screen Shot 2012-05-24 at 14.30.35.png](/public/imported_attachments/1/Screen Shot 2012-05-24 at 14.30.35.png)
![Screen Shot 2012-05-24 at 14.30.35.png_thumb](/public/imported_attachments/1/Screen Shot 2012-05-24 at 14.30.35.png_thumb)
![Screen Shot 2012-05-24 at 14.31.28.png](/public/imported_attachments/1/Screen Shot 2012-05-24 at 14.31.28.png)
![Screen Shot 2012-05-24 at 14.31.28.png_thumb](/public/imported_attachments/1/Screen Shot 2012-05-24 at 14.31.28.png_thumb) -
I don't doubt there are some bugs with AH because nobody in their right mind uses it. :-)
-
I don't doubt there are some bugs with AH because nobody in their right mind uses it. :-)
Well, as I said, I'd love to use ESP with "NULL" encryption, but that's not possible either.
i'm aware of the fact that my setup is non-standard, after all, how many people have a class-C network who don't have the type of internet service that costs hundreds or thousands per month and have an ISP that routes that net?Since Verizon with FIOS won't route a customer direct allocated address block, even with commercial FIOS service (which costs twice as much as residential for the same thing plus a single fixed IP address), I had to find a reasonably cheap and reliable collocation service which does route my class-C net, and now my network must travel the last leg tunneled from the collocation service to my place.
Eventually, a second pfSense box will replace the tiny ZyWALL 1P which is now at the collocation service, but as long as that thing is there, IPSec is the only kind of tunnel I can do.
Since all the packets that go through that tunnel end up on the internet anyway, there's little point in encrypting that tunnel, unless Verizon were to start interfering with traffic somehow.
Thus NULL encryption would lower the CPU load and likely latency, but that's not an option on pfSense (it's an option on the ZyWALL). So the other thing without encryption that would have been sufficient for my purposes and current limitations, would have been AH. It's clear that under normal circumstances there are few cases where AH is meaningful.
Still, the bug in question here seems to be with the GUI, because the web interface won't allow zero encryption methods to be selected, i.e. keeps checking hidden fields, even though AH and not ESP is selected.
The check for selected encryption methods should be contingent on ESP being selected, and should be disengaged when AH is selected. Further, regardless what encryption methods were selected, they should not be passed when AH is selected.
So aside from whatever bugs may be in AH itself, it seems the web GUI doesn't handle encryption parameters properly considering the AH vs ESP toggle, at least when what I conclude from my observations is correct.
If I'm correct, it probably would be a relatively easy fix, but I don't want to redmine it, until someone can confirm that things are broken in the way I describe it. -
There are many better ways to tunnel than that, but yes equipment may limit what you can do…
GIF and GRE are best for tunneling traffic in the clear. Failing that, OpenVPN+null cipher works.
Feel free to open up a ticket in redmine for the AH bits, but that's something that will probably need someone with motivation+knowledge or funding to fully resolve.
-
There are many better ways to tunnel than that, but yes equipment may limit what you can do…
GIF and GRE are best for tunneling traffic in the clear. Failing that, OpenVPN+null cipher works.
That's the plan, as soon as I can deploy the nano-bsd based box to my collocation service, but I want to wait with that until 2.1 is stable and released, because if something goes bad with the upgrade process, etc. I'd have to get a plane ticket to Michigan to go there and fix things ;)
Feel free to open up a ticket in redmine for the AH bits, but that's something that will probably need someone with motivation+knowledge or funding to fully resolve.
If it is what I think it is, it's at this point primarily a GUI issue, i.e. the GUI creates a bogus configuration, because it doesn't allow deselection of all encryption methods and delivers the encryption algorithms from the ESP settings to the AH setup.
If there are problems with AH itself, that's another issue, but I don't even get that far, beause AH is configured improperly by the GUI from what I can tell.
I'll file a redmine ticket…