Pfsense 2.1 : Trouble Initiating IPSec Tunnel



    • Hopefully someone here will be my second set of eyes on this IPsec config (and be up for a challenge!).

    For interoperability testing, I've set up a virtual environment using VirtualBox and Ethernet bridges on Linux.
    At the moment, I'm focusing on pfsense-to-pfsense IPSec tunneling since I had interop troubles with the other candidates.

    quick diagram of my set up

    (192.168.77.1/24) pf0 LAN –< pfsense0 (10.9.8.10) >--[Linux bridge interface]–< pfsense1 (10.9.8.15) >-- pf1 LAN (192.168.75.1/24)

    I have:

    • been able ping from pf0 WAN to pf1 WAN

    • disabled RFC1918 Private Network firewall rules

    • added firewall rules (for both hosts) on the WAN to allow:

      • proto ah

      • proto esp

      • proto udp port 500 – isakmp

      • proto udp port 4500 – NAT-traversal (disabled -- not necessary)

    • disabled NAT-Traversal – no NAT between WAN of pf0 and pf1

    • disabled Dead Peer Detection (DPD) – for the time being

    • added a firewall rule (on both hosts) to allow all traffic on IPSec interface (enc0) – for testing

    • verified that PSKs match on both pf0 and pf1

    AND to top it all off,

    # pfsense0

    
    [2.1-RELEASE][admin@pfsense0.localdomain]/root(4): cat /var/etc/ipsec/racoon.conf
    # This file is automatically generated. Do not edit
    path pre_shared_key "/var/etc/ipsec/psk.txt";
    
    path certificate  "/var/etc/ipsec";
    
    listen
    {
            adminsock "/var/db/racoon/racoon.sock" "root" "wheel" 0660;
            isakmp 10.9.8.10 [500];
            isakmp_natt 10.9.8.10 [4500];
    }
    
    extcfg { script "/var/etc/ipsec/ipsec.php" }
    
    <...snipped...>
    
    remote 10.9.8.15
    {
            ph1id 5;
            exchange_mode aggressive;
            my_identifier address 10.9.8.10;
            peers_identifier address 10.9.8.15;
            ike_frag on;
            generate_policy = off;
            initial_contact = on;
            nat_traversal = off;
    
            support_proxy on;
            proposal_check claim;
    
            proposal
            {
                    authentication_method pre_shared_key;
                    encryption_algorithm 3des;
                    hash_algorithm sha1;
                    dh_group 2;
                    lifetime time 28800 secs;
            }
    }
    
    <...snipped...>
    
    sainfo subnet 192.168.77.0/24 any subnet 192.168.75.0/24 any
    {
            remoteid 5;
            encryption_algorithm aes 256, aes 192, aes 128, blowfish 256, blowfish 248, blowfish 240, blowfish 232, blowfish 224, blowfish 216, blowfish 208, blowfish 200, blowfish 192, blowfish 184, blowfish 176, blowfish 168, blowfish 160, blowfish 152, blowfish 144, blowfish 136, blowfish 128, 3des, cast128;
            authentication_algorithm hmac_md5,hmac_sha1;
    
            lifetime time 3600 secs;
            compression_algorithm deflate;
    }
    
    [2.1-RELEASE][admin@pfsense0.localdomain]/root(3): cat /var/etc/ipsec/spd.conf
    spdadd -4 192.168.77.1/32 192.168.77.0/24 any -P out none;
    spdadd -4 192.168.77.0/24 192.168.77.1/32 any -P in none;
    <...snipped...>
    spdadd -4 192.168.77.0/24 192.168.75.0/24 any -P out ipsec esp/tunnel/10.9.8.10-10.9.8.15/unique;
    spdadd -4 192.168.75.0/24 192.168.77.0/24 any -P in ipsec esp/tunnel/10.9.8.15-10.9.8.10/unique;
    
    

    # pfsense1

    
    [2.1-RELEASE][admin@pfsense1.localdomain]/root(5): cat /var/etc/ipsec/racoon.conf
    # This file is automatically generated. Do not edit
    path pre_shared_key "/var/etc/ipsec/psk.txt";
    
    path certificate  "/var/etc/ipsec";
    
    listen
    {
            adminsock "/var/db/racoon/racoon.sock" "root" "wheel" 0660;
            isakmp 10.9.8.15 [500];
            isakmp_natt 10.9.8.15 [4500];
    }
    
    extcfg { script "/var/etc/ipsec/ipsec.php" }
    
    remote 10.9.8.10
    {
            ph1id 2;
            exchange_mode aggressive;
            my_identifier address 10.9.8.15;
            peers_identifier address 10.9.8.10;
            ike_frag on;
            generate_policy = off;
            initial_contact = on;
            nat_traversal = off;
    
            support_proxy on;
            proposal_check claim;
    
            proposal
            {
                    authentication_method pre_shared_key;
                    encryption_algorithm 3des;
                    hash_algorithm sha1;
                    dh_group 2;
                    lifetime time 28800 secs;
            }
    }
    
    sainfo subnet 192.168.75.0/24 any subnet 192.168.77.0/24 any
    {
            remoteid 2;
            encryption_algorithm aes 256, aes 192, aes 128, blowfish 256, blowfish 248, blowfish 240, blowfish 232, blowfish 224, blowfish 216, blowfish 208, blowfish 200, blowfish 192, blowfish 184, blowfish 176, blowfish 168, blowfish 160, blowfish 152, blowfish 144, blowfish 136, blowfish 128, 3des, cast128;
            authentication_algorithm hmac_md5,hmac_sha1;
    
            lifetime time 3600 secs;
            compression_algorithm deflate;
    }
    
    [2.1-RELEASE][admin@pfsense1.localdomain]/root(6): cat /var/etc/ipsec/spd.conf
    spdadd -4 192.168.75.1/32 192.168.75.0/24 any -P out none;
    spdadd -4 192.168.75.0/24 192.168.75.1/32 any -P in none;
    spdadd -4 192.168.75.0/24 192.168.77.0/24 any -P out ipsec esp/tunnel/10.9.8.15-10.9.8.10/unique;
    spdadd -4 192.168.77.0/24 192.168.75.0/24 any -P in ipsec esp/tunnel/10.9.8.10-10.9.8.15/unique;
    
    


  • I've done some additional testing (some items many times over, but it's good to be sure…).

    I can verify the following:

    • successfully send ICMP echo (ping) packets between both hosts' WAN addresses

    • successfully update NTP off of the host node (not a UDP problem alone)

    • successfully "chatted" between hosts with netcat (ala nc -u -l 501 and nc -u 10.9.8.15 501 on the respective pfsense hosts)

    • using the packet capture diagnostic tool, I see UDP:500 traffic inbound, but never a reply back from the receiving host

    • there is a WAN firewall rule using the ISAKMP port from the drop down list

    I have found the following:

    • bridge host sends ICMP Redirect packets

    • UDP port 500 has apparent problems reaching the other end (if I stop Racoon on both ends and switch the netcat port)

    • per /tmp/rules.debug in the "#VPN Rules" section, I see automatically added rules for the various VPN end points and port 500

    So my questions for anyone who can answer them…

    • Do I need my own USER_RULEs for AH, ESP, ISAKMP, NAT-T in the WAN firewall rules?

    • Can anyone replicate this problem?  (Please and thanks!)



  • Out of frustration, I went into System > Advanced > Firewall/NAT > Disable Firewall to disable the firewall.

    I did that on the one end and tried to initiate the IPSec VPN from the other end … no dice
    So I proceeded to disable the firewall on the other end as well … initiate from the other end and presto >> Aggressive, Informational, Quick Mode established … ping between the two hosts by sourcing the LAN address and it is successful (ESP packets flow end-to-end).

    So now the fun begins figuring out what exactly is blocking the ISAKMP (UDP port 500) traffic on both ends.  And then to determine what (if any) firewall allowances I need for any other IPSec related traffic (AH, ESP).

    I still have the auto-created VPN rules enabled, so ideally I should not have to add rules and I should expect the auto-created rules to "just work".

    Re-enabled the firewall, disabled my user added rules on the WAN, and tried again ... no tunnel initialization.



  • Check the WAN interfaces for the "block private addresses" option (although none of your tests should have been successful if these were checked)

    What is the racoon log saying about this?



  • @georgeman:

    Check the WAN interfaces for the "block private addresses" option (although none of your tests should have been successful if these were checked)

    What is the racoon log saying about this?

    Yes, private IPs (ie: RFC1918 addresses) have been allowed.

    Interestingly I see one problem, but it's one that wouldn't have been possible initially when I only had one tunnel.

    There appears to be a bug in the pfSense config it appears, it picks up on my disabled IPSec phase1 entry instead of the active one (later on down the list).  I'm seeing it try to initialize the wrong phase1 entry.  From Status > IPSec, I see the correct name on the tunnel list, but it certainly doesn't reflect the correct name in the IPSec system logs.

    I've since deleted my "defaults" phase1+phase2 entry and will stick with one tunnel per host regardless of (dis|en)abled status.

    There are no logs from the target/receiving end, but logs from the initiating end show the following:

    
    Feb 24 19:02:18 	racoon: [pfsense0-to-pfsense1_3DES]: INFO: IPsec-SA request for 10.9.8.15 queued due to no phase1 found.
    Feb 24 19:02:18 	racoon: [pfsense0-to-pfsense1_3DES]: INFO: initiate new phase 1 negotiation: 10.9.8.10[500]<=>10.9.8.15[500]
    Feb 24 19:02:18 	racoon: INFO: begin Identity Protection mode.
    Feb 24 19:02:50 	racoon: [pfsense0-to-pfsense1_3DES]: [10.9.8.15] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 10.9.8.15[0]->10.9.8.10[0]
    Feb 24 19:02:50 	racoon: INFO: delete phase 2 handler.
    Feb 24 19:03:08 	racoon: ERROR: phase1 negotiation failed due to time up. 5b092c37b01294f3:0000000000000000
    
    


  • I had a CF card lying around that had pfSense 2.1 RC0 on it … so I grabbed another spare CF and put a recent version of 2.1 Stable on that one.  I dropped both cards into an old sluggish pair of embedded boards (basically PC Engines WRAP-era gear ... ugh so slow).  I am able to initiate a tunnel from the 2.1 RC0 side, but not the 2.1 Stable side (using RFC1918 addresses in 192.168.15.x/24 subnet).  I have end-to-end IP connectivity, that is well tested at this point. ;)

    I saw this same behavior when testing last week, but didn't think much of it.  This time around, I switched the cards in the units and the working end follows the version (whichever unit has 2.1 RC0).  I've noticed that I had to restart Racoon on the 2.1 Stable end after restarting them both to break down the tunnel for more testing.

    The oldest image I could find on mirrors was 2.1 RC3 (if I remember correctly) and it has problems initiating tunnels just like I'm seeing with 2.1 Stable.

    I've been copying, tweaking, and loading firewall rules from my modified version of /tmp/rules.debug, but haven't nailed the firewall rule that's causing this.

    Time in these logs is slightly skewed between the two ends/units (not the case for my VBox VMs) because I don't have them on-net and able to get to NTP servers.

    Logs from 2.1 Stable End:

    
    Feb 28 20:48:03	racoon: INFO: begin Aggressive mode.
    Feb 28 20:48:35	racoon: [red-to-orange]: [192.168.15.2] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.15.2[0]->192.168.15.3[0]
    Feb 28 20:48:35	racoon: INFO: delete phase 2 handler.
    Feb 28 20:48:54	racoon: ERROR: phase1 negotiation failed due to time up. 3e3fa005e2570c3c:0000000000000000
    Feb 28 20:51:16	racoon: [red-to-orange]: INFO: IPsec-SA request for 192.168.15.2 queued due to no phase1 found.
    Feb 28 20:51:16	racoon: [red-to-orange]: INFO: initiate new phase 1 negotiation: 192.168.15.3[500]<=>192.168.15.2[500]
    Feb 28 20:51:16	racoon: INFO: begin Aggressive mode.
    Feb 28 20:51:48	racoon: [red-to-orange]: [192.168.15.2] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.15.2[0]->192.168.15.3[0]
    Feb 28 20:51:48	racoon: INFO: delete phase 2 handler.
    Feb 28 20:52:06	racoon: ERROR: phase1 negotiation failed due to time up. 15e9c5bfa64e1690:0000000000000000
    Feb 28 20:52:28	racoon: [red-to-orange]: INFO: respond new phase 1 negotiation: 192.168.15.3[500]<=>192.168.15.2[500]
    Feb 28 20:52:28	racoon: INFO: begin Aggressive mode.
    Feb 28 20:52:28	racoon: INFO: received broken Microsoft ID: FRAGMENTATION
    Feb 28 20:52:28	racoon: INFO: received Vendor ID: DPD
    Feb 28 20:52:29	racoon: [red-to-orange]: [192.168.15.2] NOTIFY: couldn't find the proper pskey, try to get one by the peer's address.
    Feb 28 20:52:38	racoon: NOTIFY: the packet is retransmitted by 192.168.15.2[500] (1).
    Feb 28 20:52:42	racoon: [red-to-orange]: [192.168.15.2] INFO: request for establishing IPsec-SA was queued due to no phase1 found.
    Feb 28 20:52:48	racoon: NOTIFY: the packet is retransmitted by 192.168.15.2[500] (1).
    Feb 28 20:52:59	racoon: NOTIFY: the packet is retransmitted by 192.168.15.2[500] (1).
    Feb 28 20:53:09	racoon: NOTIFY: the packet is retransmitted by 192.168.15.2[500] (1).
    Feb 28 20:53:13	racoon: [red-to-orange]: [192.168.15.2] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.15.2[0]->192.168.15.3[0]
    Feb 28 20:53:13	racoon: INFO: delete phase 2 handler.
    Feb 28 20:53:19	racoon: ERROR: phase1 negotiation failed due to time up. 4ea184ea5ab611ee:3960cdb822e6e885
    Feb 28 20:53:58	racoon: INFO: caught signal 15
    Feb 28 20:53:58	racoon: INFO: racoon process 87801 shutdown
    Feb 28 20:54:04	racoon: INFO: @(#)ipsec-tools 0.8.1 (http://ipsec-tools.sourceforge.net)
    Feb 28 20:54:04	racoon: INFO: @(#)This product linked OpenSSL 1.0.1e 11 Feb 2013 (http://www.openssl.org/)
    Feb 28 20:54:04	racoon: INFO: Reading configuration from "/var/etc/ipsec/racoon.conf"
    Feb 28 20:54:04	racoon: [Self]: INFO: 192.168.15.3[4500] used for NAT-T
    Feb 28 20:54:04	racoon: [Self]: INFO: 192.168.15.3[4500] used as isakmp port (fd=9)
    Feb 28 20:54:04	racoon: [Self]: INFO: 192.168.15.3[500] used for NAT-T
    Feb 28 20:54:04	racoon: [Self]: INFO: 192.168.15.3[500] used as isakmp port (fd=10)
    Feb 28 20:54:04	racoon: INFO: unsupported PF_KEY message REGISTER
    Feb 28 20:54:53	racoon: [red-to-orange]: INFO: respond new phase 1 negotiation: 192.168.15.3[500]<=>192.168.15.2[500]
    Feb 28 20:54:53	racoon: INFO: begin Aggressive mode.
    Feb 28 20:54:53	racoon: INFO: received broken Microsoft ID: FRAGMENTATION
    Feb 28 20:54:53	racoon: INFO: received Vendor ID: DPD
    Feb 28 20:54:54	racoon: [red-to-orange]: [192.168.15.2] NOTIFY: couldn't find the proper pskey, try to get one by the peer's address.
    Feb 28 20:54:54	racoon: [red-to-orange]: INFO: ISAKMP-SA established 192.168.15.3[500]-192.168.15.2[500] spi:c551135385fd8fcd:5477b18d46ed66d4
    Feb 28 20:54:54	racoon: [red-to-orange]: [192.168.15.2] INFO: received INITIAL-CONTACT
    Feb 28 20:54:54	racoon: [red-to-orange]: INFO: respond new phase 2 negotiation: 192.168.15.3[500]<=>192.168.15.2[500]
    Feb 28 20:54:54	racoon: [red-to-orange]: INFO: IPsec-SA established: ESP 192.168.15.3[500]->192.168.15.2[500] spi=165250624(0x9d98640)
    Feb 28 20:54:54	racoon: [red-to-orange]: INFO: IPsec-SA established: ESP 192.168.15.3[500]->192.168.15.2[500] spi=242230517(0xe7024f5)
    
    

    Logs from 2.1 RC0 End:

    
    Feb 28 20:40:02	racoon: INFO: caught signal 15
    Feb 28 20:40:02	racoon: INFO: racoon process 37405 shutdown
    Feb 28 20:40:07	racoon: INFO: @(#)ipsec-tools 0.8.1 (http://ipsec-tools.sourceforge.net)
    Feb 28 20:40:07	racoon: INFO: @(#)This product linked OpenSSL 1.0.1e 11 Feb 2013 (http://www.openssl.org/)
    Feb 28 20:40:07	racoon: INFO: Reading configuration from "/var/etc/ipsec/racoon.conf"
    Feb 28 20:40:07	racoon: [Self]: INFO: 192.168.15.2[4500] used for NAT-T
    Feb 28 20:40:07	racoon: [Self]: INFO: 192.168.15.2[4500] used as isakmp port (fd=9)
    Feb 28 20:40:07	racoon: [Self]: INFO: 192.168.15.2[500] used for NAT-T
    Feb 28 20:40:07	racoon: [Self]: INFO: 192.168.15.2[500] used as isakmp port (fd=10)
    Feb 28 20:40:07	racoon: INFO: unsupported PF_KEY message REGISTER
    Feb 28 20:40:08	racoon: ERROR: such policy already exists. anyway replace it: 192.168.100.1/32[0] 192.168.100.0/24[0] proto=any dir=out
    Feb 28 20:40:08	racoon: ERROR: such policy already exists. anyway replace it: 192.168.100.0/24[0] 192.168.100.1/32[0] proto=any dir=in
    Feb 28 20:40:08	racoon: ERROR: such policy already exists. anyway replace it: 192.168.100.0/24[0] 192.168.200.0/24[0] proto=any dir=out
    Feb 28 20:40:08	racoon: ERROR: such policy already exists. anyway replace it: 192.168.200.0/24[0] 192.168.100.0/24[0] proto=any dir=in
    Feb 28 20:52:26	racoon: [orange-to-red]: INFO: IPsec-SA request for 192.168.15.3 queued due to no phase1 found.
    Feb 28 20:52:26	racoon: [orange-to-red]: INFO: initiate new phase 1 negotiation: 192.168.15.2[500]<=>192.168.15.3[500]
    Feb 28 20:52:26	racoon: INFO: begin Aggressive mode.
    Feb 28 20:52:58	racoon: [orange-to-red]: [192.168.15.3] ERROR: phase2 negotiation failed due to time up waiting for phase1 [Remote Side not responding]. ESP 192.168.15.3[0]->192.168.15.2[0]
    Feb 28 20:52:58	racoon: INFO: delete phase 2 handler.
    Feb 28 20:53:17	racoon: ERROR: phase1 negotiation failed due to time up. 4ea184ea5ab611ee:0000000000000000
    Feb 28 20:54:51	racoon: [orange-to-red]: INFO: IPsec-SA request for 192.168.15.3 queued due to no phase1 found.
    Feb 28 20:54:51	racoon: [orange-to-red]: INFO: initiate new phase 1 negotiation: 192.168.15.2[500]<=>192.168.15.3[500]
    Feb 28 20:54:51	racoon: INFO: begin Aggressive mode.
    Feb 28 20:54:51	racoon: INFO: received broken Microsoft ID: FRAGMENTATION
    Feb 28 20:54:51	racoon: INFO: received Vendor ID: DPD
    Feb 28 20:54:52	racoon: [orange-to-red]: [192.168.15.3] NOTIFY: couldn't find the proper pskey, try to get one by the peer's address.
    Feb 28 20:54:52	racoon: [orange-to-red]: INFO: ISAKMP-SA established 192.168.15.2[500]-192.168.15.3[500] spi:c551135385fd8fcd:5477b18d46ed66d4
    Feb 28 20:54:52	racoon: [orange-to-red]: INFO: initiate new phase 2 negotiation: 192.168.15.2[500]<=>192.168.15.3[500]
    Feb 28 20:54:52	racoon: [orange-to-red]: [192.168.15.3] INFO: received INITIAL-CONTACT
    Feb 28 20:54:52	racoon: [orange-to-red]: INFO: IPsec-SA established: ESP 192.168.15.2[500]->192.168.15.3[500] spi=242230517(0xe7024f5)
    Feb 28 20:54:52	racoon: [orange-to-red]: INFO: IPsec-SA established: ESP 192.168.15.2[500]->192.168.15.3[500] spi=165250624(0x9d98640)
    
    


  • Advanced > Firewall/NAT > Disable all auto-added VPN rules
    I checked the box and saved settings.  I already had added an IPv4 allow all rule with logging enabled.  The tunnel establishes almost immediately with this change.

    This confirms (if it wasn't already evident) there is a firewall rule problem at play in my set up.
    When diffing /tmp/rules.debug with /tmp/rules.debug.old, I see only the VPN rules which are all set to "reply-to" and "route-to" the WAN gateway (which isn't necessary as both nodes are in the same "WAN subnet").  Maybe had I thrown another device in the middle to do the routing this would not have happened, but regardless of that fact, this is still a realistic scenario (VPN tunnels between two hosts in the same subnet).

    Advanced > Firewall/NAT > Disable reply-to on WAN rules
    Doesn't take effect as far as I can tell (at least not on the auto-created VPN rules which I re-enabled).  Reverting the change (unchecking the checkbox) and diffing rules.debug and rules.debug.old show only the USER_RULEs are affected (though all rules probably should be affected).

    If I copy the /tmp/rules.debug to another file in /tmp/ and tear out the (route-to|reply-to) keywords with vi … and reload the rules with pfctl, my tunnels magically initiate from either end (and establish).

    
    # different per host and depends on other rules, but the gist
    154,157s/ reply-to ( em0 10.9.8.1 ) //g
    154,157s/ route-to ( em0 10.9.8.1 ) //g
    
    

    It also appears there is a bug where the last phase1 that is saved is "latched on to" or used (I have duplicates due to testing, so I expect that is why it picks the wrong duplicate over the new one).

    And another apparent bug (on my production box) which is really messed up.
    ISAKMP is UDP 500 and NAT-T is UDP 4500 …

    
    # IPSec Logs from when I click the play button on Status > IPSec page
    Feb 25 20:52:34	racoon: [Self]: INFO: X.X.X.X[4500] used for NAT-T
    Feb 25 20:52:34	racoon: [Self]: INFO: X.X.X.X[4500] used as isakmp port (fd=9)
    Feb 25 20:52:34	racoon: [Self]: INFO: X.X.X.X[500] used for NAT-T
    Feb 25 20:52:34	racoon: [Self]: INFO: X.X.X.X[500] used as isakmp port (fd=10)
    
    # racoon.conf
    listen
    {
            adminsock "/var/db/racoon/racoon.sock" "root" "wheel" 0660;
            isakmp X.X.X.X [500];
            isakmp_natt X.X.X.X [4500];
    }
    
    

    _This is what I've found thus far.

    I would greatly appreciate it if someone would test this scenario to double check._