Pfsense in Azure - Cannot reach host on IPsec tunnel



  • Hi,

    I was hoping someone could help me here as i am getting a little lost and this is my first time setting this up.

    I have a pfsense box setup in azure with 1 WAN and 1 LAN interface. I have setup an IPsec tunnel from pfsense to a VPN in our DC. I've created an 'allow all' firewall rule in WAN, LAN and IPsec (just to be sure) and also 'allow all' NAT rule for LAN, WAN and IPsec.
    The WAN and LAN are on same network but different subnets and i have a VM setup on the LAN subnet. I have setup a route table with a route linked to the subnet the VM in on saying all traffic destined for the DC address space send through the LAN interface. When i tracert from the VM it shows this static route but then gets a 'destination host unreachable'.
    The IPsec tunnel is established but for some reason even with all of the above i still can't ping/trace to any host in the DC

    Any ideas?

    thanks in advanced

    Ben


  • Netgate Administrator

    Did you enable IP forwarding for the pfSense instance/inetrfaces in Azure?
    https://docs.microsoft.com/en-us/azure/virtual-network/tutorial-create-route-table-portal#turn-on-ip-forwarding

    Steve



  • Hi Stephen,

    Yup, i've enabled IP forwarding for both of the interfaces (WAN and LAN) attached to the pfsense box.

    Also switched off the firewall on the VM i'm doing the connection testing from.

    thanks
    Ben


  • Netgate Administrator

    I wouldn't expect you to need any NAT rules there since you are connecting directly between the private subnets. Even the automatic rules may not be needed. What exactly are the NAT rules you have added?

    I assume the IPSec tunnel is up?

    And you can connect across it? Ping out from pfSense using the internal subnet IP as source for example.

    Steve



  • I have the same problem. I NAT private ip inside phase 2 of the tunnel and the traffic goes to the other side and returns to pfsense, but VM that i initiate traffic from does not receive reply. like the traffic stops to pfsense and does not forward back to the vm. i have ip forwarding setted up a route table in Azure .



  • @stephenw10
    The tunnel is established but on the dashboard it says it's down but guessing that is a bug in pfsense?
    c5bf3563-ec29-4e80-92f1-f64dce6d7f17-image.png

    I can't ping or trace out from pfsense or VM, it doesn't receive any packets which makes me think it's not able to find a route there.

    here's my nat rules:
    2417b47b-6ec6-44f7-aeca-590dffe37909-image.png

    in case you need it:

    WAN rules:

    834e7385-b945-4e2a-8c43-2c071831e45e-image.png

    LAN rules:

    4651a045-8285-458e-aadc-1d96b837c045-image.png

    and IPsec rules:

    2383e25a-8f54-40e0-a1a5-6c0fee8a2ad7-image.png

    IPsec setup:

    P1:
    f4abd19e-978a-4cf4-8745-bd92f5ba4c42-image.png

    P2:

    1cbc2e4e-6c8b-435b-b6fa-2d645fffa9e6-image.png
    thanks
    Ben


  • Netgate Administrator

    Is the remote network a public subnet there?

    You should not need to NAT anywhere. You definitely should not be NATing on all the interfaces like that. And you absolutely should never NAT on the IPSec interface like that. NAT in the P2 config if you need to NAT over IPSec.
    So I would disable or remove those rules there.

    Also if you are adding outbound NAT rules you should use some source other than 'any'. For instance with those rules you have it will be NATing it's own IPSec traffic outbound which can only break things.

    If the tunnel is up you should be able to ping to something on the remote side using the LAN IP as source. If you cannot either the tunnel is not up or it's blocked at the remote end.
    Show us the IPSec status page.

    Steve



  • @stephenw10

    The remote address space is private and at the moment only consists of one IP which i am performing tests on.

    Ok i have removed all NAT rules but it doesn't work still if i try to ping. According to documentation and some videos i've watched you need to have NAT which is why i set that up but you're saying i don't need to set this up under firewall>nat? It seems if i try to use NAT in P2 it removes it once i've saved?

    Here's the IPsec status page:
    2d300f13-0a4b-4a47-a242-11105e848d98-image.png


  • Netgate Administrator

    Ok, so that's not up at phase 2, the private subnets. So you probably have a config mismatch there.
    You will see the 'Show Child SAs' button there when that is established.

    Check the IPSec log for errors. Make sure both ends are configured the same.

    The only place you need NAT there is in the phase 1 tunnel as it looks like there is some NAT in the route. However you can see it gas detected that and connected in NAT-T mode.

    Steve



  • @stephenw10

    Oh i didn't know that box existed for show child sa's, yes it does appear that p1 is up and p2 is down.

    I've double checked the config and it all matches up.
    IPSec log does show information on child SA, i've pasted part of the log below:
    Dec 16 17:32:49 charon 09[IKE] <con1000|46> establishing CHILD_SA con1000{1773} reqid 38
    Dec 16 17:32:49 charon 09[ENC] <con1000|46> generating CREATE_CHILD_SA request 19 [ N(ESP_TFC_PAD_N) SA No KE TSi TSr ]
    Dec 16 17:32:49 charon 09[NET] <con1000|46> sending packet: from <local>[4500] to <remote>500] (348 bytes)
    Dec 16 17:32:49 charon 09[NET] <con1000|46> received packet: from <remote>[4500] to <local>[4500] (76 bytes)
    Dec 16 17:32:49 charon 09[ENC] <con1000|46> parsed CREATE_CHILD_SA response 19 [ N(NO_PROP) ]
    Dec 16 17:32:49 charon 09[IKE] <con1000|46> received NO_PROPOSAL_CHOSEN notify, no CHILD_SA built
    Dec 16 17:32:49 charon 09[CFG] <con1000|46> configured proposals: ESP:AES_CBC_256/HMAC_SHA2_256_128/MODP_1024/NO_EXT_SEQ
    Dec 16 17:32:49 charon 09[IKE] <con1000|46> failed to establish CHILD_SA, keeping IKE_SA
    Dec 16 17:32:49 charon 09[CHD] <con1000|46> CHILD_SA con1000{1773} state change: CREATED => DESTROYING
    Dec 16 17:32:49 charon 09[IKE] <con1000|46> activating new tasks
    Dec 16 17:32:49 charon 09[IKE] <con1000|46> nothing to initiate


  • Netgate Administrator

    @bhodges said in Pfsense in Azure - Cannot reach host on IPsec tunnel:

    NO_PROPOSAL_CHOSEN

    Yup so some mismatch with the other side, not the actual subnets:
    https://docs.netgate.com/pfsense/en/latest/vpn/ipsec/ipsec-troubleshooting.html#phase-2-encryption-algorithm-mismatch

    It's connected as AES-128/SHA1 at phase1. It's common to use the same at phase2 but not required. Do you know what the other side is set to?

    Steve



  • @stephenw10

    For phase 2 it is set to AES-256 and SHA-256 which is what pfsense is set to so i'm not sure it is a mismatch?:

    caabb19f-f1ce-4b86-bd09-bed3be9f1b40-image.png


  • Netgate Administrator

    Different PFS key set at the other side?

    The other side is sending that response so logs from that would probably show exactly what's not matching.

    Steve



  • Hey Stephen,

    Ok phase2 is up and the issue was with NAT. I added a NAT rule in phase2 for the whole network which pfsense sits on.

    I can now ping from within pfsense to a server down the tunnel and vice versa.

    My next issue is that i can't ping the server on the other side of the tunnel from another VM in the same network or another network.

    I've setup ip forwarding on the pfsense interfaces in azure and created a route table to which has the address space on other side of tunnel to go through the pfsense LAN IP address.

    This is the response i get if i tracert from the other VM:

    Tracing route to <Server> [IP]
    over a maximum of 30 hops:

    1 1 ms <1 ms 1 ms <LANIP>
    2 * * * Request timed out.
    3 <LANIP> reports: Destination host unreachable.

    it's being picked up on firewall logs and is being accepted:
    eec17d44-5f15-449b-aebb-c0273afb3ee6-image.png

    do i need to be setting up a NAT forward so it knows what to do with the packet when it receives one destined for a host down the tunnel?

    thanks
    Ben


  • Netgate Administrator

    If you can ping from pfSense to something across the VPN when selecting the appropriate source but not from another host in that subnet it sounds like a routing issue at the host.
    You might have policy routing in place that re-routes traffic from the host but would not affect traffic from pfSense itself.

    Steve



  • it sounds as though i have the same issue as mbogoev explained above.

    it seems as though the host where the traffic is originating from is routing traffic correctly to the pfsense LAN interface but it doesn't go beyond that.

    in pfsense i have a static route for the network on the other side of the tunnel. If i disable this then tracert gets 'request timed out' errors for 30 hops. If i enable this then i get one 'request timed out' and then 'destination host unreachable' - the same error i sent in my previous message.

    This is what i see in the packet capture:

    12:18:51.842993 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 500, length 72
    12:18:51.844700 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 501, length 72
    12:18:51.846129 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 502, length 72
    12:18:52.862071 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 503, length 72
    12:18:52.862157 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28
    12:18:56.853012 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 504, length 72
    12:18:56.853074 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28
    12:19:00.860894 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 505, length 72
    12:19:00.860954 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28
    12:19:04.864868 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 506, length 72
    12:19:04.864932 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28
    12:19:08.857729 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 507, length 72
    12:19:08.857809 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28
    12:19:12.861699 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 508, length 72
    12:19:12.861766 ARP, Request who-has 10.128.2.181 tell 10.239.3.5, length 28

    it doesn't appear to be responding to the ARP requests


  • Netgate Administrator

    Where did you capture that? What hosts are those IPs shown there?

    What exactly is the static route you have in place?



  • @stephenw10

    i clicked on packet capture from the diagnostics menu within pfsense.

    10.233.2.4 is just a test vm in another network i've created
    10.128.2.181 is IP on other end of IPsec tunnel
    10.239.3.5 is the LAN IP of pfsense


  • Netgate Administrator

    I assume you're capturing on the LAN interface?

    It looks like you have a subnet conflict though. pfSense is ARPing for 10.128.2.181 from 10.239.3.5 it will only do that if they are inside the same subnet. It would be a huge subnet configured there though.

    Steve



  • @stephenw10

    yes that's right, i am capturing on the LAN interface.

    It has to come from 10.239.3.5 though? the test vm has to have a route table telling it to go to the LAN address of pfsense so it can contact the 10.128.2.181 because 10.239.3.5 is the only interface which knows of this network. So i've created a route table telling the VM to send all traffic for 10.128.2.0/24 through 10.239.3.5 and this is picked up on tracert, firewall logs on pfsense and packet trace.

    How else could VM's use the IPsec tunnel?

    thanks
    Ben


  • Netgate Administrator

    The static route is in Azure then? That's correct if so.

    Still left with pfSense trying to ARP for a device in a subnet on the other end of the VPN. That should never happen.

    It implies you have the LAN subnet configured as something huge that contains it like /9.

    Steve



  • Yes there is a static route in azure which is..

    24f4b9d7-cced-4194-8f2e-bf488253c90a-image.png

    then one in pfsense which is..
    ea7e97e3-1b24-48e2-9c15-0f473acc1802-image.png

    and then i've setup a firewall nat outbound rule:

    fceb7e2b-fd86-40b7-b215-92d6709ecfbd-image.png

    "It implies you have the LAN subnet configured as something huge that contains it like /9." the lan subnet is a /24, i have setup NAT on phase 2 of the whole network /16 and this is only setting which works.

    thanks
    Ben


  • Netgate Administrator

    Ok you should not need that static route in pfSense unless you need to send traffic from the firewall itself in which case it can be applied as a workaround:
    https://docs.netgate.com/pfsense/en/latest/vpn/ipsec/accessing-firewall-services-over-ipsec-vpns.html

    You should never apply NAT using pf on the IPSec interface, that's what is breaking the traffic here. The IPSec interface does not have an address in tunnel mode.
    Any NAT you need across the IPSec tunnel must be configured in the P2 settings.

    Steve



  • Ok now that i have removed the NAT outbound rule and the gateway the test vm isn't routing to pfsense LAN at all and just gets timeouts. i've set everything back up again but still it isn't routing to the LAN. I have rebooted the box and things seem to be back to normal again.

    The link you have sent is what i have in place already. I have a lan gateway setup with a static route. Or are you saying i shouldn't have that in place for my scenario?

    I have disabled the NAT outbound rule but only put it there as things aren't routing and i'm assuming pfsense LAN doesn't know what to do with the packet when it receives it.

    I essentially just need pfsense to act like the role in windows server which handles routing so when a VM sends a packet for a VM it routes to pfsense who then forwards that packet onto the destination since it is the only one who knows where to send it. i've set the azure side up the same in this scenario but just don't know what i'm missing in pfsense?

    thanks
    Ben


  • Netgate Administrator

    You need the static route in Azure otherwise VMs there have no way to reach the subnet over the VPN.

    You only need a static route in pfSense for services on the firewall itself. It is not required for connectivity from other VMs.

    You cannot have a NAT rule on the IPSec interface it will break the traffic. If you need NAT between those subnets though it should be in the IPSec P2 config. You probably don't need it though, since those subnets do not overlap, unless the other side is configured for some other subnet.

    Other than that you just need to have firewall rules to allow the traffic into the firewall on LAN.

    Steve



  • Hey Stephen,

    I'm a little lost here, the traffic is being captured on the firewall and it is being allowed on both sides so it doesn't seem to be a firewall blocking issue. I have also turned off windows firewall on the test vm for testing purposes.

    I've now changed the NAT outbound to automatic mode but it hasn't made any difference.

    I still get this response "Destination host unreachable." when i try to tracert to the server on other side of tunnel from test vm.

    I am not sure what else i can change?

    thanks
    Ben


  • Netgate Administrator

    Ok so start a continuous ping to the VM in Azure from the other side of the tunnel. Now run pcaps on the IPSec and then LAN interfaces filtered by the source IP you're pinging from.

    Do you see the pings in both pcaps?

    If it's not leaving the LAN check the WAN. We have seen Azure do some weird things with traffic like that.

    Steve



  • so in IPsec pcaps if i ping pfsense LAN address it comes up but if i ping the test VM it doesn't.

    if i do the same for LAN pcaps nothing comes up for LAN address or test VM

    If i change to WAN neither come up but the public IP address on the other side of the tunnel comes up when i don't run any ping commands.

    thanks
    Ben


  • Netgate Administrator

    So when you ping the VM IP from the remote end of the tunnel that traffic never arrives over IPSec at the pfSense instance in Azure?

    Then something is blocking it at the remote end or the VPN is not configured correctly so it doesn't carry that traffic.

    Steve



  • Hi Steve,

    "the VPN is not configured correctly so it doesn't carry that traffic." - the pfsense VPN or the VPN we have on other side? I have followed all of the instructions from numerous videos (which all seem to be different) and numerous online guides and what you've said above so what else could it be? My only guess is routing under 'system' or NAT under 'firewall' but i have tried numerous different settings but it has made no difference other than breaking it further.

    "Then something is blocking it at the remote end" i have the firewall switched off on the test vm and the vm on other side of tunnel so it shouldn't be being blocked.

    Do you have any other suggestions what could be causing this?

    thanks
    Ben


  • Netgate Administrator

    The configured phase 2 policy has to carry the traffic you are sending. If it doesn't match it won't see the traffic as interesting and it won't grab it and send it over the tunnel.

    The defined P2 has to match at both ends or it won't establish. We know it is established since you could ping over it from pfSense itself but what exactly has it established as? pfSense will allow a P2 to establish that is a smaller subnet defined within whatever it has configured. So for example if pfSense has 10.239.0.0/16 defined as the local subnet and the other end has 10.239.3.0/24 pfSense will allow the /24 to establish but of course that will not carry traffic from anything else in the /16.

    Looking back your LAN IP is 10.239.3.5 but the VM you are trying to connect from/to is in a different subnet so the established P2 might not carry that.

    Show us the established P2.

    You also mentioned adding NAT in the P2 but I don't see that other than the actual NAT rule which was breaking things.

    Additionally it's not clear why pfSense was ARPing for 10.128.2.181. It must have an interface in that subnet for that to happen.

    Steve



  • ok here is the established P2:

    699b29fc-f659-4a90-be87-fbe46e9d76fd-image.png

    the local subnets doesn't look right as i would expect to see the range there but even if i change this it doesn't make much difference

    150ece96-7029-48fa-b5d8-251ffc001c1f-image.png

    what do you mean by "Looking back your LAN IP is 10.239.3.5 but the VM you are trying to connect from/to is in a different subnet so the established P2 might not carry that.". I am trying to test connection to the vm on other side of tunnel with a vm in a different network in azure.

    doing a packet capture now i can't see it ARPing anymore it just shows this:

    17:17:00.607947 IP 10.233.2.4 > 10.239.3.5: ICMP echo request, id 1, seq 710, length 40
    17:17:01.615256 IP 10.233.2.4 > 10.239.3.5: ICMP echo request, id 1, seq 711, length 40
    17:17:01.772138 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 712, length 72
    17:17:05.778564 IP 10.233.2.4 > 10.128.2.181: ICMP echo request, id 1, seq 713, length 72


Log in to reply