Outbound NAT failure using Carp interface.



  • I'm trying to route outbound traffic from a DMZ (in this case DMZ1) via a CARP IP (.44 in the screenshots).

    I already have a second DMZ which is working perfectly (although this is using the primary gateway CARP IP of .42 as the NAT address).

    I can ping all the CARP IPs from my internal LAN, so they're definitely up.

    If I change the outbound NAT rule to either use the WAN address (.43 on the primary firewall) or the primary CARP virtual IP (.42) it also works.

    However, I need hosts on DMZ1 to have .44 as the visible external source IP.

    I also have an inbound NAT rule to this same DMZ (DMZ1) which doesn't work either, although it has worked intermittently.

    There is only a single route in 'System / Routing' and this is .41.

    This has got me confused, as I'm not seeing anything of note in the logs either.

    Can anybody help please?

    ![Firewall_ NAT_ Port Forward.png](/public/imported_attachments/1/Firewall_ NAT_ Port Forward.png)
    ![Firewall_ NAT_ Port Forward.png_thumb](/public/imported_attachments/1/Firewall_ NAT_ Port Forward.png_thumb)
    ![Firewall_ Rules_DMZ1.png](/public/imported_attachments/1/Firewall_ Rules_DMZ1.png)
    ![Firewall_ Rules_DMZ1.png_thumb](/public/imported_attachments/1/Firewall_ Rules_DMZ1.png_thumb)
    ![Firewall_ Rules_DMZ2.png](/public/imported_attachments/1/Firewall_ Rules_DMZ2.png)
    ![Firewall_ Rules_DMZ2.png_thumb](/public/imported_attachments/1/Firewall_ Rules_DMZ2.png_thumb)
    ![Firewall_ Rules_LAN.png](/public/imported_attachments/1/Firewall_ Rules_LAN.png)
    ![Firewall_ Rules_LAN.png_thumb](/public/imported_attachments/1/Firewall_ Rules_LAN.png_thumb)
    ![Firewall_ Rules_WAN.png](/public/imported_attachments/1/Firewall_ Rules_WAN.png)
    ![Firewall_ Rules_WAN.png_thumb](/public/imported_attachments/1/Firewall_ Rules_WAN.png_thumb)
    ![Firewall_ Virtual_IP_Addresses.png](/public/imported_attachments/1/Firewall_ Virtual_IP_Addresses.png)
    ![Firewall_ Virtual_IP_Addresses.png_thumb](/public/imported_attachments/1/Firewall_ Virtual_IP_Addresses.png_thumb)



  • i did't noticed access rules like: WAN CARP:port -> LAN IP:port , allow
    or instead change in NAT forwarding rule "Filter rule association: Pass "



  • This does actually look like a bug.

    I have been able to replicate it.

    1. Start an outbound ping from the DMZ
    2. Edit a firewall rule (no need to make any changes)
    3. Save the rule, and click "Apply changes"
    4. Ping stops almost immediately
    5. Go to CARP, select the outbound CARP IP for the DMZ
    6. Edit, change nothing and save.
    7. Ping begins responding again.

    I also have an inbound service to this IP, and when the outbound ping stops, this service stops working too.

    After performing the above steps, the inbound service works again.

    This has to be a bug surely?

    Any outbound NAT traffic (both LAN and DMZ2) via the CARP virtual IP* is unaffected.

    Also, if I set DMZ1 to NAT out through the CARP virtual IP* - again, outbound traffic is unaffected from DMZ1.

    • by CARP virtual IP - I mean the 'primary' firewall external IP.


  • @Olman:

    i did't noticed access rules like: WAN CARP:port -> LAN IP:port , allow
    or instead change in NAT forwarding rule "Filter rule association: Pass "

    I'm sorry - can you explain thisin more detail please?

    Thanks. :)



  • Pretty basic stuff there, you're not hitting any bugs.

    Sounds like a problem with the .44 IP in general, like having an IP conflict on that IP, or potentially a MAC conflict if its VHID is in use elsewhere.



  • I searched for MAC conflicts and:

    From the primary firewall log:

    kernel: arp: 00:26:55:d9:e8:6b is using my IP address x.x.x.45 on em2!

    kernel: arp: 00:26:55:d9:e8:6b is using my IP address x.x.x.44 on em2!

    That is the MAC address of em2 on the backup firewall.

    
    Apr 15 09:24:43 	php: rc.filter_synchronize: Filter sync successfully completed with http://192.168.100.2:80.
    Apr 15 09:24:39 	kernel: arp: 00:26:55:d9:e8:6b is using my IP address x.x.x.44 on em2!
    Apr 15 09:24:39 	kernel: arp: 00:26:55:d9:e8:6b is using my IP address x.x.x..45 on em2!
    Apr 15 09:24:39 	php: rc.filter_synchronize: XMLRPC sync successfully completed with http://192.168.100.2:80.
    Apr 15 09:24:38 	php: rc.filter_synchronize: Beginning XMLRPC sync to http://192.168.100.2:80.
    Apr 15 09:24:37 	check_reload_status: Reloading filter
    Apr 15 09:24:36 	check_reload_status: Syncing firewall
    
    

    This suggests to me that when reloading the filter on the primary, as part of the process, the interface(s) are pushed down momentarily, which is causing the backup firewall to bring it's interfaces up - and hence causing an IP address conflict?

    EDIT:

    Ok - so it was my stupidity (do I really want to admit this on a public forum?) :)

    I don't know how I missed this but…

    During the setup phase, I created an "IP Alias" rather than a "CARP" Virtual IP address.

    However, at some point when it synced across, the CARP VIPs were created on the secondary, but left the IP Aliases in place.

    These IP Aliases were not deleted, because the IP addresses were referenced in NAT rules.

    To fix it:

    1. Disabled NAT and Interface syncing in System / High Avail. Sync
    2. On backup - delete all interfaces and NAT / Firewall rules.
    3. re-enable sync.

    All working fine now.

    Like I said - I don't know how missed not looking at the interface status on the backup previously, the problem was staring me in the face. :)



  • cmb,

    Just wanted to say a Thank You, for explaining the conflict scenario as you did.
    I was running into a similiar situation as the Op here was explaining.

    Long story short ,On our new school setup between buildings , 10 miles apart, the two pfSense machines are now trunked via fibre rather than our old ipsec vpn setup.
    I had two different public ip's as CARP ips on each of the pfSense machines but the vhd# was the same on each of these.

    The port forward would function for about 30 seconds (or less) then quit.
    I too thought something was flaky maybe.
    My brain was flaky was the problem.

    Changing the vhd# up one number fixed the "bug"/ conflict…:)

    Barry


Log in to reply