NEW CARP Setup seems to have a mind of it's own and secret firewall rules - limited/no logs

  • I've setup a new firewall HA Cluster with CARP (latest 2.4 image). Everything looks okay but odd things don't work and there are no logs. It seems to be related to CARP VIP's and Rules with ID's that I can't find even when running "pfctl -vvsr" on both firewalls.

    I'll try and get a diagram up but in summery I have a multi-WAN - Multi LAN setup. Outbound Web Access "seems" fine but access between LAN networks is not and there seems to be lots of other little quirks. Each internal LAN network has a CARP VIP default gateway of .1 with .2 and .3 going to the firewall interfaces. It's reversed on the WAN side and using the middle of a block.

    2 example issues are

    1. We couldn't use the CARP VIP for DNS, it just didn't respond to requests, but .2 and .3 do.

    2. I can't PING from one subnet to another and there were no logs. I could ping the PFSense Interface on the opposite subnet but no to a connected device.
      |-After trying everything (adding DENY's and doing packet caputres etc.. and still seeing nothing) I changed the default gateway from the CARP VIP to the Primary Firewall Interface IP. Then boom I see the blocked traffic I was expecting. I removed the DENY I had added to see if I can now get it to work and then I get blocked by a rule with an ID I can't track down.

    I've used pfsense for over 10 years and it's always been rock solid but this is the first time I was deploying a HA cluster.

    If anyone has seen anything like this please let me know, of you you know a way to "reset" all the Firewalls back to start and I'm thinking that might help.

    Thanks in advance


  • LAYER 8 Netgate

    Start with one wan and one lan. Get that that working then move on.

  • Thanks @Derelict we actually did that and that was fine, it was really just once we implemented CARP has things all gone strange and I don't know the best way to reset it.

    It is better to brake the HA cluster and role it back to a single node?

    Also do you have any idea how I can trace a rule that doesn't come up with pfctl -vvsr

    Thanks for the quick reply



  • LAYER 8 Netgate

    What does hovering over the red X say?

  • LAYER 8 Netgate

    I was talking about implementing CARP on one WAN and one LAN then moving on to the multi-wan/lan configuration. Better to do two interfaces incorrectly than to do 20 only to find out they're all done wrong.

  • Sorry I had to wait before it would let me post again///


  • This is a quick diagram, doesn't show all the access layer etc..


  • LAYER 8 Netgate

    OK that tells us nothing about how you have it configured.

  • @Derelict I know, notes coming!

  • LAYER 8 Netgate

    does grep 1566102523 /tmp/rules.debug show anything?

  • @Derelict nope it returns nothing and when I look in the debug file again I don't see that rule.

    FYI, the other item I'm not sure we changed was to edit the manual outbound NAT rules in the same way we did for the WAN it from the interface to the VIP. I just thought of that, not sure how that relates to these lost rules

  • LAYER 8 Netgate

    No idea what that means. You only need outbound NAT for WANs in general.

  • thanks, that answers that then!

  • LAYER 8 Netgate

    Set outbound NAT on WANs to a CARP VIP for sources that need NAT (inside networks, etc. NOT localhost or the WAN subnet or source any)

    Set DHCP to give a LAN CARP VIP as the default gateway to inside hosts.

    Set DHCP to give a CARP VIP as the DNS server to inside hosts if you are using pfSense for DNS.

    Configure the failover peer in the DHCP servers on the Primary to the IP address ON THAT INTERFACE on the secondary.

    Configure XMLRPC sync on the sync interface on the primary to the secondary - not the other way around.

    Configure pfsync on the sync interface bidirectionally on both nodes.

    That's pretty much it.

  • Thanks @Derelict I'll step through this all again now and check them off, the only one I know didn't work the last time was DNS (line below). But I'll test that agin on a VLAN now being used. FYI, we're using DNS Resolver (within pfSense)

    Set DHCP to give a CARP VIP as the DNS server to inside hosts if you are using pfSense for DNS.

  • LAYER 8 Netgate

    It should work for DNS to send them to both nodes' interface addresses but if one is down it will have to time out and might induce client-perceptible delays. If you want multiple DNS servers for your clients my suggestion would be to put them on something off the firewall/HA pair, such as dedicated unbound/BIND servers, domain controllers, etc.

  • @Derelict Yes understood and we tried to set it to the CARP vip it just didn't provide an DNS resolution. But testing again now

  • LAYER 8 Netgate

    You can always just dig @carp_vip from an inside host to see if unbound is listening there. It should be.

  • @Derelict it looks like DNS on the .1 (CARP VIP) works on the two OPT networks (GUEST and LAB) but not on the LAN Network. I just get a timed out when I try and lookup against DNS on the LAN CARP VIP

  • LAYER 8 Netgate

    Are you passing DNS traffic to the LAN CARP VIP on that interface?

    Did you change the default listening interfaces on unbound from All?

    Is the source address of the query from an address on a pfSense interface or from a network routed from a downstream router on LAN?

    Anything interesting logged in the DNS Resolver logs?

  • Hi Have the interface in DNS Resolver setup with the LAN and LAN CARP VIP highlighted.

    If I try and do a packet capture of the firewall on the LAN interface for that LAN CARP VIP, I don't see any traffic for a DNS lookup or even a PING (which works).

    FYI< I checked this on both firewalls to be sure.


  • I tried a DENY rule to see if I could just get the firewall to say something but no dice


  • !!!!!!!!, it might have been the old Firewall, it had been disabled but when I took a loot at it, it still had a light of lights on the ports... I'm going back through all my testing now.

    I just wanted to let you know @Derelict

  • @Derelict Just wanted to let you know know it's looking allot better now and I think it was just that lingering interface that should have been down that caused the issue (which then caused others).

    Thanks for coming back so quick on a Sunday. FYI, I've now hit another Intel 10G known issue which I'll post once I re-read the previous ones