NEW CARP Setup seems to have a mind of it's own and secret firewall rules - limited/no logs

Hass

I've setup a new firewall HA Cluster with CARP (latest 2.4 image). Everything looks okay but odd things don't work and there are no logs. It seems to be related to CARP VIP's and Rules with ID's that I can't find even when running "pfctl -vvsr" on both firewalls.

I'll try and get a diagram up but in summery I have a multi-WAN - Multi LAN setup. Outbound Web Access "seems" fine but access between LAN networks is not and there seems to be lots of other little quirks. Each internal LAN network has a CARP VIP default gateway of .1 with .2 and .3 going to the firewall interfaces. It's reversed on the WAN side and using the middle of a block.

2 example issues are

We couldn't use the CARP VIP for DNS, it just didn't respond to requests, but .2 and .3 do.
I can't PING from one subnet to another and there were no logs. I could ping the PFSense Interface on the opposite subnet but no to a connected device.
|-After trying everything (adding DENY's and doing packet caputres etc.. and still seeing nothing) I changed the default gateway from the CARP VIP to the Primary Firewall Interface IP. Then boom I see the blocked traffic I was expecting. I removed the DENY I had added to see if I can now get it to work and then I get blocked by a rule with an ID I can't track down.

I've used pfsense for over 10 years and it's always been rock solid but this is the first time I was deploying a HA cluster.

If anyone has seen anything like this please let me know, of you you know a way to "reset" all the Firewalls back to start and I'm thinking that might help.

Thanks in advance

Hass

Derelict

Start with one wan and one lan. Get that that working then move on.

Hass

Thanks @Derelict we actually did that and that was fine, it was really just once we implemented CARP has things all gone strange and I don't know the best way to reset it.

It is better to brake the HA cluster and role it back to a single node?

Also do you have any idea how I can trace a rule that doesn't come up with pfctl -vvsr

Thanks for the quick reply

Hass

Derelict

What does hovering over the red X say?

Derelict

I was talking about implementing CARP on one WAN and one LAN then moving on to the multi-wan/lan configuration. Better to do two interfaces incorrectly than to do 20 only to find out they're all done wrong.

Hass

Sorry I had to wait before it would let me post again///

Hass

This is a quick diagram, doesn't show all the access layer etc..

Derelict

OK that tells us nothing about how you have it configured.

Hass

@Derelict I know, notes coming!

Derelict

does grep 1566102523 /tmp/rules.debug show anything?

Hass

@Derelict nope it returns nothing and when I look in the debug file again I don't see that rule.

FYI, the other item I'm not sure we changed was to edit the manual outbound NAT rules in the same way we did for the WAN it from the interface to the VIP. I just thought of that, not sure how that relates to these lost rules

Derelict

No idea what that means. You only need outbound NAT for WANs in general.

Hass

thanks, that answers that then!

Derelict

Set outbound NAT on WANs to a CARP VIP for sources that need NAT (inside networks, etc. NOT localhost or the WAN subnet or source any)

Set DHCP to give a LAN CARP VIP as the default gateway to inside hosts.

Set DHCP to give a CARP VIP as the DNS server to inside hosts if you are using pfSense for DNS.

Configure the failover peer in the DHCP servers on the Primary to the IP address ON THAT INTERFACE on the secondary.

Configure XMLRPC sync on the sync interface on the primary to the secondary - not the other way around.

Configure pfsync on the sync interface bidirectionally on both nodes.

That's pretty much it.

Hass

Thanks @Derelict I'll step through this all again now and check them off, the only one I know didn't work the last time was DNS (line below). But I'll test that agin on a VLAN now being used. FYI, we're using DNS Resolver (within pfSense)

Set DHCP to give a CARP VIP as the DNS server to inside hosts if you are using pfSense for DNS.

Derelict

It should work for DNS to send them to both nodes' interface addresses but if one is down it will have to time out and might induce client-perceptible delays. If you want multiple DNS servers for your clients my suggestion would be to put them on something off the firewall/HA pair, such as dedicated unbound/BIND servers, domain controllers, etc.

Hass

@Derelict Yes understood and we tried to set it to the CARP vip it just didn't provide an DNS resolution. But testing again now

Derelict

You can always just dig @carp_vip www.google.com from an inside host to see if unbound is listening there. It should be.

Hass

@Derelict it looks like DNS on the .1 (CARP VIP) works on the two OPT networks (GUEST and LAB) but not on the LAN Network. I just get a timed out when I try and lookup against DNS on the LAN CARP VIP

Derelict

Are you passing DNS traffic to the LAN CARP VIP on that interface?

Did you change the default listening interfaces on unbound from All?

Is the source address of the query from an address on a pfSense interface or from a network routed from a downstream router on LAN?

Anything interesting logged in the DNS Resolver logs?