Issues with the "slave" (GUI access, gateway, unbound etc) in HA-mode



  • I have two pfsenses. Both have a public address and I have a CARP VIP. In addition, the firewalls have addresses in each interface and all interfaces have a CARP VIP. I have a PFSYNC interface and syncing between the devices work seemingly fine (everything gets immediately synced).

    The issues are

    • Slave marks the gateway as down and cannot contact internet
    • Slave GUI is accessible, but intermittently becomes completely unresponsive for many minutes
    • I see a lot of log entries like this per second:
      Jan 3 23:03:43 kernel: arpresolve: can't allocate llinfo for 91.190.199.217 on vtnet0
      Jan 3 23:03:43 dpinger: MYGW XXX.40.88.73: sendto error: 22
      (.218 is my WAN CARP VIP and .217 is the gateway)
    • Because the slave has no internet connetictivit I presumably also see tens of thousands of this: unbound: [16041:3] error: outgoing tcp: connect: Invalid argument for 1.1.1.1 port 853

    If I manually fail over, the SLAVE becomes a really nice firewall working perfectly, and the MASTER turn into the problematic one, just as described above.

    I have followed the HA instructions, read every possible blog on the matter and I am obviously blind to something really fundamental here.

    What info do you need to help me see what I am doing wrong? I'm not going to dump all configurations and screenshots, so point me towards something?

    Thanks.



  • "Never add outbound NAT rules that could match the WAN/Public IP addresses of the cluster. This includes both rules that have the public IP addresses listed explicitly and also rules that have any set as a source. These NAT rules will cause other problems/unintended behavior, and will break outbound connectivity from the secondary node when it is in a BACKUP state."

    Could someone explain this?


  • LAYER 8 Netgate

    Primary WAN: 192.0.2.2/29
    Secondary WAN: 192.0.2.3/29
    WAN CARP: 192.0.2.1/29

    If you have outbound NAT rules that match traffic sourced from 192.0.2.0/29, or from localhost (127.0.0.0/8), and NAT to the CARP VIP, you will be translating the source address of outbound traffic to the CARP VIP. That will cause all sorts of problems for connections made by the node that does not currently hold the CARP MASTER VIP.

    A common way this happens is people use any as a source network for outbound NAT.

    Only set outbound NAT to the CARP VIP for things that need it, like connections out WAN sourced from inside hosts.



  • It needs to be pointed out that:

    • my carp vip is .218 (from 91.190.199.216 /30)
    • my firewalls have an ip-address each from 91.190.195.32/28 (they are .33 and .34)

    These are the two networks I get from my ISP - the gateway to the internet being 91.190.199.217

    And my current NAT is

    	WAN	This Firewall	*	*	SG_HTTP		91.190.199.218	*			  
    	WAN	This Firewall	*	*	*	91.190.199.218	*			  
    	WAN	10.200.201.0/24	*	*	*	91.190.195.35	*			  
    	WAN	any	*	*	*	91.190.199.218	*

  • LAYER 8 Netgate

    That's pretty broken out-of-the-gate.

    That /30 should be a /29. But it should work though I've never tried it.

    You should not be sourcing outbound NAT from either This Firewall or any. That's been the conversation above.



  • I read this "Never add outbound NAT rules that could match the WAN/Public IP addresses of the cluster" as "Never add outbound NAT rules that could match the public ip-addresses of the firewalls themselves, but rather always NAT from the wan CARP VIP".

    What you are saying is (how I should interpret this is), always NAT the traffic outbound from the public address of the currently active firewall - "interface address"?

    Or if my question is still stupid/uneducated - how should the NAT's be constructed - give me a concrete example that I can work from?



  • So this NTA thing is now as follows. If I understand correctly, each device will now do all outbound traffic using it's own address as NAT, and the any-rule should capture all stuff going from any of my subnets to the outside using my WAN CARP WIP.

    	Interface	Source	Source Port	Destination	Destination Port	NAT Address	NAT Port	Static Port	Description	Actions
    	WAN	This Firewall	*	*	*	WAN address	*			  
    	WAN	10.200.201.0/24	*	*	*	91.190.195.35	*			  
    	WAN	any	*	*	*	91.190.199.218	*			  
    

    However, this does not really change anything. I can't still access the secondary firewall from my interna workstations, except intermittently now and then for a few requests and then it stops (resulting in gateway 504).

    All problems and issues persist.


  • LAYER 8 Netgate

    YOU ARE STILL PERFORMING NAT FOR SOURCE ANY!

    A concrete example? How about the standard manual outbound NAT rules that you had when you first enabled manual outbound NAT? You decided they needed changing above and beyond simply changing the NAT address to the CARP VIP.

    0_1546631264237_Screen Shot 2019-01-04 at 11.47.25 AM.png



  • Allright, so that NAT's are not performed like firewall rules top to bottom (when a rule matches, no further processing is done). My assumption was,that "any" in this case would mean "anything else than the above" - which effectively would be a oneliner of all my internal networks. If this is not the case, then I will go back to the standard set and just modify those. They are all now set to the WAN CARP VIP.

    What I have decided is of course to follow the documentation step by step trying to interpret what to do. And one step in the was to change the NAT rules to the WAN CARP VIP.

    After this change, nothing really changes. The problems persist, and the slave cannot for instance ping 1.1.1.1 whereas the master can.

    	Interface	Source	Source Port	Destination	Destination Port	NAT Address	NAT Port	Static Port	Description	Actions
    	WAN	10.200.201.0/24	*	*	*	91.190.195.35	*			 
    	WAN	127.0.0.0/8	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - localhost to WAN	 
    	WAN	127.0.0.0/8	*	*	*	91.190.199.218	*		Auto created rule - localhost to WAN	 
    	WAN	::1/128	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - localhost to WAN	 
    	WAN	::1/128	*	*	*	91.190.199.218	*		Auto created rule - localhost to WAN	 
    	WAN	192.168.1.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - LAN to WAN	 
    	WAN	192.168.1.0/24	*	*	*	91.190.199.218	*		Auto created rule - LAN to WAN	 
    	WAN	192.168.100.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - BLUE to WAN	 
    	WAN	192.168.100.0/24	*	*	*	91.190.199.218	*		Auto created rule - BLUE to WAN	 
    	WAN	10.10.1.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - DMZ to WAN	 
    	WAN	10.10.1.0/24	*	*	*	91.190.199.218	*		Auto created rule - DMZ to WAN	 
    	WAN	192.168.10.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - MINTSERVERS to WAN	 
    	WAN	192.168.10.0/24	*	*	*	91.190.199.218	*		Auto created rule - MINTSERVERS to WAN	 
    	WAN	192.168.20.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - MALMSERVERS to WAN	 
    	WAN	192.168.20.0/24	*	*	*	91.190.199.218	*		Auto created rule - MALMSERVERS to WAN	 
    	WAN	10.20.1.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - SEGREGATED to WAN	 
    	WAN	10.20.1.0/24	*	*	*	91.190.199.218	*		Auto created rule - SEGREGATED to WAN	 
    	WAN	192.168.30.0/24	*	*	500 (ISAKMP)	91.190.199.218	*		Auto created rule for ISAKMP - MGMT to WAN	 
    	WAN	192.168.30.0/24	*	*	*	91.190.199.218	*		Auto created rule - MGMT to WAN	 
    	WAN	10.200.200.0/24	*	*	*	91.190.199.218	*		Auto created rule - OpenVPN server to WAN	 
    	WAN	10.100.100.0/24	*	*	*	91.190.199.218	*		Auto created rule - OpenVPN server to WAN	 
    	WAN	10.100.101.0/24	*	*	*	91.190.199.218	*		Auto created rule - OpenVPN server to WAN

  • LAYER 8 Netgate

    Allright, so that NAT's are not performed like firewall rules top to bottom

    Yes. they are. Your any rule at the bottom catches everything so everything has NAT performed on it. The whole point is that some things should not have NAT to the CARP VIP performed on them.

    In your example you are STILL natting for sources in 127.0.0.0/8, which is where traffic sourced from the firewall itself might be originating for NAT purposes, which is NOT the default configuration.

    If I were you I would scrap what you have start over with a simple WAN/LAN setup on a test bench, read the docs again, and get a handle on what is involved here.

    If NAT to the interface address doesn't work, then your /30 /28 scheme, which I said was broken out-of-the-gate, is incompatible with the upstream gear.


Log in to reply