Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    strange connectivity errors in HA

    Scheduled Pinned Locked Moved HA/CARP/VIPs
    8 Posts 2 Posters 805 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P
      prakti
      last edited by

      Hi experts,

      i've a pfsense+ HA cluster. Both members are connected directly per fiber for carp / pfsync. Further, both devices are connected redundant to the lan side (lagg0.x) and also to the wan side (lagg1.200). On wan side there's only one VIP for the public IP, on lan side 6 VIPs exist for different transfer networks to different lan areas.

      In vlan 200 , also the ISP routers speaking HRRP for their virtual gateway address. HA - CARP + VIPs are working fine. The pfsenses seems doing what they should.

      Now the problem :

      per diagnostic -> ping / traceroute each firewall member can reach eg. 8.8.8.8 or the public IPs of the ISP routers or each other per phsyically IPs correctly, IF i don't set manually an source interface. If i set the source NIC (or vlan IP interface) the second HA member can't reach (ping / traceroute) anything outside. Not the first HA member, nor the ISP routers, nothing.

      If i do the same on the first HA member, everything is working EXCEPT reaching the publich ip of the second HA member.

      While rebooting the primary HA member, the second member get the master role for all VIPs, can ping everything outside (without setting a source interface), but no other traffic arrive the wan area.

      It seems, don't see the wood for the trees :-/
      Sorry for my german english ;-)

      Any ideas where to start ?

      V 1 Reply Last reply Reply Quote 0
      • P
        prakti
        last edited by

        Some more hints after debugging the problem above:
        after setting System -> Advanced -> Firewall & NAT -> Disable Firewall AND "pfctl -d" the problem still exists.

        From console on the second member i can ping the provider gateway. But after adding the source addr or source interface parameter to ping or traceroute there are no answers. :-(

        After disabling the firewall / packet filter function, i can be sure, that's not a firewall rule or NAT problem, correct?
        The nic's used in this firewall are sfxge0: Solarflare SFN7122F SFP+ Server Adapter

        1 Reply Last reply Reply Quote 0
        • V
          viragomann @prakti
          last edited by

          @prakti said in strange connectivity errors in HA:

          If i set the source NIC (or vlan IP interface) the second HA member can't reach (ping / traceroute) anything outside.

          Can you give more detail on this, please?

          Which IP? The interface IP or the CARP VIP?
          How are the interfaces and virtual IPs configured on both nodes?

          1 Reply Last reply Reply Quote 0
          • P
            prakti
            last edited by

            @viragomann
            thank you very much for your reply. From the second member , i'm testing "source pinging" with the interface IP. The interfaces are VLANs, trunked to the pfsenses as link aggregations (lagg1.x for the WAN trunk, lagg0.xxxx for the wan trunk)

            Our IP Range from Versatel is 83.x.x.48/28 ...
            HSRP address of versatel is 83.x.x.49
            1st router of verstel is 83.x.x.50
            2nd router of versatel is 83.x.x.51

            1st of my pfsense is 83.x.x.60
            2nd of my pfsense is 83.x.x.61
            VIP of both is
            83.x.x.53
            83.x.x.54
            83.x.x.55

            Without setting an internal source address, the .x.61 can ping everything above, 8.8.8.8 and everything else.

            The internal (LAN) transfer segments looks like that:
            1st of my pfsense is 172.23.0.2
            2nd of my pfsense is 172.23.0.3
            VIP 172.23.0.1
            for example the the addressing to the core network (extreme networks virtual fabric vsp7400 platform) in this case:
            core switch 1 172.23.0.250
            core switch 2 172.23.0.251
            core switch 3 172.23.0.252
            core switch 4 172.23.0.253
            core virtual (vrrp) 172.23.0.254

            For this example , this continues to five more transfer networks like:
            The internal (LAN) transfer segments looks like that:
            1st of my pfsense is 172.23.1.2
            2nd of my pfsense is 172.23.1.3
            VIP 172.23.1.1
            for example the the addressing to the core network (extreme virtual fabric) in this case:
            core switch 1 172.23.1.250
            core switch 2 172.23.1.251
            core switch 3 172.23.1.252
            core switch 4 172.23.1.253
            core virtual (vrrp) 172.23.1.254
            etc ....

            and the problem looks like that:

            traceroute -n -s 172.23.3.3 8.8.8.8
            traceroute to 8.8.8.8 (8.8.8.8) from 172.23.3.3, 64 hops max, 40 byte packets
            1 * * *
            2 * * *
            3 * * *
            4 * * *
            5 * * **

            and the same result is tracing 83.x.x.60 (1st fw member) or the 83.x.x.49, when setting an source ip :-/

            I'm a bit desperate

            V 1 Reply Last reply Reply Quote 0
            • V
              viragomann @prakti
              last edited by

              @prakti said in strange connectivity errors in HA:

              traceroute -n -s 172.23.3.3 8.8.8.8

              So which device is this source IP assigned to?

              Note that you can only use IPs assigned to the respective pfSense itself. You cannot use an arbitrary internal IP or a CARP or any other VIP hooking up on a CARP VIP if the node is in backup state.

              P 1 Reply Last reply Reply Quote 0
              • P
                prakti @viragomann
                last edited by

                @viragomann said in strange connectivity errors in HA:

                So which device is this source IP assigned to?

                The 172.23.3.3 is (one of) the internal ip address of the vlan nic from the second HA member.
                The 172.23.3.2 is the address of the first HA member and
                the The 172.23.3.1 is the VIP for this VLAN

                "172.23.3.1/24 (vhid: 8)"

                So "traceroute -s" should work?

                V 1 Reply Last reply Reply Quote 0
                • V
                  viragomann @prakti
                  last edited by

                  @prakti
                  Yes, it's the same here.

                  I investigated this by sniffing the WAN traffic. The reason was found immediately.
                  If you use an internal IP to ping a public host, the outbound NAT rule is applied to the traffic, which translates the source to the CARP WAN VIP.
                  Hence responses go to the master node and the backup doesn't get a reply.

                  P 1 Reply Last reply Reply Quote 0
                  • P
                    prakti @viragomann
                    last edited by

                    @viragomann
                    Hi viragomann,

                    thank you very much for your time and investigation. Your answer was very important bringing me back to the correct path for debugging. The reason, why clients can't reach the internet was an inconsistent configuration of pfBlockNG between the two HA members. I've ignored erros like this:

                    /rc.filter_configure_sync: New alert found: Unresolvable source alias 'pfB_BinaryDefense_v4' for rule 'NAT Allow HTTPS_2_xxxxxxxx'
                    Dec 14 16:17:17 svrfw02 php-fpm[32037]: /rc.filter_configure_sync: New alert found: Unresolvable source alias 'pfB_DNSBLIP_v4' for rule 'NAT Allow HTTP_2_xxxxxxxx'
                    Dec 14 16:17:17 svrfw02 php-fpm[32037]: /rc.filter_configure_sync: New alert found: Unresolvable source alias 'pfB_DNSBLIP_v4' for rule 'NAT Allow HTTPS_2_xxxxxxxx'
                    Dec 14 16:17:18 svrfw02 php-fpm[32037]: /rc.filter_configure_sync: New alert found: There were error(s) loading the rules: /tmp/rules.debug:299: syntax error - The line in question reads [299]: rdr on lagg1.808 inet proto tcp from ! to 83.x.x.54 port 443 -> $SERVER_xxxxxxxx

                    After fixing this, switching between carps members works correctly.
                    Again, thank you for your assistance !!!!!

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.