Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Problem with standby node

    Scheduled Pinned Locked Moved General pfSense Questions
    18 Posts 4 Posters 1.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cmouse
      last edited by

      We have a HA setup here, in which the standby node is constantly having issues. It's mostly unreachable over LAN and if you manage to actually reach it, it keeps kicking you out of SSH after a while, and the WEB UI becomes unresponsive. We have rebooted it several times to no avail. Any ideas how this could be debugged further?

      The HA pair is Netgate SG-8860 version 2.4.2-RELEASE-p1 (amd64).

      1 Reply Last reply Reply Quote 0
      • C
        cmouse
        last edited by

        After much headscratching, found out that

        • someone had configured a /32 instead of /29 for WAN IP, which caused some of the issues.

        • for some reason, the device mgmt does not work outside L2 network when it's in standby mode, this has now been confirmed with both HA members after CARP failover:

          • HTTP(S) does not even open most of the time, and if it does, gets stuck

          • SSH is cut after few seconds

        1 Reply Last reply Reply Quote 0
        • SammyWooS
          SammyWoo
          last edited by

          Someone? Sounds like somebody just quit and your boss just told you, Sam, this is now yours!  Have fun. :D

          1 Reply Last reply Reply Quote 0
          • DerelictD
            Derelict LAYER 8 Netgate
            last edited by

            Then you still have something misconfigured.

            A properly-configured HA pair is always accessible, both primary and secondary, using the interface IP addresses.

            You need to make sure ALL of your interfaces exactly match exactly match exactly match on both nodes in the same order in the same order in the same order. This means the physical interface (igb1 em2, ix1.102 etc) and the internal, logical interface name (wan, lan, opt1, opt2, etc). Making the descriptive name match exactly is also recommended for sanity's sake (LAN, WAN, DMZ, GUESTWIFI, SYNC, etc).

            I use the Status > Interfaces screen to check this since it displays all of the pertinent information in the proper order.

            That is where I would start based on your trouble description.

            Chattanooga, Tennessee, USA
            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
            Do Not Chat For Help! NO_WAN_EGRESS(TM)

            1 Reply Last reply Reply Quote 0
            • C
              cmouse
              last edited by

              I know who did it, just wasn't me.

              The primary and secondary are accessible, but the secondary refuses to be accessible over different subnet. If I use the interface IP from same subnet, it works as expected, if I try to use it from different subnet, it misbehaves.

              Interface names match on both, for physical, logical and descriptive name, ensuring case is same too.

              1 Reply Last reply Reply Quote 0
              • DerelictD
                Derelict LAYER 8 Netgate
                last edited by

                Then either your routing or your rules are wrong.

                You will probably have to be more specific and post screenshots of the addresses/interfaces/rules in question to receive more assistance.

                Chattanooga, Tennessee, USA
                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                1 Reply Last reply Reply Quote 0
                • C
                  cmouse
                  last edited by

                  This is what happens when I try ssh from another L2

                  ~$ ip -4 addr show dev eno1
                  2: eno1: <broadcast,multicast,up,lower_up>mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
                      inet 10.217.110.125/24 brd 10.217.110.255 scope global dynamic eno1
                        valid_lft 56325sec preferred_lft 56325sec
                  ~$ ssh 10.217.1.3 -l admin
                  Connection to 10.217.1.3 closed by remote host.
                  Connection to 10.217.1.3 closed.

                  And from same L2

                  ~$ ssh 10.217.110.3 -l admin
                  Password for admin@gw13.dovecot.fi:
                  Netgate SG-8860 …

                  Here is the relevant ruleset

                  Could not find anything related/useful in logs.</broadcast,multicast,up,lower_up>

                  1 Reply Last reply Reply Quote 0
                  • DerelictD
                    Derelict LAYER 8 Netgate
                    last edited by

                    Looks like the firewall is either blocking that connection if it has to time out or rejecting that connection if you are getting that connection closed immediately.

                    That image is too small to read clearly - even with my reader specs.

                    Chattanooga, Tennessee, USA
                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                    1 Reply Last reply Reply Quote 0
                    • C
                      cmouse
                      last edited by

                      The connection closed comes after a longish delay.

                      Slightly larger image, hopefully this is more clearer.

                      1 Reply Last reply Reply Quote 0
                      • DerelictD
                        Derelict LAYER 8 Netgate
                        last edited by

                        What interface is that on? What is the interface subnet? What is the source address? What is the target address?

                        See my sig for the type of information required for us to help you.

                        Chattanooga, Tennessee, USA
                        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                        Do Not Chat For Help! NO_WAN_EGRESS(TM)

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmouse
                          last edited by

                          The setup is like this:

                          igb0.217 = 10.217.1.1/24 (vip), 10.217.1.2/24 (gw1), 10.217.1.3/24 (gw2)
                          igb0.100 = 10.217.110.1/24 (vip), 10.217.110.2/24 (gw1), 10.217.1110.3/24 (gw2)

                          Then gw1 is master, and gw2 is standby, I can access 10.217.1.2 from 10.217.110.125/24 just fine. I can't access 10.217.1.3/24 from that station, but I can access 10.217.110.3 just fine.

                          If I switch gw1 as standby and gw2 as master, I can't access 10.217.1.2 from 10.217.110.125 anymore, but I can access 10.217.110.2.

                          In spirit of debugging I have now tested this about 10 times by perusing the 'persistent CARP maintenance mode' on gw1.

                          the pfSenses are in a HA cluster mode, serving those subnets.

                          The symptops are:

                          • Login page open, but no matter how long I wait, it won't log in over web UI. (TCP connection is established, but login does not complete)

                          • ssh connection is same, TCP establishes, but the actual login won't complete. The few rare times it does, it kicks you out with 'write failed: Pipe broken' after some seconds.

                          1 Reply Last reply Reply Quote 0
                          • DerelictD
                            Derelict LAYER 8 Netgate
                            last edited by

                            Sounds like you might not be setting the clients to use the CARP VIP as the gateway.

                            Chattanooga, Tennessee, USA
                            A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                            DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                            Do Not Chat For Help! NO_WAN_EGRESS(TM)

                            1 Reply Last reply Reply Quote 0
                            • C
                              cmouse
                              last edited by

                              Unfortunately the CARP VIP is used. I think I'll just accept that it refuses to work over L3.

                              1 Reply Last reply Reply Quote 0
                              • DerelictD
                                Derelict LAYER 8 Netgate
                                last edited by

                                It works fine. Maybe your switches aren't moving the CARP MAC address like they should.

                                Chattanooga, Tennessee, USA
                                A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                1 Reply Last reply Reply Quote 0
                                • C
                                  cmouse
                                  last edited by

                                  That would imply that nothing would work, but the problem is limited to the standby switch only. Internet works, other resources on the other L2 work, so the gateway MAC cannot be blamed.

                                  1 Reply Last reply Reply Quote 0
                                  • DerelictD
                                    Derelict LAYER 8 Netgate
                                    last edited by

                                    Telling you, bro. it all works. You have something hosed up or are misunderstanding something.

                                    Chattanooga, Tennessee, USA
                                    A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                    DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                    Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                    1 Reply Last reply Reply Quote 0
                                    • C
                                      cmouse
                                      last edited by

                                      No doubt. Just would be nice to know what.

                                      1 Reply Last reply Reply Quote 0
                                      • B
                                        bpina
                                        last edited by bpina

                                        Hello,

                                        I have the same issue here. I'm using pfsense 2.4.4.
                                        Being in the pfsense network I have access to the standby node without any problem.
                                        Trying to access the standby node from a different network, https access become unresponsive.

                                        cmouse have you found a way to overcome this issue?

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post
                                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.