• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Problem with standby node

Scheduled Pinned Locked Moved General pfSense Questions
18 Posts 4 Posters 1.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C
    cmouse
    last edited by Apr 4, 2018, 5:18 PM

    We have a HA setup here, in which the standby node is constantly having issues. It's mostly unreachable over LAN and if you manage to actually reach it, it keeps kicking you out of SSH after a while, and the WEB UI becomes unresponsive. We have rebooted it several times to no avail. Any ideas how this could be debugged further?

    The HA pair is Netgate SG-8860 version 2.4.2-RELEASE-p1 (amd64).

    1 Reply Last reply Reply Quote 0
    • C
      cmouse
      last edited by Apr 4, 2018, 6:17 PM

      After much headscratching, found out that

      • someone had configured a /32 instead of /29 for WAN IP, which caused some of the issues.

      • for some reason, the device mgmt does not work outside L2 network when it's in standby mode, this has now been confirmed with both HA members after CARP failover:

        • HTTP(S) does not even open most of the time, and if it does, gets stuck

        • SSH is cut after few seconds

      1 Reply Last reply Reply Quote 0
      • S
        SammyWoo
        last edited by Apr 4, 2018, 6:22 PM

        Someone? Sounds like somebody just quit and your boss just told you, Sam, this is now yours!  Have fun. :D

        1 Reply Last reply Reply Quote 0
        • D
          Derelict LAYER 8 Netgate
          last edited by Apr 4, 2018, 6:23 PM

          Then you still have something misconfigured.

          A properly-configured HA pair is always accessible, both primary and secondary, using the interface IP addresses.

          You need to make sure ALL of your interfaces exactly match exactly match exactly match on both nodes in the same order in the same order in the same order. This means the physical interface (igb1 em2, ix1.102 etc) and the internal, logical interface name (wan, lan, opt1, opt2, etc). Making the descriptive name match exactly is also recommended for sanity's sake (LAN, WAN, DMZ, GUESTWIFI, SYNC, etc).

          I use the Status > Interfaces screen to check this since it displays all of the pertinent information in the proper order.

          That is where I would start based on your trouble description.

          Chattanooga, Tennessee, USA
          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
          Do Not Chat For Help! NO_WAN_EGRESS(TM)

          1 Reply Last reply Reply Quote 0
          • C
            cmouse
            last edited by Apr 4, 2018, 6:33 PM

            I know who did it, just wasn't me.

            The primary and secondary are accessible, but the secondary refuses to be accessible over different subnet. If I use the interface IP from same subnet, it works as expected, if I try to use it from different subnet, it misbehaves.

            Interface names match on both, for physical, logical and descriptive name, ensuring case is same too.

            1 Reply Last reply Reply Quote 0
            • D
              Derelict LAYER 8 Netgate
              last edited by Apr 4, 2018, 6:42 PM

              Then either your routing or your rules are wrong.

              You will probably have to be more specific and post screenshots of the addresses/interfaces/rules in question to receive more assistance.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • C
                cmouse
                last edited by Apr 4, 2018, 7:01 PM

                This is what happens when I try ssh from another L2

                ~$ ip -4 addr show dev eno1
                2: eno1: <broadcast,multicast,up,lower_up>mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
                    inet 10.217.110.125/24 brd 10.217.110.255 scope global dynamic eno1
                      valid_lft 56325sec preferred_lft 56325sec
                ~$ ssh 10.217.1.3 -l admin
                Connection to 10.217.1.3 closed by remote host.
                Connection to 10.217.1.3 closed.

                And from same L2

                ~$ ssh 10.217.110.3 -l admin
                Password for admin@gw13.dovecot.fi:
                Netgate SG-8860 …

                Here is the relevant ruleset

                Could not find anything related/useful in logs.</broadcast,multicast,up,lower_up>

                1 Reply Last reply Reply Quote 0
                • D
                  Derelict LAYER 8 Netgate
                  last edited by Apr 4, 2018, 7:03 PM

                  Looks like the firewall is either blocking that connection if it has to time out or rejecting that connection if you are getting that connection closed immediately.

                  That image is too small to read clearly - even with my reader specs.

                  Chattanooga, Tennessee, USA
                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                  1 Reply Last reply Reply Quote 0
                  • C
                    cmouse
                    last edited by Apr 4, 2018, 7:06 PM

                    The connection closed comes after a longish delay.

                    Slightly larger image, hopefully this is more clearer.

                    1 Reply Last reply Reply Quote 0
                    • D
                      Derelict LAYER 8 Netgate
                      last edited by Apr 5, 2018, 1:27 AM

                      What interface is that on? What is the interface subnet? What is the source address? What is the target address?

                      See my sig for the type of information required for us to help you.

                      Chattanooga, Tennessee, USA
                      A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                      DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                      Do Not Chat For Help! NO_WAN_EGRESS(TM)

                      1 Reply Last reply Reply Quote 0
                      • C
                        cmouse
                        last edited by Apr 5, 2018, 10:37 AM

                        The setup is like this:

                        igb0.217 = 10.217.1.1/24 (vip), 10.217.1.2/24 (gw1), 10.217.1.3/24 (gw2)
                        igb0.100 = 10.217.110.1/24 (vip), 10.217.110.2/24 (gw1), 10.217.1110.3/24 (gw2)

                        Then gw1 is master, and gw2 is standby, I can access 10.217.1.2 from 10.217.110.125/24 just fine. I can't access 10.217.1.3/24 from that station, but I can access 10.217.110.3 just fine.

                        If I switch gw1 as standby and gw2 as master, I can't access 10.217.1.2 from 10.217.110.125 anymore, but I can access 10.217.110.2.

                        In spirit of debugging I have now tested this about 10 times by perusing the 'persistent CARP maintenance mode' on gw1.

                        the pfSenses are in a HA cluster mode, serving those subnets.

                        The symptops are:

                        • Login page open, but no matter how long I wait, it won't log in over web UI. (TCP connection is established, but login does not complete)

                        • ssh connection is same, TCP establishes, but the actual login won't complete. The few rare times it does, it kicks you out with 'write failed: Pipe broken' after some seconds.

                        1 Reply Last reply Reply Quote 0
                        • D
                          Derelict LAYER 8 Netgate
                          last edited by Apr 5, 2018, 3:49 PM

                          Sounds like you might not be setting the clients to use the CARP VIP as the gateway.

                          Chattanooga, Tennessee, USA
                          A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                          DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                          Do Not Chat For Help! NO_WAN_EGRESS(TM)

                          1 Reply Last reply Reply Quote 0
                          • C
                            cmouse
                            last edited by Apr 6, 2018, 6:21 PM

                            Unfortunately the CARP VIP is used. I think I'll just accept that it refuses to work over L3.

                            1 Reply Last reply Reply Quote 0
                            • D
                              Derelict LAYER 8 Netgate
                              last edited by Apr 6, 2018, 6:23 PM

                              It works fine. Maybe your switches aren't moving the CARP MAC address like they should.

                              Chattanooga, Tennessee, USA
                              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                              Do Not Chat For Help! NO_WAN_EGRESS(TM)

                              1 Reply Last reply Reply Quote 0
                              • C
                                cmouse
                                last edited by Apr 6, 2018, 6:44 PM

                                That would imply that nothing would work, but the problem is limited to the standby switch only. Internet works, other resources on the other L2 work, so the gateway MAC cannot be blamed.

                                1 Reply Last reply Reply Quote 0
                                • D
                                  Derelict LAYER 8 Netgate
                                  last edited by Apr 6, 2018, 7:45 PM

                                  Telling you, bro. it all works. You have something hosed up or are misunderstanding something.

                                  Chattanooga, Tennessee, USA
                                  A comprehensive network diagram is worth 10,000 words and 15 conference calls.
                                  DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
                                  Do Not Chat For Help! NO_WAN_EGRESS(TM)

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    cmouse
                                    last edited by Apr 7, 2018, 5:26 AM

                                    No doubt. Just would be nice to know what.

                                    1 Reply Last reply Reply Quote 0
                                    • B
                                      bpina
                                      last edited by bpina Jan 3, 2020, 6:44 PM Jan 3, 2020, 6:41 PM

                                      Hello,

                                      I have the same issue here. I'm using pfsense 2.4.4.
                                      Being in the pfsense network I have access to the standby node without any problem.
                                      Trying to access the standby node from a different network, https access become unresponsive.

                                      cmouse have you found a way to overcome this issue?

                                      1 Reply Last reply Reply Quote 0
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                        This community forum collects and processes your personal information.
                                        consent.not_received