Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Subnet collapses periodically since 24.11-RELEASE

    Scheduled Pinned Locked Moved DHCP and DNS
    38 Posts 5 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      SteveITS Galactic Empire @vf1954
      last edited by

      @vf1954 So, what is the .254 you mentioned?

      Screencap the change in pfSense when this happens.

      If the fields in pfSense aren’t changing I suspect what you’re seeing is another DHCP server. Windows and I’m sure other clients will show the DHCP server used for example “ipconfig /all”

      Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
      When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
      Upvote 👍 helpful posts!

      V 1 Reply Last reply Reply Quote 0
      • V
        vf1954 @SteveITS
        last edited by

        @SteveITS said in Subnet collapses periodically since 24.11-RELEASE:

        Screencap the change in pfSense when this happens.

        Not sure what you mean here. Does screencap mean screenshot? Screenshot what?

        The address being circulated is 192.168.0.xx but the other DHCP router is the wifi which is turned off.

        S 1 Reply Last reply Reply Quote 0
        • S
          SteveITS Galactic Empire @vf1954
          last edited by

          @vf1954 yes, screenshot pfSense with the changed settings, or some evidence.

          If you’re not saying anything in pfSense actually changes then it’s not pfSense. Unplug pfSense LAN, restart a client, and see what it’s IP and DHCP server are.

          Pre-2.7.2/23.09: Only install packages for your version, or risk breaking it. Select your branch in System/Update/Update Settings.
          When upgrading, allow 10-15 minutes to restart, or more depending on packages and device speed.
          Upvote 👍 helpful posts!

          V 1 Reply Last reply Reply Quote 0
          • V
            vf1954 @SteveITS
            last edited by

            @SteveITS Fair enough. But why would pfsense just give up its DHCP authority ... randomly ... after 6-14 days?

            johnpozJ 1 Reply Last reply Reply Quote 0
            • johnpozJ
              johnpoz LAYER 8 Global Moderator @vf1954
              last edited by johnpoz

              @vf1954 said in Subnet collapses periodically since 24.11-RELEASE:

              DHCP authority ... randomly ... after 6-14 days?

              what authority?? When a client does a discover - the first dhcp server that answers wins..

              If there is more than 1 dhcp server on your network - its a coinflip who will answer first.

              You can run more than 1 on the same network.. But they need to hand out the same info.. This is can be done as a failover scenario - where you split the scope between then..

              Say dhcpd1 hands out 192.168.1.10-128
              Where dhcpd2 hands out 192.168.1.129-244

              Both point say to 192.168.1.1 for dns and gateway.. Leaving you .2-9 and .245-254 as IPs you can set statically on devices.

              But if your handing out different IP range and different gateway - yeah your going to have a bad day on clients that get IP from that dhcp server.

              An intelligent man is sometimes forced to be drunk to spend time with his fools
              If you get confused: Listen to the Music Play
              Please don't Chat/PM me for help, unless mod related
              SG-4860 24.11 | Lab VMs 2.7.2, 24.11

              V 1 Reply Last reply Reply Quote 0
              • V
                vf1954 @johnpoz
                last edited by

                @johnpoz Hello John,

                Yes you taught me something new again. I thought DHCP holds authority.

                But regardless, even if two DHCP servers were vying for the same "authority" (to grant leases), I'd expect, statistically, that many of the clients would choose 192.168.0.x and lose network/internet access and that that would appear sporadically during the day/week. This is not the behaviour. It is perfectly stable with netgate "in charge", all the time, for all clients, until suddenly every client decides to pivot to 192.168.0.x (albeit at different times, but once one goes the rest will follow within an hour).

                You would think they all magically pick up netgate after a couple hours... but they don't either. the pfsense just become inaccessible until I console into it.

                my two switches are hardcoded to be on 192.168.3.x address.
                my 3 tp-link archer 5400 are set as 192.168.3.3 .4 .5 on easy-mesh with the primary dhcp = off.

                There is no other dhcp server afaik.

                johnpozJ 1 Reply Last reply Reply Quote 0
                • johnpozJ
                  johnpoz LAYER 8 Global Moderator @vf1954
                  last edited by johnpoz

                  @vf1954 clearly there is.. Here do this.. Look at your client currently.

                  What does it list for the dhcp server?

                  $ ipconfig /all                                                                       
                                                                                                        
                  Windows IP Configuration                                                              
                                                                                                        
                     Host Name . . . . . . . . . . . . : i9-win                                         
                     Primary Dns Suffix  . . . . . . . : home.arpa                                      
                     Node Type . . . . . . . . . . . . : Broadcast                                      
                     IP Routing Enabled. . . . . . . . : No                                             
                     WINS Proxy Enabled. . . . . . . . : No                                             
                     DNS Suffix Search List. . . . . . : home.arpa                                      
                                                                                                        
                  Ethernet adapter Local:                                                               
                                                                                                        
                     Connection-specific DNS Suffix  . :                                                
                     Description . . . . . . . . . . . : Killer E2600 Gigabit Ethernet Controller       
                     Physical Address. . . . . . . . . : B0-4F-13-0B-FD-16                              
                     DHCP Enabled. . . . . . . . . . . : Yes                                            
                     Autoconfiguration Enabled . . . . : Yes                                            
                     IPv4 Address. . . . . . . . . . . : 192.168.9.100(Preferred)                       
                     Subnet Mask . . . . . . . . . . . : 255.255.255.0                                  
                     Lease Obtained. . . . . . . . . . : Friday, February 14, 2025 2:01:59 PM           
                     Lease Expires . . . . . . . . . . : Tuesday, February 18, 2025 2:02:00 PM          
                     Default Gateway . . . . . . . . . : 192.168.9.253                                  
                     DHCP Server . . . . . . . . . . . : 192.168.9.253                                  
                     DNS Servers . . . . . . . . . . . : 192.168.3.10                                   
                     NetBIOS over Tcpip. . . . . . . . : Enabled                                        
                  

                  192.168.9.253 is my pfsense.. now if I look at the mac address

                  $ arp -a
                  
                  Interface: 192.168.9.100 --- 0x5
                    Internet Address      Physical Address      Type
                    192.168.9.10          00-11-32-7b-29-7d     dynamic
                    192.168.9.253         00-08-a2-0c-e6-24     dynamic
                    192.168.9.255         ff-ff-ff-ff-ff-ff     static
                    224.0.0.22            01-00-5e-00-00-16     static
                    239.255.255.250       01-00-5e-7f-ff-fa     static
                    255.255.255.255       ff-ff-ff-ff-ff-ff     static
                  

                  So its mac is 00-08-a2-0c-e6-24. If pfsense was out of the blue changing its IP and dhcp scope, that that mac address would be the same.

                  As to why your not seeing a random distribution, maybe pfsense dhcp answers faster - but when it goes offline the only one to answer is your other dhcp server.

                  Pfsense is just not going to randomly change its IP address.. You either changing it, or your loading a bad/old config? Looking to what mac address your dhcp server is at will tell you for sure that its pfsense, or its some other box.

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                  V 1 Reply Last reply Reply Quote 0
                  • V
                    vf1954 @johnpoz
                    last edited by

                    @johnpoz I agree a second dhcp is somewhere lurking but I am at wits end to figure out where.

                    TP-Link: unless the tp link is acting out, it's off. I updated the firmware but that didn't have any effect.
                    Novell (OES2 server). It has dhcp disabled and the port to dhcp also blocked.
                    Pi-Hole: turned off (and even if it was turned on, it would serve 192.168.3.x)
                    Switches: no dhcp server capability (afaik)
                    We have several unmanaged switches connecting various PCs in an office back to one of the switches

                    ...

                    that's it.

                    johnpozJ 1 Reply Last reply Reply Quote 0
                    • johnpozJ
                      johnpoz LAYER 8 Global Moderator @vf1954
                      last edited by johnpoz

                      @vf1954 well next time it happens, check the mac - that should help you track down what is doing it.

                      Or turn off the dhcp server in pfsense.. Do a release and renew on some client, that you were seeing this before.. Does it get the 192.168.0 address.. If so what is the mac of the dhcp server and hope you can track it down from that. The first 3 numbers of the mac should tall you what brand of device it is atleast.

                      Unless your switches are all just dumb switches, managed and smart switch can provide dhcp.

                      edit: I mean it could be possible if pfsense is rebooting to an old config or something.. When you console in, look to see what IPs are on the interfaces, etc. I just find that so highly improbable.. What makes more sense and quite possible to happen is something else serving dhcp..

                      Checking the mac address of dhcp server IP when you get the wrong lease and IP should tell you for sure.. My money is on rogue dhcp and not pfsense just spontaneously changing its IP of an interface and handing out different dhcp info

                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                      If you get confused: Listen to the Music Play
                      Please don't Chat/PM me for help, unless mod related
                      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                      V 1 Reply Last reply Reply Quote 0
                      • w0wW
                        w0w
                        last edited by

                        Any chance that there's some mess with flow control on the switches or client devices?
                        Some USB and non-USB Realtek network adapters embedded into motherboards are known to cause similar issues, such as endless pauses on RX/TX, which can literally collapse the network. I've run into this twice, so it's likely not such an uncommon issue nowadays.
                        I would start by disabling FC on pfSense and on the switches too, if it is enabled.
                        Netgate Documentation - Flow Control
                        Also, disable FC on the switches and routers you are using in your LAN.

                        V 1 Reply Last reply Reply Quote 0
                        • V
                          vf1954 @w0w
                          last edited by

                          @w0w I don't know. I never use flow control. I will look more deeply into this.

                          johnpozJ 1 Reply Last reply Reply Quote 0
                          • johnpozJ
                            johnpoz LAYER 8 Global Moderator @vf1954
                            last edited by

                            @vf1954 flow control issues not going to have your client change its IP.. There is zero reason to turn off flow control on anything.

                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                            If you get confused: Listen to the Music Play
                            Please don't Chat/PM me for help, unless mod related
                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                            w0wW 1 Reply Last reply Reply Quote 0
                            • V
                              vf1954 @johnpoz
                              last edited by

                              @johnpoz So I did some more testing.

                              Testing arp and ipconfig all reveals a DHCP server sending a 192.168.0.x broadcast. No picture attached, just letting you know.
                              Communicating to TP-Link engineers revealed that the wifi-router (Archer AX73v1) will act as a DHCP server as an emergency only if it detects no DHCP server anymore.

                              Since the network went down again this morning, I produced the following test results:

                              • Disconnecting the tp-link router(s) does NOT allow a client to establish a connection to netGATE pfSense.
                              • This means netGATE pfSense is somehow dropping the DHCP server randomly, and the tp-link notices this and says "uh oh" and does what it can.
                              • While in this strange state, I can enter into console and enter into shell and ping, for example, our OES server at 192.168.3.xx.
                              • In our OES server, it cannot access or ping anything back
                              • Our debian pihole dns on 192.168.3.yy server does seem to work with ping...
                              • Attempting to connect my laptop to the netGATE does not produce any connection (see picture).
                                Screenshot from 2025-02-27 09-57-04.png
                              • Running codes while in shell produced the following (I am using KEA)
                                Screenshot from 2025-02-27 09-57-14.png

                              With the wifi-router disconnected, I re-ran two commands on a windows PC but nothing really connects.
                              Screenshot from 2025-02-27 09-50-14.png
                              Screenshot from 2025-02-27 09-53-28.png

                              johnpozJ 1 Reply Last reply Reply Quote 0
                              • johnpozJ
                                johnpoz LAYER 8 Global Moderator @vf1954
                                last edited by

                                @vf1954 169.254 is a link local IP range windows will use when a dhcp server is not available.

                                Run isc for your dhcp server - kea is still in preview to be honest..

                                An intelligent man is sometimes forced to be drunk to spend time with his fools
                                If you get confused: Listen to the Music Play
                                Please don't Chat/PM me for help, unless mod related
                                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                V 1 Reply Last reply Reply Quote 0
                                • V
                                  vf1954 @johnpoz
                                  last edited by

                                  @johnpoz I did as you advised. I am back on ISC. I just see that it will be depreciated.
                                  based upon the ps aux command, only a ipv6 is visible, and no ipv4 at all. Is that correct? Is that the result of KEA?

                                  johnpozJ 1 Reply Last reply Reply Quote 0
                                  • johnpozJ
                                    johnpoz LAYER 8 Global Moderator @vf1954
                                    last edited by johnpoz

                                    @vf1954 said in Subnet collapses periodically since 24.11-RELEASE:

                                    I just see that it will be depreciated.

                                    And how many versions of down the road do you think that is? 3 - 6, 12?

                                    kea is not at feature parity yet.. So there is no chance you going to see isc removed as an option that is for sure.

                                    I would bet you that there will be a switch over to where kea is default, and then some time later after that would it be removed. I don't see kea becoming default for at least a few more versions of pfsense.

                                    An intelligent man is sometimes forced to be drunk to spend time with his fools
                                    If you get confused: Listen to the Music Play
                                    Please don't Chat/PM me for help, unless mod related
                                    SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                    1 Reply Last reply Reply Quote 0
                                    • w0wW
                                      w0w @johnpoz
                                      last edited by

                                      @johnpoz said in Subnet collapses periodically since 24.11-RELEASE:

                                      @vf1954 flow control issues not going to have your client change its IP.. There is zero reason to turn off flow control on anything.

                                      As long as it's not Windows and not 169.x.x.x, you are right...

                                      @vf1954 said in Subnet collapses periodically since 24.11-RELEASE:

                                      I never use flow control.

                                      For example, I didn't even know it was enabled.

                                      BTW, I've been using KEA in a small network for over a year with VLANs, LAGs, VPN, and CARP. So far, there have been no issues with collapses or clients not receiving addresses.
                                      But switching to ISC is a good idea for debug anyway.

                                      johnpozJ 1 Reply Last reply Reply Quote 0
                                      • johnpozJ
                                        johnpoz LAYER 8 Global Moderator @w0w
                                        last edited by

                                        @w0w yeah its coming along - but just look at the board, many posts about kea. Don't see any reason to use it if your having issues. Try again next release to be honest.

                                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                                        If you get confused: Listen to the Music Play
                                        Please don't Chat/PM me for help, unless mod related
                                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                        V 1 Reply Last reply Reply Quote 1
                                        • V
                                          vf1954 @johnpoz
                                          last edited by vf1954

                                          @johnpoz Hello.

                                          Well knocking on wood. The switch back to ISC was the solution. So far no issues for 3 weeks straight.

                                          What should I do to report KEA malfunctioning?

                                          johnpozJ 1 Reply Last reply Reply Quote 0
                                          • johnpozJ
                                            johnpoz LAYER 8 Global Moderator @vf1954
                                            last edited by

                                            @vf1954 unless your running 25.03 beta and want to report stuff in that section. I see little point in pointing out what might be wrong with 24.11 version of kea. Now if your using what is about to come out, and you see problems - they still might be able to be fixed before release.

                                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                                            If you get confused: Listen to the Music Play
                                            Please don't Chat/PM me for help, unless mod related
                                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.