Subnet collapses periodically since 24.11-RELEASE
-
@johnpoz I agree a second dhcp is somewhere lurking but I am at wits end to figure out where.
TP-Link: unless the tp link is acting out, it's off. I updated the firmware but that didn't have any effect.
Novell (OES2 server). It has dhcp disabled and the port to dhcp also blocked.
Pi-Hole: turned off (and even if it was turned on, it would serve 192.168.3.x)
Switches: no dhcp server capability (afaik)
We have several unmanaged switches connecting various PCs in an office back to one of the switches...
that's it.
-
@vf1954 well next time it happens, check the mac - that should help you track down what is doing it.
Or turn off the dhcp server in pfsense.. Do a release and renew on some client, that you were seeing this before.. Does it get the 192.168.0 address.. If so what is the mac of the dhcp server and hope you can track it down from that. The first 3 numbers of the mac should tall you what brand of device it is atleast.
Unless your switches are all just dumb switches, managed and smart switch can provide dhcp.
edit: I mean it could be possible if pfsense is rebooting to an old config or something.. When you console in, look to see what IPs are on the interfaces, etc. I just find that so highly improbable.. What makes more sense and quite possible to happen is something else serving dhcp..
Checking the mac address of dhcp server IP when you get the wrong lease and IP should tell you for sure.. My money is on rogue dhcp and not pfsense just spontaneously changing its IP of an interface and handing out different dhcp info
-
Any chance that there's some mess with flow control on the switches or client devices?
Some USB and non-USB Realtek network adapters embedded into motherboards are known to cause similar issues, such as endless pauses on RX/TX, which can literally collapse the network. I've run into this twice, so it's likely not such an uncommon issue nowadays.
I would start by disabling FC on pfSense and on the switches too, if it is enabled.
Netgate Documentation - Flow Control
Also, disable FC on the switches and routers you are using in your LAN. -
@w0w I don't know. I never use flow control. I will look more deeply into this.
-
@vf1954 flow control issues not going to have your client change its IP.. There is zero reason to turn off flow control on anything.
-
@johnpoz So I did some more testing.
Testing arp and ipconfig all reveals a DHCP server sending a 192.168.0.x broadcast. No picture attached, just letting you know.
Communicating to TP-Link engineers revealed that the wifi-router (Archer AX73v1) will act as a DHCP server as an emergency only if it detects no DHCP server anymore.Since the network went down again this morning, I produced the following test results:
- Disconnecting the tp-link router(s) does NOT allow a client to establish a connection to netGATE pfSense.
- This means netGATE pfSense is somehow dropping the DHCP server randomly, and the tp-link notices this and says "uh oh" and does what it can.
- While in this strange state, I can enter into console and enter into shell and ping, for example, our OES server at 192.168.3.xx.
- In our OES server, it cannot access or ping anything back
- Our debian pihole dns on 192.168.3.yy server does seem to work with ping...
- Attempting to connect my laptop to the netGATE does not produce any connection (see picture).
- Running codes while in shell produced the following (I am using KEA)
With the wifi-router disconnected, I re-ran two commands on a windows PC but nothing really connects.
-
@vf1954 169.254 is a link local IP range windows will use when a dhcp server is not available.
Run isc for your dhcp server - kea is still in preview to be honest..
-
@johnpoz I did as you advised. I am back on ISC. I just see that it will be depreciated.
based upon the ps aux command, only a ipv6 is visible, and no ipv4 at all. Is that correct? Is that the result of KEA? -
@vf1954 said in Subnet collapses periodically since 24.11-RELEASE:
I just see that it will be depreciated.
And how many versions of down the road do you think that is? 3 - 6, 12?
kea is not at feature parity yet.. So there is no chance you going to see isc removed as an option that is for sure.
I would bet you that there will be a switch over to where kea is default, and then some time later after that would it be removed. I don't see kea becoming default for at least a few more versions of pfsense.
-
@johnpoz said in Subnet collapses periodically since 24.11-RELEASE:
@vf1954 flow control issues not going to have your client change its IP.. There is zero reason to turn off flow control on anything.
As long as it's not Windows and not 169.x.x.x, you are right...
@vf1954 said in Subnet collapses periodically since 24.11-RELEASE:
I never use flow control.
For example, I didn't even know it was enabled.
BTW, I've been using KEA in a small network for over a year with VLANs, LAGs, VPN, and CARP. So far, there have been no issues with collapses or clients not receiving addresses.
But switching to ISC is a good idea for debug anyway. -
@w0w yeah its coming along - but just look at the board, many posts about kea. Don't see any reason to use it if your having issues. Try again next release to be honest.
-
@johnpoz Hello.
Well knocking on wood. The switch back to ISC was the solution. So far no issues for 3 weeks straight.
What should I do to report KEA malfunctioning?
-
@vf1954 unless your running 25.03 beta and want to report stuff in that section. I see little point in pointing out what might be wrong with 24.11 version of kea. Now if your using what is about to come out, and you see problems - they still might be able to be fixed before release.