Every 2 days Netgate 2100 Stops Routing Traffic
-
Next time that happens try running at the CLI:
etherswitchcfg
Make sure it can see the switch device and returns sane data.
Do you have any additional hardware in it? In particular a cellular modem?
Steve
-
-
@stephenw10 OK will do, I was able to access the switch management in the webGUI and that's what I used to turn the LAN port going to the main switch from auto-select to None to reset that port to see if it would help which it didn't.
Assuming that worked I would assumeetherswitchcfg
would have worked as well, but I will try it next time. -
Yes, it shows it was able to see the switch device at least. It may still have been returning bogus values though.
-
@stephenw10 just happened again right now, here is the output from
etherswitchcfg
The customer switch is connected to LAN 1 on the Netgate 2100.
I personally don't see any issue here in the output.etherswitch0: VLAN mode: PORT port1: state=8<FORWARDING> flags=0<> media: Ethernet autoselect (1000baseT <full-duplex>) status: active port2: state=8<FORWARDING> flags=0<> media: Ethernet autoselect (none) status: no carrier port3: state=8<FORWARDING> flags=0<> media: Ethernet autoselect (none) status: no carrier port4: state=8<FORWARDING> flags=0<> media: Ethernet autoselect (none) status: no carrier port5: state=8<FORWARDING> flags=1<CPUPORT> media: Ethernet 2500Base-KX <full-duplex> status: active vlangroup1: port: 1 members 2,3,4,5 vlangroup2: port: 2 members 1,3,4,5 vlangroup3: port: 3 members 1,2,4,5 vlangroup4: port: 4 members 1,2,3,5 vlangroup5: port: 5 members 1,2,3,4
-
Yeah that looks fine. In which case I'd expect to at least see some traffic on mvneta1. Try running packet capture there and make sure there is.
When this happens do LAN clients stop connecting entirely? Are they able to reach the pfSense webgui still? Do they still get a dhcp lease?
Steve
-
@artooro did you try a different port, 2-4? Or different patch cable? Seems unlikely, but...
-
@stephenw10 no they can't get to the pfsense webgui. If I look at the packet counters under Status / Interfaces the "in" count stays static and as you would expect the packet capture won't show anything coming in either.
-
@artooro This may seem random but is the switch going down at that time? We have a lab running 2.6 and realized that it drops its LAN when the switch is unplugged/replaced/etc. I don't recall ever seeing that before so just chalked it up to the Realtek NIC in that PC (which, try to avoid Realtek). We can access/restart that router from its WAN so it's not a big deal for us but it's on our list to look at.
-
@SteveITS it's not going down to the point of being able to tell remotely. I did already think about the possibility that rebooting the Netgate might somehow be causing the customer switch to start working again, and that's why I shutdown the LAN 1 port on the Netgate without rebooting it to test that theory, which did not make a difference.
So I highly doubt it's the customer switch at this point. -
So do clients no longer get a dhcp lease from pfSense?
What is actually connected to the 2100 switch port? A client directly?
-
So even after swapping the cable and connecting the switch (which is a Ubiquiti USW-Lite-16-POE) to a different LAN port on the Netgate 2100, the same issue is recurring.
Now what's interesting is that we just had a long weekend in Canada, and the Netgate didn't stop routing until the employees came into the office and started working.
So it appears that the issue on the Netgate isn't triggered until there is some LAN-side load on it.Regarding DHCP leases, no that stops working as well. But I'm monitoring the servers which have static IPs and they all go down.
@stephenw10 would it make sense to go the RMA route at this point?
-
Are you able to test this with a default config after a clean install?
If it still stops routing in that situation then, yes, it's probably time to open an RMA request.
Steve
-
@stephenw10 that was the first thing we did. So yeah I'll go ahead and create a ticket.