Hotplug event every minute
I'm facing strange issue here. I'm using pfsense 2.4.2-RELEASE-p1 (amd64) with two interfaces: WAN connected my ISP Optical router and LAN interface connected to Netgear R7000 configured as an access point. The issue is that my LAN interface goes down and reloads every minute or so with the following log entry:
Mar 18 13:30:50 pfSense kernel: re0: watchdog timeout
Mar 18 13:30:50 pfSense kernel: re0: link state changed to DOWN
Mar 18 13:30:50 pfSense check_reload_status: Linkup starting re0
Mar 18 13:30:51 pfSense php-fpm: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1 )
Mar 18 13:30:51 pfSense check_reload_status: Reloading filter
Mar 18 13:30:54 pfSense check_reload_status: Linkup starting re0
Mar 18 13:30:54 pfSense kernel: re0: link state changed to UP
Mar 18 13:30:55 pfSense php-fpm: /rc.linkup: Hotplug event detected for LAN(lan) static IP (192.168.1.1)
Mar 18 13:30:55 pfSense check_reload_status: rc.newwanip starting re0
Mar 18 13:30:55 pfSense check_reload_status: Reloading filter
Mar 18 13:30:56 pfSense php-fpm: /rc.newwanip: rc.newwanip: Info: starting on re0.
Mar 18 13:30:56 pfSense php-fpm: /rc.newwanip: rc.newwanip: on (IP address: 192.168.1.1) (interface: LAN[lan]) (real interface: re0).
Mar 18 13:30:56 pfSense check_reload_status: Reloading filter
Physically I can see that the LAN port LED lights on my pfsense box go off and on every time when the issue appears.
Tried replacing the LAN cable with 3 different LAN cables - same problem.
Tried replacing my Netgear R7000 access point with another one - same issue.
Tried switching off all network devices connected to my Netgear R7000 - same issue.
Tried shaking the cable to see if the LAN port is damaged - LAN port is fine. During the shaking both orange and yellow link lights are on.
Tried changing the interface speed and duplex to manual 1000baseT - same issue.
The only thing that helps is restarting my pfsense box. Then the issue disappears for a week or two then comes back randomly.
Unfortunately I'm out of ideas. Now the issue is present and I don't want to restart pfsense until I find the root cause. Guidance from your side for this troubleshooting will be highly appreciated.
Here is my LAN interface ifconfig:
options=8219b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum,tso4,wol_magic,linkstate>ether 74:XX:XX:XX:XX:73
inet6 fe80::XXXX:XXXX:XXXX:c473%re0 prefixlen 64 scopeid 0x1
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
nd6 options=21 <performnud,auto_linklocal>media: Ethernet 1000baseT <full-duplex,master>status: active
My hardware is Realtek RTL8111f for my WAN and LAN interfaces. There is no usb LAN adapter connected to my pfsense box.
I'm looking forward to hearing from you.
Try disable any powerd setting, the re0 driver seems to not like the power saving.
I run embedded RTL81111f nics so they are not the best but will run anything on PfSense including Suricata inline.
Thanks for your reply. In my configuration the PowerD settings were disabled long time ago.
When you replaced the R7000 was it with another R7000?
Disconnecting and reconnecting the LAN cable does not bring the connection back?
It does seem like some sort of power saving option. Might not be powerd though, perhaps 'green ethernet' or similar.
Try putting a passive switch in between the LAN and the R7000.
Thanks for your reply.
When I replaced my R7000 it was with another fully functional and tested R7000.
Disconnecting and reconnecting the LAN cable brings the link up, but after one minute same link goes down and up again in a loop.
Finding a passive switch will be a challenge for me. The issue came with the latest pfsense release. Did you make any ammenfments to the power saving settings in the latest release?
There were no changes to the re(4) defaults as far as I know. It depends what version you came from though. 2.4.2 to 2.4.2_1? Changes were minimal, very unlikely to have been anything that would affect it like this directly.
Can you try any other device on that link?
Can you swap the interface assignments so that link is using a different NIC?
I've swapped the interfaces and actually the situation went worse. My defective interface went in a reset loop constantly going down and up.
After interface swap, I managed to get LAN up and the DHCP gave me an IP, but the webgui became unresponsive, so I had to reset the box and restore original configuration.
Now the fault disappeared but I'm sure I haven't found the root cause.
The result of interface swap was: the fault stayed on re0.
Be sure your interfaces are set to Default (No preference, usually autoselect) in the interface Speed and Duplex settings.
Other than that, can only say "Realtek NICs suck" so many times.
I can confirm all interfaces are set to Default in terms of Speed and Duplex.
I found a topic, claiming that updating pfsense realtek drivers to the latest one available on official realtek website will solve my issue.
The above did not solve my problem alone rather a combination.
What fixed it for me was uploading a compiled version of realtek driver 1.92 if_re.ko to /boot/kernel/ and in /boot/loeader.conf.local add the following line: if_re_load="YES" then reboot and wupti.
I also however tick the 3 Disable hardware offloadings in System - Advanced - Networking
Heres some links that helped me solve my issue including the compiled if_re.ko, i hope it helps others having same problem.
From what I can see the following realtek drivers are the latest one: 0007-rtl_bsd_drv_v194.01. Unfortunately in order to compile the drivers, I have to install a freebsd release which will take me a while.
Mmm, Realtek NIC do suck.
My defective interface went in a reset loop constantly going down and up.
That in particular sounds like a negotiation issue that have seen myself on Realtek NICs when connected to some other gear (nothing in particular though).
Are you actually seeing the watchdog timeout errors in the system log on on the console? If not the alternate driver may not do much for you.
I tought it's negotiation issue, but going trough all 1000baseT negotiation modes didn't help.
I confirm I can see watchdog timeout errors in the system log every time when the problem appears.
So far I've uploaded the latest Realtek drivers to my pfsense.
For your conviniance guys I've attached the latest drivers from Realtek complied for freeBSD. Version: 1.94.
Here is a guide if you want to compile them by yourselve: https://gist.github.com/jovimon/524e116471f249626fd2ccd141f3fe05
Just as a warning: no-one should be uploading drivers, of any sort but particularly from an unknown source, into their firewall unless they have a really really good reason! ;)
What FreeBSD version was that compiled against? 64bit?
I fully support your statement. Do not upload files to your firewalls from an unknown sources.
The driver was compiled on freeBSD 11.1 amd64 release.