WAN link state change up and down until reboot


  • Yesterday sbout 22:30 disconnect my pFsense from internet. After pFsense reboot, it came back to work. How to find, what caused that problem? On WAN is there Intel PRO 1000CT card.


  • And in log I see this:

    Aug 14 06:46:44 	kernel 		em2: link state changed to UP
    Aug 14 06:46:44 	check_reload_status 		Linkup starting em2
    Aug 14 06:46:41 	kernel 		em2: link state changed to DOWN
    Aug 14 06:46:41 	kernel 		em2: RX Next to Refresh = 1023
    Aug 14 06:46:41 	kernel 		em2: RX Next to Check = 0
    Aug 14 06:46:41 	kernel 		em2: RX discarded packets = 0
    Aug 14 06:46:41 	kernel 		em2: hw rdh = 0, hw rdt = 1023
    Aug 14 06:46:41 	kernel 		em2: RX Queue 0 ------
    Aug 14 06:46:41 	kernel 		em2: Tx Descriptors avail failure = 0
    Aug 14 06:46:41 	kernel 		em2: TX descriptors avail = 127
    Aug 14 06:46:41 	kernel 		em2: Tx Queue Status = -2147483648
    Aug 14 06:46:41 	kernel 		em2: hw tdh = 0, hw tdt = 897
    Aug 14 06:46:41 	kernel 		em2: TX Queue 0 ------
    Aug 14 06:46:41 	kernel 		em2: Watchdog timeout Queue[0]-- resetting
    Aug 14 06:46:41 	check_reload_status 		Linkup starting em2
    Aug 14 06:31:52 	php-fpm 		/rc.newwanip: rc.newwanip: on (IP address: xxxxxxx) (interface: WAN[wan]) (real interface: em2).
    Aug 14 06:31:52 	php-fpm 		/rc.newwanip: rc.newwanip: Info: starting on em2.
    Aug 14 06:31:51 	check_reload_status 		rc.newwanip starting em2
    Aug 14 06:31:50 	kernel 		em2: link state changed to UP
    Aug 14 06:31:50 	check_reload_status 		Linkup starting em2
    Aug 14 06:31:46 	kernel 		em2: link state changed to DOWN
    Aug 14 06:31:46 	kernel 		em2: RX Next to Refresh = 1023
    Aug 14 06:31:46 	kernel 		em2: RX Next to Check = 0
    Aug 14 06:31:46 	kernel 		em2: RX discarded packets = 0
    Aug 14 06:31:46 	kernel 		em2: hw rdh = 0, hw rdt = 1023
    Aug 14 06:31:46 	kernel 		em2: RX Queue 0 ------
    Aug 14 06:31:46 	kernel 		em2: Tx Descriptors avail failure = 0
    Aug 14 06:31:46 	kernel 		em2: TX descriptors avail = 127
    Aug 14 06:31:46 	kernel 		em2: Tx Queue Status = -2147483648
    Aug 14 06:31:46 	kernel 		em2: hw tdh = 0, hw tdt = 897 
    

  • @GeorgeCZ58 said in WAN link state change up and down until reboot:

    How to find, what caused that problem?

    How about :
    Change LAN and WAN NIC assignment and see if the problems follows.
    It does : It's the NIC. If not : the cable or the device in front of it.
    Etc etc.

    Btw : yep, even Intel nics can die.

    edit :
    If the interface 'resets' after a

    /rc.newwanip: rc.newwanip: Info: starting on em2.
    

    event, check why that event fired (compare the events at the same moment in all the logs).

    Example : an up stream router decided to deal out a 'pfSense WAN IP lease' with a duration of 60 seconds. That means the pfSense DHCP client starts asking for renewal every 30 seconds.
    That will chain-gun the state of the WAN interface as it gets restarted every 30 seconds.


  • There are another informations from log and time when problem starts:

    Aug 13 22:30:14 	kernel 		em2: link state changed to DOWN
    Aug 13 22:30:14 	kernel 		em2: RX Next to Refresh = 1023
    Aug 13 22:30:14 	kernel 		em2: RX Next to Check = 0
    Aug 13 22:30:14 	kernel 		em2: RX discarded packets = 0
    Aug 13 22:30:14 	kernel 		em2: hw rdh = 0, hw rdt = 1023
    Aug 13 22:30:14 	kernel 		em2: RX Queue 0 ------
    Aug 13 22:30:14 	kernel 		em2: Tx Descriptors avail failure = 0
    Aug 13 22:30:14 	kernel 		em2: TX descriptors avail = 886
    Aug 13 22:30:14 	kernel 		em2: Tx Queue Status = -2147483648
    Aug 13 22:30:14 	kernel 		em2: hw tdh = 0, hw tdt = 138
    Aug 13 22:30:14 	kernel 		em2: TX Queue 0 ------
    Aug 13 22:30:14 	kernel 		Interface is RUNNING and ACTIVE
    Aug 13 22:30:14 	kernel 		em2: Watchdog timeout Queue[0]-- resetting
    Aug 13 22:30:14 	check_reload_status 		Linkup starting em2
    Aug 13 22:28:04 	dhcpleases 		/etc/hosts changed size from original!
    Aug 13 22:28:04 	php-fpm 		/rc.newwanip: rc.newwanip: on (IP address: xxx.xxx.xxx.xxx) (interface: WAN[wan]) (real interface: em2).
    Aug 13 22:28:04 	php-fpm 		/rc.newwanip: rc.newwanip: Info: starting on em2.
    Aug 13 22:28:03 	check_reload_status 		Reloading filter
    Aug 13 22:28:03 	check_reload_status 		rc.newwanip starting em2
    Aug 13 22:28:03 	php-fpm 		/rc.linkup: Hotplug event detected for WAN(wan) static IP (xxx.xxx.xxx.xxx )
    Aug 13 22:28:02 	kernel 		em2: link state changed to UP
    Aug 13 22:28:02 	check_reload_status 		Linkup starting em2
    Aug 13 22:28:01 	php-fpm 		/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_IP_ADD.
    Aug 13 22:28:00 	check_reload_status 		Reloading filter
    Aug 13 22:28:00 	check_reload_status 		Restarting OpenVPN tunnels/interfaces
    Aug 13 22:28:00 	check_reload_status 		Restarting ipsec tunnels
    Aug 13 22:28:00 	check_reload_status 		updating dyndns WAN_IP_ADD
    Aug 13 22:28:00 	rc.gateway_alarm 	41570 	>>> Gateway alarm: WAN_IP_ADD (Addr:8.8.8.8 Alarm:1 RTT:9.184ms RTTsd:.778ms Loss:21%)
    Aug 13 22:27:59 	check_reload_status 		Reloading filter
    Aug 13 22:27:59 	php-fpm 		/rc.linkup: Hotplug event detected for WAN(wan) static IP (xxx.xxx.xxx.xxx )
    Aug 13 22:27:58 	kernel 		em2: link state changed to DOWN
    Aug 13 22:27:58 	kernel 		em2: RX Next to Refresh = 1023
    Aug 13 22:27:58 	kernel 		em2: RX Next to Check = 0
    Aug 13 22:27:58 	kernel 		em2: RX discarded packets = 0
    Aug 13 22:27:58 	kernel 		em2: hw rdh = 0, hw rdt = 1023
    Aug 13 22:27:58 	kernel 		em2: RX Queue 0 ------
    Aug 13 22:27:58 	kernel 		em2: Tx Descriptors avail failure = 0
    Aug 13 22:27:58 	kernel 		em2: TX descriptors avail = 989
    Aug 13 22:27:58 	kernel 		em2: Tx Queue Status = -2147483648
    Aug 13 22:27:58 	kernel 		em2: hw tdh = 0, hw tdt = 35
    Aug 13 22:27:58 	kernel 		em2: TX Queue 0 ------
    Aug 13 22:27:58 	kernel 		Interface is RUNNING and ACTIVE
    Aug 13 22:27:58 	kernel 		em2: Watchdog timeout Queue[0]-- resetting
    Aug 13 22:27:58 	check_reload_status 		Linkup starting em2
    Aug 13 22:27:53 	check_reload_status 		Reloading filter
    Aug 13 22:27:53 	dhcpleases 		/etc/hosts changed size from original!
    Aug 13 22:27:53 	php-fpm 		/rc.newwanip: rc.newwanip: on (IP address: xxx.xxx.xxx.xxx) (interface: WAN[wan]) (real interface: em2).
    Aug 13 22:27:53 	php-fpm 		/rc.newwanip: rc.newwanip: Info: starting on em2.
    Aug 13 22:27:52 	check_reload_status 		Reloading filter
    Aug 13 22:27:52 	check_reload_status 		rc.newwanip starting em2
    Aug 13 22:27:52 	php-fpm 		/rc.linkup: Hotplug event detected for WAN(wan) static IP (xxx.xxx.xxx.xxx )
    Aug 13 22:27:51 	kernel 		em2: link state changed to UP
    Aug 13 22:27:51 	check_reload_status 		Linkup starting em2
    Aug 13 22:27:49 	check_reload_status 		Reloading filter
    Aug 13 22:27:49 	php-fpm 		/rc.linkup: Hotplug event detected for WAN(wan) static IP (xxx.xxx.xxx.xxx )
    Aug 13 22:27:48 	kernel 		em2: link state changed to DOWN
    Aug 13 22:27:48 	kernel 		em2: RX Next to Refresh = 733
    Aug 13 22:27:48 	kernel 		em2: RX Next to Check = 734
    Aug 13 22:27:48 	kernel 		em2: RX discarded packets = 0
    Aug 13 22:27:48 	kernel 		em2: hw rdh = 734, hw rdt = 733
    Aug 13 22:27:48 	kernel 		em2: RX Queue 0 ------
    Aug 13 22:27:48 	kernel 		em2: Tx Descriptors avail failure = 0
    Aug 13 22:27:48 	kernel 		em2: TX descriptors avail = 961
    Aug 13 22:27:48 	kernel 		em2: Tx Queue Status = -2147483648
    Aug 13 22:27:48 	kernel 		em2: hw tdh = 70, hw tdt = 133
    Aug 13 22:27:48 	kernel 		em2: TX Queue 0 ------
    Aug 13 22:27:48 	kernel 		Interface is RUNNING and ACTIVE
    Aug 13 22:27:48 	kernel 		em2: Watchdog timeout Queue[0]-- resetting
    Aug 13 22:27:48 	check_reload_status 		Linkup starting em2 
    

    We have more public addresses. Cisco routers on other address seems works OK. So you think, that replace Intel card is the solution? But after restart it work ok. :-/