Daily interface up/down, state reset



  • At first I thought this was a fluke, but now I am getting suspicious…

    For the past two days at 2 a.m., my pfsense box logs that the WAN interface went down, and then back up within seconds.  This is odd for a few reasons - one, the device on the other side is a small cisco router provided by my cable co. for biz customers, two, that device has been up the whole time, three, it was 2 a.m. both times.

    Here's the system logs:

    
    today:
    
    Mar 25 02:01:38	check_reload_status: reloading filter
    Mar 25 02:01:38	check_reload_status: reloading filter
    Mar 25 02:01:28	apinger: alarm canceled: HE_V6(2001:470:x::1) *** down ***
    Mar 25 02:01:28	apinger: alarm canceled: GW_WAN(x.x.x.x) *** delay ***
    Mar 25 02:01:28	apinger: alarm canceled: GW_WAN(x.x.x.x) *** down ***
    Mar 25 02:00:56	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 25 02:00:55	check_reload_status: Linkup starting xl0
    Mar 25 02:00:55	kernel: xl0: link state changed to UP
    Mar 25 02:00:41	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 25 02:00:40	kernel: xl0: link state changed to DOWN
    Mar 25 02:00:40	check_reload_status: Linkup starting xl0
    Mar 25 02:00:31	check_reload_status: reloading filter
    Mar 25 02:00:21	apinger: ALARM: HE_V6(2001:470:x::1) *** down ***
    Mar 25 02:00:21	apinger: ALARM: GW_WAN(x.x.x.x) *** down ***
    Mar 25 02:00:15	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 25 02:00:14	check_reload_status: reloading filter
    Mar 25 02:00:14	kernel: xl0: link state changed to UP
    Mar 25 02:00:14	check_reload_status: Linkup starting xl0
    Mar 25 02:00:13	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.)
    Mar 25 02:00:12	kernel: xl0: link state changed to DOWN
    Mar 25 02:00:12	check_reload_status: Linkup starting xl0
    Mar 25 02:00:04	apinger: ALARM: GW_WAN(x.x.x.x) *** delay ***
    
    yesterday:
    
    Mar 24 02:01:38	check_reload_status: reloading filter
    Mar 24 02:01:38	check_reload_status: reloading filter
    Mar 24 02:01:28	apinger: alarm canceled: HE_V6(2001:470:x::1) *** down ***
    Mar 24 02:01:28	apinger: alarm canceled: GW_WAN(x.x.x.x) *** delay ***
    Mar 24 02:01:28	apinger: alarm canceled: GW_WAN(x.x.x.x) *** down ***
    Mar 24 02:00:56	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 24 02:00:55	check_reload_status: Linkup starting xl0
    Mar 24 02:00:55	kernel: xl0: link state changed to UP
    Mar 24 02:00:41	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 24 02:00:40	check_reload_status: Linkup starting xl0
    Mar 24 02:00:40	kernel: xl0: link state changed to DOWN
    Mar 24 02:00:31	check_reload_status: reloading filter
    Mar 24 02:00:21	apinger: ALARM: HE_V6(2001:470:x::1) *** down ***
    Mar 24 02:00:21	apinger: ALARM: GW_WAN(x.x.x.x) *** down ***
    Mar 24 02:00:15	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 24 02:00:14	check_reload_status: reloading filter
    Mar 24 02:00:14	kernel: xl0: link state changed to UP
    Mar 24 02:00:14	check_reload_status: Linkup starting xl0
    Mar 24 02:00:13	php: : Hotplug event detected for wan but ignoring since interface is configured with static IP (x.x.x.x)
    Mar 24 02:00:12	kernel: xl0: link state changed to DOWN
    Mar 24 02:00:12	check_reload_status: Linkup starting xl0
    Mar 24 02:00:04	apinger: ALARM: GW_WAN(x.x.x.x) *** delay ***
    
    

    I don't mind so much that it goes down, but this also seems to trigger a reset of the firewall states, as any long-running sessions (ie: a few dozen ssh sessions) timeout right after this.

    This is 2.0-RC3 with a gitsync to the v6 branch from Sunday.



  • You can disable the state reset under advanced options.
    Otherwise this is your issue cabling/switch/etc not a pfSense one.



  • @ermal:

    You can disable the state reset under advanced options.

    Perfect.  It might be a good idea to have that state clearing option log its action though.  I would have never guessed it was a feature.  Also maybe some type of a delay - looking at the logs, the whole down/up cycle is only 2 seconds.

    @ermal:

    Otherwise this is your issue cabling/switch/etc not a pfSense one.

    Same cable and hardware as when I was running 1.3, so I kind of doubt it.  If I can find an easy way to move the WAN to another interface, I'll try that and see if it looks like an 8.x issue with the xl driver.  Odd that's it's doing this at 2 a.m. everyday though…


Locked