WAN keeps going down - static ip - replaced hardware



  • Hi

    I have weird problem that started happening last week. I had 2.2.4 running perfectly fine for a couple months and then for some reason the WAN link kept dropping and simple modem restart and (or even perhaps wan link disconnect/reconnect on modem side) would bring things back up.

    The hardware was Supermicro D525 with SSD and dual gbit card (intel). the hd was fine, no errors on the WAN interface. I called the ISP to replace the modem thinking it was faulty.

    Modem was replaced and the problem was still happening, so i did fresh install on the same hardware to 2.2.6 and restored the config and that did not fix anything.

    I replaced the pfsense box thinking maybe the WAN gbit port was bad and used old machine that was used previously. I had to put new HD so i put 2.2.6 on it and the same thing i happening on the new hardware. at this point im completely lost.

    Most of the sites run 2.1.5 but i have a couple on 2.2.4 and 2.2.5. I'm not sure if it's a hardware problem at this point. I basically repalced everything between the network and ISP. any ideas ?

    Thanks

    gateway log
    Jan 27 14:08:27 apinger: ALARM: WANGW(static ip) *** down ***
    Jan 27 14:09:12 apinger: alarm canceled: WANGW(static ip) *** down ***

    system log
    Jan 27 14:08:37 check_reload_status: Reloading filter
    Jan 27 14:08:53 php-fpm[50665]: /rc.newipsecdns: IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
    Jan 27 14:08:53 check_reload_status: Reloading filter
    Jan 27 14:08:53 check_reload_status: Linkup starting em0
    Jan 27 14:08:53 kernel: em0: link state changed to DOWN
    Jan 27 14:08:54 php-fpm[4316]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (static ip )
    Jan 27 14:08:56 check_reload_status: Linkup starting em0
    Jan 27 14:08:56 kernel: em0: link state changed to UP
    Jan 27 14:08:57 php-fpm[4316]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (static ip )
    Jan 27 14:08:57 check_reload_status: rc.newwanip starting em0
    Jan 27 14:08:58 php-fpm[4316]: /rc.newwanip: rc.newwanip: Info: starting on em0.
    Jan 27 14:08:58 php-fpm[4316]: /rc.newwanip: rc.newwanip: on (IP address: static ip) (interface: WAN[wan]) (real interface: em0).
    Jan 27 14:08:58 check_reload_status: Reloading filter
    Jan 27 14:09:08 check_reload_status: Linkup starting em0
    Jan 27 14:09:08 kernel: em0: link state changed to DOWN
    Jan 27 14:09:09 php-fpm[4316]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (static ip )
    Jan 27 14:09:11 check_reload_status: Linkup starting em0
    Jan 27 14:09:11 kernel: em0: link state changed to UP
    Jan 27 14:09:12 php-fpm[4316]: /rc.linkup: Hotplug event detected for WAN(wan) but ignoring since interface is configured with static IP (static ip )
    Jan 27 14:09:12 check_reload_status: rc.newwanip starting em0
    Jan 27 14:09:13 php-fpm[4316]: /rc.newwanip: rc.newwanip: Info: starting on em0.
    Jan 27 14:09:13 php-fpm[4316]: /rc.newwanip: rc.newwanip: on (IP address: static ip) (interface: WAN[wan]) (real interface: em0).
    Jan 27 14:09:13 check_reload_status: Reloading filter
    Jan 27 14:09:22 check_reload_status: updating dyndns WANGW
    Jan 27 14:09:22 check_reload_status: Restarting ipsec tunnels
    Jan 27 14:09:22 check_reload_status: Restarting OpenVPN tunnels/interfaces
    Jan 27 14:09:22 check_reload_status: Reloading filter
    Jan 27 14:09:38 php-fpm[4316]: /rc.newipsecdns: IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
    Jan 27 14:09:38 check_reload_status: Reloading filter



  • You still seeing link loss on the WAN NIC? That's generally something pretty basic and almost never a software issue. Sounds like you've replaced most of the hardware involved though. Replaced the patch cables too?



  • hi cmb,

    yes at this point i don't know what's causing it but i'm almost positive it's on ISP/modem side.

    steps done

    • cable modem was replaced (because the issue started happening all of a sudden) so i figured ISP is the first to contact
    • Supermicro D525 (#1) with 2.2.4 (was running fine for 6+ with the original cable modem, only siproxd package installed) - random WAN loss
    • reflashed #1 with 2.2.6 with new modem, new patch cables - random WAN loss
    • replaced with another box (#2) i had laying around with intel dual gbit card with 2.2.6, thinking #1 coulve had bad WAN port - random WAN loss
    • grabbed a different Supermicro D525 (#3) with 2.1.5 (i have about 13 sites running 2.1.5 no problem and this one was in production trouble free for 1 year+) - random WAN loss
    • tried to turn off gateway monitoring on each setup

    so right now I went through three pfsense boxes with different versions and every single scenario produces the WAN loss. at this point i don't think it's my hardware or even pfsense software.

    I'm waiting for it to happen happen again overnight so i can call cable provider so they can view the logs since whenever it happens someone at the location power cycles the modem to bring it back up which flushes the logs so the support can't see what's happening prior to WAN disconnect.

    i have another Sonicewall TZ200 that I will put in place tomorrow just to see, but the issue is still here after 3 pfsense boxes. yes the patch cables were replaced mutltiple times. it's really frustrating to pinpoint this problem but i feel like i exhausted all the options on my side of the network (that's also what the ISP rep said). if the isp cant figure it out i may be forced to switch to different provider (FIOS) as the business depends on internet connectivity.

    any advice is appreciated. cable provider is optimum (static ip).



  • Maybe a workarround.
    Can you place a router between your modem and your pfSense box ?
    Then enable all the log options in the router.
    Then if connection failed, you can see if the pfSense box still has a IP-adress from the router.
    If the pfSense box still has a IP-adress and the router loses it's WAN ip-adress, then the problem
    is on the ISP side.

    Grtz
    DeLorean



  • thanks for the suggestion DeLorean. it happened again overnight and ISP tech came out in the morning to inspect the modem logs and it turned out to be the modem having issues. i can finally sleep at night.



  • Nice to hear , that the problem is solved ;-)

    Grtz
    DeLorean