Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Rapid DOWN-UP of WAN PHY + IP change causes complete loss of WAN sometimes

    Scheduled Pinned Locked Moved General pfSense Questions
    3 Posts 2 Posters 498 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      TauCeti
      last edited by

      This has been plaguing me for a long time and now with 2.4.0 stable it's been easier to try to tease out what might be going wrong.

      The setup:
      LAN <–--> SG-1000 <----> HFC Cable Modem
      NOTE: I have zero control over the cable modem as it is controlled by the infrastructure provider (NBN Australia).

      The problem is that Internet connectivity will sometimes completely disappear and not come back unless the SG-1000 is power cycled (have not confirmed yet whether it can be accessed via web console when it is in this state).

      From studying what happens in multiple logs it seems that:

      1. ISP sets WAN IP via DHCP with 150 second renew time.

      2. Seemingly randomly (2-3 times a day or once a week) WAN IP is changed.

      3. A moment before the IP is changed the cable modem will in under one second force the link to pfsense DOWN then UP.

      4. From this point on, it's all bets are off. Either:
        a) ~ 50% of the time IP will update and WAN connection stays
        b) ~ 50% of the time WAN dies. Internet cannot be accessed from the LAN and also the unit cannot be pinged from the Internet.

      In the case where Internet access dies, it seems that the new IP sticks, but some other odd things happen. Combined logs:

      
      07:35:35 dhclient: RENEW
      07:35:35 dhclient: Creating resolv.conf
      07:38:05 dhclient: RENEW
      07:38:05 dhclient: Creating resolv.conf
      07:40:35 dhclient: RENEW
      07:40:35 dhclient: Creating resolv.conf
      07:43:05 dhclient: RENEW
      07:43:05 dhclient: Creating resolv.conf
      07:45:35 dhclient: RENEW
      07:45:35 dhclient: Creating resolv.conf
      07:48:05 dhclient: RENEW
      07:48:05 dhclient: Creating resolv.conf
      07:49:35 kernel: cpsw0: link state changed to DOWN
      07:49:35 kernel: cpsw0: link state changed to UP
      07:49:35 check_reload_status: Linkup starting cpsw0
      07:49:35 check_reload_status: Linkup starting cpsw0
      07:49:36 php-fpm[88066]: /rc.linkup: DEVD Ethernet detached event for wan
      07:49:36 dhclient[9971]: connection closed
      07:49:36 dhclient[9971]: exiting.
      07:49:37 miniupnpd[58525]: ioctl(s, SIOCGIFADDR, ...): Can't assign requested address
      07:49:37 miniupnpd[58525]: Failed to get IP for interface cpsw0
      07:49:37 miniupnpd[58525]: SendNATPMPPublicAddressChangeNotification: cannot get public IP address, stopping
      07:49:37 check_reload_status: Reloading filter
      07:49:37 php-fpm[13091]: /rc.linkup: DEVD Ethernet attached event for wan
      07:49:37 php-fpm[13091]: /rc.linkup: HOTPLUG: Configuring interface wan
      07:49:38 miniupnpd[58525]: SendNATPMPPublicAddressChangeNotification: sendto(s_udp=10): No route to host
      07:49:38 dhclient: PREINIT
      07:49:38 dhclient[59642]: DHCPREQUEST on cpsw0 to 255.255.255.255 port 67
      07:49:38 dhclient[59642]: DHCPNAK from ##ISP GATEWAY##
      07:49:38 dhclient[59642]: DHCPDISCOVER on cpsw0 to 255.255.255.255 port 67 interval 1
      07:49:39 miniupnpd[58525]: ioctl(s, SIOCGIFADDR, ...): Can't assign requested address
      07:49:39 miniupnpd[58525]: Failed to get IP for interface cpsw0
      07:49:39 miniupnpd[58525]: SendNATPMPPublicAddressChangeNotification: cannot get public IP address, stopping
      07:49:39 miniupnpd[58525]: SendNATPMPPublicAddressChangeNotification: sendto(s_udp=10): No route to host
      07:49:39 dhclient[59642]: DHCPDISCOVER on cpsw0 to 255.255.255.255 port 67 interval 2
      07:49:39 dhclient[59642]: DHCPOFFER from ##ISP GATEWAY##
      07:49:39 dhclient: ARPSEND
      07:49:39 dhclient[59642]: DHCPOFFER from ##ISP GATEWAY##
      07:49:39 dhclient[59642]: DHCPOFFER already seen.
      07:49:39 dhclient[59642]: DHCPOFFER from ##ISP GATEWAY 2##
      07:49:39 dhclient: ARPSEND
      07:49:39 dhclient: ARPCHECK
      07:49:39 dhclient[59642]: DHCPREQUEST on cpsw0 to 255.255.255.255 port 67
      07:49:39 dhclient[59642]: DHCPACK from ##ISP GATEWAY##
      07:49:39 dhclient: BOUND
      07:49:39 dhclient: Starting add_new_address()
      07:49:39 dhclient: ifconfig cpsw0 inet ##WAN IP## netmask 255.255.0.0 broadcast ##WAN BROADCAST##
      07:49:39 dhclient: New IP Address (cpsw0): ##WAN IP##
      07:49:39 dhclient: New Subnet Mask (cpsw0): 255.255.0.0
      07:49:39 dhclient: New Broadcast Address (cpsw0): ##WAN BROADCAST##
      07:49:39 dhclient: New Routers (cpsw0): ##ISP GATEWAY##
      07:49:39 dhclient: Adding new routes to interface: cpsw0
      07:49:40 dnsmasq[97316]: exiting on receipt of SIGTERM
      07:49:40 dhclient: /sbin/route add default ##ISP GATEWAY##
      07:49:40 dhclient: Creating resolv.conf
      07:49:40 dhclient[59642]: bound to ##WAN IP## -- renewal in 150 seconds.
      07:49:40 check_reload_status: rc.newwanip starting cpsw0
      07:49:40 php-fpm[13091]: /rc.linkup: ROUTING: setting default route to ##ISP GATEWAY##
      07:49:40 check_reload_status: Restarting ipsec tunnels
      07:49:41 dhcpleases: /etc/hosts changed size from original!
      07:49:41 dhcpleases: Sending HUP signal to dns daemon(97316)
      07:49:41 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
      07:49:41 kernel: arpresolve: can't allocate llinfo for ##ISP GATEWAY 2## on cpsw0
      07:49:41 php-fpm[88066]: /rc.newwanip: rc.newwanip: Info: starting on cpsw0.
      07:49:41 kernel: arpresolve: can't allocate llinfo for ##ISP GATEWAY 2## on cpsw0
      07:49:41 php-fpm[88066]: /rc.newwanip: rc.newwanip: on (IP address: ##WAN IP##) (interface: WAN[wan]) (real interface: cpsw0).
      07:49:41 dhcpleases: /etc/hosts changed size from original!
      07:49:41 php-fpm[88066]: /rc.newwanip: IP Address has changed, killing states on former IP Address 106.71.88.7.
      07:49:41 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
      07:49:42 dhcpleases: /etc/hosts changed size from original!
      07:49:42 dhcpleases: Sending HUP signal to dns daemon(97316)
      07:49:42 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
      07:49:42 dhcpleases: kqueue error: unkown
      07:49:42 dhcpleases: Sending HUP signal to dns daemon(71274)
      07:49:42 dnsmasq[71274]: started, version 2.78 cachesize 10000
      07:49:42 dnsmasq[71274]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect no-inotify
      07:49:42 dnsmasq[71274]: reading /etc/resolv.conf
      07:49:42 dnsmasq[71274]: using nameserver ##ISP DNS###53
      07:49:42 dnsmasq[71274]: using nameserver ##ISP DNS 2###53
      07:49:42 dnsmasq[71274]: read /etc/hosts - 2 addresses
      07:49:42 dnsmasq[71274]: read /etc/hosts - 4 addresses
      07:49:42 dhcpleases: /etc/hosts changed size from original!
      07:49:42 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
      07:49:42 dhcpleases: kqueue error: unkown
      07:49:53 rc.gateway_alarm[85709]: >>> Gateway alarm: WAN_DHCP (Addr:##ISP GATEWAY 2## Alarm:1 RTT:16620ms RTTsd:2417ms Loss:22%)
      07:49:53 check_reload_status: updating dyndns WAN_DHCP
      07:49:53 check_reload_status: Restarting ipsec tunnels
      07:49:53 check_reload_status: Restarting OpenVPN tunnels/interfaces
      07:49:53 check_reload_status: Reloading filter
      
      

      After all that the IP lease is renew via DHCP although the behaviour is a bit different:

      
      07:52:09 dhclient: Creating resolv.conf
      07:52:09 dhclient[66767]: bound to ##WAN IP## -- renewal in 150 seconds.
      07:54:39 dhclient[66767]: DHCPREQUEST on cpsw0 to ##ISP GATEWAY## port 67
      07:54:40 dhclient[66767]: DHCPACK from ##ISP GATEWAY##
      
      

      This is a remote site so it is hard to debug, but I thought of trying to maybe recreate the issue by:

      1. Unplugging then replugging the WAN cable
      2. Power cycling the cable modem with a 1 minute off time

      I've found several issues and threads which are similar but the solution either didn't work for me or it's been patched. The fact that it is 50/50 makes me wonder if this is some race condition in the rc.linkup and/or rc.newwanip scripts which pop up in the logs.

      Any ideas, thoughts on what it might be or other ways I can dig up information would be greatly appreciated.

      Thanks

      EDIT 1:

      • I get the same behaviour with unbound instead of dnsmasq
      • Internet does not go down with the ISP supplied crappy router

      EDIT 2:
      Currently running 2.4.0-RELEASE (arm)

      1 Reply Last reply Reply Quote 0
      • T
        TauCeti
        last edited by

        This is the system log of the fail event but also up until the unit was power cycled.  I noticed further down shortly before the power cycle several of these messages:

        08:26:06 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:26:15 kernel: cpsw1: link state changed to DOWN
        08:26:15 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:26:27 kernel: cpsw1: link state changed to UP

        where cpsw1 is the LAN

        Also:
        08:31:37 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:37 kernel: cpsw0: link state changed to DOWN
        08:31:37 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:40 kernel: cpsw0: link state changed to UP
        08:31:40 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed

        where cpsw0 is the WAN.

        Didn't really take note of it earlier because I was so focused on the messages at the failure time, but maybe they mean something. There's not a lot of information on the net about these messages.

        Full system.log from start of failure until power cycle:

        
        07:49:35 kernel: cpsw0: link state changed to DOWN
        07:49:35 kernel: cpsw0: link state changed to UP
        07:49:35 check_reload_status: Linkup starting cpsw0
        07:49:35 check_reload_status: Linkup starting cpsw0
        07:49:36 php-fpm[88066]: /rc.linkup: DEVD Ethernet detached event for wan
        07:49:37 check_reload_status: Reloading filter
        07:49:37 php-fpm[13091]: /rc.linkup: DEVD Ethernet attached event for wan
        07:49:37 php-fpm[13091]: /rc.linkup: HOTPLUG: Configuring interface wan
        07:49:40 check_reload_status: rc.newwanip starting cpsw0
        07:49:40 php-fpm[13091]: /rc.linkup: ROUTING: setting default route to ###ISP GATEWAY 1###
        07:49:40 check_reload_status: Restarting ipsec tunnels
        07:49:41 kernel: arpresolve: can't allocate llinfo for ###ISP GATEWAY 2### on cpsw0
        07:49:41 php-fpm[88066]: /rc.newwanip: rc.newwanip: Info: starting on cpsw0.
        07:49:41 kernel: arpresolve: can't allocate llinfo for ###ISP GATEWAY 2### on cpsw0
        07:49:41 php-fpm[88066]: /rc.newwanip: rc.newwanip: on (IP address: ###NEW WAN IP###) (interface: WAN[wan]) (real interface: cpsw0).
        07:49:41 dhcpleases: /etc/hosts changed size from original!
        07:49:41 php-fpm[88066]: /rc.newwanip: IP Address has changed, killing states on former IP Address 106.71.88.7.
        07:49:41 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
        07:49:42 dhcpleases: /etc/hosts changed size from original!
        07:49:42 dhcpleases: Could not deliver signal HUP to process because its pidfile (/var/run/dnsmasq.pid) does not exist, No such process.
        07:49:42 dhcpleases: kqueue error: unkown
        07:49:53 rc.gateway_alarm[85709]: >>> Gateway alarm: WAN_DHCP (Addr:###ISP GATEWAY 2### Alarm:1 RTT:16620ms RTTsd:2417ms Loss:22%)
        07:49:53 check_reload_status: updating dyndns WAN_DHCP
        07:49:53 check_reload_status: Restarting ipsec tunnels
        07:49:53 check_reload_status: Restarting OpenVPN tunnels/interfaces
        07:49:53 check_reload_status: Reloading filter
        07:51:36 sshd[82454]: Timeout, client not responding.
        07:59:46 check_reload_status: Linkup starting cpsw1
        07:59:46 kernel: cpsw1: link state changed to DOWN
        08:00:06 kernel: cpsw1: link state changed to UP
        08:00:06 check_reload_status: Linkup starting cpsw1
        08:00:15 check_reload_status: Linkup starting cpsw1
        08:00:15 kernel: cpsw1: link state changed to DOWN
        08:00:28 check_reload_status: Linkup starting cpsw1
        08:00:28 kernel: cpsw1: link state changed to UP
        08:25:43 kernel: cpsw1: link state changed to DOWN
        08:25:43 check_reload_status: Linkup starting cpsw1
        08:26:06 kernel: cpsw1: link state changed to UP
        08:26:06 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:26:15 kernel: cpsw1: link state changed to DOWN
        08:26:15 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:26:27 kernel: cpsw1: link state changed to UP
        08:26:27 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:30:51 kernel: cpsw1: link state changed to DOWN
        08:30:51 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:30:55 kernel: cpsw0: link state changed to DOWN
        08:30:55 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:14 kernel: cpsw1: link state changed to UP
        08:31:14 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:20 kernel: cpsw1: link state changed to DOWN
        08:31:20 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:25 kernel: cpsw0: link state changed to UP
        08:31:25 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:37 kernel: cpsw1: link state changed to UP
        08:31:37 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:37 kernel: cpsw0: link state changed to DOWN
        08:31:37 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:40 kernel: cpsw0: link state changed to UP
        08:31:40 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:31:43 kernel: cpsw0: link state changed to DOWN
        08:31:43 kernel: sonewconn: pcb 0xc3abe840: pru_attach() failed
        08:33:08 syslogd: kernel boot file is /boot/kernel/kernel
        08:33:08 kernel: Copyright (c) 1992-2017 The FreeBSD Project.
        08:33:08 kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        
        

        [

        1 Reply Last reply Reply Quote 0
        • S
          seanmcne
          last edited by

          Does your wan Access come back if you power cycle your cable modem without rebooting PFSense?  I have, what may be, a similar issue but I can restore internet by rebooting only
          The cable modem. It started the day after I upgraded to 2.4 and unfortunately it is still randomly happening after 2.4.1 - I’m considering rolling back for a while until the dust settles more on 2.4 🙁

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.