Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Netgate 1100 becoming unresponsive intermittently

    Scheduled Pinned Locked Moved General pfSense Questions
    4 Posts 2 Posters 276 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D
      DominikHoffmann
      last edited by

      I have a client with a brand-new Netgate 1100. She reports that it becomes unresponsive intermittently a couple of times a week. When that happens she neither has internet access nor can she log into the WebConfigurator interface. Because in those instances she has just power-cycled the unit, we only recently discovered that the device eventually recovers and starts operating normally again.

      What could be the cause of this malfunction?

      Here are the system logs from today:

      Apr 8 04:18:03	rc.gateway_alarm	39093	>>> Gateway alarm: WAN_DHCP (Addr:100.93.160.1 Alarm:1 RTT:2.930ms RTTsd:1.385ms Loss:21%)
      Apr 8 04:18:03	check_reload_status	463	updating dyndns WAN_DHCP
      Apr 8 04:18:03	check_reload_status	463	Restarting IPsec tunnels
      Apr 8 04:18:03	check_reload_status	463	Restarting OpenVPN tunnels/interfaces
      Apr 8 04:18:03	check_reload_status	463	Reloading filter
      Apr 8 04:18:05	php-fpm	740	/rc.openvpn: Gateway, none 'available' for inet, use the first one configured. 'WAN_DHCP'
      Apr 8 04:18:05	php-fpm	740	/rc.openvpn: Gateway, NONE AVAILABLE
      Apr 8 04:18:05	php-fpm	740	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_DHCP.
      Apr 8 05:06:00	sshguard	68389	Exiting on signal.
      Apr 8 05:06:00	sshguard	78042	Now monitoring attacks.
      Apr 8 06:02:00	sshguard	78042	Exiting on signal.
      Apr 8 06:02:00	sshguard	54186	Now monitoring attacks.
      Apr 8 06:36:00	sshguard	54186	Exiting on signal.
      Apr 8 06:36:00	sshguard	64978	Now monitoring attacks.
      Apr 8 06:58:00	sshguard	64978	Exiting on signal.
      Apr 8 06:58:00	sshguard	20813	Now monitoring attacks.
      Apr 8 07:54:00	sshguard	20813	Exiting on signal.
      Apr 8 07:54:00	sshguard	13378	Now monitoring attacks.
      Apr 8 07:55:00	sshguard	13378	Exiting on signal.
      Apr 8 07:55:00	sshguard	37932	Now monitoring attacks.
      Apr 8 08:50:00	sshguard	37932	Exiting on signal.
      Apr 8 08:50:00	sshguard	96590	Now monitoring attacks.
      Apr 8 09:05:09	check_reload_status	463	rc.newwanip starting mvneta0.4090
      Apr 8 09:05:10	php-fpm	415	/rc.newwanip: rc.newwanip: Info: starting on mvneta0.4090.
      Apr 8 09:05:10	php-fpm	415	/rc.newwanip: rc.newwanip: on (IP address: 100.93.183.31) (interface: WAN[wan]) (real interface: mvneta0.4090).
      Apr 8 09:05:10	check_reload_status	463	Reloading filter
      Apr 8 09:05:28	php-fpm	740	/index.php: webConfigurator authentication error for user 'admin ' from: 192.168.42.122
      Apr 8 09:06:06	rc.gateway_alarm	13935	>>> Gateway alarm: WAN_DHCP (Addr:100.93.160.1 Alarm:0 RTT:3.029ms RTTsd:1.577ms Loss:6%)
      Apr 8 09:06:06	check_reload_status	463	updating dyndns WAN_DHCP
      Apr 8 09:06:06	check_reload_status	463	Restarting IPsec tunnels
      Apr 8 09:06:06	check_reload_status	463	Restarting OpenVPN tunnels/interfaces
      Apr 8 09:06:06	check_reload_status	463	Reloading filter
      Apr 8 09:06:08	php-fpm	415	/rc.openvpn: Gateway, NONE AVAILABLE
      Apr 8 09:06:08	php-fpm	415	/rc.openvpn: Gateway, NONE AVAILABLE
      Apr 8 09:06:08	php-fpm	415	/rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed IP addresses. Reloading endpoints that may use WAN_DHCP.
      Apr 8 10:08:21	php-fpm	416	/index.php: Session timed out for user 'admin' from: 192.168.1.33 (Local Database)
      Apr 8 10:08:40	php-fpm	416	/index.php: Successful login for user 'admin' from: 192.168.1.33 (Local Database)
      Apr 8 10:08:41	php-fpm	416	/index.php: The command '/usr/local/sbin/ping-auth -s > /etc/thoth/thothid 2>/dev/null' returned exit code '127', the output was ''
      
      D 1 Reply Last reply Reply Quote 0
      • D
        DominikHoffmann @DominikHoffmann
        last edited by DominikHoffmann

        I also looked at the DNS Resolver logs and found these suspicious entries:

        Apr 8 09:05:10	unbound	86407	[86407:0] info: service stopped (unbound 1.18.0).
        Apr 8 09:05:10	unbound	86407	[86407:0] info: server stats for thread 0: 1081 queries, 326 answers from cache, 755 recursions, 0 prefetch, 0 rejected by ip ratelimiting
        Apr 8 09:05:10	unbound	86407	[86407:0] info: server stats for thread 0: requestlist max 6 avg 1.49934 exceeded 0 jostled 0
        Apr 8 09:05:10	unbound	86407	[86407:0] info: average recursion processing time 69.285267 sec
        Apr 8 09:05:10	unbound	86407	[86407:0] info: histogram of recursion processing times
        Apr 8 09:05:10	unbound	86407	[86407:0] info: [25%]=5.55147e-07 median[50%]=16.4444 [75%]=101.3
        Apr 8 09:05:10	unbound	86407	[86407:0] info: lower(secs) upper(secs) recursions
        Apr 8 09:05:10	unbound	86407	[86407:0] info: 0.000000 0.000001 340
        Apr 8 09:05:10	unbound	86407	[86407:0] info: 0.000256 0.000512 1
        Apr 8 09:05:10	unbound	86407	[86407:0] info: 0.008192 0.016384 2
        

        I cannot be sure, however, but recursions that are not terminated can cause a system hang. Is this normal?

        I looked at my own appliance’s unbound log, which has recursions that last a small fraction of the time those of my client take:

        Mar 6 22:22:18	unbound	17207	[17207:0] info: 0.131072 0.262144 8
        Mar 6 22:22:18	unbound	17207	[17207:0] info: server stats for thread 1: 229 queries, 72 answers from cache, 157 recursions, 0 prefetch, 0 rejected by ip ratelimiting
        Mar 6 22:22:18	unbound	17207	[17207:0] info: server stats for thread 1: requestlist max 5 avg 0.55414 exceeded 0 jostled 0
        Mar 6 22:22:18	unbound	17207	[17207:0] info: average recursion processing time 0.039812 sec
        Mar 6 22:22:18	unbound	17207	[17207:0] info: histogram of recursion processing times
        Mar 6 22:22:18	unbound	17207	[17207:0] info: [25%]=0.00949527 median[50%]=0.0224619 [75%]=0.052992
        Mar 6 22:22:18	unbound	17207	[17207:0] info: lower(secs) upper(secs) recursions
        Mar 6 22:22:18	unbound	17207	[17207:0] info: 0.000000 0.000001 5
        
        1 Reply Last reply Reply Quote 0
        • D DominikHoffmann referenced this topic on
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          You have an OpenVPN gateway configured on the system?

          Make sure the system default gateway in Sys > Routing > Gateways is set to WAN_DHCP specifically and not automatic if so. Otherwise if the WAN gateway ever goes down the system may try to switch to a different gateway and choose the OpenVPN gateway which is probably invalid. It will not switch back unless that too goes down or the system is rebooted.

          Steve

          D 1 Reply Last reply Reply Quote 1
          • D
            DominikHoffmann @stephenw10
            last edited by

            @stephenw10: Thanks very much! Excellent point!

            1 Reply Last reply Reply Quote 1
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.