Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Pfsense 2.6 and 2.7 crash on Zotac Mini PC

    Scheduled Pinned Locked Moved General pfSense Questions
    16 Posts 2 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      tzalmaves
      last edited by tzalmaves

      Hello all,

      PfSense 2.6 and 2.7 crash every few days on my Zotac Mini PC (ZBOX-CI323NANO). By "crash", I mean that from my wired PC, the webUI and SSH console are not available and it cannot be pinged. Devices connected to my stand-alone wireless access points lose internet connectivity, although this last time I noticed that my wired PC was able to use web browser, for example. To recover, I must press and hold the power button to turn it off, then turn it back on.

      This issue did not exist in PfSense 2.4.4. This started happening when I upgraded to 2.6.0. I upgraded to 2.7.0 in hopes of a fix, but the crashes continue under 2.7.0.

      /var/crash contains only one file, minfree, whose contents are "2048" (without quotes).

      I have the Realtek 1.98 driver installed, and it is loaded at boot time.

      In system->advanced->networking, I have Disable hardware checksum offload, Disable hardware TCP segmentation offload, Disable hardware large receive offload, and Enable the ALTQ support for hn NICs checked.

      This just serves the computers and devices in my home, so it's not what I would think would be considered high load. CPU usage seems to be 3% to 10%, memory usage is 11%. Temperature is reported as 26.9C.

      I have 4 VLANs setup to segment devices from one another.

      Using Cron to reboot the machine every 24 hours doesn't help, it will still crash between reboots.

      Any help would be greatly appreciated, my next step would be to buy a Netgate 1100 appliance, but I'm hoping it won't have to come to that.

      1 Reply Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator
        last edited by

        Is it still responsive at the local console on the box itself? Any errors shown there?

        Do you see a crash report after it reboots? Any errors logged?

        Steve

        T 2 Replies Last reply Reply Quote 0
        • T
          tzalmaves @stephenw10
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • T
            tzalmaves @stephenw10
            last edited by

            @stephenw10 No, there's no crash report shown in the GUI when I log in after reboot.

            To see errors at the local console, I need to log in and let it sit an the console menu, correct?

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Most significant errors are shown on the console whether or not you're logged in. If it lost access to the boot drive for example you would see a bunch of disk errors.

              If it is still responsive and there are no errors then you should try to connect out from the command line and see what still works, if anything, and what errors that produces.

              Steve

              T 1 Reply Last reply Reply Quote 0
              • T
                tzalmaves @stephenw10
                last edited by

                @stephenw10 OK thanks.

                In /var/log, based on the symptoms above, what should I be looking for? Could it be a DHCP problem?

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  No you wouldn't lose access from the LAN side if it was dhcp.

                  It could be the LAN side NIC seeing some error. Is it a Realtek NIC?
                  I would expect to see something logged after rebooting from the driver though.

                  Some Zotac boxes had issues with a conflicting driver for the sd/mmc device that required it be disabled. That prevented it booting entirely though, I haven't seen that happen after boot.

                  If you find the console is still responsive run dmesg and cat /var/log/system.log and see what errors are shown.

                  T 1 Reply Last reply Reply Quote 0
                  • T
                    tzalmaves @stephenw10
                    last edited by

                    @stephenw10 Here's system.log from the time of the crash. Note the hole in time in the log. Looks like logging fails even though pfsense worked enough that I didn't notice the failure until almost 2 hours later when I rebooted.

                    Aug 24 18:44:36  sshd[15774]: Disconnected from authenticating user root 177.19.162.241 port 43266 [preauth]
                    Aug 24 18:44:36  sshguard[25997]: Attack from "177.19.162.241" on service SSH with danger 10.
                    Aug 24 18:45:02  sshd[87435]: Invalid user network from 192.99.59.56 port 56558
                    Aug 24 18:45:02  sshguard[25997]: Attack from "192.99.59.56" on service SSH with danger 10.
                    Aug 24 18:45:02  sshd[87435]: Received disconnect from 192.99.59.56 port 56558:11: Bye Bye [preauth]
                    Aug 24 18:45:02  sshd[87435]: Disconnected from invalid user network 192.99.59.56 port 56558 [preauth]
                    Aug 24 18:45:02  sshguard[25997]: Attack from "192.99.59.56" on service SSH with danger 10.
                    Aug 24 18:45:39  sshguard[25997]: 191.242.105.133: unblocking after 969 secs
                    Aug 24 18:46:41  sshd[10619]: Received disconnect from 191.242.105.133 port 60080:11: Bye Bye [preauth]
                    Aug 24 18:46:41  sshd[10619]: Disconnected from authenticating user root 191.242.105.133 port 60080 [preauth]
                    Aug 24 18:46:41  sshguard[25997]: Attack from "191.242.105.133" on service SSH with danger 10.
                    Aug 24 18:46:53  sshguard[25997]: 181.49.178.6: unblocking after 995 secs
                    Aug 24 20:21:39  syslogd: kernel boot file is /boot/kernel/kernel
                    Aug 24 20:21:39  kernel: ---<<BOOT>>---
                    Aug 24 20:21:39  kernel: Copyright (c) 1992-2023 The FreeBSD Project.
                    Aug 24 20:21:39  kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
                    Aug 24 20:21:39  kernel: 	The Regents of the University of California. All rights reserved.
                    Aug 24 20:21:39  kernel: FreeBSD is a registered trademark of The FreeBSD Foundation.
                    Aug 24 20:21:39  kernel: FreeBSD 14.0-CURRENT #1 RELENG_2_7_0-n255866-686c8d3c1f0: Wed Jun 28 04:21:19 UTC 2023
                    
                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Looks like you have SSH open to the world which is generally not a good idea.

                      Nothing logged at all like that could be a failing drive. It can take a while after the driver goes AWOL for the firewall services to all fail.

                      T 1 Reply Last reply Reply Quote 0
                      • T
                        tzalmaves @stephenw10
                        last edited by

                        @stephenw10 I'm not seeing anything that indicates a failing drive, am I missing something?

                        >smartctl --test long /dev/ada0
                        smartctl 7.3 2022-02-28 r5338 [FreeBSD 14.0-CURRENT amd64] (local build)
                        Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
                        
                        === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
                        Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
                        Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
                        Testing has begun.
                        Please wait 2 minutes for test to complete.
                        Test will complete after Fri Aug 25 09:11:00 2023 EDT
                        Use smartctl -X to abort test.
                        
                        smartctl -a /dev/ada0
                        smartctl 7.3 2022-02-28 r5338 [FreeBSD 14.0-CURRENT amd64] (local build)
                        Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
                        
                        === START OF INFORMATION SECTION ===
                        Device Model:     SC2 MSATA SSD
                        Serial Number:    39DD07471ED000000340
                        Firmware Version: S9FM01.9
                        User Capacity:    60,022,480,896 bytes [60.0 GB]
                        Sector Size:      512 bytes logical/physical
                        Rotation Rate:    Solid State Device
                        Form Factor:      2.5 inches
                        TRIM Command:     Available
                        Device is:        Not in smartctl database 7.3/5319
                        ATA Version is:   ACS-3 (minor revision not indicated)
                        SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
                        Local Time is:    Fri Aug 25 09:14:31 2023 EDT
                        SMART support is: Available - device has SMART capability.
                        SMART support is: Enabled
                        
                        === START OF READ SMART DATA SECTION ===
                        SMART overall-health self-assessment test result: PASSED
                        
                        General SMART Values:
                        Offline data collection status:  (0x00) Offline data collection activity
                                                                was never started.
                                                                Auto Offline Data Collection: Disabled.
                        Self-test execution status:      (   0) The previous self-test routine completed
                                                                without error or no self-test has ever
                                                                been run.
                        Total time to complete Offline
                        data collection:                (   30) seconds.
                        Offline data collection
                        capabilities:                    (0x7b) SMART execute Offline immediate.
                                                                Auto Offline data collection on/off support.
                                                                Suspend Offline collection upon new
                                                                command.
                                                                Offline surface scan supported.
                                                                Self-test supported.
                                                                Conveyance Self-test supported.
                                                                Selective Self-test supported.
                        SMART capabilities:            (0x0003) Saves SMART data before entering
                                                                power-saving mode.
                                                                Supports SMART auto save timer.
                        Error logging capability:        (0x01) Error logging supported.
                                                                General Purpose Logging supported.
                        Short self-test routine
                        recommended polling time:        (   1) minutes.
                        Extended self-test routine
                        recommended polling time:        (   2) minutes.
                        Conveyance self-test routine
                        recommended polling time:        (   2) minutes.
                        
                        SMART Attributes Data Structure revision number: 16
                        Vendor Specific SMART Attributes with Thresholds:
                        ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
                          1 Raw_Read_Error_Rate     0x000a   100   100   000    Old_age   Always       -       0
                          9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       23637
                         12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       3175
                        168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
                        170 Unknown_Attribute       0x0013   100   100   010    Pre-fail  Always       -       21
                        173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       674825059
                        192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       53
                        194 Temperature_Celsius     0x0023   070   070   000    Pre-fail  Always       -       30
                        218 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -       0
                        241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       16018186
                        
                        SMART Error Log Version: 1
                        No Errors Logged
                        
                        SMART Self-test log structure revision number 1
                        Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
                        # 1  Extended offline    Completed without error       00%        12         -
                        
                        SMART Selective self-test log data structure revision number 0
                        Note: revision number not 1 implies that no selective self-test has ever been run
                         SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
                            1        0        0  Not_testing
                            2        0        0  Not_testing
                            3        0        0  Not_testing
                            4        0        0  Not_testing
                            5        0        0  Not_testing
                        Selective self-test flags (0x0):
                          After scanning selected spans, do NOT read-scan remainder of disk.
                        If Selective self-test is pending on power-up, resume after 0 minute delay.
                        
                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Nope I don't see anything there either.

                          T 2 Replies Last reply Reply Quote 0
                          • T
                            tzalmaves @stephenw10
                            last edited by

                            @stephenw10 Hmm, OK, so if the there's nothing in /var/crash, and the system log shows nothing on failure, and the disk looks OK, is there anything else to do or try?

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              Wait for it to fail again and check on the local console. If it's responsive try connecting out.

                              T 1 Reply Last reply Reply Quote 0
                              • T
                                tzalmaves @stephenw10
                                last edited by tzalmaves

                                @stephenw10 Just to provide more information, here are my installed packages:

                                1074f1a4-68d9-4421-a726-c1adef500a2c-image.png

                                I have not installed any system patches, I just installed the package.

                                1 Reply Last reply Reply Quote 0
                                • T
                                  tzalmaves @stephenw10
                                  last edited by

                                  @stephenw10 When you say "connect out", do you mean ping, curl, etc., or something else?

                                  1 Reply Last reply Reply Quote 0
                                  • stephenw10S
                                    stephenw10 Netgate Administrator
                                    last edited by

                                    Any of those. I would start with ping to a local IP, then to some remote IP.

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.