Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    How to automate fsck? (SG-2440)

    Scheduled Pinned Locked Moved General pfSense Questions
    4 Posts 2 Posters 1.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      pfbolt
      last edited by

      So, I rebooted an SG-2440 at a remote site, and it didn't come back up.

      I went over there, plugged in the console cable, pressed <enter>and got a #

      Stupidly, instead of poking around, I typed "exit", and it immediately booted, complaining about some fsck fixes it had to do.
      Thus I don't know what kind of shell I was in, or why. What was printed to the console before I connected is lost.

      Then it gave me a ton of lines like these:
      –---------------------------------------------------------------
      281.465158 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.476184 [ 799] generic_netmap_dtor      Restored native NA 0
      281.483175 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.490567 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.497547 [ 799] generic_netmap_dtor      Restored native NA 0
      281.504807 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.512232 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      done.
      281.519241 [ 799] generic_netmap_dtor      Restored native NA 0
      281.526864 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.534269 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.541352 [ 799] generic_netmap_dtor      Restored native NA 0
      281.548217 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.555776 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.562758 [ 799] generic_netmap_dtor      Restored native NA 0
      281.569795 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.577263 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.584263 [ 799] generic_netmap_dtor      Restored native NA 0
      281.595263 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.603180 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.610788 [ 799] generic_netmap_dtor      Restored native NA 0
      Starting NTP tim281.618288 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      e client…281.627131 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.635177 [ 799] generic_netmap_dtor      Restored native NA 0
      281.642505 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.650094 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.657131 [ 799] generic_netmap_dtor      Restored native NA 0
      281.664235 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.671654 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.678689 [ 799] generic_netmap_dtor      Restored native NA 0
      281.685705 [ 266] generic_find_num_desc    called, in tx 1024 rx 1024
      281.693152 [ 274] generic_find_num_queues  called, in txq 0 rxq 0
      281.702990 [ 799] generic_netmap_dtor      Restored native NA 0
      done.
      Starting DHCP service…done.
      Configuring firewall.....0 addresses deleted.
      0 addresses deleted.
      .done.
      Generating RRD graphs...done.
      Starting syslog...done.
      [boot process continues…..]
      –---------------------------------------------------------------

      Well, the box is up now, but what the hey?
      Is it normal for these to get stuck at fsck and require manual intervention?
      What do all the generic_ lines mean?</enter>

      1 Reply Last reply Reply Quote 0
      • D Offline
        doktornotor Banned
        last edited by

        Wait for 2.4 (or use the snapshots) and switch to ZFS. UFS and its fsck is totally broken and unfixable. Trying to fix things with fsck will eventually destroy the filesystem.

        1 Reply Last reply Reply Quote 0
        • P Offline
          pfbolt
          last edited by

          Guessing that'll mean a reinstall, then…
          Any idea about the generic_ lines?

          1 Reply Last reply Reply Quote 0
          • D Offline
            doktornotor Banned
            last edited by

            Yes, reinstall is the only way to fix UFS.

            I've filed multitude of bugs about UFS and fsck. fsck is so broken that it needs multiple successive manual runs to even try to repair the filesystem, and then it gets all sort of things wrong, and segfaults, or spits out various confused nonsense, and eventually screws the filesystem to the point where you cannot boot any more.

            I got the below patch from one of the pfSense devs for debugging, and while it tries to run fsck much aggressively, as noted above, the only result in the end was complete FS destruction. Also, it would need updating for 2.3.2 or newer, apparently.

            
            diff --git a/src/etc/rc b/src/etc/rc
            index e82a5ba..970fa9c 100755
            --- a/src/etc/rc
            +++ b/src/etc/rc
            @@ -54,7 +54,7 @@ fi
            
             if [ -e /root/force_fsck ]; then
             	echo "Forcing filesystem(s) check..."
            -	/sbin/fsck -y -F -t ufs
            +	/sbin/fsck -y
             fi
            
             if [ "${PLATFORM}" != "cdrom" ]; then
            @@ -77,18 +77,37 @@ if [ "${PLATFORM}" != "cdrom" ]; then
            
             	if [ ${FSCK_ACTION_NEEDED} = 1 ]; then
             		echo "WARNING: Trying to recover filesystem from inconsistency..."
            -		/sbin/fsck -yF
            +		ntries=0
            +		fsck_rc=1
            +		until [ $ntries -ge 3 -o $fsck_rc -eq 0 ]; do
            +			/sbin/fsck -y
            +			fsck_rc=$?
            +			ntries=$((ntries+1))
            +			echo "DEBUG: Run #${ntries} - rc = ${fsck_rc}"
            +			sleep 1
            +
            +			# Sometimes first call returns 0 but filesystem is still broken
            +			# Run fsck in preen mode again just to be sure
            +			/sbin/fsck -p -F
            +			fsck_rc=$?
            +			echo "DEBUG: (-p) #${ntries} - rc = ${fsck_rc}"
            +			sleep 1
            +		done
            +
            +		if [ $fsck_rc -ne 0 ]; then
            +			echo "Automatic filesystem recovery failed. Starting recovery shell!"
            +			tcsh
            +			reboot
            +		fi
             	fi
            
             	/sbin/mount -a 2>/dev/null
            -	mount_rc=$?
            -	attempts=0
            -	while [ ${mount_rc} -ne 0 -a ${attempts} -lt 3 ]; do
            -		/sbin/fsck -yF
            -		/sbin/mount -a 2>/dev/null
            -		mount_rc=$?
            -		attempts=$((attempts+1))
            -	done
            +
            +	if [ $? -ne 0 ]; then
            +		echo "Filesystems could not be mounted. Starting recovery shell!"
            +		tcsh
            +		reboot
            +	fi
            
             	if [ "${PLATFORM}" = "nanobsd" ]; then
             		# XXX This script does need all filesystems rw!!!!
            
            
            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.