Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Diagnosing a "Dead" box

    General pfSense Questions
    3
    9
    758
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      Stewart
      last edited by

      I have an APU2 unit that appears to have died but I'm not sure what happened. I know it was working until about 10:00am the other day and then it just stopped passing traffic. I've tried getting into the gui but it doesn't respond. Sniffing with wireshark shows no packets. Using a console cable didn't give me anything until a reboot.

      On reboot it boots pfSense and everything seem to load fine until it gets to the line "Bootup Complete" where it just hangs. Normally when it gets to that point then it's a 9600vs115200 speed issue but that isn't the case. This unit is 2.4.4-p3 and the bootup shows "/boot/config: -S115200 -h". I reinstalled from a config backup and it lets me in on 115200. On bootup it checks the filesystem each time and says it is clean. Booting to Single User Mode has similar results in that it appears to boot normal until "Bootup Complete" where it just hangs.

      We had a second client that appears to have had something similar happen at about the same time. They are running but you can't get into the GUI or SSH. Console doesn't respond. I'm afraid if I reboot then it'll just do the same thing as the unit from the first client. There is no connection between the 2 devices. Different companies. Different industries. Different cities. Other than the same ISP there's no connection between the two. I've built up another unit to replace the second unit if need be but I'd like to learn how to troubleshoot the issue. Any advice would be great. Thanks.

      stephenw10S JKnottJ 2 Replies Last reply Reply Quote 0
      • stephenw10S
        stephenw10 Netgate Administrator @Stewart
        last edited by

        If it gets to 'Bootup complete' that's beyond where it would fail because of an baud rate mismatch, which would be after the bootloader. That usually indicates the serial console is not enabled. You see everything but no console menu. Check the config file. Make sure the <system> section cointains:

        		<enableserial></enableserial>
        

        What image was used to install to this APU?

        Steve

        S 1 Reply Last reply Reply Quote 0
        • JKnottJ
          JKnott @Stewart
          last edited by

          @stewart

          FWIW, I had a problem last year where the performance dropped. I rebooted and it wouldn't come up. It would partially load, but I couldn't get anything to work. I wound up replacing the computer with the one in my sig. Incidentally, I had previously used the old computer for a Linux firewall and I couldn't even install Linux on it again, so that was a definite clue the hardware had failed.

          PfSense running on Qotom mini PC
          i5 CPU, 4 GB memory, 32 GB SSD & 4 Intel Gb Ethernet ports.
          UniFi AC-Lite access point

          I haven't lost my mind. It's around here...somewhere...

          S 1 Reply Last reply Reply Quote 0
          • S
            Stewart @stephenw10
            last edited by

            @stephenw10 It was made from the usb installer something like 3 years ago. It's worked fine up until this point.

            How do I get to the config file? I guess I could pull the drive out and plug it into the USB port of another unit. I have an internal mSata adapter that I can externally power and plug it into the SATA port. I'm a bit rusty on my mount commands, though. I'll give it a shot.

            1 Reply Last reply Reply Quote 0
            • S
              Stewart @JKnott
              last edited by

              @jknott The unit seems fine. Could be the drive, though. Usually I can get in to run fsck but that doesn't seem to be an option so far.

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                You said you installed a config backup in the first post, I assumed you had access to backups to check?

                S 1 Reply Last reply Reply Quote 0
                • S
                  Stewart @stephenw10
                  last edited by

                  @stephenw10 Oh, yes it's enabled in the old configs. I got the old drive mounted
                  mkdir /mnt/drive
                  mount /dev/ada1s1 /mnt/drive

                  Looks like it ran out of space. That old Suricata log bug where it doesn't rotate and keeps writing. I thought I had fixed all of those. Once that happens then the config file gets corrupted with whole sections missing. That's why it seems to boot but doesn't.

                  I'll be looking at the second unit today. Maybe the same issue? Guess we'll see.

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    Hmm, interesting. If the config is corrupted it should try to load the last good config.

                    S 1 Reply Last reply Reply Quote 0
                    • S
                      Stewart @stephenw10
                      last edited by

                      @stephenw10 Back when we got bit by this bug a few times it didn't ever seem to load a good config. However, if you look at the configs there are generally several that are over 100KB+ (can't remember exactly) but the last few are just a few KB in size. Perhaps it does load the last config but it is also corrupted but it can't really do anything about it since there is no disk space to swap through. That's just been my assumption.

                      As for the second unit, everything was fine after a reboot. Disk space is fine, no errors in the logs. Just couldn't access the box via gui or ssh or console. Strange. It was 2.4.5-p1 so I've upgraded it to 2.5.2 and we'll see how it goes.

                      1 Reply Last reply Reply Quote 1
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.