Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    all services fail to start all packages gone

    General pfSense Questions
    2
    10
    855
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • wgstarksW
      wgstarks
      last edited by wgstarks

      Upgraded to 21.05.1 yesterday and everything was working properly after the upgrade so not sure if this is related.

      Got a couple of notifications from watchdog that all services had stopped and were being restarted so I logged into the firewall and found that all services were stopped, all packages were gone and there aren't any logs. I restarted the 3100 but issues still persist. I've tried restarting the services but just get failed to start. No idea what to try next?

      Box: SG-4200

      1 Reply Last reply Reply Quote 0
      • wgstarksW
        wgstarks
        last edited by

        Tried a restore but didn't help. Looks like all monitoring is gone.

        Screen Shot 2021-08-08 at 6.10.57 PM.png

        Box: SG-4200

        1 Reply Last reply Reply Quote 0
        • wgstarksW
          wgstarks
          last edited by wgstarks

          Serial interface shows this repeating-

          kern.ipc.maxpipekva exceeded; see tuning(7)
          kern.ipc.maxpipekva exceeded; see tuning(7)
          
          

          Box: SG-4200

          1 Reply Last reply Reply Quote 0
          • stephenw10S
            stephenw10 Netgate Administrator
            last edited by

            That is a symptom of something else doing something it shouldn't be a spawning loads of pipes.

            Try running abd the command line:

            ps -auxwwd
            

            Steve

            wgstarksW 1 Reply Last reply Reply Quote 0
            • wgstarksW
              wgstarks @stephenw10
              last edited by

              @stephenw10 said in all services fail to start all packages gone:

              That is a symptom of something else doing something it shouldn't be a spawning loads of pipes.

              Try running abd the command line:

              ps -auxwwd
              

              Steve

              Thanks. I should have posted this last night but it was late. I started a support ticket with Netgate and they sent me a recovery image for 21.05 within just a few minutes. Took me a bit to re-configure the basic settings and restore the rest from recent backups but got everything up and running a little after midnight.

              Tried to capture a status report for tech support before recovery but couldn’t get that to work. Wouldn’t retrieve any of the info. Tech Support also sent me a recovery image for 21.05.1. I’ll try updating via that process later this week when I have a few hours free and see what happens. Hopefully this was just gremlins and they’re gone now.😁 If not I’ve got the recovery images already and can revert to 21.05 again.

              Box: SG-4200

              1 Reply Last reply Reply Quote 0
              • stephenw10S
                stephenw10 Netgate Administrator
                last edited by

                Ah, good to hear. 👍

                wgstarksW 1 Reply Last reply Reply Quote 0
                • wgstarksW
                  wgstarks @stephenw10
                  last edited by wgstarks

                  @stephenw10
                  Looks like I jinxed myself. System went right back into the same state this afternoon.

                  Is it possible that an automated local backup could cause this? Maybe it's just coincidence but both days this has occurred at roughly the same time as my automated backup runs.

                  Backup script-

                  #!/bin/bash
                  
                  BACKUP_HOST=10.0.1.1
                  BACKUP_USER=backup
                  BACKUP_PASSWORD=redacted
                  
                  # Create config file directory if it doesn't exist
                  [ -d files/ ] || mkdir files
                  
                  # Fetch the login form and save the cookies and CSRF token:
                  wget -qO- --keep-session-cookies --save-cookies cookies.txt \
                    --no-check-certificate https://${BACKUP_HOST}/diag_backup.php \
                    | grep "name='__csrf_magic'" | sed 's/.*value="\(.*\)".*/\1/' > csrf.txt
                  
                  # Submit the login form along with the first CSRF token and save the second CSRF token (can’t reuse the same file) – now the script is logged in and can take action:
                  wget -qO- --keep-session-cookies --load-cookies cookies.txt \
                    --save-cookies cookies.txt --no-check-certificate \
                    --post-data "login=Login&usernamefld=${BACKUP_USER}&passwordfld=${BACKUP_PASSWORD}&__csrf_magic=$(cat csrf.txt)" \
                    https://${BACKUP_HOST}/diag_backup.php  | grep "name='__csrf_magic'" \
                    | sed 's/.*value="\(.*\)".*/\1/' > csrf2.txt
                  
                  # Submit the download form along with the second CSRF token to save a copy of config.xml:
                  wget --keep-session-cookies --load-cookies cookies.txt --no-check-certificate \
                    --post-data "download=download&donotbackuprrd=yes&__csrf_magic=$(head -n 1 csrf2.txt)" \
                    https://${BACKUP_HOST}/diag_backup.php -O ./files/config_${BACKUP_HOST}_$(date +%Y-%m-%d-%H-%M-%S).xml 2>/dev/null
                  
                  # Clean up
                  rm cookies.txt csrf.txt csrf2.txt
                  unset BACKUP_HOST BACKUP_USER BACKUP_PASSWORD
                  
                  # Remove files older than 100 days
                  find /mnt/user/odin_backup/OdinBackUp/files/ -type f -name '*.xml' -mtime +100 -exec rm {} \;
                  

                  I've done another recovery and the system is back up and running but wondering for how long.😕

                  Box: SG-4200

                  1 Reply Last reply Reply Quote 0
                  • stephenw10S
                    stephenw10 Netgate Administrator
                    last edited by

                    It seems unlikely but I guess it could if it's doing something unexpected.

                    Run ps -auxwwd if it fails and see what it's actually doing.

                    Steve

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Looks like this is the gw_leds script which it appears you're also running:
                      https://forum.netgate.com/topic/165680/sg-3100-21-05-1-kern-ipc-maxpipekva-exceeded-see-tuning-7

                      Steve

                      wgstarksW 1 Reply Last reply Reply Quote 0
                      • wgstarksW
                        wgstarks @stephenw10
                        last edited by

                        @stephenw10 said in all services fail to start all packages gone:

                        Looks like this is the gw_leds script which it appears you're also running:
                        https://forum.netgate.com/topic/165680/sg-3100-21-05-1-kern-ipc-maxpipekva-exceeded-see-tuning-7

                        Steve

                        Thanks. I’ll follow that post.

                        Box: SG-4200

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post
                        Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.