Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense 2.2 503 - Service Not Available

    Scheduled Pinned Locked Moved webGUI
    48 Posts 24 Posters 36.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      gessel
      last edited by

      If it is possible to recover from a tarball, then perhaps a script that runs on startup that tests for some indication of this problem and automatically executes recovering /etc from the archive?  An ugly hack, but the problem I could easily see for myself (pfSense instances running 20 hours of travel apart) is that the manual fix is not an easy talk-through for a non-technical hands-on person and if the system goes down and it is awfully hard to get in from the WAN side to do the work remotely.

      I don't want to attempt this again unintentionally, but does SSH successfully start when this happens and are the rules that permit WAN side access working?

      Otherwise a remote box pretty much necessitates a remote KVM on an accessible IP outside the firewall to give console access. Having a tarball of /etc/ squirreled away would save from reinstalling and I'll prepare my instances for the worst by doing that and making sure WAN side SSH works.

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        It may be possible to make an ugly hack like that, but it's not something we'd actually code up and put in the images (not that I can see happening anyhow) unless things got really desperate.

        For those especially prone to this, you might also try adding "sync,noatime" (sans quotes) to the mount options for the disk in /etc/fstab – in my testing it still ran fsck and found errors but I didn't see any corruption. Though whether that was pure luck or due to the change is unclear yet. For example:

        Before:

        /dev/ufsid/552d6d027debc466		/		ufs	rw		1	1
        

        After

        /dev/ufsid/552d6d027debc466		/		ufs	rw,sync,noatime		1	1
        

        Disk performance may take a slight hit for that but if it does help, it's worth the extra stability.

        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        1 Reply Last reply Reply Quote 0
        • G
          gessel
          last edited by

          This seems like a sensible fix.  It should help reduce the risk of corruption on data loss.    The mitigants seem to me:

          • Make sure SSH access works from wherever one needs to manage a dead firewall from (probably WAN)

          • Backup /etc to someplace sensible

          • adjust /etc/fstab to trade performance for reliability

          Hopefully this will get sorted.

          I would think that moving to boot on ZFS would be a reasonable migration path.  No more fsck.

          1 Reply Last reply Reply Quote 0
          • jimpJ
            jimp Rebel Alliance Developer Netgate
            last edited by

            zfs is more of a long term goal (and it is one of our goals, definitely) – not something we can implement fast or without lots of testing, and not an option for upgrades. So it is great for the future, but not what we need to fix right now.

            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

            Need help fast? Netgate Global Support!

            Do not Chat/PM for help!

            1 Reply Last reply Reply Quote 0
            • P
              prairie-sky
              last edited by

              I'm having this same issue at a remote site.

              does it just kill the GUI or does it kill the routing as well?  I can still ping the box but I'm really hoping it's still allowing traffic to flow through…...

              1 Reply Last reply Reply Quote 0
              • Y
                yaplej
                last edited by

                I just ran into this issue on two VM instances of pfSense I was setting up.  Iv been struggling to get CARP working between two KVM hosts and would reboot the hosts without shutting down the pfSense VM first (simulating power failure).  Have now re-installed the pair 3 times.  Trying the ",sync,noatime" option in /etc/fstab to see if it prevents needing to re-install.

                1 Reply Last reply Reply Quote 0
                • G
                  gessel
                  last edited by

                  yaplej: please report if it does help - seems like you're doing the right kind of testing to verify.  I've made the changes on all my pfSense instances: fingers crossed power doesn't go out at a remote site and kill it.

                  1 Reply Last reply Reply Quote 0
                  • D
                    donpfsform
                    last edited by

                    I had this happen after a power outage.  Version 2.2.0.  I received 503 error on the web console, tried enabling ssh from the console with no luck, routing did work as long as I gave the workstation a static IP and and a DNS other than pfSense.  I ended up reinstalling to fix the issue.  I installed several instances this time to different partitions so I can at lease boot something.  http://www.blog.unflap.com/2009/12/28/dual-boot-pfsense-for-testing-new-versions/    You can just choose to go back to the main menu instead of reboot and can install as my instances as you need.

                    Note for some reason a clean install of 2.2.2 would not boot on my Dell Optiplex 320.  I would get the F1, F2, F3 selection and then reboot in an endless loop.  It had listed ad0 for the drive.  I finally went back to 2.1.0 and installed multiple instances without issue to ad4.  Not sure why the drive letter change.  All three instances updated to 2.2.2 and a config restore just fine.

                    1 Reply Last reply Reply Quote 0
                    • KOMK
                      KOM
                      last edited by

                      I just had this happen after installing Bandwidthd on a new test 2.2.2 config that I spun up today.  No power outage or anything else strange going on.  I selected the LAN interface and clicked Save.  Then I went to Access Bandwidthd and got:

                      Please start bandwidthd to populate this directory.

                      I went back to the Bandwidthd config page and again clicked Save, and that's when I got the 503.  All WebGUI attempts now give 503 until I restart PHP-FPM.  I have never, ever seen this before until now with bandwidthd.

                      With PHP-FPM restarted, everything appeared to be normal again.  Then when I went to my dashboard, all the widgets were good except for the NTP widget which showed 503, and my LAN traffic gaph which shows

                      Cannot get data about interface vmx1

                      ntopng seems to have died as well.

                      pfsense.png
                      pfsense.png_thumb

                      1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate
                        last edited by

                        That's likely a different root cause than the filesystem corruption others are seeing.

                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        1 Reply Last reply Reply Quote 0
                        • KOMK
                          KOM
                          last edited by

                          And the problem persists through a reboot.  Now both squid and squidGuard, which were working fine, are crashing as fast as they can start.  WebGUI gives the exact same display as my screencap, with 503 for the NTP widget and the LAN graph not able to talk to vmx1.  Bizarre.

                          1 Reply Last reply Reply Quote 0
                          • D
                            doktornotor Banned
                            last edited by

                            With the squid* censored, I's pretty much possible you are getting an unclean reboot – which is perfectly enough to screw the filesystem.

                            On that note, I wonder if anyone tested with fsck from some previous FBSD versions. The one in 10.1 is simply mad.

                            1 Reply Last reply Reply Quote 0
                            • KOMK
                              KOM
                              last edited by

                              I didn't see any warnings about a dirty filesystem.  The system never crashed or anything that might cause an obvious filesystem error.  fsck comes up with about a dozen unreferenced inodes though.

                              1 Reply Last reply Reply Quote 0
                              • KOMK
                                KOM
                                last edited by

                                I left it running overnight and came back to a dead GUI and an endless stream of:

                                swap_pager_getswapspace(n): failed.

                                I lost my VMware heartbeat around 1am and it didn't return for 45 minutes.

                                Hard reset allowed me to boot.  The LAN graph is still borked but the NTP widget now seems to be working again.

                                1 Reply Last reply Reply Quote 0
                                • P
                                  pgmillon
                                  last edited by

                                  Hi,

                                  I just encontered the same problem today. I'm running pfSense in a VirtualBox on an old computer probably with HDD troubles.
                                  PHP-FPM could not start.

                                  I had to reinstall and restore a backup of config.xml.

                                  I hope not to experience this trouble to often.

                                  1 Reply Last reply Reply Quote 0
                                  • K
                                    kujina
                                    last edited by

                                    Version 2.2.2 direct install (no vm).

                                    Power cut yesterday and problems here too 503 & 500 errors, ssh wasn't working or dhcp.

                                    1 Reply Last reply Reply Quote 0
                                    • jimpJ
                                      jimp Rebel Alliance Developer Netgate
                                      last edited by

                                      For those who want to avoid having their filesystem corrupted, here's a quick shell one-liner to add sync to fstab and remount / with sync:

                                      cp /etc/fstab /etc/fstab.orig; /usr/bin/sed -i '' 's/^\(\/.*[[:space:]]*\/[[:space:]]*ufs[[:space:]]*\)rw\([[:space:]]*[[:digit:]][[:space:]]*[[:digit:]]\)$/\1rw,sync\2/' /etc/fstab; mount -o sync /
                                      

                                      (Be sure to copy the entire line)

                                      This will happen automatically during the 2.2.3 upgrade, and new installs of 2.2.3 will have sync enabled.

                                      Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                      Need help fast? Netgate Global Support!

                                      Do not Chat/PM for help!

                                      1 Reply Last reply Reply Quote 0
                                      • jimpJ
                                        jimp Rebel Alliance Developer Netgate
                                        last edited by

                                        I also started a wiki doc about it, with the commands there as well. https://doc.pfsense.org/index.php/Filesystem_Corruption_%28503_errors,_cannot_get_uid%29

                                        Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                                        Need help fast? Netgate Global Support!

                                        Do not Chat/PM for help!

                                        1 Reply Last reply Reply Quote 0
                                        • G
                                          gcoltharp
                                          last edited by

                                          The fix for this is quite simple

                                          Go to the command prompt…

                                          vi /etc/group
                                          add a line at the end using the same syntax as the line above it. Just use wheel as the group name and use a GID number that is not already used in the file.
                                          Save it and restart the PHP service. The error will go away and you will be golden.

                                          1 Reply Last reply Reply Quote 0
                                          • GertjanG
                                            Gertjan
                                            last edited by

                                            Another fix will be : get the repaired pfSense version 2.2.4 …

                                            No "help me" PM's please. Use the forum, the community will thank you.
                                            Edit : and where are the logs ??

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.