Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense Instability Help

    General pfSense Questions
    5
    13
    5.6k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      mattlach
      last edited by

      Hello all,

      I could appreciate some help in trying to solve a pfSense instability issue I have been having.

      Every now and then pfSense seems to stop working properly.  The time intervals seem random (though hastened by heavy use, like torrents).

      The specs are as follows:

      2.0.1-RELEASE (amd64) running under ESXi 5 (Vsphere Hypervisor) on the following hardware:
      AMD E-350 (1.6ghz dual core)
      8GB RAM (4GB provisioned for pfSense)
      Intel EXPI9402PT dual gigabit NIC (via ESXi virtual switches, as this system is not IOMMU compatible)

      I don't think it's a hardware problem, as when pfSense stops working properly ESXi and my other server on the ESXi box keep running normally.

      When pfsense stops working properly, all clients lose external internet access.  Clients already on and active keep their IP addresses, but new clients do not obtain an ip from the DHCP server and instead make up their own, and fail to even reach internal clients.

      When this happens, I can force my desktop to use a static IP, and then can sometimes reach the pfSense web interface, sometimes not.  I go to the console and hit "5" to reboot the system, which doesn't appear to work, and then finally force a reboot from inside ESXi.  When it comes back up again, everything works as normal, but it seems to pull a new external IP.

      It takes 1 -2 days for the same issue to recur.

      I logged onto the console to see if I could track down what happened in the logs, but I am not sure which log I should be looking in.

      /var/log/system.log only appears to have information from the most recent boot, so it is not helpful.

      I'd appreciate any assistance in figuring this one out, including what log information to pull to look at it.

      If it is helpful, I will post my network diagram, shortly (as soon as I am done drawing it)

      Thanks,
      Matt

      1 Reply Last reply Reply Quote 0
      • M
        mattlach
        last edited by

        Here is the network diagram (click for larger)

        Appreciate any help/suggestions.

        1 Reply Last reply Reply Quote 0
        • W
          wallabybob
          last edited by

          @mattlach:

          Every now and then pfSense seems to stop working properly.

          When this happens does the pfSense console still respond to shell commands?

          Sometimes a kernel can loop in the kernel due to resource exhaustion. The following shell script could be run soon after startup to monitor kernel network resource usage:```
          $ more t.sh
          while true
          do
            date
            netstat -m
            sleep $1
          done
          $

          where the first parameter gives the interval (in seconds) between the time stamped reports, for example
          $ sh -x ./t.sh  3600

          1 Reply Last reply Reply Quote 0
          • M
            mattlach
            last edited by

            @wallabybob:

            @mattlach:

            Every now and then pfSense seems to stop working properly.

            When this happens does the pfSense console still respond to shell commands?

            Sometimes a kernel can loop in the kernel due to resource exhaustion. The following shell script could be run soon after startup to monitor kernel network resource usage:```
            $ more t.sh
            while true
            do
              date
              netstat -m
              sleep $1
            done
            $

            where the first parameter gives the interval (in seconds) between the time stamped reports, for example
            $ sh -x ./t.sh  3600

            Thank you, I will do this and give it a shot.

            I am a beginner at BSD though, so please bear with me here.

            How can I install an editor to perform this task?

            It looks like gcc is installed with pfsense.    Should I compile one manually?  I have heard of th eports package manager, but I have no clue how to use it.  (I am familiar with Gentoo linux portage implementation as well as redhats RPM and debian's APT).

            How can I get an editor like vi, nano or emacs on the system so I can edit and save the script?

            Thanks for your help,
            Matt

            1 Reply Last reply Reply Quote 0
            • W
              wallabybob
              last edited by

              @mattlach:

              How can I install an editor to perform this task?

              No need, see later.

              @mattlach:

              It looks like gcc is installed with pfsense.    Should I compile one manually?  I have heard of th eports package manager, but I have no clue how to use it.  (I am familiar with Gentoo linux portage implementation as well as redhats RPM and debian's APT).

              Compile one what? A shell? sh is already installed.

              @mattlach:

              How can I get an editor like vi, nano or emacs on the system so I can edit and save the script?

              vi and ee are installed as part of the base install.

              You don't need to do anything to the base system (except create the shell script) to run the script I provided.

              1 Reply Last reply Reply Quote 0
              • B
                bkamen
                last edited by

                is your system completely locked hard? (i.e. you hit the HW reset button or power-cycled?)

                I recently installed a new setup using a SuperMicro X7SPE Atom MB – and have had similar issues.

                I think I may have solved it - but don't have enough up-time to say yet.

                -Ben

                –
                Ben - O.D.T., S.P.

                1 Reply Last reply Reply Quote 0
                • M
                  mattlach
                  last edited by

                  @wallabybob:

                  vi and ee are installed as part of the base install.

                  Ahh, I see, that they are.  My mistake.

                  I was trying to launch "vim" instead of "vi" as I am used to that being installed on my linux systems.

                  Thanks for the help.  I will be running your netstat script.

                  Can you give me an idea of what I might be looking for in the netstat - m output?

                  Thanks,
                  Matt

                  1 Reply Last reply Reply Quote 0
                  • M
                    mattlach
                    last edited by

                    @bkamen:

                    is your system completely locked hard? (i.e. you hit the HW reset button or power-cycled?)

                    I recently installed a new setup using a SuperMicro X7SPE Atom MB – and have had similar issues.

                    I think I may have solved it - but don't have enough up-time to say yet.

                    -Ben

                    Mine does not lock hard.

                    I am running it in a VM under VMware ESXi.  the rest of the system (and other VM's) remain up and stable.

                    The pfSense VM remains somewhat accessible (web interface sometimes, console always) but it is slow and unresponsive after the issue occurs.  I typically try to go to console and restart it using command "5", but after waiting what seems like forever without a reboot occurring, I usually just get tired of waiting and force a reset from within ESXi.

                    What was your issue?  How did you solve it?

                    1 Reply Last reply Reply Quote 0
                    • W
                      wallabybob
                      last edited by

                      @mattlach:

                      Can you give me an idea of what I might be looking for in the netstat - m output?

                      Allocation failures, a trend of increasing resource use and a resource current or total figure "near" maximum. The maximum figures need to be big enough for worst case use. The total figures may rise for a time then should level off but will rise again if you encounter bigger peak loads.

                      1 Reply Last reply Reply Quote 0
                      • H
                        heper
                        last edited by

                        i've experienced similar issues with esxi, even to the point where all VM's on the same box stop working (i've had this on multiple systems running esxi4.1 & pfsense)

                        thus far i've been unable to solve it but the problem does not repeat frequently (sometimes +150d uptime without issues).

                        esxi forum posts indicate that this could be related to a couple of types of raid controller … (check esxi logs for warnings)

                        1 Reply Last reply Reply Quote 0
                        • M
                          mattlach
                          last edited by

                          @heper:

                          i've experienced similar issues with esxi, even to the point where all VM's on the same box stop working (i've had this on multiple systems running esxi4.1 & pfsense)

                          thus far i've been unable to solve it but the problem does not repeat frequently (sometimes +150d uptime without issues).

                          esxi forum posts indicate that this could be related to a couple of types of raid controller … (check esxi logs for warnings)

                          Thanks for the suggestion.

                          In my case I know it's not due to any RAID controllers, as I am not using RAID.    I appreciate the input though.

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            Probably worth applying the em tweaks from here:
                            http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

                            Remember: Upvote with the 👍 button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            1 Reply Last reply Reply Quote 0
                            • M
                              mattlach
                              last edited by

                              @jimp:

                              Probably worth applying the em tweaks from here:
                              http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

                              Thank you, I'll have to take a look at this.

                              I don't think the mbuf's are the issue for me.

                              The pfsense guest became unresponsive again on Friday (just getting around to posting now), and the following was the last entry in my scripted log file:

                              
                              Fri Mar 30 16:16:44 EDT 2012
                              514/5758/6272 mbufs in use (current/cache/total)
                              513/5541/6054/25600 mbuf clusters in use (current/cache/total/max)
                              512/5376 mbuf+clusters out of packet secondary zone in use (current/cache)
                              0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
                              
                              

                              This seems to suggest that mbuf's are not my issue.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.