Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    Bug report - pfsense on ESXi 5 freeze

    Virtualization
    4
    13
    9774
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      AshleyRBlack last edited by

      "Chris Buechler ‏@cbuechler
      @AshleyRBlack post info like hypervisor, config overview, anything else that might be pertinent, and i'll check it out and reply."

      Chris Buechler ‏@cbuechler
      @AshleyRBlack it's not Windows, bad band aid and prob won't fix. twitter too short to troubleshoot, I'll reply @ forum or mailing list post

      Okays, Problem is at home, and in the lab VM at work, after a few days, pfsense just freezes, including the VM console.
      At work, the boss just did a cron job a few weeks ago to restart it every night, which I copied for now for home.

      I will give details of my home one, as I don't have access to the work lab one yet.

      _*** Welcome to pfSense 2.0.1-RELEASE-pfSense (amd64) on pfSense ***

      WAN (wan)                 -> em1        -> 86.x.x.x (DHCP)
       LAN (lan)                 -> em0        -> x.x.x.x
       WAN2 (opt1)               -> em2        -> 87.x.x.x
      …
      ...
      Enter an option: 8

      [2.0.1-RELEASE][admin@pfSense.localdomain]/root(1): uname -a
      FreeBSD pfSense.localdomain 8.1-RELEASE-p6 FreeBSD 8.1-RELEASE-p6 #0: Mon Dec 12 18:15:35 EST 2011     root@FreeBSD_8.0_pfSense_2.0-AMD64.snaps.pfsense.org:/usr/obj./usr/pfSensesrc/src/sys/pfSense_SMP.8  amd64
      [2.0.1-RELEASE][admin@pfSense.localdomain]/root(2):_

      and its is running on VMware ESXi Version 5.0.0 build 469512

      and I can provide anything else needed…

      1 Reply Last reply Reply Quote 0
      • H
        heper last edited by

        is it the pfsense that freezes or are you unable to restart/start other VMs on the same machine ? Are you able to reboot the host machine ? Is the host machine a dell R310 ?

        1 Reply Last reply Reply Quote 0
        • A
          AshleyRBlack last edited by

          @heper:

          is it the pfsense that freezes or are you unable to restart/start other VMs on the same machine ? Are you able to reboot the host machine ? Is the host machine a dell R310 ?

          Okays, all other VM's carry on working. Just a pfsense freeze.

          Not rebooted the esxi host, as no need, its running fine. As are the rest of my VM's inc Juniper SA ssl vpn, ubuntu, and vm xp for emergency access.

          Host is a HP ProLiant ML115 G5. with an added Intel duel GIG-E card.

          Do that help any? need some log files or something ?

          1 Reply Last reply Reply Quote 0
          • H
            heper last edited by

            is there any version of VM-tools installed ?

            also did you check this on the wiki:

            Certain intel igb cards, especially multi-port cards, can very easily/quickly exhaust mbufs and cause panics, especially on amd64. The following tweaks should help:

            In /boot/loader.conf.local - Add the following (or create the file if it does not exist):

            kern.ipc.nmbclusters="131072"
            hw.igb.num_queues=1

            That will increase the amount of network memory buffers, and make the card use one queue instead of multiple queues, to reduce the strain on the system.

            The same settings can also apply to em(4) cards, just use "em" in place of "igb" in the setting(s) above.

            i'm not sure if this is relevant on esxi tho.

            Personally i've only had issues on a dellR310 & esxi 4.1, running pfsense VM, would bring the hypervisor in an semi unresponsive state. (could enter console, but any action  as in reboot/shutdown would fail). This was solved when updating to esxi5.0 and might have been related to the cheap basic hardware raid card inside.

            1 Reply Last reply Reply Quote 0
            • C
              cmb last edited by

              I suspect this is because of a timecounter issue some people see on occasion, where the system clock stops. The console is generally still responsive in those cases, but many services stop functioning because they're time-dependent. Try running:

              sysctl kern.timecounter.hardware=i8254

              and see if it happens again. Once you know it's something you want applied permanently, add it under System>Advanced.

              The couple other times we've seen this, the console was responsive, and running the above immediately brought everything back to life as it fixed the system clock. Why it applies to so few people I'm not sure. ESX is the most widely used hypervisor by far, our production firewalls run in ESX, numerous of our customers and other users in the community do as well, it's an extremely small percentage.

              Definitely interested in whether that fixes it for you.

              1 Reply Last reply Reply Quote 0
              • A
                AshleyRBlack last edited by

                In /boot/loader.conf.local - Add the following (or create the file if it does not exist):

                kern.ipc.nmbclusters="131072"
                hw.igb.num_queues=1

                I have added this, so we will see what happens.

                sysctl kern.timecounter.hardware=i8254

                and see if it happens again. Once you know it's something you want applied permanently, add it under System>Advanced.

                Well, the console is always unresponsive. Should I try this preemptively?

                Thanks

                1 Reply Last reply Reply Quote 0
                • B
                  biggsy last edited by

                  Hope cmb's suggestion works but

                  … running on VMware ESXi Version 5.0.0 build 469512

                  There's nothing in the list of bug fixes that hints it might help you but there is 5.0 Update 1 (build 623860)

                  … 2.0.1-RELEASE-pfSense (amd64) ...

                  Is it practical for you to build a 32-bit pfSense VM, restore your config and see if it suffers from the same problem?

                  1 Reply Last reply Reply Quote 0
                  • A
                    AshleyRBlack last edited by

                    Is it practical for you to build a 32-bit pfSense VM, restore your config and see if it suffers from the same problem?

                    Yes, this could be done. Actually would be quite easy, and if it all goes pete tong, just spin up the original version.

                    I could build today and switch over tonight, only problem now is that i have 2 weeks holiday from Friday, so monitoring becomes a problem…

                    EDIT: Just checked, and the lab was installed with the 32 bit version, and it experiances the same problem. (same esx host and ver as well)

                    1 Reply Last reply Reply Quote 0
                    • B
                      biggsy last edited by

                      Also, have you seen this:

                      http://forums.freebsd.org/showthread.php?t=31929

                      1 Reply Last reply Reply Quote 0
                      • A
                        AshleyRBlack last edited by

                        @biggsy:

                        Also, have you seen this:

                        http://forums.freebsd.org/showthread.php?t=31929

                        I have now… erm.. kinda leaves very few options other that just going physical.

                        EDIT: added sysctl kern.timecounter.hardware=ACPI-safe as per the bsd post, and added to my "system tunables" see what happens now.

                        1 Reply Last reply Reply Quote 0
                        • C
                          cmb last edited by

                          Looks like this quote in particular from VMware in the above linked FreeBSD forum thread has the answer:

                          " just wanted to get in touch with you to let you know that I've reviewed the logs and information you have provided. I've sent the details on to our Engineering team - it appears other customers are experiencing this issue and a case was only opened with Engineering last week regarding this issue. The same workaround you found (manually force the guest OS to use the ACPI-safe source) appears to be working for other customers as well.

                          We are in the process of drafting a KB article for this issue while Engineering work on a fix."

                          1 Reply Last reply Reply Quote 0
                          • A
                            AshleyRBlack last edited by

                            @cmb:

                            Looks like this quote in particular from VMware in the above linked FreeBSD forum thread has the answer:

                            " just wanted to get in touch with you to let you know that I've reviewed the logs and information you have provided. I've sent the details on to our Engineering team - it appears other customers are experiencing this issue and a case was only opened with Engineering last week regarding this issue. The same workaround you found (manually force the guest OS to use the ACPI-safe source) appears to be working for other customers as well.

                            We are in the process of drafting a KB article for this issue while Engineering work on a fix."

                            Thanks very much for all the help on here and twitter. So far so good. the webui and console seem to be more responsive, and nothing like a slowdown or freeze. But Guess i wont know for sure till I get back in 2 1/2 weeks.

                            I will add an update then and confirm that this has worked.

                            1 Reply Last reply Reply Quote 0
                            • A
                              AshleyRBlack last edited by

                              After a power cut,

                              uptime 14 days, 15:29

                              which I think it means that this is a fix/workaround. before it would fail within a week and need to be restarted.

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post

                              Products

                              • Platform Overview
                              • TNSR
                              • pfSense
                              • Appliances

                              Services

                              • Training
                              • Professional Services

                              Support

                              • Subscription Plans
                              • Contact Support
                              • Product Lifecycle
                              • Documentation

                              News

                              • Media Coverage
                              • Press
                              • Events

                              Resources

                              • Blog
                              • FAQ
                              • Find a Partner
                              • Resource Library
                              • Security Information

                              Company

                              • About Us
                              • Careers
                              • Partners
                              • Contact Us
                              • Legal
                              Our Mission

                              We provide leading-edge network security at a fair price - regardless of organizational size or network sophistication. We believe that an open-source security model offers disruptive pricing along with the agility required to quickly address emerging threats.

                              Subscribe to our Newsletter

                              Product information, software announcements, and special offers. See our newsletter archive to sign up for future newsletters and to read past announcements.

                              © 2021 Rubicon Communications, LLC | Privacy Policy