Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    PfSense 2.2.x Panics with "Sleeping thread owns a non-sleepable lock"

    Scheduled Pinned Locked Moved General pfSense Questions
    12 Posts 5 Posters 2.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • H Offline
      heper
      last edited by

      as far as i can tell, there have been some patches that have been merged in 2.2.4, but it appears it's still not completely fixed. (so it probably isn't in 2.2.5 snapshots either)

      perhaps you could provide intel about your config/situation and a way to replicate, to narrow down the causes ; it seems to be hard to replicate the situation in a lab-environment

      1 Reply Last reply Reply Quote 0
      • T Offline
        tomierna
        last edited by

        My hardware is a fitPC2i, with an Atom processor. The two built-in Gig-E ports are using the Realtek driver. Verizon FiOS on one side, my company LAN on the other. I'm not using the wireless in the fitPC, so it's a pretty basic two-interface firewall config, with a touch of NAT and about a page's worth of explicit rules. No traffic shaping, no proxy, no captive portal. I'm not running any third-party packages on this pfSense install.

        I bought the hardware in 2010, and about a year ago, I replaced the spinning disk in it with an industrial SSD.

        In July, I updated to pfSense 2.2 from 2.1.1. Since July, there have been 19 spontaneous reboots.

        Conversely, I have another fitPC2i (identical hardware) at another location, and it's running 2.1.1. No spontaneous reboots I can recall.

        The panic log is exactly the same (excepting PIDs) as the one in bug report 4685, so I'm eager to test if this patch to if_ether.c in D2828 addresses the problem.

        What other intel can I provide?

        1 Reply Last reply Reply Quote 0
        • F Offline
          firewalluser
          last edited by

          @tomierna:

          Admins: if there is a better place to put this, please move it!

          After updating to 2.2.x, I've been seeing occasional panics under load, and over the last few days (in which we've been doing some offsite backups), crashes have happened much more frequently. Yesterday, we had four crashes, a couple within less than an hour of each other.

          I'm on 2.2.3 right now on this box.

          While I don't have historical crash logs for the previous crashes, in the most recent cases, it's been a kernel panic with the main cause "Sleeping thread (tid 100067, pid 12) owns a non-sleepable lock".

          I read in this bug report: https://redmine.pfsense.org/issues/4685 that it's a problem with the underlying FreeBSD code, and I see that a fix has been committed to base on June 17: https://reviews.freebsd.org/D2828

          The pfSense bug report 4685 says that the target is 2.2.5 with the change from the initial target of 2.3 being made just a couple of days ago. I'm not versed in how the pfSense team keeps track of which FreeBSD base system is included with which pfSense release, but based on this, I figure that the fix is not in 2.2.4.

          My question is: does the nightly 2.2.5 snapshot now contain this fix? And, should I update to 2.2.5 snapshot to get this fix, or wait a little, if a release 2.2.5 is imminent?

          Or, is there another way to skin this cat that I'm not thinking of, like taking just the kernel or some replacement object from 2.2.5 snapshot and drop it into my existing system?

          FreeBSD 11 replacing 10.1 following the steps in the link below, is one possibility.

          https://forum.pfsense.org/index.php?topic=83785.msg459222#msg459222

          What Intel Atom chip is it?

          There are an awful lot of complaints about Intel Atom chips randomly crashing, for any number of reasons on any number of platforms, if you search the internet.

          Have you tried switching off Hyperthreading in the bios, as some report improvements ie no crashes, which might also be worth a try.

          Capitalism, currently The World's best Entertainment Control System and YOU cant buy it! But you can buy this, or some of this or some of these

          Asch Conformity, mainly the blind leading the blind.

          1 Reply Last reply Reply Quote 0
          • T Offline
            tomierna
            last edited by

            It's an Intel Atom Z530.

            I'll try turning off HT the next time it crashes.

            While there may be complaints about Atom crashes on the internet, this box was extremely stable up until the pfSense switch to FreeBSD 10.1p13.

            Putting FBSD11 on this router is probably more effort than I want to put into the problem. I suspect I might downgrade to pfSense 2.1.x instead.

            1 Reply Last reply Reply Quote 0
            • C Offline
              cmb
              last edited by

              The initial fix that went in did fix a problem, but it wasn't the root problem of bug #4685. The root issue there was identified and one of our developers committed a fix into FreeBSD in the past few days. It's not in any snapshots yet.

              The only trigger we're aware of is proxy ARP VIPs. Do you have any proxy ARP VIPs configured? Changing those to IP aliases instead would prevent that from occurring.

              1 Reply Last reply Reply Quote 0
              • T Offline
                tomierna
                last edited by

                Thanks cmb.

                I have a single ARP Proxy VIP configured on this machine. I'm using it to NAT to several different internal services.

                Can I simply click the "IP Alias" radio button and reload the config with little other impact to my config?

                I see in the chart in the docs for VIPs that IP Aliases and Proxy ARPs have similar enough features.

                1 Reply Last reply Reply Quote 0
                • C Offline
                  cmb
                  last edited by

                  @tomierna:

                  Can I simply click the "IP Alias" radio button and reload the config with little other impact to my config?

                  yes, everything else will remain the same.

                  1 Reply Last reply Reply Quote 0
                  • T Offline
                    tomierna
                    last edited by

                    OK, I've changed it.

                    Should I update to 2.2.4 as well, or see how this change affects stability first?

                    1 Reply Last reply Reply Quote 0
                    • M Offline
                      mer
                      last edited by

                      @tomierna:

                      OK, I've changed it.

                      Should I update to 2.2.4 as well, or see how this change affects stability first?

                      Past experience would lead me to tell you "one thing at a time".  :)  See how it affects stability on your current release.  If things improve, plan an update to 2.2.4.  If you do both, you can't be sure which one fixed it.

                      1 Reply Last reply Reply Quote 0
                      • C Offline
                        cmb
                        last edited by

                        @mer:

                        Past experience would lead me to tell you "one thing at a time".  :)  See how it affects stability on your current release.  If things improve, plan an update to 2.2.4.  If you do both, you can't be sure which one fixed it.

                        Very often true. In this case, the original issue is understood well enough to know that upgrading won't change things, and that switching from proxy ARP to IP alias will prevent the issue. So I wouldn't hesitate to upgrade in this case.

                        1 Reply Last reply Reply Quote 0
                        • T Offline
                          tomierna
                          last edited by

                          It's been a few days since I changed the Proxy ARP to an IP Alias.

                          No crashes so far!

                          I'll do the update during the next maintenance window.

                          Thanks for the help!

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.