Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Snif: pfSense randomly hangs, how to diagnose please (peep)?

    Scheduled Pinned Locked Moved General pfSense Questions
    9 Posts 4 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M
      Mr. Jingles
      last edited by

      G'day all  :-[

      Over the last couple of days, my first pfSense (in my sig) spontaneously hangs. Nothing will work anymore, not even a keyboard response on the console. I have to hard power off and power on, after which it does the file system checks and starts up normally again.

      The last change I made was uninstalling Suricata some days ago, since this conflicted with Snort.

      In the GUI, Status/System logs, nothing is to be seen (since this is the log since reboot), [b]except for that I am supposing it happened after this:

      
      Aug 5 12:47:51 syslogd: kernel boot file is /boot/kernel/kernel
      Aug 5 12:30:35 php: rc.update_urltables: /etc/rc.update_urltables
      

      As you can see, the update ran at 12.30, and at 12.47 we see the system booting up again after my hard reset and the file system checks.

      How would one diagnose this further? In which logs to look for what problem happened?

      Thank you in advance for any help  :-*

      Bye,

      EDIT:  I just thought: these urltables are updated more than once a day (BB's script for updating the IR* tables also starts the updater hourly), so most of the times the updating does not cause a crash, since the crash frequency is once every 1,5-2 days or so.

      6 and a half billion people know that they are stupid, agressive, lower life forms.

      1 Reply Last reply Reply Quote 0
      • C
        charliem
        last edited by

        In my experience, this is almost always hardware.  Trouble is, there's not a lot of diagnosing to be done except systematic replacement; I'd start with a new power supply.  (I've seen this exact symptom cause by a bad power supply twice: first time took a few months of intermittent failures to figure it out, second time was a day  :) )

        You can try running memtest and/or a cpu stressing utility (to look for temperature issues).  But most times with memory issues, you would still see an oops or something in the log.

        1 Reply Last reply Reply Quote 0
        • M
          Mr. Jingles
          last edited by

          Thanks Charlie  ;D

          I memtested it only a couple of days ago, it was fine. The PSU for this mobo is very expensive, I'd rather not buy a new one if not necessary  :-[

          CPU stress testing is something I could try, but then I have to put the machine offline again, I was hoping to rule out other causes first. Isn't the great FreeBSD keeping logs of everything everywhere all the time? Even Windows does a lot of logging, so I would be expecting FreeBSD to do the same(?)

          6 and a half billion people know that they are stupid, agressive, lower life forms.

          1 Reply Last reply Reply Quote 0
          • C
            charliem
            last edited by

            It sounds like you have a monitor and keyboard hooked up; I take it no console error messages appeared?

            Yes, of course there are logs ('cd /var/log; clog system.log | less' for example), but if the CPU is halted by safety circuits on the chip due to Vdd being out of range or over temperature (just for example), such logs never make it to the disk.  In any case, if the CPU stops, there's no notice given to the OS.

            After the system hangs, is there any response to the NumLock key?  Any response to pings?  I know linux will flash the keyboard LEDs in morse code to signal a kernel fault, but I don't think freebsd does so.

            Sorry I have no other suggestions.  I had the same reluctance when I ran into this the first time, hence the two-month time frame to fix it …

            1 Reply Last reply Reply Quote 0
            • J
              jasonlitka
              last edited by

              @Hollander:

              I memtested it only a couple of days ago, it was fine. The PSU for this mobo is very expensive, I'd rather not buy a new one if not necessary  :-[ [/quote]

              Running MemTest86+ (or similar) is only useful if you commit to it for 48-72 hours.  I've had machines that would run overnight without issue but would crumble if left to run over a weekend.

              The same goes for power supplies.  If your PSU is expensive because it's a high-wattage "gaming" PSU, well, I've had more of those fail than a standard Enermax or Antec.

              I can break anything.

              1 Reply Last reply Reply Quote 0
              • M
                Mr. Jingles
                last edited by

                Thank you to both of you  ;D

                The PSU is not expensive because it is a high-wattage gaming PSU, aux contraire. The problem is it is some special sort of adapter required for this Intel Mobo. And as soon as the word 'special' comes up, it appears to be expensive.

                My guess, but I am nnot sure, is that there are some package conflicts somewhere. I deinstalled Squid, Squidguard, Lightsquid (squid wasn't behaving with Snort anyway), and some other packages I already forgot, and in 24 hours no crash.

                I did manage to fetch one crash log that pfSense presented to me in the GUI when I logged in. Of course, being the eternal noob, I have no clue how to interpret this. Would any of you perhaps be able to see anything in there that might give a clue?

                Thank you for your help very much  ;D

                EDIT: sorry, I don't know where I saved the log  :-[ ( >:( )

                I will have to wait for the next crash. The only thing I actually wrote down is this:

                [quote]
                Fatal trap 12: page fault while in kernel mode
                cpuid = 0; apic id = 00
                fault virtual address      = 0x420
                fault code                          = supervisor read data, page not present
                instruction pointer        = 0x20:0xf8023be83
                stack pointer            = 0x28:0xff80000fd5e0
                frame pointer          = 0x28:0xff80000fd610
                code segment                  = base 0x0, limit 0xfffff, type 0x1b
                                                              = DPL 0, pres 1, long 1, def32 0, gran 1
                processor eflags              = interrupt enabled, resume, IOPL = 0
                current process                              = 0 (em0 que)

                But I am not sure if that is sufficient information(?)

                6 and a half billion people know that they are stupid, agressive, lower life forms.

                1 Reply Last reply Reply Quote 0
                • M
                  Mr. Jingles
                  last edited by

                  @charliem:

                  It sounds like you have a monitor and keyboard hooked up; I take it no console error messages appeared?

                  Yes, of course there are logs ('cd /var/log; clog system.log | less' for example), but if the CPU is halted by safety circuits on the chip due to Vdd being out of range or over temperature (just for example), such logs never make it to the disk.  In any case, if the CPU stops, there's no notice given to the OS.

                  After the system hangs, is there any response to the NumLock key?  Any response to pings?  I know linux will flash the keyboard LEDs in morse code to signal a kernel fault, but I don't think freebsd does so.

                  Sorry I have no other suggestions.  I had the same reluctance when I ran into this the first time, hence the two-month time frame to fix it …

                  I forgot to answer this, sorry  :(

                  No, it does nothing, and doesn't respond to anything at all.

                  I will go into the cli and look for the logs. Are there perhaps particular key words I would need to grep for?

                  6 and a half billion people know that they are stupid, agressive, lower life forms.

                  1 Reply Last reply Reply Quote 0
                  • BBcan177B
                    BBcan177 Moderator
                    last edited by

                    Google "Fatal trap 12: page fault while in kernel mode" and there are lots of people with that error. What kind of machine is it? Are you virtualizing this machine?

                    "Experience is something you don't get until just after you need it."

                    Website: http://pfBlockerNG.com
                    Twitter: @BBcan177  #pfBlockerNG
                    Reddit: https://www.reddit.com/r/pfBlockerNG/new/

                    1 Reply Last reply Reply Quote 0
                    • M
                      Mr. Jingles
                      last edited by

                      @BBcan177:

                      Google "Fatal trap 12: page fault while in kernel mode" and there are lots of people with that error. What kind of machine is it? Are you virtualizing this machine?

                      'tIs the first machine in my sig, BB; not virtualized  ;D

                      I don't think it was hardware; I uninstalled these packages mentioned before, and so far no hangs anymore. I'll see what happens next.

                      6 and a half billion people know that they are stupid, agressive, lower life forms.

                      1 Reply Last reply Reply Quote 0
                      • First post
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.