Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Random? reboots, is there any way to know if it's software or hardware…

    Scheduled Pinned Locked Moved Off-Topic & Non-Support Discussion
    13 Posts 5 Posters 2.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      Knight
      last edited by

      Hi!

      My pfSense box has suddenly decided to reboot once in a while (3rd time in a few weeks now)…

      Is there any way to know if it is initiated by a software problem or a hardware one?

      I checked in the logs but couldn't find anything but maybe I did not check at the right place...

      Thank you and have a nice day!

      Nick

      1 Reply Last reply Reply Quote 0
      • DerelictD
        Derelict LAYER 8 Netgate
        last edited by

        Is it creating crash dumps?

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • K
          Knight
          last edited by

          Hi and thank you for your reply!

          From what I understood from this page:

          https://doc.pfsense.org/index.php/Obtaining_Panic_Information_for_Developers

          I should have a way to submit it from the dashboard since I have a full install…

          The dashboard seems to have nothing more than usual so I apparently don't have a crash dump...

          I saw that I apparently had problems connecting to my ISP about 6-7 hours before it rebooted but everything had gone back to normal...

          Thank you and have a nice day!

          Nick

          1 Reply Last reply Reply Quote 0
          • K
            Knight
            last edited by

            Hi!

            I think I found what is causing those reboots…

            The CPU is overheating...

            Thing is, it doesn't quite make sense...

            It's an Atom 330 CPU which is passively cooled (and always has been stock from the factory).

            Why would it suddenly start to overheat like that???

            CPU usage seems fine so it doesn't appear to be overheating because it has to do much processing...

            Could anything on the software side cause that? I doubt it could but I cannot explain why, all of a sudden, it overheats when nothing cooling wise has changed...

            Thank you and have a nice day!

            Nick

            1 Reply Last reply Reply Quote 0
            • DerelictD
              Derelict LAYER 8 Netgate
              last edited by

              What does the pfSense CPU graph show in Status > Monitoring?

              No matter what, you cooling seems insufficient.

              Chattanooga, Tennessee, USA
              A comprehensive network diagram is worth 10,000 words and 15 conference calls.
              DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
              Do Not Chat For Help! NO_WAN_EGRESS(TM)

              1 Reply Last reply Reply Quote 0
              • D
                doktornotor Banned
                last edited by

                @Knight:

                It's an Atom 330 CPU which is passively cooled (and always has been stock from the factory).
                Why would it suddenly start to overheat like that???

                1 Reply Last reply Reply Quote 0
                • H
                  Harvy66
                  last edited by

                  I also like to cover my computer with a protective anti-static layer of cigarette-smoke tar enhanced dust.

                  1 Reply Last reply Reply Quote 0
                  • K
                    Knight
                    last edited by

                    Hi Derelict!

                    @Derelict:

                    What does the pfSense CPU graph show in Status > Monitoring?

                    Around 215 processes but the highest percentage I had there was around 5%…

                    @Derelict:

                    No matter what, you cooling seems insufficient.

                    And this was by my own fault and turned out not to be the problem…

                    My gut feeling is that this motherboard probably has some bad caps or something similar...

                    I tried to replace the power supply for the same reason and my pfSense box is still unstable...

                    It was by my own fault because I had temporarily removed a 120 mm fan I had added (it is not supposed to be necessary but I had added it just in case) and had forgot about it...

                    That fan is not supposed to be there and I actually had to be creative to make it fit there and not touch anything...

                    Thank you very much for your help and have a nice day!

                    Season's Greetings!

                    Nick

                    1 Reply Last reply Reply Quote 0
                    • K
                      Knight
                      last edited by

                      Hi doktornotor!

                      @doktornotor:

                      Actually there is barely any dust in the computer that is failing…

                      As I mentionned in the post I just made prior to this one I was wrong about overheating being the cause...

                      My guess is bad caps...

                      Thank you, have a nice day and Season's Greetings!

                      Nick

                      1 Reply Last reply Reply Quote 0
                      • M
                        MasterX-BKC- Banned
                        last edited by

                        its entirely possible that from the factory the heatsink isnt properly mounted, or is just insufficient in design, what model is the box, where is it from?

                        could try pressing the heatsink down onto the proc, but not too hard, just enough to make sure the thermal paste is compacted nicely.

                        ive seen many a time where the heatsink is just sitting on the paste and not actually compacting it because the mounting bracket/pins are loose fitting and dont compress it at all

                        1 Reply Last reply Reply Quote 0
                        • K
                          Knight
                          last edited by

                          Hi MasterX-BKC!

                          @MasterX-BKC-:

                          its entirely possible that from the factory the heatsink isnt properly mounted, or is just insufficient in design

                          I was actually wrong about the reboot being caused by overheating… The box was overheating because when it started having problems I opened it and temporarily removed a fan I had added...

                          That fan is

                          • not actually supposed to be there
                          • it's a pain to put it there and have it work because if I move it ever so slightly it stops spinning because it touches something in the casing.

                          So, in restrospect, the case I used is absolute c...

                          @MasterX-BKC-:

                          what model is the box, where is it from?

                          If you mean the motherboard, it's a Zotac IONITX-F-E I believe…

                          (Only the last two letters I am not sure of but the IONITX-F-E seems to be what I have...)

                          If you mean the case, I really don't know who made it...

                          @MasterX-BKC-:

                          could try pressing the heatsink down onto the proc, but not too hard, just enough to make sure the thermal paste is compacted nicely.

                          ive seen many a time where the heatsink is just sitting on the paste and not actually compacting it because the mounting bracket/pins are loose fitting and dont compress it at all

                          As far as I can tell, and it's the first time ever I have seen this, the heatsink is screwed in from the back side of the board using the same kind of screws as PCI/PCIe slots…

                          With proper airflow the box doesn't overheat and before I had to open it because it was getting unstable it had proper airflow so I think the heatsink is ok...

                          I thought I had found the cause of my problems when I saw it overheat but it was because of something I did after it started to be unstable…

                          Thank you and have a nice day!

                          Nick

                          1 Reply Last reply Reply Quote 0
                          • K
                            Knight
                            last edited by

                            OK guys (and gals if there are any reading this…), I have some bad news...

                            I replaced the box with another new one and it did it again today…

                            :( :( :( :( :( :( :( :(

                            I reused 3 parts from the old one...

                            • the NIC, an Intel I340-T4 (those things are kinda costly...)
                            • the optical drive (I put one but it will probably never see much use since I installed with a USB key).
                            • the SSD... Smart isn't reporting any problem with it and since my problem seemed more power supply or motherboard related I decided to reuse it...

                            Today I was greeted with the same screen as last time...

                            The box rebooted and seems to freeze while trying to do a PXE boot...

                            It's like, when it gets to choose between

                            F1 pfSense
                            F6 PXE boot

                            It chooses to do F6 and as far as I can tell only after it rebooted by itself…

                            Why and why did it reboot in the first place?

                            What do you guys think I should try next?

                            I doubt the optical drive is to blame so that leaves the Intel I340-T4 and the SSD...

                            Apparently if the number if MBUFs is too low with that card it could cause problem so I followed the recommandations I found here https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Intel_igb.284.29_and_em.284.29_Cards but it looks like it's no longer necessary to do this with the more recent versions as I ended up with slightly less MBUFs after doing this (before: 1,009,342 after: 1,000,000).

                            Could the card hardware itself cause a reboot and not its software, I am not so sure of that.

                            And there's the SSD... Maybe it is starting to go bad but shouldn't I get errors in the SMART reports and shouldn't my box behave even more strangely?

                            Any ideas?

                            Thank you and have a nice day!

                            Nick

                            1 Reply Last reply Reply Quote 0
                            • K
                              Knight
                              last edited by

                              Hi!

                              I have changed the SSD about 10 days ago…

                              I didn't touch the optical drive (since it's quite improbable it is that nor the NIC card (since those things are kinda costly)...

                              Last time it took a close to two weeks before it became unstable again IIRC so I should know soon enough if the SSD was truly to blame...

                              Thank you and have a nice day!

                              Nick

                              1 Reply Last reply Reply Quote 0
                              • First post
                                Last post
                              Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.