Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Netgate 7100 freezes when temperature above 50°C

    Scheduled Pinned Locked Moved Official Netgate® Hardware
    26 Posts 5 Posters 2.7k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      geek_at
      last edited by

      Update: With the script of @stephenw10 I was able to lower the temperature to levels that would stop the crashes. So this topic should be resolved. Thanks for the help

      johnpozJ 1 Reply Last reply Reply Quote 0
      • johnpozJ
        johnpoz LAYER 8 Global Moderator @geek_at
        last edited by johnpoz

        @geek_at this got me curious to what my temps were at.. I use to run around 50ish, but after the update to 23.09 there seems to have be a drastic drop in normal running temp

        drop.jpg

        Where I see the drastic drop seems to correspond to when 23.09 came out. Its possible I stop using some package or something? But I don't recall turning off anything? Or doing any sort of significant change that could account for the drastic drop off in overall temp.. I looked at the number of processes running or the cpu use over the same period and don't see any changes there that could account for the drastic drop?

        And temp in the house or my office where pfsense sure didn't change all that much..

        Curious to why the sudden drop off now - maybe reading of the senors changed? My sg4860 doesn't even have any fans that I am aware of - sure can't hear anything.. So I don't think it could of been a change to how the fans run..

        Curious - I do like the lower temps..

        Also curious to why it was running hot there for a while.. Either way even when it was running on the warm side of 50+ I never had any issues with it hanging or rebooting or anything - the thing runs, and the only time it reboots is when I update it to new version of pfsense.

        50ish seems a bit low to be causing any sort of the thermal problem..

        An intelligent man is sometimes forced to be drunk to spend time with his fools
        If you get confused: Listen to the Music Play
        Please don't Chat/PM me for help, unless mod related
        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

        stephenw10S G 2 Replies Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator @johnpoz
          last edited by

          @johnpoz said in Netgate 7100 freezes when temperature above 50°C:

          50ish seems a bit low to be causing any sort of the thermal problem..

          It does. Turning up the fans seems to confirm it's a thermal issue but I wonder if it's actually some other component that's overheating.

          I'm not sure what might have affected a 4860 like that. Do the CPU graphs show a load reduction?

          johnpozJ 1 Reply Last reply Reply Quote 0
          • johnpozJ
            johnpoz LAYER 8 Global Moderator @stephenw10
            last edited by johnpoz

            @stephenw10 no I looked no cpu reduction nor processes running that would reflect such a change..

            cpu.jpg

            An intelligent man is sometimes forced to be drunk to spend time with his fools
            If you get confused: Listen to the Music Play
            Please don't Chat/PM me for help, unless mod related
            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Hmm, interesting. I'm nowt aware of anything that would affect that in the power/thermal management.

              1 Reply Last reply Reply Quote 0
              • G
                geek_at @johnpoz
                last edited by

                @johnpoz that's actually a very good point. I too was still on 22 and just upgraded to 23. Let's see if that changes anything without the fan script

                1 Reply Last reply Reply Quote 0
                • G
                  geek_at
                  last edited by

                  Update: after the update to the latest version the 7100 didn't find any boot partition anymore.. so I'm done with this and I'm going to virtualize from now on. 55e9100b-daa1-4a88-8644-f1f19fb3a585-image.png

                  G 1 Reply Last reply Reply Quote 0
                  • G
                    geek_at @geek_at
                    last edited by

                    @geek_at Okay one more update.

                    I opened the 7100 up and found out that the heatsinks were installed in the wrong direction

                    09c19fba-a922-46dc-bd82-e2c340981356-image.png

                    They should align with the airflow but are rotated 90°

                    This doesn't seem to be normal as in the official documentation of the 7100 the heatsinks are aligned correctly

                    d1ce6bda-b7f7-40be-92fd-7001cc162ab2-image.png

                    https://docs.netgate.com/pfsense/en/latest/solutions/xg-7100-1u/m-2-sata-installation.html

                    It seems there was a mistake in the production process around the time my clients bought their Firewalls. This explains the overheating and random crashes my clients were experiencing.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      It still seems as though something else is happening there because 50C really isn't that hit for that CPU. 🤔

                      But if that lowers the temps and prevents it....

                      johnpozJ 1 Reply Last reply Reply Quote 0
                      • johnpozJ
                        johnpoz LAYER 8 Global Moderator @stephenw10
                        last edited by

                        @stephenw10 but temps are where the measurement are taken, I would think it possible without proper cooling that while the measurement point might only show 50, that its hotter at some other spot.. And maybe that is where the problem is?

                        It is a good finding.. Is this a known issue where the heat sink does seems to be installed in the wrong orientation?

                        Simple google for heat sinks and air flow does point to the wrong orientation being problematic for proper cooling.

                        An intelligent man is sometimes forced to be drunk to spend time with his fools
                        If you get confused: Listen to the Music Play
                        Please don't Chat/PM me for help, unless mod related
                        SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Well the heatsink only cools the CPU in the 7100. The measurement is using the on-die sensors in the CPU package. Having the heatsink oriented incorrectly causes it to run hotter and hence the fans run faster It probably actually lowers the temps of everything else.

                          1 Reply Last reply Reply Quote 0
                          • G
                            geek_at
                            last edited by

                            Okay I'm back with new thermal paste and correctly aligned heatsinks.

                            I ran a CPU benchmark for 45 minutes. It started out with a CPU temp of 40°C

                            The first half it ran just on default settings at 100% CPU. Temp went up to to 58°C without crashes. So the orientation did improve things greatly which also adds to my suspicion that the problem in the production process was to blame for my freezing problems.

                            I also tested running the "smart fan control" script which reduced the temperatures back to ~42°C even at 100% load. Here's the graph:

                            812d27ac-e3f6-4564-bb22-38db3467784e-image.png

                            I will check all remaining 3 firewalls of my clients to see if the heatsinks are aligned wrong there too

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S
                              stephenw10 Netgate Administrator
                              last edited by

                              I run that script here and have never seen any issues with it. However there is a risk that if it crashes or is otherwise killed for any reason the fans will just remain at whatever speed there are running. That means if it was very light load it may then be insufficient at higher loads.
                              As an alternative we do have a script that resets the lookup tables in the fan controller but leaves the controller in charge, independent of the CPU. It's not as good as actually maintaining the temperature though since it still relies on the board sensors.

                              Do you have a temperature value for the CPU without the script running? Just comparing the before and after remounting the heatsink?

                              Steve

                              G 1 Reply Last reply Reply Quote 0
                              • G
                                geek_at @stephenw10
                                last edited by

                                @stephenw10 In my last screenshot, the when the temperature rises, this was without the script.

                                I started the script at between 19:25 and 19:33 and you can see it working and the temperature falling. The script works perfectly (well except for the error you get when the script tries to spin up the fans over a value of 256 which results in errors)

                                I don't really have a "before" benchmark with CPU load but I will try to do it at another customers location soon

                                1 Reply Last reply Reply Quote 1
                                • stephenw10S
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Ok great. I'd guess it's not dramatically different. Those numbers seems to indicate that.

                                  1 Reply Last reply Reply Quote 0
                                  • G
                                    geek_at
                                    last edited by

                                    I wanted to give an update on the matter.

                                    I have opened and checked all of the Netgate boxes of my clients and my suspicions were correct. All clients who have experienced outages and random crashes indeed hat the heatsinks mounted in the wrong direction (against airflow) and all clients who had no problems had them in the correct orientation (same as in the pictures of the official documentation)

                                    After fixing the heatsinks I had no more crashes even with forced heat and burn-in-tests. So this really was the cause for the crashes.

                                    @stephenw10 please talk with your QA people about this

                                    Best wishes from Austria

                                    1 Reply Last reply Reply Quote 2
                                    • stephenw10S
                                      stephenw10 Netgate Administrator
                                      last edited by

                                      I will do. Thanks for checking that.

                                      1 Reply Last reply Reply Quote 1
                                      • First post
                                        Last post
                                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.