Topton N100 Reporting 402 MHz
-
One remaining quirk that would be nice but not necessary to solve: any idea why I would have (seemingly accurate and changing) core temps reported on the dashboard, but nothing at all for Thermal Sensors under Status > Monitoring?
EDIT: Found this post and will try doing what was was described there and report back.
-
@TheNarc I tried deleting the file from that thread but it just removed my sensor data completely. Was not recreated on reboot either. Instead I used the delete data button which deleted all data but also recreated the sensor data which now shows all sensors!
In other news I ran memtest86+ today and it passed on the first run but failed on the second. Temperatures very slowly climbed during the entire first run up to ~60-63C when I had no fan running during the test. I didn't see exactly what the temp was when the failure was recorded but it was around 60 degrees when I saw the error but there was also a recorded maximum of 73 degrees. I tried removing the back of the case and touched the RAM but it was too hot to touch, not possible to hold very long without burning myself.
I have now ordered a thin SODIMM RAM heatsink but I'm doubtful it will help all that much if the RAM and NVMe stick gets cooked in this little oven with no air exchange. Therefore I also ordered a 120mm adjustable speed USB fan which I could just place on top of the case. I would prefer to run it fanless though since that looks much nicer and requires less power. As a temporary measure I have increased my boards default for Temperature Activation Offset from 25 to 53 which I think should activate the Intel "Thermal Control Circuit" at TjMax - offset, which in this case should be TjMax = 105 for N100 and with my offset it should activate at 105 - 53 = 52 degrees. It seems to be working since I haven't really seen a temperature above ~53 degrees yet (Intel mentions some overshoot will happen). They also say though that a systems thermal constraint should really be controlled by the PL1 setting and not the TCC offset. Probably much more performance is lost by the actions the TCC has to take to limit the temperature compared to a power limit from PL1.
It's a bit sad though to have to limit the CPU so heavily because of the other components in the box because the CPU itself is fine with much higher temperatures than this. The much higher temperature offset I'm running now while waiting for the additional parts to arrive seem to have lowered by OpenSSL test performance by about 20-25% when running fanless. I tried with my old fan as well and performance was a bit better but still much worse than original with the fan as well but I don't remember the exact numbers. For now I won't do further testing (not even sure the limit I set makes the system fully stable), because it seems pointless to spend time testing a temporary solution. Probably the final solution will be SODIMM heatsink + the custom BIOS with a lower PL1 setting and a less restrictive TCC offset or a bit less restrictive PL1 and TCC offset with the fan (maybe just original settings).
-
@AnonymousRetard Yeah not sure what the deal with the RRD was but I saw the same thing. Deleted all the data, and then the thermal sensors came back. So far so good!
Sorry to hear about your RAM test results. Although, I'm not an expert, but I'd be a bit surprised if 63C was too hot. Even 73 doesn't seem outrageous. Not ideal, for sure, but I wonder if it's just a bad stick as opposed to a heat-induced failure. I guess you'd need to run a control under a more temperature-controlled environment to be certain. Best of luck with your heatsink approach. It is annoying that these little boxes don't make it easier to add a fan if you're so inclined. The totally passive cooling is attractive, but only when well-designed, which these are not so much haha. This one I deployed for my family is going to be in a basement so I'm hoping it will be alright without needing to add a fan at it, but time will tell!
-
@TheNarc I know all components get a lot less stable when overclocked which is something I have done a lot with regular desktop systems. Both modern CPUs and GPUs automatically downclock to maintain stability when temps go higher because of this but RAM sticks don't. Now this RAM stick shouldn't be overclocked but there seems to be pretty much no settings regarding RAM in the BIOS, not even in your unlocked one. Another possible solution would perhaps be to downclock the RAM if it was possible and/or give it a bit of a lower voltage. But it really should be running at factory settings. I haven't really been able to find any specifications of what ambient temperatures it should be tolerating at stock settings though. But there definately won't be any room for overclocking the RAM at these ambient temperatures. Unfortunately, since there's no working ambient temperature sensor in the box and no temperature sensor I've been able to read from the RAM stick I don't fully know what either temperatures are but I feel like the ambient temperature in the box will probably settle not much lower than the CPU temps since the box is very small and badly ventilated and the big CPU heatsink will probably almost equalize with the air temperature inside the box...
All I know is the RAM stick after it had failed was so hot that it burned my fingers by just touching the backside of its PCB for ~2 seconds, which I don't think is very good. I wouldn't be surprised if the RAM without any heatsink becomes 20-40 degrees higher than the ambient temperatures around it when stressed, so if the CPU has made the ambient temperature close to 70 I wouldn't be surprised if the RAM starts approaching or even exceeding 100C... But yeah I'm making a lot of guesses here. I don't really know what normal ambient temperatures in a laptop are either, I guess SODIMM ram sticks should be expected to be installed in laptop systems.
I'll have to continue testing once I have more cooling solutions... If it turns out to be unstable even with much better temps I guess I'll just order another RAM stick instead and try with that. I'll post updates here once I have done more testing but it'll take a while before everything arrives.
-
@AnonymousRetard An update: My system has now been stable for about 2 weeks with no crashes with a speed adjustable 120mm usb fan on top of the case running on the lowest setting. This is a very long uptime compared to what I could get before without cooling so it seems like the system can be made stable with a better cooling solution.
Today the RAM heatsinks arrived but they are just super-thin copper/graphene films and I don't have very high hopes that they'll do much. But I am now running memtest86+ again without any fan and with the heatsinks.
This test is now being run with a temperature activation offset of 37 instead of the 53 I mentioned earlier because with 53 and no fan I actually got another type of problem where my internet connection died after a while and never came back until I rebooted pfSense. There was some strange message in the kernel logs (dmesg) about some buffer or something being full but I don't fully remember what it said. But I googled it and there was some recommendation to increase the size of this buffer but I already had it quite large (above the default) so I didn't think it was appropriate to increase even more.
I think an offset of 53 should mean that the CPU will try to keep temperatures below 52C (105-53) by downclocking and eventually turning of clocks completely which might not really be doable without any active cooling and the extreme measures the CPU takes to try to stay below that temp might trigger some strange freebsd bug or something... (In my mind a full buffer should be a recoverable problem, not requiring a reboot to fix). The new setting of 37 should make the CPU try to stay below 68C which should be more reasonable with only passive cooling.
I will update the thread probably one last time when I feel like I have a final stable solution. Preferably one without a fan but if this test fails that's probably what I will end up with.
-
Good afternoon all, this week I received my equipment. A CWWK-N100-4L-01 (LH-N100-4L-V2). I came to this thread searching because I have observed the same problem with the CPU, and in the current state it does not show more than 2GHz stressing it.
During the initial installation (the first baremetal) I had problems trying to mount two nvme in raid. The installation failed, and when trying to manage the disks with other utilities to eliminate the raid swap that I had created and zero the disk, the computer crashed because it was very hot. I gave up and removed the adapter that came with it to place the second M.2 in the port for the Wi-Fi card, and installed it on a single NVME with a cooler, I was afraid of burning it.
The S.M.A.R.T shows no problems and the system seems to be over 50º. The CPU in idle is around 40º. I expected a lower temperature since the system has some packages installed but has no load, and I was already scared when I read about burned memories trying to solve the problem and I stopped touching the BIOS since I don't have much knowledge.
I didn't initially enable PowerD in pfSense so that it used the more modern Speed Shift technology, but when I saw this I tried all the combinations...I enabled C-States, I disabled Speed Step, etc...I loaded the default values again and nothing has worked.
The device does not appear on their website, I cannot find the manual, I do not know which BIOS corresponds to it, the ftp is in Chinese...
Does anyone reading the post have the same equipment as me, or any clue as to what could be happening? I think they have to be the default values provided by the BIOS, but I don't know what they are. I bought it without disk or memory (my memory and disks are Crucial and it recognizes them without problems) but it would bother me if the BIOS values are not set by default so that the cpu works correctly. -
@AnonymousRetard Thanks for the update and glad to hear it's going well for you. For reference, although I'm not sure how useful it will be to you, my machine has been stable with CPU temps generally being between about 40 and 45 with brief maximums up to about 53. Here's the graph over the last week:
I'm not as sure what to make of the nvme temp, because the SMART data reports "Temperature" as 45 but "Temperature Sensor 1" as 63, which is a rather large difference. But also it was a $15 drive, so I'm not inclined to worry about it too much now and figure I'll just see whether it remains stable (and if it does fail, after how long).
-
@zoltar I might advise running the same
openssl speed -elapsed -evp aes-256-cbc
command that I'd been using to test my N100 machine to see if it really seems to be getting "underclocked". I've posted my own results in this thread that you can compare to. I'm not getting at useful hits on the two model numbers you mentioned so I can't be sure whether we have the same hardware or not. Although mine was Topton branded, not CWWK, but I know a lot of these are just rebrands of pretty much identical hardware. I certainly wouldn't want to advise you to flash a BIOS without being 100% certain though.As you can see from the other post I just made, my CPU temps tend toward the mid-40s most of the time. I think it's fair to say that 50 is not bad at all either for passive cooling. I wouldn't expect that to be causing crashes. Although there have been reports of varying mechanical quality of these machines with some exhibiting gaps between the surface of the processor dies and the heatsink. If you're comfortable doing so (and have replacement thermal compound) you might disassemble it and check for that, if you're convinced that it was crashing due to overheating.
-
Yes, I have Artic paste, but the temperatures do not worry me now, what worries me is getting the processor unlocked and starting to rise when the device has a workload.
My output:
-
@zoltar Those results look to be pretty much in line with mine after I loaded the unlocked BIOS and played around with the PL1 and PL2 settings. Before that I would get about half those values. So I question whether you're truly throttled. Or if you are, it's not by much. The reported CPU speed indicator has never seemed to be especially accurate, so I'm not sure how much stock I'd place in it. Do you have other reasons to believe your CPU speed is being limited?
-
false alarm. I have launched a simple iperf and the frequency has risen to 2.9GHz. The temperature is relatively high for the low workload, but I can possibly improve it by changing the thermal compound with my Artic Silver as you recommend.
Edit: Upload screenshots to the folder in Dropbox.For now I will continue with the installation of packages and I will start migrating the vlans one by one to see how it evolves.
If I see little performance I will start playing with the Speed Shift, and if not with the motherboard values.I leave you the configuration of my system and my default BIOS configuration on my device in case it can help you link text
If you need a specific capture, ask me for it without problems and I will upload it to the same folder. -
@TheNarc said in Topton N100 Reporting 402 MHz:
@zoltar Those results look to be pretty much in line with mine after I loaded the unlocked BIOS and played around with the PL1 and PL2 settings. Before that I would get about half those values. So I question whether you're truly throttled. Or if you are, it's not by much. The reported CPU speed indicator has never seemed to be especially accurate, so I'm not sure how much stock I'd place in it. Do you have other reasons to believe your CPU speed is being limited?
No, I had only noticed the speed reported in the GUI. The max still seems frozen at 806, but as you will see in the screenshots when it works, it goes up to 2.9GHz. I hadn't read you before posting again.
If you need anything, ask for it. Thanks for the recommendation, and for everything you are sharing too, when I am more comfortable with the system I will surely start playing. -
@AnonymousRetard Not sure whether it might be of interest to you as well, but in case it would . . . I haven't really been concerned about my temperatures, because they were mostly in the mid-40s spiking to just below 60 under load, but nevertheless decided that for $8 I'd get two USB-powered fans and strap them on with cable ties, one to the top and one to the bottom. Here are the fans I got. I was shocked at the difference, it dropped the temps by about 20C.
Edited to add: I was only able to put one on the bottom because I've got my router on a wire rack, though I suspect the one on the top is doing the heavy lifting anyway.
-
@TheNarc Thanks for all the information. Yeah, I also got a fan, this one, a bit larger, and with three different speed settings. But only one. Was also planning to attach it with cable-ties if needed but I still would prefer to make do with just passive cooling. My memtest86+ experiement with just the added SODIMM heatsinks unfortunately failed as well after about 1.5 hours. Again the RAM was too hot to touch.
Currently though I have 10.5 days of uptime with no crashes with no active cooling. What I changed is I left the SODIMM heatsinks on (even though they probably change next to nothing), I changed the TCC offset to 40 (which causes the CPU to throttle at temps above 65C), and I lowered the speedstep setting from two steps toward performance to two steps toward energy effiency.
The box is probably not 100% stable still, it could probably crash during prolonged stress-testing, but hopefully during normal usage it will be stable enough for what I want to use it for.
As for my NVMe drive I got one with the box and it seems to be some cheap chinese brand: BKKJ nvme 128G. It seems the current temperature it reports through SMART-data is broken (it always says 40C). But it does have other historical thermal information which is probably correct:
Warning Comp. Temp. Threshold: 83 Celsius Critical Comp. Temp. Threshold: 85 Celsius Warning Comp. Temperature Time: 12 Critical Comp. Temperature Time: 1 Thermal Temp. 1 Transition Count: 51 Thermal Temp. 2 Transition Count: 1 Thermal Temp. 1 Total Time: 5488 Thermal Temp. 2 Total Time: 12
I'm not sure what the unit is for the time but obviously it thinks it has spent some amount of time above 83C (warning temp) and a small amount of time over 85C (critical temp). Probably the SODIMM RAM increases to similar temps when the ambient temp in the box becomes really high. For now I think I'll only use the fan if the box keeps crashing during my normal usage or maybe during critical heavy operations such as full system upgrades. If I do use the fan performance does increase a bit as the CPU doesn't have to limit itself because of thermals but for my use I don't really need every last bit of the possible performance.
-
@TheNarc
Just one question:
In the graphs, where do you get cpu_0..3 from?
I only get tz0, and that always stays at 27.9°C.
(Btw, on Topton N5105.)Found the reference to the posting above.
OK, a simple reset of monitor data did the trick. -
Hello everyone,
Unfortunately, I'm afraid I have very similar symptoms.
Initially I was happy to see this thread, but as time goes by and I start to doubt whether there is a solution.I have one of these Topton Intel N100 devices, 4 x 2.5 GBs Intel 226 NIC
My bios is BK-1264NP Ver: 1.5, 09/28/2023 17:23:35
Unfortunately, I don't see the "Performance temperature" settings in the BIOS :-(
However, I only noticed the possibility of changing "CPU Flex Ratio Settings"
The default value of "Disabled" can only be changed to "7". Then pfsense shows me the max speed is 691 Mhz
I have experimented with "7" and "8" (max speed is 800 Mhz ) but in all cases I still see the issue.
The issue is that during the idle mode, the pfsense is reporting quite high values of "Current CPU speed".
However, if I startiperf3
test, than the speed decreases from 2 GHz to mentioned 402 MhzI can observe this also with iperf3 transfer speed which decreases from 2.5 Gbps to ~ 1 Gbps
This is weird to me. Summarizing the issue is about:
- when there is no traffic, the current CPU speed is 2GHz
- when I start iperf3 , which gives my pfsense lot of work to do :-), then CPU speed decreases to 402 MHz
-
@roxy Have you run the openssl benchmark referenced in this thread? That may help to determine whether the N100 is being throttled. Do you see CPU usage hitting 100% when your transfer speed drops?
I only got access to the power level settings in the BIOS that seemed to make the difference for me by loading a modded BIOS (also referenced in this thread). My whole LAN is only 1Gbps though so I can't run your same 2.5Gbps test as a point of comparison.
-
Yes, and the benchmark result depends how often I run it.
If the pfsense is in the idle mode, then I got very good results :-)You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-CBC for 3s on 16 size blocks: 106710142 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 64 size blocks: 35718014 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 256 size blocks: 9240105 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 1024 size blocks: 2329749 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 8192 size blocks: 293649 AES-256-CBC's in 3.02s Doing AES-256-CBC for 3s on 16384 size blocks: 146186 AES-256-CBC's in 3.00s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x7ffaf3bfffebffff:0x98c007bc239ca7eb The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-CBC 569120.76k 761984.30k 788488.96k 795220.99k 795641.59k 798370.47k
But, when I run the benchmakr several times, or if I start
iperf3
, so the CPU is getting slower (~503 MHz this time) and slower, then I see the following resultsRESULTS when CPU speed is 503 MHz You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-CBC for 3s on 16 size blocks: 20752586 AES-256-CBC's in 3.61s Doing AES-256-CBC for 3s on 64 size blocks: 11339018 AES-256-CBC's in 3.81s Doing AES-256-CBC for 3s on 256 size blocks: 2117270 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 1024 size blocks: 474071 AES-256-CBC's in 3.20s Doing AES-256-CBC for 3s on 8192 size blocks: 62937 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 16384 size blocks: 22461 AES-256-CBC's in 3.00s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x7ffaf3bfffebffff:0x98c007bc239ca7eb The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-CBC 91994.15k 190346.79k 180673.71k 151925.27k 171859.97k 122667.01k
-
That seems like you might be hitting thermal throttling. Check the per core temperatures.
-
I see the following output of
thermal sensors
hw.acpi.thermal.tz0.temperature: 27.9C dev.cpu.3.temperature: 44.0C dev.cpu.2.temperature: 43.0C dev.cpu.1.temperature: 44.0C dev.cpu.0.temperature: 45.0C
However, after few minutes of benchmark, the CPU speed increased to 600 - 800 MHz, and the transfer speed is about 1.9 Gbps