Topton N100 Reporting 402 MHz
-
@TheNarc No problem! I have not ran iperf before but appearantly iperf3 was already installed on my Gentoo machine in the LAN so I installed it on pfSense as well. Here's what I get when pfSense acts as the server:
iperf3 -c 192.168.1.1 -p 5201 Connecting to host 192.168.1.1, port 5201 [ 5] local 192.168.1.11 port 49202 connected to 192.168.1.1 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 114 MBytes 951 Mbits/sec 0 502 KBytes [ 5] 1.00-2.00 sec 108 MBytes 909 Mbits/sec 0 502 KBytes [ 5] 2.00-3.00 sec 104 MBytes 876 Mbits/sec 0 502 KBytes [ 5] 3.00-4.00 sec 112 MBytes 935 Mbits/sec 0 502 KBytes [ 5] 4.00-5.00 sec 113 MBytes 947 Mbits/sec 0 502 KBytes [ 5] 5.00-6.00 sec 112 MBytes 943 Mbits/sec 0 502 KBytes [ 5] 6.00-7.00 sec 114 MBytes 953 Mbits/sec 0 765 KBytes [ 5] 7.00-8.00 sec 112 MBytes 935 Mbits/sec 0 765 KBytes [ 5] 8.00-9.00 sec 112 MBytes 941 Mbits/sec 0 765 KBytes [ 5] 9.00-10.00 sec 112 MBytes 939 Mbits/sec 0 765 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.09 GBytes 933 Mbits/sec 0 sender [ 5] 0.00-10.00 sec 1.08 GBytes 930 Mbits/sec receiver
I ran it multiple times and there's not much of a difference. Worst bitrate I saw was 814, some runs had no bitrates below 900, but the speeds definately don't degrade after multiple runs. Under "htop" it looks like all cores reach almost 100% during this test but I have no idea what frequency they run at during the test, since I have speed shift set to 100% efficency I guess they might not be clocking to the highest they can. I also tried disabling/enabling suricata but it really didn't make any difference. Oh and I also have the tunable "dev.igc.0.fc" (WAN) and "dev.igc.1.fc" (LAN) both set to 0 because it seemed to give more consistent speeds on the Ookla speedtest website.
But about the temperatures I can tell you my box was initially idling with core temperatures around 50-52C with the default pfSense settings. After enabling speed shift set to 100% effiency, disabling PowerD and enabling all C-states through the tunable the idle temperatures dropped to ~45-46C. At the moment though the idle temperatures are only ~27C because after the last crash I brought a USB desk fan from work to blow on the case to see if it helps with my stability issue. I don't think ~50C should be a problem for the CPU but perhaps the ambient temp inside the case gets too high for the RAM to be fully stable is my hope... I really should run memtest86 on the machine but it's currently "in production" already running my home network and the wife gets a bit mad every time I reboot the machine or tinker with something else which brings the internet down.
Since just blowing on the top of the case with a fan is able to drop the CPU core temps by close to 20C and almost down to ambient room temp it feels like the CPU must already have pretty good thermal coupling to these fins on the outside. But I doubt the NVME drive and SODIMM ram stick inside has any heatsinks at all and theres no working "ambient" thermal sensor on the inside I've been able to find so because the case has no holes I wouldn't be surprised if the ambient temp inside the case becomes close to the CPU core temps.
-
@AnonymousRetard Thank you, much appreciated! So you're definitely getting the results I'd expect. I really don't know what the deal with my machine is. I disassembled it and the heatsink solution isn't what you'd love to see: an aluminum block sandwiched between the CPU dies and the (painted!) top inside surface of the case. There were no obvious gaps though. So I cleaned off the existing thermal compound, used a crappy little rotary tool I have to get off as much of the paint as I was able to with a wire wheel, cleaned everything, and reassembled using my thermal compound. That did have the effect of dropping my temps a few degrees I think, but they were only getting up to the low 40s before.
Curiously (or maybe not, since C-states are low power states) I'm finding that I get better performance on the openssl test with I disable C-states in the BIOS. It sure feels like FreeBSD is just throttling the CPU as extremely as it can. I did notice that it's not logging the temps in System > Monitoring, which my N5105 system does. But the 4 core temps do show up and seem accurate on the dashboard.
Honestly I was out of my depth a number of posts ago in this thread, and I'm not sure I have the stamina to try changing one BIOS setting at a time and rebooting to see whether it makes any difference. Just very frustrating . . . if it weren't for the normal performance I'm seeing under Linux, I could at least write it off to a potential hardware defect, but that not being the case there's the tantalizing possibility of being able to get it to work like it should if I could only tweak the right things.
Anyway, thanks again for your time. And I certainly empathize with the difficulty of fooling around with these things in a "live" environment! Thankfully for the time being I've got the old apu2 swapped back in.
-
This post is interesting and seems to suggest that the FreeBSD scheduler with respect to E-cores is lacking when compared to Linux. Of course, it's just a random forum post, but the N100 has no P-cores, and the poster was specifically referring to variability in results of the same openssl speed test I've been running.
-
@TheNarc No problem! Really strange problem you have... I took another round thinking what I could be and realized that we probably do not have the same board. I never followed the link to the BIOS you said you used until now and when I did I realized it's not the same one I added a link to of possible things to try later. For me it's this one: https://forums.servethehome.com/index.php?threads/cwwk-topton-nxxx-quad-nic-router.39685/page-75#post-405132 . I haven't double-checked yet that it matches 100% and I might never do that but I do remember my bios said "BK-1264NP Ver 1.5" somewhere as I was setting up the device. I have not yet checked dates and numbers on stickers but I'm pretty sure that system matches mine.
I guess my specific board/BIOS version could be better supported in FreeBSD than what yours is? It does really sound like your CPU is not being allowed to clock as high as it should in FreeBSD, especially with the temperatures you say you've had. I mean mine started out at 50-52 before any tweaking for lower power use...
Edit: Oh nvm.. you did link to the same BIOS... it's just you had another link as well to the unlocking procedure which talks about another BIOS... This makes it even stranger if we started from the same board and BIOS... Are you also running the latest pfSense 2.7.2?
-
@AnonymousRetard Yeah it definitely seems like we have the same machine. And I am also running 2.7.2. I even tried booting the live FreeBSD distribution NanoBSD and got similarly poor results there, so it's not just a pfSense thing. It definitely is confusing; if I didn't have the data point of Linux seeming to perform as I'd expect, I'd definitely chalk it up to losing the hardware lottery. But I can't do that so it will just keep driving me crazy until I give up at some point :)
-
Well, in late-breaking news, something I wangjangled in the BIOS seems to have made a difference. The results are still different than yours, but kind of on-par, plus I can see pfSense claiming it's boosting up to 3+GHz now:
/home/john/better_openssl_speed.jpeg
Now the bad news is: I touched a lot of settings in the BIOS this last time. I was kind of going for broke. I may try to get pictures of all the screens (in fact I probably should in case the settings ever get defaulted and I need to get back to this point), but it's going to be a lot of screens! I've wasted enough time on this today that I may wait until tomorrow to do that. And then I need to decide whether I care enough to methodically try to determine which setting or few settings out of the ones I changed actually made the difference.
-
I do still see the drop-off in the iperf testing though. That one's odd because I can now see the CPU speed going up and corresponding to the more expected results of 900+ Mbps. But if I keep re-running the test, it's as if something decides it no longer needs to increase the CPU speed, and I see it fall again along with the iperf speeds. That I don't care as much about so long as traffic through the system isn't impacted, but to test that I'll need to get it fully back in service again and run some Internet speed tests.
-
Hmm, fun*!
-
@TheNarc Very interesting! In my original BIOS I don't have these settings unlocked but in general I think these CPUs should have PL1 setting which says what the maximum power consumption for a low load scenario is. Then there should be a PL2 setting which says what the maximum wattage should be while boosting. Then there should also be a configured time limit for how long the CPU should be allowed to be in the PL2 state. After this time limit it should be forced back to the maximum power usage from the PL1 setting.
If your CPU is not thermal throttling, could it perhaps be that you reached the configured time limit for the PL2 state?
If you are lucky maybe you can find these settings in your unlocked BIOS and tweak them further?
For example maybe with a higher PL1 setting you could remove the visible iperf performance drop even after the PL2 time limit has ran out?Bad PL1 and PL2 settings don't quite explain though how you could get better performance in Linux... so yeah if you could find out what you changed to finally see the 3GHz in pfSense I would be very happy to learn that as well.
I would be very interested on anything you can learn about this board and BIOS really. I might also change from my original BIOS in the future and would like to learn as much as possible. If you decide to keep digging and find out exactly what fixed or at least partly fixed your issue I would be very interested in that as well. But no rush at all. It's time for me to go to bed for today as well so you're not likely to get more responses from me tonight either.
-
@AnonymousRetard Thanks! Have a good night. I did play around some with the PL1 and PL2 settings and those changes alone did not seem to make a difference, but I didn't configure time limits on them either. I also lied and went ahead and took pictures of all the BIOS screens on which I know I changed at least one setting from the default. Which ones? It's anyone's guess, as I had lost any incentive to be methodical by the point I was fiddling with a bunch of them :) So I am not asking anyone to analyze or postulate on these, but posting them here just in case they're of interest.
Of note, both of the modded BIOSes that I tried were indistinguishable (version-wise) from the one that was originally loaded. So perhaps it only looked like we had the same BIOS, but behind the scenes they had different default values. Who knows.
Anyway, here's a bunch of low-quality phone pics of BIOS setting screens :)
/home/john/Pictures/bios1.jpeg
/home/john/Pictures/bios2.jpeg
/home/john/Pictures/bios3.jpeg
/home/john/Pictures/bios4.jpeg
/home/john/Pictures/bios5.jpeg
/home/john/Pictures/bios6.jpeg
/home/john/Pictures/bios7.jpeg
/home/john/Pictures/bios8.jpeg
/home/john/Pictures/bios9.jpeg
/home/john/Pictures/bios10.jpeg
/home/john/Pictures/bios11.jpeg
/home/john/Pictures/bios12.jpeg
/home/john/Pictures/bios13.jpeg
/home/john/Pictures/bios14.jpeg
/home/john/Pictures/bios15.jpeg -
@TheNarc Thanks for all the pictures! Something I would try would be to set PL1 much higher and see what happens. When I googled it I found some useful data for you here: https://forum.netgate.com/topic/181999/hunsn-rj38-n100-cpu-clock-speeds/22
Unfortunately I don't know what my default values are since I still run the original BIOS which doesn't expose them. I'm still not sure I'll want to risk flashing the modified one unless there's something specific I can't make work without it.Something I learned from my system today is that the speed shift setting (which controls dev.hwpstate_intel.%d.epp) is greatly limiting the max boost when dragged towards energy efficiency. I didn't do much tests and initially thought it would just increase the time before the CPU takes the decision to increase the clocks but it turns out it also greatly limits the max boost allowed.
I kept running
openssl speed -elapsed -evp aes-256-cbc
with various settings for this value and found that everything less than one-two steps to the left from the middle value is clearly decreasing the scores and max boost by much more than just normal variance between tests. For now I have settled at a setting of 40 in this slider (two steps to the left from the middle), anything further towards the performance side becomes unclear if there's much of a difference or just normal run-to-run variance. My full settings on this page are now:
- Enable speed shift
- Core Level Control (Recommended)
- Power preference: 40%
- PowerD disabled
The only other thing I have tuned in regards to the CPU is enabling C-states through the tunable I mentioned earlier. This has caused these counters:
sysctl -a | grep cx_usage dev.cpu.3.cx_usage_counters: 1100374 3254544 5422369 dev.cpu.3.cx_usage: 11.25% 33.28% 55.45% last 104us dev.cpu.2.cx_usage_counters: 749072 2829209 5473164 dev.cpu.2.cx_usage: 8.27% 31.25% 60.46% last 80us dev.cpu.1.cx_usage_counters: 895261 3207828 5610663 dev.cpu.1.cx_usage: 9.21% 33.02% 57.75% last 232us dev.cpu.0.cx_usage_counters: 3198717 53634141 0 dev.cpu.0.cx_usage: 5.62% 94.37% 0.00% last 168us
To show values in column 2 and 3 which they never did before so it has clearly done something. It should reduce power use and I could imagine it possibly increasing single-thread performance as well if some cores are sleeping deeper and the remaining core gets more of the total power budget. But of course it will increase latency as well but that's not something I've been able to notice.
With these settings I now see the CPU boost to slightly above 3400MHz sometimes on the pfSense web dashboard and the speed no longer decreases when I stress test but instead increases.
Here are my new scores with these settings:openssl speed -elapsed -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing AES-256-CBC for 3s on 16 size blocks: 159931447 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 64 size blocks: 54773786 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 256 size blocks: 14077112 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 1024 size blocks: 3525849 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 8192 size blocks: 444070 AES-256-CBC's in 3.00s Doing AES-256-CBC for 3s on 16384 size blocks: 221421 AES-256-CBC's in 3.00s version: 3.0.12 built on: reproducible build, date unspecified options: bn(64,64) compiler: clang CPUINFO: OPENSSL_ia32cap=0x7ffaf3bfffebffff:0x98c007bc239ca7eb The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes AES-256-CBC 852967.72k 1168507.43k 1201246.89k 1203489.79k 1212607.15k 1209253.89k
Something to note here though is my system is currently being cooled with an external usb desk fan blowing on the case so CPU temperature only gets to 40-45C during the test and then quickly drops back to ~24-27.
-
@AnonymousRetard Hey thank you so much for this! That's great information and I'm glad you were able to boost your own speeds as well. I'm definitely going to try the advice from the other thread you referenced of setting PL1=9, PL2=10, and PL3=30 in the BIOS and I'll see what impact that has. I've also been running with my SpeedShift slider at 50 but I'm going to try 40 as well. And I did add the Cmax system tunable that you suggested.
I'm still confused why these BIOS power level changes would be needed in order to get this sort of expected performance out of the processor in FreeBSD whereas no changes to the default settings seemed necessary to get expected performance under Linux, but then I have basically no understanding of the vagaries of BIOS settings, what gets passes to different OSes, etc.
Funny you mention the fan because I'm thinking I may try to do the same thing with mine. It is a pain because I know the "fanless" thing is a big selling point but it would be really nice if the cases for these things at least provided the option for mounting a fan somewhere to get a little air circulation!
Oh yeah and I would probably not flash your BIOS since your machine is behaving as expected. It went well for me but did make me nervous, and was not especially straightforward.
-
@TheNarc I agree that it is quite strange that you only got bad performance in FreeBSD specifically but it seems like it is possible to change the PL1 & PL2 settings and probably much more things as well directly from the OS, one example is a script for Linux found here: https://github.com/horshack-dpreview/setPL
So I guess it's not impossible that the Linux or Windows kernels are changing some of the BIOS values while FreeBSD doesn't. But of course it could also be a lot of other things than PL1 & PL2 specifically that will bring about different behaviors from different OSes in how they will control the CPU frequencies.
But there are a lot of other settings in your unlocked BIOS that will definitely have an impact on the CPU frequency behavior as well.
-
@AnonymousRetard said in Topton N100 Reporting 402 MHz:
https://github.com/horshack-dpreview/setPL
Ah that's very interesting. Honestly this is the first time I've really been aware of these power levels. It definitely seems plausible that you're right and perhaps Linux is overriding the BIOS settings for them or something to that effect. I've sure never had such trouble just getting the expected speed out of a CPU before though, but I did realize that I was potentially signing up for an adventure like this when I bought one of these boxes, ha.
-
Yes I could believe Linux/Windows is updating some values there to allow the CPU to run at full speed. It would surprise me if it's those Power Level values though since, as I understand it, those are supposed to be set by the system builder based on the thermal management available. But that data seems to suggest it is so.....
-
@stephenw10 Yeah, and I could certainly believe that the power levels were set low, as the thermal management available on these things is . . . underwhelming. Although out of the box it seems to be set for PL1=6W and PL2=25W which, as far as I've been able to tell, is what it should be for the N100. And for that matter, that's still what I have mine set at, so it had to have been some of the other settings I changed that got my current performance boost. But I am going to try the PL1=9W, PL2=10W, PL3=30W today to see whether it gets even better performance w/o getting too hot. I also came across some suggestions (like here) that the PL1 time window should be set to 28 instead of 0. But I stopped knowing what I was doing a while ago :)
-
Yup I'm guessing at this point! Seems like you're making progress though.
-
So I tried changing PL1 from 6W to 9W and PL2 from 25W to 10W and get much better openssl speed results:
/home/john/Pictures/PL1_9W_PL2_10W.jpeg
Not sure about thermals yet so I'll need to keep an eye on that, but this is much better. And yet, it is nominally a 6W processor so if this is basically just forcing it to run at 9W all the time, it seems like there must be a better way. But I'll take what I can get!
-
Got even slightly better numbers when I left PL1 at 9W and set PL2 back to 25W. Which is interesting because it suggests that the PL2 setting is more impactful, and does make sense because my understanding is that it's the upper power limit the processor is allowed to go to for brief periods of time. Except when PL2 was 25W and PL1 was 6W, I was getting significantly worse results, which I'm not sure I can really explain. But I think I'm going to leave it at PL1=9W and PL2=25W unless the thermal situation proves untenable.
-
@TheNarc Nice! From my understanding you are not forcing it run at at 9W all the time but rather allowing it to use 9W of power for an unlimited amount of time and as the maximum in "low load scenarios". When the CPU usage goes up the CPU will be allowed to use the PL2 power for a certain time limit which should be configurable as well. The PL2 limit is supposed to handle bursty loads since it usually takes a while before the temperature builds up enough to become a problem.
Sometimes if the CPU is constrained by thermals you will get better performance though by setting a lower power limit as the efficiency usually drops at higher power draws. As an example imagine the thermal downclock limit gets hit at a sustained 10W power draw and then the CPU downclocks because of the thermal limit. In that scenario you might get better performance if the CPU instead runs for long amounts of time at 6-9W if that causes it to not hit the thermal limit where it starts downclocking.
It shouldn't be dangerous to the CPU to set both of the limits really high, like putting both at 25W. It will limit itself because of thermals anyway but your performance is likely to drop at some point because of what I described above. Another problem though is that it could also increase the ambient temperature in the box too much which the other components might not like. I don't think the NVMe drive inside or the RAM has any heatsinks in our boxes but I haven't double-checked. But I'm sure neither is connected to the big heatsink on the top of the case. A third problem could be that the power supply isn't designed for too large power draws for longer amounts of time but in your unlocked BIOS it looks like the total power draw is also controlled and constrained by a separate setting (the 65W AC Brick setting). Normal PSUs though just shut down if they get overloaded (OCP).