SG-1000 100% CPU Usage



  • Is anyone else seeing 100% CPU usage on the SG-1000 even with very low utilization? We are on the latest version and even sitting almost idle it is maxed out on CPU. All other variables within normal limits.



  • they had it fine, the last 2 updates - so bad i had to use other hardware.



  • I bought an SG-1000 to setup and install at my parents house but the performance that I'm seeing while I have it connected to my home network has me worried. I'm only seeing may throughput of around 100Mbps which is fine; their connection is less than that. Tried to run pfBlockerNG and unbound with DNSBL but the CPU utilization keeps spiking to 100%. Other than pfBlockerNG I have it setup with a remote access VPN and a VPN tunnel between their network and mine. OpenVPN can hit a CPU pretty hard but there is no traffic going over the VPNs.

    Is the ARM CPU in this so weak that it can't handle the VPN, unbound, and pfBlockerNG?

    Is the 2.4 Beta software not fully optimized for this platform?

    Worst part is I had my retired father spend $150 so that I could get his network purring like a kitten using pfSense and now I am seriously worried about sending this to him. I had it crash on me several times when I was setting it up; think I was overloading it due to impatience. Now that I have accepted that it takes a while to make changes I am seeing very high utilization.



  • I think theyre still fiddling with it.  Like I said, 2 updates ago it was great - barely any cpu unless downloading something huge off of steam.  2 netflix going with smartphones and several computers with 0 issues….2 updates ago.

    Im brute forcing with a 6 core xeon until they get it stable again.  2.5w compared to 150w is nothing to sneeze at.



  • @W4RH34D:

    I think theyre still fiddling with it.  Like I said, 2 updates ago it was great - barely any cpu unless downloading something huge off of steam.  2 netflix going with smartphones and several computers with 0 issues….2 updates ago.

    Im brute forcing with a 6 core xeon until they get it stable again.  2.5w compared to 150w is nothing to sneeze at.

    This is where I'm at; I want to wait until they have it stable before I send it to my father because he doesn't have the technical knowledge to correct possible issues and I don't want to send him a buggy product.


  • Netgate Administrator

    I'm not seeing CPU usage any different in todays snap than a few days ago.

    pfBlocker with DNSBL could be an issue for the 512MB or RAM it has if you're using a large list.

    At the console run

    top -aSH
    

    Let it run for a few seconds to get the interrupt usage at the top and hit q to quit. Grab a screenshot or copy/paste it. What process is using all the CPU cycles?

    Steve



  • I'll plug it in and get that soon.



  • @stephenw10:

    I'm not seeing CPU usage any different in todays snap than a few days ago.

    pfBlocker with DNSBL could be an issue for the 512MB or RAM it has if you're using a large list.

    At the console run

    top -aSH
    

    Let it run for a few seconds to get the interrupt usage at the top and hit q to quit. Grab a screenshot or copy/paste it. What process is using all the CPU cycles?

    Steve

    Attached, this was after a boot. Direct console connection.

    It cashed a few minutes after bootup. One connected via console no ethernet connections at the moment.

    ![2017-02-05 (1).png_thumb](/public/imported_attachments/1/2017-02-05 (1).png_thumb)
    ![2017-02-05 (1).png](/public/imported_attachments/1/2017-02-05 (1).png)



  • I've got it setup with openvpn (one outbound and an inbound server) and unbound.
    It works fine - infact it works amazingly well, lower latency than my previous router.
    I've not tried pfBlockerNG, I took a look but I've not used it before so I've removed it for the moment.
    The outbound vpn connection works really well and saves me hastle.
    The inbound connection is for when my wife is at work and needs files of the NAS.
    Not had any complaints, in fact she commented that it was great and "just worked".

    It is still in beta, so I'm expecting a few issues.
    If you update every day, given it's in beta - I would expect some pain here and there.

    I've not seen any load issues, or had any reboots. (touch wood)
    But I tried a few other packages:
    squid - way too sluggish, transformed from fibre to 9600 baud modem!
    acme - not worth the effort, good idea though. [ CORRECTION - not worth the effort for a "home firewall" ]
    pfBlockerNG - not got the time at the moment to understand and configure it properly.


  • Netgate Administrator

    Hmm, Ok.
    Not seeing high CPU usage there. Not sure what that panic might be. Did you get a crash report at reboot?

    Steve



  • you can reduce the interrupt usage by fiddling in loader.conf.

    you can reduce the timer to 100hz refresh rate instead of 1000hz, usually I only tune this on virtual machines but I would consider tuning it also on very low end hardware.

    I will paste a link to the FreeBSD page as it has other tunables as well.

    https://wiki.freebsd.org/TuningPowerConsumption

    Also it might be a good idea to add a swap to that system as 512meg maximum addressable memory is quite low when you doing things like using pfblockerNG and especially if using with unbound.



  • I can't get in to the GUI or the console - getting a resource busy.  I think i fried it.  I handed it to a friend who is going to take a crack at it.



  • @stephenw10:

    Hmm, Ok.
    Not seeing high CPU usage there. Not sure what that panic might be. Did you get a crash report at reboot?

    Steve

    I noticed that when before I sent it. The crash is what really worries me. My thought is the CPU usage and performance will come with finalized software.


  • Netgate Administrator

    I certainly hope so.

    If you can catch it running at high CPU though and grab an output from top that will help us track down whatever that is.

    It doesn't seem to be that widespread, not an issue effecting all users.

    Steve



  • I'm seeing very high load on mine, and very low throughput. The CPU usage in top says it's mostly idle, but the load average hovers around 1.
    Also, I'm getting very low throughput. Simply running curl on a large file and piping to /dev/null, I get <10mbit both on the SG-1000 itself and when I connect a computer to the LAN port.

    I took the thing with me to work today, to try it in another network, but the results are the same.

    Really disappointed :(

    If there's any output I can give you that would help, please let me know. I really want to use this, but if it can't get close to my 100mbit connection I can't.
    Currently it has the 20170207 image freshly flashed, and I've only run through the initial web config. No other changes.


  • Netgate Administrator

    Same thing I suggested above should show it. Run at the console:

    top -aSH
    

    Let it run for a few seconds to get all the info then quit and copy paste it here.

    Steve



  • It seems it depends very much on what I do. Testing with speedtest.net, I get the full 100mbit down, and almost the same up.
    However, downloading a single file from a site that normally maxes out my 100mbit, both on wireless and when stealing the WAN cable from the SG-1000, I get around 10mbit. Back to a smooth 100mbit as soon as I steal the WAN cable and plug it into my client machine.

    Attached is the output of top, as requested, after having ran top and the download for about a minute.

    top.txt



  • Any ideas? Anything else I can provide?



  • The load on mine always hovers around 1, but it can handle my internet speed OK, it's the max speed I'm pegged at.

    So I don't think the load is the problem.
    If you want load less than 1, disable ssh, only use the console and don't use the web-interface.
    Obviously that's not never use the web interface!
    The 20% interrupt seems very high

    last pid:  7799;  load averages:  0.95,  1.26,  1.07    up 0+00:33:30  21:55:09
    117 processes: 2 running, 93 sleeping, 22 waiting
    CPU: 16.7% user,  0.0% nice, 44.9% system,  5.1% interrupt, 33.3% idle
    Mem: 59M Active, 227M Inact, 115M Wired, 56M Buf, 81M Free
    Swap:

    The only things I can think of immediately are:
    Is the traffic already saturated on the WAN port (install iftop and use from command line)
    Is there a duplex mismatch? Can you put a(nother) switch between the WAN cable and the SG-1000
    Are you logging any packets on the firewall? (that may help, try not logging anything)



  • It's absolutely stock, so I doubt anything is being logged. However, speedtest also runs well on my SG-1000. I get the full 100mbit. However, when downloading a single file, the story is entirely different. Could you by any chance give downloading http://ipv4.download.thinkbroadband.com/1GB.zip a try? (it's a test file from http://www.thinkbroadband.com/download.html)



  • If speedtest is working fine, then there is something weird going on.
    It doesn't make much sense, which means we are missing something (probably obvious!)

    I get(ish) around 4-5MB/sec (30-40Mbit) downloading the file.
    That seems around right for me, downloading from standard endpoints.

    If it's stock, then I'd definitely look at the settings.
    I'm still a newbie when it comes to PFSense (even though I've been using it for a long time) but
    If I was having these problems I would:

    • swear a lot ;-)

    • "Reset to factory defaults"

    • disable ssh

    • don't use the web-ui whilst you are doing any performance testing. (web browsers are the work of the devil anyway)

    • On the WAN, I would recommend a rule saying drop everything, don't log.

    Something like this:

    The SG-1000, from my understanding of gossip, blogs and forums is that it's currently rated to be around 100Mbit.
    I could be wrong there, that is purely my guess and bad memory at work!
    I know they are and have been working on improving this.
    virus scanning, proxy etc. IMHO is the last thing that should be on it at the moment if you are using it, or want to use it as your main firewall.

    So anything you can do to reduce the overhead is a good thing.
    It maybe painful to start again from factory, but if it's a niggling issue that can't be solved the only way is one step at a painful time.
    Make one change and re-test every time.

    Another question, are you plugging directly into the SG-1000 or via a switch - for a laugh if you can try both.
    Hardware is also the work of the devil - in fact anything to do with infrastructure is there to bend your mind!



  • I am plugging directly into the SG-1000, both with my laptop at home and the machine at work. That's also the two different networks I've tried it in with the same results. I've reflashed the thing multiple times with no improvements in throughput. I also see the same performance when downloading the file directly to the SG-1000 through the console with curl, writing to /dev/null, so I doubt the connection between my machines and the SG-1000 has anything to do with it, at least.



  • Using it on two different networks with two different machines doesn't allow any comparisons as there are too many variables involved.
    There are many, many things that can make a difference.
    I would concentrate on debugging it where you are going to use it.
    The only conclusion I could draw at the moment, given I'm not seeing any problems is that something is interfering with the downloads between the SG-1000 and the WAN connection.
    I'm not saying that is right, but it's a starting point.

    When you are using this at home, how is your WAN connection configured.
    i.e. are you connected directly to the internet with it, or is it going through a router (or multiple routers) first.
    For example, my setup I have my old wifi router on my LAN (in case sg-1000 breaks I can reconfigure that as the WAN connection).
    My SG-1000 is connected via the WAN port using a shielded network cable to my ISP's fibre router.
    My SG-1000 does pppoe on the wan port to connect me to the internet and the LAN port connects to my old wifi router (which is acting as a switch and NAS box, until I can afford a decent switch, NAS and wifi access point[cisco/synology?/Ubiquiti])



  • So it took me a while to get all this together; system wouldn't even boot until I had a fan blowing directly on it.

    All these Screen shots were done while I was logged in and looking at the dashboard the last two that show lower usage came after I logged off. I also included a screen shot of the installed packages.

    Side not if it would help I could ship it back to netgate or bring it down for you to look at. I am just an hour away to your north.

    1st  Picture: is an attempt to boot up the SG-1000
    2nd Picture: still trying to boot
    3rd Picture: Yup still won't boot
    4th Picture: No Joy
    5th Picture: ok finally booted and updated from console; no traffic, one user viewing the Dashboard
    6th Picture: no traffic, one user viewing the Dashboard
    7th Picture: no traffic, one user viewing the Dashboard
    8th Picture: Installed Packages
    9th Picture: no traffic, one user viewing the Dashboard
    10th Picture:  no traffic, one users logged in



    ![2017-02-11 (1).png](/public/imported_attachments/1/2017-02-11 (1).png)
    ![2017-02-11 (1).png_thumb](/public/imported_attachments/1/2017-02-11 (1).png_thumb)
    ![2017-02-11 (2).png](/public/imported_attachments/1/2017-02-11 (2).png)
    ![2017-02-11 (2).png_thumb](/public/imported_attachments/1/2017-02-11 (2).png_thumb)
    ![2017-02-11 (3).png](/public/imported_attachments/1/2017-02-11 (3).png)
    ![2017-02-11 (3).png_thumb](/public/imported_attachments/1/2017-02-11 (3).png_thumb)
    ![2017-02-11 (4).png](/public/imported_attachments/1/2017-02-11 (4).png)
    ![2017-02-11 (4).png_thumb](/public/imported_attachments/1/2017-02-11 (4).png_thumb)
    ![2017-02-11 (5).png](/public/imported_attachments/1/2017-02-11 (5).png)
    ![2017-02-11 (5).png_thumb](/public/imported_attachments/1/2017-02-11 (5).png_thumb)
    ![2017-02-11 (6).png](/public/imported_attachments/1/2017-02-11 (6).png)
    ![2017-02-11 (6).png_thumb](/public/imported_attachments/1/2017-02-11 (6).png_thumb)
    ![2017-02-11 (7).png](/public/imported_attachments/1/2017-02-11 (7).png)
    ![2017-02-11 (7).png_thumb](/public/imported_attachments/1/2017-02-11 (7).png_thumb)
    ![2017-02-11 (8).png](/public/imported_attachments/1/2017-02-11 (8).png)
    ![2017-02-11 (8).png_thumb](/public/imported_attachments/1/2017-02-11 (8).png_thumb)
    ![2017-02-11 (9).png](/public/imported_attachments/1/2017-02-11 (9).png)
    ![2017-02-11 (9).png_thumb](/public/imported_attachments/1/2017-02-11 (9).png_thumb)



  • Wow! If you saying "system wouldn't even boot until I had a fan blowing directly on it",
    I'd speak to netgate and look at a replacement.
    My initial guess would have been there's a dodgy package causing that, but you're seeing it during initial boot!
    My next guess would be a bad flash of the image.
    Assuming you have reflashed the image onto it, the next guess (only one left) is hardware fault.
    But bang a call into netgate and see what they come back with.



  • @deadmalc:

    Using it on two different networks with two different machines doesn't allow any comparisons as there are too many variables involved.
    There are many, many things that can make a difference.

    That's the whole point. I use it on two different networks with two different machines, and the download speed is equally bad on both. Thus I am thinking the environment has nothing to do with it.


  • Banned

    @cplmayo:

    So it took me a while to get all this together; system wouldn't even boot until I had a fan blowing directly on it.

    Well, that thing is clearly faulty and needs to be RMA-ed.



  • @pfbolt:

    That's the whole point. I use it on two different networks with two different machines, and the download speed is equally bad on both. Thus I am thinking the environment has nothing to do with it.

    Possibly, but you are using it in two totally different environments. All that really proves is that whatever is the same in both environments isn't causing the issue.
    It's not a case of proving whether it's specifically environmental, that won't get you far.
    What you want to do is find out what is causing the issue on the SG-1000, that maybe some environmental factor that is specific to adding the SG-1000 to it or a bug in PFSense.
    There is an issue that causes lower than gigabit downloads when using pppoe:
    https://redmine.pfsense.org/issues/4821
    But if you aren't using pppoe…then that may not be the problem.
    So at home are you going through a router? or using the SG-1000 as a router?
    It's a case of stepping through every part of your infrastructure, slowly and logically.



  • @deadmalc:

    All that really proves is that whatever is the same in both environments isn't causing the issue.

    Actually, I'd say it proves that whatever is different isn't causing the issue.

    I have tried using the SG-1000 as a router, as well as using it as a client inside my NAT. There's really no difference to the speed at which it can download a large file and send it to /dev/null. Also, when all the other clients I'm aware of on the same networks (both of the networks) have no issues attaining a much higher speed, that means that if there's a compatibility issue, the fault rests on the SG-1000. Even if other equipment is not behaving to spec, the rest of the stuff connecting to said equipment handles the situation just fine.



  • I'm not saying the fault doesn't lie with the SG-1000, but it's working out what is going on.



  • After a bit more digging and testing, I see what you mean.
    It's most strange. I'd swear this wasn't happening last month.
    There seems something funky going on



  • You observe slow throughput for single large transfers?



  • It seems that single connections are definitely limited in their throughput.
    3-4MB/sec (at most) as against the 80Mbit (10MB/sec) I should be getting - and was getting.
    The speed test tools on speedtest.net don't show any issue, however I suspect these do multi-downloads.
    However the amazon ones and the ones on http://www.thinkbroadband.com/speedtest.html do single files and both respectively.


    The single file ones are consistently bad, the multi-file ones are fine.
    My openvpn connection to my server now is at 400KB-800KB/sec too which used to be about 2MB/sec and when I do transfers there isn't a lot of CPU activity.

    The other thing that is really noticeable on websites that have images/media etc.
    The main site loads up quickly (i.e. no media) but then there is a delay in loading the images - which is really irritating on an 80Mbit connection.
    Two main examples are:

    http://www.bbc.co.uk/news
    http://www.speedtest.net

    When I look at it, it really is glaringly obvious that something is wrong.

    Those maybe two separate issues, or they maybe the same - but as I said once looked at it's really obvious.
    There is no interrupt overflow or anything obvious, but my bsd experience is tiny.

    So anyone who can suggest a way of debugging this, I would really appreciate it.

    I've stepped through everything I can think of, and it only happens when the SG-1000 is present.

    I'm not sure how reliable these tests are…
    On my linux boot system I get this:

    But downloading pycharm I got 6MB/sec, yet I can't get near this using the broadband checkers.
    lies, damned lies and speed testers ;-)

    I've disabled all ipv6 stuff on my box and on the firewall, things seem better.
    Perhaps issue was ipv6 being "half enabled"?
    I'll keep looking…



  • Yes this post is WAY old. But in case anyone else runs into this issue with a SG-1000 device,
    I reduced CPU utilization from near constant 1.0+% to .7 - .8% tops via webui changes:

    1. go to Status > Logs > Settings, deselect 'Log packets blocked by Block Bogon Networks rules' and 'Log packets blocked by Block Private Networks rules'
    2. System > Advanced > Firewall & NAT, change the dropdown list selection for 'Firewall Optimization Options' from 'Normal' to 'Aggressive'