CPU loaded at 100% and hangs pfsense
-
Now that you got rid of whatever modified source was there, I'm wondering if the remaining issues are just a fact that you're running possibly the worst NICs ever created, and an old Celeron CPU (the lack of cache hits network throughput performance in a firewall scenario hard, huge diff between a Celeron and P4 of the same clock speed for firewall purposes). 80 Mbps through crap NICs and an old Celeron proc may just be tops of what your hardware can accomplish. Using a P4 proc of the same clock speed would be drastically faster for firewall purposes. Better NICs would reduce CPU usage, but not sure if by enough to make much diff.
-
cmb, Replace the network card on the TP-LINK TG-3269 and still loaded processor at 100%. Perhaps you are right, we have to change the CPU.
Or is there another way?
-
Is this really a P4 era Celeron?
http://en.wikipedia.org/wiki/List_of_Intel_Celeron_microprocessors#.22Willamette-128.22_.28180_nm.29I am running a P4-M at 1.2GHz. It can pass >300Mbps. Yes it has 512KB cache vs 128KB in the Celeron but I find it hard to believe you couldn't pass 80Mbps. :-
Interesting information about cache being so important though.
Do you have hundreds of firewall rules? What is using the CPU time in top -SH?Replacing the Celeron with a P4 should be easy though, they are very cheap. I have several here you could have for free if you were near enough. ;)
Steve
-
There have been a couple instances of people on here running really old Celerons that got horrid performance, just slapping a really old P4 with the same clock speed into the same box quadrupled throughput in one case. Way more than I would have expected, the cache makes a massive difference.
-
Replaced the processor Intel (R) Pentium (R) 4 CPU 3.00GHz, speed now works fine on all the 80-90 Mbps)))
But the CPU loading is still a lot, which is 35-85% when downloading a file, which can then be wrong?last pid: 31994; load averages: 0.86, 0.64, 0.37 up 0+00:10:32 10:02:44 94 processes: 5 running, 74 sleeping, 15 waiting CPU: 1.7% user, 0.0% nice, 38.8% system, 21.3% interrupt, 38.2% idle Mem: 45M Active, 15M Inact, 40M Wired, 108K Cache, 23M Buf, 884M Free Swap: 2048M Total, 2048M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 171 ki31 0K 16K RUN 0 7:41 45.75% {idle: cpu0} 11 root 171 ki31 0K 16K RUN 1 5:40 40.58% {idle: cpu1} 0 root -68 0 0K 64K - 1 3:40 36.96% {ath0 taskq} 12 root -28 - 0K 120K WAIT 0 0:32 34.86% {swi5: +} 13 root 55 - 0K 16K sleep 1 0:16 15.48% {ng_queue0} 13 root 55 - 0K 16K RUN 0 0:16 15.38% {ng_queue1} 55418 root 47 0 54620K 20792K piperd 0 0:06 0.68% php 12 root -32 - 0K 120K WAIT 0 0:11 0.39% {swi4: clock} 55510 root 76 0 53596K 20132K accept 0 0:06 0.29% php 0 root 76 0 0K 64K sched 1 1:02 0.00% {swapper} 53409 root 47 0 54620K 20524K accept 1 0:03 0.00% php 53607 root 47 0 53596K 16348K accept 1 0:02 0.00% php 14 root -16 - 0K 8K - 1 0:01 0.00% yarrow 0 root -68 0 0K 64K - 1 0:01 0.00% {ath0 taskq} 33903 root 44 0 4948K 2516K select 0 0:00 0.00% syslogd 20429 root 64 20 3316K 1356K select 1 0:00 0.00% apinger 62838 root 64 20 5564K 3256K kqread 0 0:00 0.00% lighttpd 4 root -8 - 0K 8K - 1 0:00 0.00% g_down 2937 root 76 20 3656K 1440K wait 0 0:00 0.00% sh 49585 root 44 0 3712K 2012K CPU0 0 0:00 0.00% top 3 root -8 - 0K 8K - 1 0:00 0.00% g_up 12 root -68 - 0K 120K WAIT 0 0:00 0.00% {irq18: ath0}
-
Help me, Please!
-
I don't think you have a problem, other than still having poor quality NIC hardware that induces significantly more CPU load than good quality NICs would. It's working fine as is, the fact it's using 30% CPU is irrelevant, you're maxing out your Internet connection without coming close to maxing out your hardware.
-
I have never used L2TP from a pfSense box. I have no idea how much overhead that might represent or which process might show that in top. I have to assume it uses some cpu cycles though which might explain why your box looks more heavily loaded than I would have expected. Chris?
In the top output I assume you are maxing out your 80Mb WAN connection? And using the wifi interface (ath0)?
Steve
-
cmb, A good network card is expensive, from 100$. Say something from the budget, please, but not less effective?
-
stephenw10, At the peak of WAN traffic can reach 90-100 Mbit / s.
And yes, I use a WiFi network for mobile devices and netbooks. -
But were you using the wifi interface at the same time as maxing out your WAN and what bandwidth was ath0 seeing when you ran 'top' above?
80Mbps over wifi is going to more cpu cycles than 80Mbps via ethernet if only because of the encryption. I'm just trying to determine exactly what the conditions were so that I might run a comparable test.Do you know exactly what CPU you used?
Steve
-
…plus you'd never see more than 45Mbps over wifi anyhow.
-
For the sake of getting some comparable figures up, here is the output of top -SH from my home box which is, as I previously mentioned, a 1.8 P4-M underclocked to 1.2GHz. It's quite low end. ;)
Here there is nothing much happening, no thoughput to speak of.
last pid: 57933; load averages: 0.63, 0.79, 0.44 up 155+21:01:22 12:28:57 109 processes: 4 running, 90 sleeping, 15 waiting CPU: 0.4% user, 0.0% nice, 0.0% system, 0.4% interrupt, 99.3% idle Mem: 72M Active, 51M Inact, 62M Wired, 204K Cache, 59M Buf, 300M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 3610.3 94.97% idle 11 root -32 - 0K 128K WAIT 462:52 0.00% {swi4: clock} 11 root -68 - 0K 128K RUN 237:24 0.00% {irq18: em0 ath0+} 11 root -68 - 0K 128K WAIT 90:45 0.00% {irq17: fxp2 fxp6} 0 root -68 0 0K 88K - 31:11 0.00% {em1 taskq} 0 root -68 0 0K 88K - 31:01 0.00% {ath0 taskq} 13 root -16 - 0K 8K - 30:51 0.00% yarrow 0 root -68 0 0K 88K - 28:33 0.00% {em0 taskq} 11 root -44 - 0K 128K WAIT 27:13 0.00% {swi1: netisr 0} 47006 root 76 20 3656K 1600K wait 21:20 0.00% sh 0 root -68 0 0K 88K - 20:23 0.00% {em2 taskq} 6810 nobody 74 r30 3368K 1484K RUN 14:37 0.00% LCDd 35 root -8 - 0K 8K mdwait 13:50 0.00% md1 20 root 44 - 0K 8K syncer 11:39 0.00% syncer 2 root -8 - 0K 8K - 11:16 0.00% g_event 5227 root 65 20 46428K 17508K nanslp 11:08 0.00% php 17355 root 44 0 7612K 5184K kqread 10:50 0.00% lighttpd 12 root -16 - 0K 8K sleep 10:35 0.00% ng_queue 11 root -68 - 0K 128K WAIT 9:07 0.00% {irq19: fxp0 fxp4} 4 root -8 - 0K 8K - 8:26 0.00% g_down 29 root -8 - 0K 8K mdwait 7:15 0.00% md0 27610 root 44 0 8464K 4440K select 7:00 0.00% {mpd5} 47631 root 44 0 8984K 6272K bpf 4:10 0.00% tcpdump 44283 dhcpd 44 0 8436K 6552K select 4:02 0.00% dhcpd 3 root -8 - 0K 8K - 3:43 0.00% g_up 33556 root 44 0 3352K 1308K select 3:42 0.00% miniupnpd 9595 root 64 20 3316K 1336K select 3:24 0.00% apinger 1433 root 44 0 4948K 2456K select 2:48 0.00% syslogd 7 root -16 - 0K 8K pftm 2:38 0.00% pfpurge 0 root -16 0 0K 88K sched 2:36 0.00% {swapper} 11 root -64 - 0K 128K WAIT 2:27 0.00% {irq14: ata0} 32508 nobody 44 0 5556K 2824K select 1:28 0.00% dnsmasq 21 root -16 - 0K 8K sdflus 1:25 0.00% softdepflush 14 root -64 - 0K 96K - 1:11 0.00% {usbus2} 47955 root 44 0 3316K 892K piperd 1:10 0.00% logger 18 root -16 - 0K 8K psleep 1:05 0.00% bufdaemon
Here is the same situation, no throughput, but I have the webGUI dashboard open on another machine. You'll see it's quite a resource hog on a low end machine like this and it doesn't appear as a process only in the CPU: system figure.
last pid: 52329; load averages: 1.78, 0.74, 0.31 up 155+20:57:37 12:25:12 111 processes: 4 running, 91 sleeping, 16 waiting CPU: 19.0% user, 0.4% nice, 38.4% system, 0.4% interrupt, 41.8% idle Mem: 73M Active, 51M Inact, 62M Wired, 204K Cache, 59M Buf, 299M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 3610.3 32.96% idle 36179 root 76 0 43356K 18008K lockf 0:10 3.96% php 20371 root 76 0 43356K 17896K lockf 0:09 2.98% php 14491 root 76 0 43356K 16680K piperd 0:28 1.95% php 10354 root 76 0 43356K 15464K piperd 0:27 1.95% php 11 root -32 - 0K 128K WAIT 462:52 0.00% {swi4: clock} 11 root -68 - 0K 128K WAIT 237:24 0.00% {irq18: em0 ath0+} 11 root -68 - 0K 128K WAIT 90:45 0.00% {irq17: fxp2 fxp6} 0 root -68 0 0K 88K - 31:11 0.00% {em1 taskq} 0 root -68 0 0K 88K - 31:01 0.00% {ath0 taskq} 13 root -16 - 0K 8K - 30:51 0.00% yarrow 0 root -68 0 0K 88K - 28:33 0.00% {em0 taskq} 11 root -44 - 0K 128K WAIT 27:13 0.00% {swi1: netisr 0} 47006 root 76 20 3656K 1600K wait 21:20 0.00% sh 0 root -68 0 0K 88K - 20:23 0.00% {em2 taskq} 6810 nobody 74 r30 3368K 1484K nanslp 14:37 0.00% LCDd 35 root -8 - 0K 8K mdwait 13:50 0.00% md1 20 root 44 - 0K 8K syncer 11:39 0.00% syncer 2 root -8 - 0K 8K - 11:16 0.00% g_event 5227 root 65 20 46428K 17508K nanslp 11:08 0.00% php 17355 root 44 0 7612K 5184K kqread 10:49 0.00% lighttpd 12 root -16 - 0K 8K sleep 10:35 0.00% ng_queue 11 root -68 - 0K 128K WAIT 9:07 0.00% {irq19: fxp0 fxp4} 4 root -8 - 0K 8K - 8:26 0.00% g_down 29 root -8 - 0K 8K mdwait 7:15 0.00% md0 27610 root 44 0 8464K 4440K select 7:00 0.00% {mpd5} 47631 root 44 0 8984K 6272K bpf 4:10 0.00% tcpdump 44283 dhcpd 44 0 8436K 6552K select 4:02 0.00% dhcpd 3 root -8 - 0K 8K - 3:43 0.00% g_up 33556 root 44 0 3352K 1308K select 3:42 0.00% miniupnpd 9595 root 64 20 3316K 1336K select 3:24 0.00% apinger 1433 root 44 0 4948K 2456K select 2:48 0.00% syslogd 7 root -16 - 0K 8K pftm 2:38 0.00% pfpurge 0 root -16 0 0K 88K sched 2:36 0.00% {swapper} 11 root -64 - 0K 128K WAIT 2:27 0.00% {irq14: ata0} 32508 nobody 44 0 5556K 2824K select 1:28 0.00% dnsmasq
Here I am maxing out my two WAN connections at 20Mbps and 23Mbps (the best I can get at midday) but don't have the dashboard open. The actual figures dance around a bit but this looks like a good average. The interfaces used are fxp5 and fxp6 (the two WAN connections) and em1. Interestingly fxp5 doesn't appear so I assume it shares an IRQ. (Aside: could I improve matters by using a different fxp interface? Hmm)
last pid: 17219; load averages: 0.90, 0.63, 0.43 up 155+21:09:45 12:37:20 109 processes: 5 running, 89 sleeping, 15 waiting CPU: 0.0% user, 0.4% nice, 17.6% system, 6.7% interrupt, 75.3% idle Mem: 73M Active, 51M Inact, 62M Wired, 204K Cache, 59M Buf, 300M Free Swap: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU COMMAND 10 root 171 ki31 0K 8K RUN 3610.4 69.97% idle 11 root -68 - 0K 128K WAIT 237:42 10.99% {irq18: em0 ath0+} 11 root -68 - 0K 128K RUN 91:01 6.98% {irq17: fxp2 fxp6} 0 root -68 0 0K 88K RUN 31:25 6.98% {em1 taskq} 12 root -16 - 0K 8K sleep 10:38 0.98% ng_queue 11 root -32 - 0K 128K WAIT 462:53 0.00% {swi4: clock} 0 root -68 0 0K 88K - 31:01 0.00% {ath0 taskq} 13 root -16 - 0K 8K - 30:52 0.00% yarrow 0 root -68 0 0K 88K - 28:33 0.00% {em0 taskq} 11 root -44 - 0K 128K WAIT 27:13 0.00% {swi1: netisr 0} 47006 root 76 20 3656K 1600K wait 21:21 0.00% sh 0 root -68 0 0K 88K - 20:24 0.00% {em2 taskq} 6810 nobody 74 r30 3368K 1484K RUN 14:38 0.00% LCDd 35 root -8 - 0K 8K mdwait 13:50 0.00% md1 20 root 44 - 0K 8K syncer 11:39 0.00% syncer 2 root -8 - 0K 8K - 11:16 0.00% g_event 5227 root 65 20 46428K 17508K nanslp 11:09 0.00% php 17355 root 44 0 7612K 5184K kqread 10:50 0.00% lighttpd 11 root -68 - 0K 128K WAIT 9:07 0.00% {irq19: fxp0 fxp4} 4 root -8 - 0K 8K - 8:26 0.00% g_down 29 root -8 - 0K 8K mdwait 7:15 0.00% md0 27610 root 44 0 8464K 4440K select 7:00 0.00% {mpd5} 47631 root 44 0 8984K 6272K bpf 4:10 0.00% tcpdump 44283 dhcpd 44 0 8436K 6552K select 4:02 0.00% dhcpd 3 root -8 - 0K 8K - 3:43 0.00% g_up 33556 root 44 0 3352K 1308K select 3:42 0.00% miniupnpd 9595 root 64 20 3316K 1336K select 3:24 0.00% apinger 1433 root 44 0 4948K 2456K select 2:48 0.00% syslogd 7 root -16 - 0K 8K pftm 2:38 0.00% pfpurge 0 root -16 0 0K 88K sched 2:36 0.00% {swapper} 11 root -64 - 0K 128K WAIT 2:27 0.00% {irq14: ata0} 32508 nobody 44 0 5556K 2824K select 1:28 0.00% dnsmasq 21 root -16 - 0K 8K sdflus 1:25 0.00% softdepflush 14 root -64 - 0K 96K - 1:11 0.00% {usbus2} 47955 root 44 0 3316K 892K piperd 1:10 0.00% logger 18 root -16 - 0K 8K psleep 1:05 0.00% bufdaemon
Since my ath0 interface is 54Mbps theoretical max. I probably couldn't max my WAN interfaces through it so I haven't tried. Edit: like Jim just said!
Steve
-
stephenw10
WiFi network does not load at all for me.
CPU Intel Pentium 4 3 GHz -
You haven't been able to get ath0 working?
There are a number of different cpus that could be 'Pentium 4 3GHz'. Any of them should be plenty powerful enough.
Steve
-
stephenw10, ath0 works well, I would say even with no problems.
The processor is really powerful, but it is strange why the peak load is loaded to 85%. Maybe this is weird or is this normal? -
Maybe if you try to replicate the test conditions I used and produced figures we can compare. Your throughput is going to be higher but your cpu is substantially more powerful.
Did you try comparing with the webgui dashboard open and closed? When I first realised that I was quite surprised at the cpu load.I don't know what the exact conditions were when you took your 'top' screenshot earlier but it looks like ath0 and swi5 are using a large number of cpu cycles, perhaps not an unreasonable amount for 100Mbps. However you could not get that bandwidth through an ath0 interface as Jim said.
You also have ng_queue using quite a bit. You have two, presumably because your cpu supports hyper-threading and hence appears as two cpus, but both are far higher than mine.Are you doing QoS or any sort of traffic shaping?
Steve
-
Yes, I tried WebGUI, and indeed, when the cover is closed on the CPU load is less than when open.
WiFi there is little used, and the load is too small. With him there is no such problem.
No, I do not use QoS and Traffic Shaper.