CPU Usage when network used
-
Hi Steve,
It's unfortunate about this RSS issue. I have another board that I plan to try out, however its quite overkill especially if only 1 core is going to be used for pppoe. However it does have some better on board hardware that may help overall. It is however still just 2ghz/core.
https://www.supermicro.com/products/motherboard/atom/A2SDi-H-TP4F.cfm
Cheers!
-
Yes. I have a PPPoE WAN but fortunately/unfortunately it's no where near fast enough to worry about this.
No benchmarks for the C3958 but if we assume it's the same as the C3858 but with 4 more cores then it should make about ~40% better single thread performance.
It does seem like a waste of cores unless you virtualise it.
Steve
-
Hi Steve,
So I flipped it over. Performance so far looks drastically better. CPU in the gui was about 5-6% while transferring over pppoe. I believe still just the 1 core.
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0K 256K CPU1 1 7:39 97.26% [idle{idle: cpu1}]
11 root 155 ki31 0K 256K CPU10 10 7:41 97.12% [idle{idle: cpu10}]
11 root 155 ki31 0K 256K CPU13 13 7:33 96.96% [idle{idle: cpu13}]
11 root 155 ki31 0K 256K CPU7 7 7:45 96.85% [idle{idle: cpu7}]
11 root 155 ki31 0K 256K CPU11 11 7:38 96.51% [idle{idle: cpu11}]
11 root 155 ki31 0K 256K RUN 4 7:43 96.46% [idle{idle: cpu4}]
11 root 155 ki31 0K 256K CPU3 3 7:44 96.46% [idle{idle: cpu3}]
11 root 155 ki31 0K 256K CPU9 9 7:36 96.26% [idle{idle: cpu9}]
11 root 155 ki31 0K 256K CPU5 5 7:42 95.99% [idle{idle: cpu5}]
11 root 155 ki31 0K 256K RUN 8 7:19 95.56% [idle{idle: cpu8}]
11 root 155 ki31 0K 256K CPU6 6 7:42 95.12% [idle{idle: cpu6}]
11 root 155 ki31 0K 256K CPU2 2 7:42 94.98% [idle{idle: cpu2}]
11 root 155 ki31 0K 256K CPU12 12 7:40 93.93% [idle{idle: cpu12}]
11 root 155 ki31 0K 256K RUN 15 7:35 87.04% [idle{idle: cpu15}]
11 root 155 ki31 0K 256K CPU14 14 7:31 82.95% [idle{idle: cpu14}]
11 root 155 ki31 0K 256K RUN 0 7:24 79.60% [idle{idle: cpu0}]irq298: ix0:q0 2716423 6058
irq299: ix0:q1 244578 545
irq300: ix0:q2 461159 1029
irq301: ix0:q3 243416 543
irq302: ix0:q4 378891 845
irq303: ix0:q5 124788 278
irq304: ix0:q6 478729 1068
irq305: ix0:q7 125913 281
irq306: ix0:link 1 0
irq307: ix1:q0 326596 728
irq308: ix1:q1 254938 569
irq309: ix1:q2 614196 1370
irq310: ix1:q3 250402 558
irq311: ix1:q4 388996 868
irq312: ix1:q5 128709 287
irq313: ix1:q6 492403 1098
irq314: ix1:q7 130143 290
irq315: ix1:link 1 0ix0 is pppoe and ix1 is internal lans.
I was thinking about virtualizing. However I've seen so many talks about people suggesting this is not a great choice for a firewall. However I'm open to exploring this more. Do you have any thoughts? Proxmox was my first choice.
Cheers!
-
Nice, what sort of throughput were you seeing at that point?
I can't really advise on hypervisors, I'm not using anything right now.
A lot of people here are using Proxmox though. ESXi is also popular.
Steve
-
Same throughput but I believe this is more because of the source. I have not had a chance to test internally the network to see if anything there is improved. Will update once I have.
-
so testing with iperf3, I still don't seem to be getting anywhere close to 10G bandwidth.
It looks about spot on with 1G.
[ 41] 0.00-10.00 sec 56.4 MBytes 47.4 Mbits/sec 3258 sender
[ 41] 0.00-10.00 sec 56.4 MBytes 47.3 Mbits/sec receiver
[ 43] 0.00-10.00 sec 58.1 MBytes 48.8 Mbits/sec 3683 sender
[ 43] 0.00-10.00 sec 58.0 MBytes 48.6 Mbits/sec receiver
[SUM] 0.00-10.00 sec 1.10 GBytes 943 Mbits/sec 69930 sender
[SUM] 0.00-10.00 sec 1.10 GBytes 941 Mbits/sec receiverAny ideas?
This is literally SFP+ 10G interface on pfsense to switch to fileserver. The file server has two 10G bonded links. Nothing else running.
Cheers!
-
How many processes are you running there?
You have 8 queues so I don't expect to any advantage over 8.
Is that result testing over 1G? What do you actually see over 10G?
I would anticipate something ~4Gbps maybe. Though if you're running iperf on the firewall it may reduce that.Steve
-
My test with iperf was sending 20 connections (what I saw someones example on the internets doing) and it looks pretty much to saturate if it were 1G.
This is not 1G. This is using my internal network. Pfsense reports it as 10G, the switch is all 10G, and the file server has 2x10G.
Curious why would iperf on the firewall reduce this?
fyi cpu did not appear stressed in any way.
Cheers!
-
That seems far too much like a 1G link limit to be coincidence.
Check that each part is actually linked at 10G.
Steve
-
so on my pfsense I can see all my internal interface vlans are listed with:
media: Ethernet autoselect (10Gbase-T <full-duplex>)
on my NAS I see the bonded interfaces:
Settings for eth4:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseKX/Full
10000baseKR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Advertised link modes: 1000baseKX/Full
10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x00000014 (20)
link ifdown
Link detected: yesSettings for eth5:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseKX/Full
10000baseKR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Advertised link modes: 1000baseKX/Full
10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x00000014 (20)
link ifdown
Link detected: yesOn the switch:
0/3 PC Mbr Enable Auto D 10G Full Up Enable Enable Disable (nas)
0/4 PC Mbr Enable Auto D 10G Full Up Enable Enable Disable (nas)
...
0/16 Enable Auto 10G Full Up Enable Enable Disable (pfsense) -
Do you use traffic shaping/limiters?
-
Unless there is something configured from a default install I have not set anything myself. Going into the traffic shaper area it does not appear to have anything set.
For reference I have dismantled my NAS bonded interfaces and just using 1 interface now. Results are about the same showing about 1G speed.
Thanks!
-
Update: I have now separated the NAS from the rest of the VLAN's I had to try and ensure nothing going on there. Now its on its own 10G interface. Results about the same.
Another interesting fact. If I reverse the iperf direction. NAS to PFsense I can see the bandwidth spike up to more around the 2G range.
Doing -P20 (20 transfers at once)
[SUM] 0.00-10.00 sec 2.71 GBytes 2.33 Gbits/sec receiverWithout, it will drop down to a little over 1G.
Any ideas?
-
Is that using the -R switch? Can you try running the actual client on the NAS and server on pfSense? That will open firewall states differently.
You could also try disabling pf as a test. If there is a CPU restriction still that should show far higher throughput.
Steve
-
I had not used -R before but I tried it with or w/o -P20 and the results seem to be about the same.
I have also tried replacing the SFP+ cables with brand new ones. No difference.
Disabling PF (firewall) did not appear to do anything noticeable.
Two things I have noticed now.
-
Transfer with PFSense as the client and Fileserver as the server the speed is best and using parallel connections (-P20) it gets a little over 2G.
However when I reverse this and have PFSense as the server and the file server as client the speeds are drastically worse. -
There does appear to be a lot of retries with the iperf sending. I am not sure if this is a "normal" result or not. It does appear to happen regardless of the direction. But is always the sender.
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 304 MBytes 255 Mbits/sec 15 sender
[ 5] 0.00-10.00 sec 302 MBytes 253 Mbits/sec receiver
[ 7] 0.00-10.00 sec 25.5 MBytes 21.4 Mbits/sec 9 sender
[ 7] 0.00-10.00 sec 24.2 MBytes 20.3 Mbits/sec receiver
[ 9] 0.00-10.00 sec 210 MBytes 176 Mbits/sec 15 sender
[ 9] 0.00-10.00 sec 208 MBytes 174 Mbits/sec receiver
[ 11] 0.00-10.00 sec 116 MBytes 97.5 Mbits/sec 9 sender
[ 11] 0.00-10.00 sec 114 MBytes 95.9 Mbits/sec receiver
[ 13] 0.00-10.00 sec 35.9 MBytes 30.1 Mbits/sec 19 sender
[ 13] 0.00-10.00 sec 34.2 MBytes 28.7 Mbits/sec receiver
[ 15] 0.00-10.00 sec 104 MBytes 87.1 Mbits/sec 17 sender
[ 15] 0.00-10.00 sec 102 MBytes 85.5 Mbits/sec receiver
[ 17] 0.00-10.00 sec 127 MBytes 106 Mbits/sec 13 sender
[ 17] 0.00-10.00 sec 124 MBytes 104 Mbits/sec receiver
[ 19] 0.00-10.00 sec 449 MBytes 377 Mbits/sec 11 sender
[ 19] 0.00-10.00 sec 447 MBytes 375 Mbits/sec receiver
[ 21] 0.00-10.00 sec 64.1 MBytes 53.8 Mbits/sec 18 sender
[ 21] 0.00-10.00 sec 62.4 MBytes 52.3 Mbits/sec receiver
[ 23] 0.00-10.00 sec 261 MBytes 219 Mbits/sec 19 sender
[ 23] 0.00-10.00 sec 258 MBytes 216 Mbits/sec receiver
[ 25] 0.00-10.00 sec 182 MBytes 153 Mbits/sec 15 sender
[ 25] 0.00-10.00 sec 180 MBytes 151 Mbits/sec receiver
[ 27] 0.00-10.00 sec 129 MBytes 108 Mbits/sec 13 sender
[ 27] 0.00-10.00 sec 127 MBytes 106 Mbits/sec receiver
[ 29] 0.00-10.00 sec 288 MBytes 242 Mbits/sec 13 sender
[ 29] 0.00-10.00 sec 285 MBytes 239 Mbits/sec receiver
[ 31] 0.00-10.00 sec 48.7 MBytes 40.8 Mbits/sec 11 sender
[ 31] 0.00-10.00 sec 47.3 MBytes 39.6 Mbits/sec receiver
[ 33] 0.00-10.00 sec 332 MBytes 279 Mbits/sec 13 sender
[ 33] 0.00-10.00 sec 330 MBytes 277 Mbits/sec receiver
[ 35] 0.00-10.00 sec 76.5 MBytes 64.2 Mbits/sec 17 sender
[ 35] 0.00-10.00 sec 74.6 MBytes 62.6 Mbits/sec receiver
[ 37] 0.00-10.00 sec 233 MBytes 196 Mbits/sec 16 sender
[ 37] 0.00-10.00 sec 230 MBytes 193 Mbits/sec receiver
[ 39] 0.00-10.00 sec 78.1 MBytes 65.5 Mbits/sec 16 sender
[ 39] 0.00-10.00 sec 76.6 MBytes 64.3 Mbits/sec receiver
[ 41] 0.00-10.00 sec 58.4 MBytes 49.0 Mbits/sec 16 sender
[ 41] 0.00-10.00 sec 57.1 MBytes 47.9 Mbits/sec receiver
[ 43] 0.00-10.00 sec 67.5 MBytes 56.6 Mbits/sec 18 sender
[ 43] 0.00-10.00 sec 65.8 MBytes 55.2 Mbits/sec receiver
[SUM] 0.00-10.00 sec 3.11 GBytes 2.68 Gbits/sec 293 sender
[SUM] 0.00-10.00 sec 3.08 GBytes 2.64 Gbits/sec receiveriperf Done.
I have started engaging support with the file server manufacture to see if they have any thoughts. It's looking more and more likely that PFSense is not the issue here. But as always open to any suggestions...
Cheers!
-
-
Use iperf3 if you can. That's available for installing from the command line in pfSense.
pfSense is not optimised to be a server (or client in this case). It will almost certainly perform better testing through it rather than to it.
Steve
-
Yes sorry I am using iperf3 as I had compatibility issues with the NAS before.
I will also be trying to fire up some sort of test box to see if it can achieve more desirable results.
Cheers!
-
Hi again,
So I did manage to fire up a test box. Results are better but unfortunately it does look like there is something with the PFSense hw/config.
Test1. Connected test box to the same network as the NAS. Did a basic iperf3 -c mynas speeds show about 10G.
Test2. Connected test box to another network (same switch) which would have the test box route through the PFSense box and the speeds dropped. I'll give it that the speeds were still better than my other testing but still considerably lower than without PFSense.
I also checked the CPU graph on PFSense and it was pretty still around the 12% mark during my tests.
Thoughts?
Nas same network:
Through PFSense:
-
Your CPU did sit around 12%, right? What about running "top" and looking for Interrupts etc.? Could be that the buffers, caches, interfaces are maxed out via IRQ handling?
-
Yeah, 12% overall tells us nothing really but that's not miles away from what I expect for that CPU.
Were you able to test with pf disabled?
pfctl -d
Steve