Abysmal Performance after pfSense hardware upgrade
-
Hmm, so still interrupt load with all three disabled? There must be something else set there. You have any custom sysctls set?
-
@stephenw10 said in Abysmal Performance after pfSense hardware upgrade:
Hmm, so still interrupt load with all three disabled? There must be something else set there. You have any custom sysctls set?
Not that I recall, then again this has been an evolution of my early usage of pfSense which is going for about 15 years at this point. Always seemed to complicated to start over, and over time that feeling continued to grow.
Here is my current System Tunables:
-
Hmm, nothing unexpected there. You have any custom loader values in /boot/loader.conf.local?
What other packages do you have installed?
-
[2.7.2-RELEASE][admin@pfSense-Edge01.scs.lan]/boot: vi loader.conf
kern.cam.boot_delay=10000
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
kern.ipc.nmbclusters="1000000"
kern.ipc.nmbjumbo9="524288"
kern.ipc.nmbjumbop="524288"
opensolaris_load="YES"
zfs_load="YES"
opensolaris_load="YES"
zfs_load="YES"
kern.cam.boot_delay=10000
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
kern.ipc.nmbclusters="1000000"
kern.ipc.nmbjumbo9="524288"
kern.ipc.nmbjumbop="524288"
kern.geom.label.disk_ident.enable="0"
kern.geom.label.gptid.enable="0"
cryptodev_load="YES"
zfs_load="YES"
boot_serial="NO"
autoboot_delay="3"
hw.hn.vf_transparent="0"
hw.hn.use_if_start="1"
net.link.ifqmaxlen="128"
machdep.hwpstate_pkg_ctrl="1"
net.pf.states_hashsize="4194304" -
No loader.conf.local file though?
-
Not that I'm seeing. I could create one if there are persistent items needed to be added.
-
Ok good, nothing unexpected hiding there.
Is Snort running on the interfaces passing traffic during the test? I don't see it in any of your output.
The interrupt load shown really seems to line up with the ntop load though. It makes me wonder if if something there is actually still enabled.
-
I'm not running snort ATM, but I do have pfBlockerNG running
-
Hmm, pfBlocker doesn't run continually against all traffic like that. Any load created by large lists just appears as firewall load in the task queues.
It's almost as if the NICs are running in a different mode.
-
Let me know if there is anything you can think of me trying or something else you'd like to to check.
-
The throughput you're seeing now is as expected though?
-
It is. You just now have me curious what is causing the interrupts.
I'm considering getting the 1u version of my new router. If I do, I'll preform a clean install and look to rebuild my system one brick at a time to see if I can figure out what is causing the GUI slowdown I've had since I moved to my last hardware. I can try to keep an eye on the interrupts as well.
Here is the stats from Status -> Interfaces
I seem to be a little beyond the interrupt range you said shouldn't be "unusual".
https://forum.netgate.com/topic/179674/netgate-6100-significant-interface-interrupt-rates/9
-
Mmm, but no where near 10K! I agree though I find it odd that you see the interrupt loading in the top output and I do not on a similar C3K system. Like whilst passing 1Gbps iperf traffic on a 5100:
last pid: 57718; load averages: 0.55, 0.36, 0.34 up 0+06:27:17 22:57:28 339 threads: 7 running, 288 sleeping, 44 waiting CPU 0: 0.0% user, 0.0% nice, 28.6% system, 0.0% interrupt, 71.4% idle CPU 1: 0.0% user, 0.0% nice, 23.1% system, 0.0% interrupt, 76.9% idle CPU 2: 0.4% user, 0.0% nice, 24.7% system, 0.0% interrupt, 74.9% idle CPU 3: 0.0% user, 0.0% nice, 34.1% system, 0.0% interrupt, 65.9% idle Mem: 45M Active, 258M Inact, 505M Wired, 3028M Free ARC: 127M Total, 28M MFU, 93M MRU, 416K Anon, 962K Header, 4535K Other 92M Compressed, 229M Uncompressed, 2.48:1 Ratio Swap: 1024M Total, 1024M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 187 ki31 0B 64K RUN 0 377:22 73.68% [idle{idle: cpu0}] 11 root 187 ki31 0B 64K RUN 2 376:55 73.23% [idle{idle: cpu2}] 11 root 187 ki31 0B 64K RUN 1 377:47 72.55% [idle{idle: cpu1}] 11 root 187 ki31 0B 64K CPU3 3 376:45 72.51% [idle{idle: cpu3}] 0 root -60 - 0B 1648K - 0 0:06 19.51% [kernel{if_io_tqg_0}] 0 root -60 - 0B 1648K - 3 0:05 18.78% [kernel{if_io_tqg_3}] 0 root -60 - 0B 1648K CPU1 1 0:04 18.60% [kernel{if_io_tqg_1}] 57718 root 34 0 19M 8644K CPU0 0 0:03 17.05% iperf3 -c 172.21.16.8 -P 3 -t 30{iperf3} 57718 root 36 0 19M 8644K sbwait 3 0:04 16.93% iperf3 -c 172.21.16.8 -P 3 -t 30{iperf3} 57718 root 40 0 19M 8644K sbwait 1 0:03 16.74% iperf3 -c 172.21.16.8 -P 3 -t 30{iperf3} 0 root -60 - 0B 1648K - 1 0:36 0.14% [kernel{if_config_tqg_0}] 78943 root 20 0 14M 4716K CPU2 2 0:00 0.12% top -HaSP 7 root -16 - 0B 16K pftm 0 0:09 0.03% [pf purge]
-
What settings did you utilize for the iperf test?
I turned on the server on my pfSense box
Then connected from my Windows box with
iperf3.exe -c 192.168.1.1 -P 2This way I get over 1Gbps
-
I ran the server on a Linux box on my network and then ran a client on pfSense:
[24.03-BETA][admin@5100-2.stevew.lan]/root: iperf3 -c 172.21.16.8 -P 3 -t 30 Connecting to host 172.21.16.8, port 5201 [ 5] local 172.21.16.75 port 22757 connected to 172.21.16.8 port 5201 [ 7] local 172.21.16.75 port 1280 connected to 172.21.16.8 port 5201 [ 9] local 172.21.16.75 port 22575 connected to 172.21.16.8 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 39.2 MBytes 329 Mbits/sec 0 938 KBytes [ 7] 0.00-1.00 sec 39.2 MBytes 329 Mbits/sec 0 939 KBytes [ 9] 0.00-1.00 sec 39.2 MBytes 329 Mbits/sec 0 937 KBytes [SUM] 0.00-1.00 sec 118 MBytes 986 Mbits/sec 0
The 5100 ix NICs are 1G but the SoC is the same.
-
I averaged about 1.9Gbps with the interrupts @ 24%-31%
with your settings
I'm guessingthere are gremlins in the config, from being carried forward over a decade, multiple hardware, and os upgrades
-
Perhaps a long shot, but could it be that you have have some power saving settings for the CPU? Pegging it at a low frequency? I remember reading about that in relation to high interrupt under load.
-
Hmm, possible.
-
I didn't see anything explicitly listed for power savings that was enabled.
-
Ordering the Rackmount version shortly and I'll test restoring one component at a time to see if the interrupts persist, or at what point they may increase.