UPDATE: It seems that the ixl driver has been ported to iflib and is available in FreeBSD 12. As pfSense is adopting FreeBSD 12 for its 2.5 release, I decided to try a development snapshot of pfSense 2.5 (pfSense-CE-memstick-2.5.0-DEVELOPMENT-amd64-20191031-1313.img).
Below are the updated results for both pfSense 2.4.4-p3 and pfSense 2.5.0-dev:
+------------------+---------------+---------------+
| MTU | streams | pfSense 2.4.4 | pfSense 2.5.0 |
+------------------+---------------+---------------+
| | 1 | 1.30 Gbps | 1.66 Gbps |
| 1500 |-----------+---------------+---------------+
| | 8 | 1.81 Gbps | 2.52 Gbps |
+----------------- +---------------+---------------+
| | 1 | 6.13 Gbps | 2.92 Gbps*|
| 9000 |-----------+---------------+---------------+
| | 8 | 8.76 Gbps | 9.77 Gbps |
+------------------+---------------+---------------+
*The result for pfSense 2.5.0 with 9000 MTU and 1 stream is suspect. Performance decreased dramatically with pfSense 2.5.0 for this test, while performance improved with pfSense 2.5.0 for all of the other tests. I did see some curious behavior during this specific tests as compared to the others. In all other tests the throughput was relatively constant on a second-by-second basis. During this test the throughput each second varied wildly from 1 to 6 Gbps.
The interrupt processing definitely improved with the move to pfSense 2.5.0. Below is a snapshot of the output from "top" on the left pfSense server during the 1500 MTU test. The interrupt processing and CPU usage seems to be distributed nicely across all of the cores. In fact, most of the CPUs seem to be underutilized. So I am not sure now what the bottleneck might be that is limiting performance.
[2.5.0-DEVELOPMENT][admin@left]/root: top -P
last pid: 46970; load averages: 1.65, 1.10, 0.97 up 0+16:56:20 08:44:13
49 processes: 2 running, 47 sleeping
CPU 0: 3.5% user, 0.0% nice, 15.7% system, 3.1% interrupt, 77.6% idle
CPU 1: 2.4% user, 0.0% nice, 14.2% system, 2.0% interrupt, 81.5% idle
CPU 2: 0.8% user, 0.0% nice, 23.1% system, 2.0% interrupt, 74.1% idle
CPU 3: 2.7% user, 0.0% nice, 18.0% system, 5.5% interrupt, 73.7% idle
CPU 4: 3.5% user, 0.0% nice, 20.4% system, 2.4% interrupt, 73.7% idle
CPU 5: 1.6% user, 0.0% nice, 23.1% system, 4.3% interrupt, 71.0% idle
CPU 6: 2.4% user, 0.0% nice, 16.9% system, 1.6% interrupt, 79.2% idle
CPU 7: 7.5% user, 0.0% nice, 52.5% system, 0.8% interrupt, 39.2% idle
CPU 8: 5.1% user, 0.0% nice, 24.7% system, 1.6% interrupt, 68.6% idle
CPU 9: 2.7% user, 0.0% nice, 16.9% system, 2.4% interrupt, 78.0% idle
CPU 10: 3.1% user, 0.0% nice, 16.1% system, 3.1% interrupt, 77.6% idle
CPU 11: 2.7% user, 0.0% nice, 24.3% system, 2.4% interrupt, 70.6% idle
CPU 12: 4.7% user, 0.0% nice, 21.2% system, 2.4% interrupt, 71.8% idle
CPU 13: 2.4% user, 0.0% nice, 22.0% system, 2.0% interrupt, 73.7% idle
CPU 14: 0.0% user, 0.0% nice, 100% system, 0.0% interrupt, 0.0% idle
CPU 15: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 77M Active, 39M Inact, 522M Wired, 69M Buf, 15G Free
Swap: 3979M Total, 3979M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
8645 root 17 52 0 70M 20M sigwai 7 8:08 111.07% charon
26414 root 2 92 0 18M 6380K CPU0 0 6:52 99.59% ntpd