Intel 10Gb ix X552
-
freebsd 10.3 with pf enabled, still operates at full speed
driver:
root@firewall01:~ # sysctl dev.ix.0.%desc
dev.ix.0.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k -
Have you made any configuration changes for the NIC? May I suggest these then a test (loader.conf):
legal.intel_iwi.license_ack=1
legal.intel_ipw.license_ack=1
hw.igb.enable_aim=0
hw.igb.rxd=4096
hw.igb.txd=4096
hw.igb.rx_process_limit=-1
hw.igb.max_interrupt_rate=64000In system tunables. For each port 0, 1, etc:
dev.igb.0.fc 0
dev.igb.0.eee_disabled 1If you have more ports then configure dev.igb.1.fc 0, dev.igb.1.eee_disabled 1
You can also set hw.igb.num_queues. For quad core you set to 2, e.g. hw.igb.num_queues=2 within loader.conf.
These are igb configurations. You will need to find the ix configuration equivalent like hw.igb.max_interrupt_rate / hw.ix.max_interrupt_rate
Some may be here: https://downloadmirror.intel.com/14687/eng/readme.txt
Also, do a test with all offloading disabled with each option, i.e. checksum, LRO, segmentation. Then enabled, only, checksum and test. Some features of the NIC must have all features enabled to work properly. For examaple, checksum would need LRO and segmentation too the description is in the driver notes, IIRC. 10GbE is a different beast to configure at times. BTW, can you post a dmesg?
-
Have you tried polling?
-
/root: dmesg
Copyright 1992-2016 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.3-RELEASE-p19 #1 76a12c4e6(RELENG_2_3_4): Fri Jul 14 15:02:35 CDT 2017
root@ce23-amd64-builder:/builder/pfsense-234/tmp/obj/builder/pfsense-234/tmp/FreeBSD-src/sys/pfSense amd64
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
CPU: Intel(R) Pentium(R) CPU D1508 @ 2.20GHz (2200.05-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x50663 Family=0x6 Model=0x56 Stepping=3
Features=0xbfebfbff <fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe>Features2=0x7ffefbff <sse3,pclmulqdq,dtes64,mon,ds_cpl,vmx,smx,est,tm2,ssse3,sdbg,fma,cx16,xtpr,pdcm,pcid,dca,sse4.1,sse4.2,x2apic,movbe,popcnt,tscdlt,aesni,xsave,osxsave,avx,f16c,rdrand>AMD Features=0x2c100800 <syscall,nx,page1gb,rdtscp,lm>AMD Features2=0x121 <lahf,abm,prefetch>Structured Extended Features=0x21cbfbb <fsgsbase,tscadj,bmi1,hle,avx2,smep,bmi2,erms,invpcid,rtm,pqm,nfpusg,pqe,rdseed,adx,smap,proctrace>XSAVE Features=0x1 <xsaveopt>VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
TSC: P-state invariant, performance statistics
real memory = 4294967296 (4096 MB)
avail memory = 3957551104 (3774 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: <superm smci--mb="">FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 SMT threads
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 1
cpu2 (AP): APIC ID: 2
cpu3 (AP): APIC ID: 3
random: <software, yarrow="">initialized
ioapic0 <version 2.0="">irqs 0-23 on motherboard
lapic0: Forcing LINT1 to edge trigger
wlan: mac acl policy registered
netmap: loaded module
kbd1 at kbdmux0
module_register_init: MOD_LOAD (vesa, 0xffffffff81017260, 0) error 19
cryptosoft0: <software crypto="">on motherboard
padlock0: No ACE support.
acpi0: <superm smci--mb="">on motherboard
acpi0: Power Button (fixed)
cpu0: <acpi cpu="">on acpi0
cpu1: <acpi cpu="">on acpi0
cpu2: <acpi cpu="">on acpi0
cpu3: <acpi cpu="">on acpi0
atrtc0: <at realtime="" clock="">port 0x70-0x71,0x74-0x77 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
attimer0: <at timer="">port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
hpet0: <high precision="" event="" timer="">iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 550
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
Event timer "HPET3" frequency 14318180 Hz quality 440
Event timer "HPET4" frequency 14318180 Hz quality 440
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0: <acpi host-pci="" bridge="">on acpi0
pci255: <acpi pci="" bus="">on pcib0
pcib1: <acpi host-pci="" bridge="">port 0xcf8-0xcff on acpi0
pci0: <acpi pci="" bus="">on pcib1
pcib2: <acpi pci-pci="" bridge="">irq 16 at device 1.0 on pci0
pci1: <acpi pci="" bus="">on pcib2
pcib3: <acpi pci-pci="" bridge="">irq 16 at device 2.0 on pci0
pci2: <acpi pci="" bus="">on pcib3
pcib4: <acpi pci-pci="" bridge="">irq 16 at device 2.2 on pci0
ACPI Warning: \134_SB_.PCI0.BR2C._PRT: Return Package has no elements (empty) (20150515/nsprepkg-137)
pci3: <acpi pci="" bus="">on pcib4
pcib4: no PRT entry for 3.0.INTA
pcib4: no PRT entry for 3.0.INTB
ix0: <intel(r) pro="" 10gbe="" pci-express="" network="" driver,="" version="" -="" 3.1.14="">mem 0xfbc00000-0xfbdfffff,0xfbe04000-0xfbe07fff irq 11 at device 0.0 on pci3
ix0: Using MSIX interrupts with 5 vectors
ix0: Ethernet address: 0c:c4:7a:c9:44:b2
ix0: netmap queues/slots: TX 4/2048, RX 4/2048
ix1: <intel(r) pro="" 10gbe="" pci-express="" network="" driver,="" version="" -="" 3.1.14="">mem 0xfba00000-0xfbbfffff,0xfbe00000-0xfbe03fff irq 10 at device 0.1 on pci3
ix1: Using MSIX interrupts with 5 vectors
ix1: Ethernet address: 0c:c4:7a:c9:44:b3
ix1: netmap queues/slots: TX 4/2048, RX 4/2048
pcib5: <acpi pci-pci="" bridge="">irq 16 at device 3.0 on pci0
pci4: <acpi pci="" bus="">on pcib5
xhci0: <intel lynx="" point="" usb="" 3.0="" controller="">mem 0xfb300000-0xfb30ffff irq 19 at device 20.0 on pci0
xhci0: 32 bytes context size, 64-bit DMA
usbus0: waiting for BIOS to give up control
xhci0: Port routing mask set to 0xffffffff
usbus0 on xhci0
pci0: <simple comms="">at device 22.0 (no driver attached)
pci0: <simple comms="">at device 22.1 (no driver attached)
ehci0: <intel lynx="" point="" usb="" 2.0="" controller="" usb-b="">mem 0xfb314000-0xfb3143ff irq 18 at device 26.0 on pci0
usbus1: EHCI version 1.0
usbus1 on ehci0
pcib6: <acpi pci-pci="" bridge="">irq 16 at device 28.0 on pci0
pci5: <acpi pci="" bus="">on pcib6
pcib7: <acpi pci-pci="" bridge="">irq 16 at device 28.4 on pci0
pci6: <acpi pci="" bus="">on pcib7
pcib8: <acpi pci-pci="" bridge="">at device 0.0 on pci6
pci7: <acpi pci="" bus="">on pcib8
vgapci0: <vga-compatible display="">port 0xe000-0xe07f mem 0xfa000000-0xfaffffff,0xfb000000-0xfb01ffff irq 18 at device 0.0 on pci7
vgapci0: Boot video device
ehci1: <intel lynx="" point="" usb="" 2.0="" controller="" usb-a="">mem 0xfb313000-0xfb3133ff irq 18 at device 29.0 on pci0
usbus2: EHCI version 1.0
usbus2 on ehci1
isab0: <pci-isa bridge="">at device 31.0 on pci0
isa0: <isa bus="">on isab0
ahci0: <intel lynx="" point="" ahci="" sata="" controller="">port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xfb312000-0xfb3127ff irq 16 at device 31.2 on pci0
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich0: <ahci channel="">at channel 0 on ahci0
ahcich1: <ahci channel="">at channel 1 on ahci0
ahcich2: <ahci channel="">at channel 2 on ahci0
ahcich3: <ahci channel="">at channel 3 on ahci0
ahcich4: <ahci channel="">at channel 4 on ahci0
ahcich5: <ahci channel="">at channel 5 on ahci0
ahciem0: <ahci enclosure="" management="" bridge="">on ahci0
acpi_button0: <power button="">on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
orm0: <isa option="" rom="">at iomem 0xc0000-0xc7fff on isa0
sc0: <system console="">at flags 0x100 on isa0
sc0: CGA <16 virtual consoles, flags=0x300>
vga0: <generic isa="" vga="">at port 0x3d0-0x3db iomem 0xb8000-0xbffff on isa0
ppc0: cannot reserve I/O port range
coretemp0: <cpu on-die="" thermal="" sensors="">on cpu0
est0: <enhanced speedstep="" frequency="" control="">on cpu0
coretemp1: <cpu on-die="" thermal="" sensors="">on cpu1
est1: <enhanced speedstep="" frequency="" control="">on cpu1
coretemp2: <cpu on-die="" thermal="" sensors="">on cpu2
est2: <enhanced speedstep="" frequency="" control="">on cpu2
coretemp3: <cpu on-die="" thermal="" sensors="">on cpu3
est3: <enhanced speedstep="" frequency="" control="">on cpu3
Timecounters tick every 1.000 msec
random: unblocking device.
usbus0: 5.0Gbps Super Speed USB v3.0
usbus1: 480Mbps High Speed USB v2.0
usbus2: 480Mbps High Speed USB v2.0
ugen0.1: <0x8086> at usbus0
uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
ugen2.1: <intel>at usbus2
uhub1: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus2
ugen1.1: <intel>at usbus1
uhub2: <intel 1="" 9="" ehci="" root="" hub,="" class="" 0,="" rev="" 2.00="" 1.00,="" addr="">on usbus1
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub0: 21 ports with 21 removable, self powered
ugen0.2: <vendor 0x05e3="">at usbus0
uhub3: <vendor 1="" 9="" 0x05e3="" usb2.0="" hub,="" class="" 0,="" rev="" 2.00="" 32.98,="" addr="">on usbus0
uhub3: MTT enabled
ugen2.2: <vendor 0x8087="">at usbus2
uhub4: <vendor 2="" 9="" 0x8087="" product="" 0x8000,="" class="" 0,="" rev="" 2.00="" 0.05,="" addr="">on usbus2
ugen1.2: <vendor 0x8087="">at usbus1
uhub5: <vendor 2="" 9="" 0x8087="" product="" 0x8008,="" class="" 0,="" rev="" 2.00="" 0.05,="" addr="">on usbus1
uhub4: 4 ports with 4 removable, self powered
uhub5: 4 ports with 4 removable, self powered
uhub3: 4 ports with 4 removable, self powered
ugen0.3: <vendor 0x0557="">at usbus0
uhub6: <vendor 2="" 9="" 0x0557="" product="" 0x7000,="" class="" 0,="" rev="" 2.00="" 0.00,="" addr="">on usbus0
uhub6: 4 ports with 3 removable, self powered
ugen0.4: <vendor 0x0557="">at usbus0
ukbd0: <vendor 0="" 3="" 0x0557="" product="" 0x2419,="" class="" 0,="" rev="" 1.10="" 1.00,="" addr="">on usbus0
kbd0 at ukbd0
ses0 at ahciem0 bus 0 scbus6 target 0 lun 0
ses0: <ahci sgpio="" enclosure="" 1.00="" 0001="">SEMB S-E-S 2.00 device
ses0: SEMB SES Device
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <ts64gssd370s p1225ca="">ACS-2 ATA SATA 3.x device
ada0: Serial Number D646741877
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 1024bytes)
ada0: Command Queueing enabled
ada0: 61057MB (125045424 512 byte sectors)
ada0: Previously was known as ad4
lapic1: Forcing LINT1 to edge trigger
SMP: AP CPU #1 Launched!
lapic2: Forcing LINT1 to edge trigger
SMP: AP CPU #2 Launched!
lapic3: Forcing LINT1 to edge trigger
SMP: AP CPU #3 Launched!
Timecounter "TSC-low" frequency 1100023234 Hz quality 1000
Trying to mount root from ufs:/dev/ufsid/59b7cd4862406270 [rw]…
pflog0: promiscuous mode enabled
ix0: link state changed to UP
ix1: link state changed to UP
ix0: link state changed to DOWN
ix0: link state changed to UP</ts64gssd370s></ahci></vendor></vendor></vendor></vendor></vendor></vendor></vendor></vendor></vendor></vendor></intel></intel></intel></intel></enhanced></cpu></enhanced></cpu></enhanced></cpu></enhanced></cpu></generic></system></isa></power></ahci></ahci></ahci></ahci></ahci></ahci></ahci></intel></isa></pci-isa></intel></vga-compatible></acpi></acpi></acpi></acpi></acpi></acpi></intel></simple></simple></intel></acpi></acpi></intel(r)></intel(r)></acpi></acpi></acpi></acpi></acpi></acpi></acpi></acpi></acpi></acpi></high></at></at></acpi></acpi></acpi></acpi></superm></software></version></software,></superm></xsaveopt></fsgsbase,tscadj,bmi1,hle,avx2,smep,bmi2,erms,invpcid,rtm,pqm,nfpusg,pqe,rdseed,adx,smap,proctrace></lahf,abm,prefetch></syscall,nx,page1gb,rdtscp,lm></sse3,pclmulqdq,dtes64,mon,ds_cpl,vmx,smx,est,tm2,ssse3,sdbg,fma,cx16,xtpr,pdcm,pcid,dca,sse4.1,sse4.2,x2apic,movbe,popcnt,tscdlt,aesni,xsave,osxsave,avx,f16c,rdrand></fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,htt,tm,pbe> -
Intel(R) Pentium(R) CPU D1508 @ 2.20GHz
Do you disable hyper-threading when the system is installed/using pfSense? It is enabled in the dmesg.
-
Hyper-threading is enabled but it is also enabled on freebsd where the driver works ok.
Upon further testing we discovered that it is receiving end that is slowing down
i.e sending data from pfsene to freebsd works at 9.4-9.5 Gbit
sending data from freebsd to pfsense works at 2.8 Gbitsending data between to freebsd boxes either way works at 9.5 Gbit
I also tried device polling to no effect.
-
I'm not 100% I have the screenshots matched up correctly but it looks like iperf itself is using far more CPU in pfSense than it is in FreeBSD. And that the interrupt load from the driver queues is far higher in pfSense. Would you agree?
That can be typical of a CPU running at a lower frequency. Did you check the sysctls to make sure the CPUs are running at the same speed in both cases?
The loading caused by pf appears as the interrupt load on the NIC queues. If it was not actually disabled that's where it would appear. That would also match the one direction is OK finding as pfSense allows out all traffic by default.
It's hard to believe that CPU would push close to 10Gbps with pf enabled in FreeBSD if it was actually filtering anything. Did you actually see any drop in speed or increase in CPU load when you tested that?
Interesting issue though.
Steve
-
Hi, I'm working on that problem with Belgarath.
sysctl show the same for both systems.
-
pfSense is not pushing 10GBps even with pf disabled, so that i why I am confused.
pfctl -d has no impact on the throughput on the pfsense. -
Hmm, pfctl -d will disable pf only until any change is made in the gui or the ruleset is reloaded which can be triggered by a number of things. However if will report it's already disabled if you run it again (and it is still disabled). You might try disabling it in the GUI in System > Advanced > Firewall/NAT to be sure.
It's interesting that neither box is running at 2201MHz which would be Turbo mode. But they are certainly comparable though with both set the same.
Steve
-
Hardware on both is identical, we disabled turbo mode in bios on both.
we made sure pf was disabled through the duration of the test but will check again with gui suggestion just to be sure. -
Same results when natting i disabled from GUI
-
I would recommend not using pfSense as a client or server for iperf as it does not reflect actual routing performance AT ALL. pfSense is configured to be good at routing not hosting. Both use the network but in completely different ways.
-
Try with both a client and server other than pfSense, but on different sides of the box. You'll probably have to open a firewall port for that.
-
On ubuntu and freebsd it is working with above 9Gbps both ways
-
A bit late but perhaps it helps out @belgarath.
I have an issue where PFsense on the smae hardware gets about 2-4 GB/s out of those interfaces but FreeBSD is getting 9.5 GB/salso load on the FreeBSD side is lower.
Linux and FreeBSD is not doing any NAT job and passing pf rules on top of this so it must be faster. And the
second thing is that you will be able to play around with some and/or more settings to get different numbers
of this tests. But the main and most urgent thing is here to test with NetIO or iPerf 3 through pfSense, either
from LAN port to LAN port or between the WAN and LAN ports and not on the machine itself. By the way I
really think that pfSense is not only FreeBSD plus some new GUI running like an ordinary program , it is
more then that, too many changes and other things will be turn it into its own group or level.It seems that cpu is exhausted while doing the work with PFsense, as the cpu seemed to be an issue I tried disable firewall processing on those interfaces but the results would improve by decimal parts so it does not look like it is the firewall issue.
If you are using PPPoE you will be CPU core single threaded and if not CPU multi core usage will be the result!
For sure with a pfSense version that is using all core + HT you might be able to get once more again totally other
results and numbers. Only this can be different!I tired different versions of pfsense and the results are more or less consistent, I'm getting anywhere between 1.8 and 3.5 Gbps
As normal it will be something around 2 GBit/s and 4 GBit/s as real throughput between two 10 GBit/s connections
based on the used protocols and/or used programs or offered services, but if you would see more between the test
together with iPerf you could try out to produce more streams something like 8 or 10 streams could be doing the job.General:
- HT enabling or disabling in the BIOS
- PowerD (hi adaptive, adaptive or maximum)
- Fast and enough RAM
Tunings:
Now this section can as above tried out as a single change or all together or only some combined changing´s.- mbuf size to 65.000 or to 1.000.000
Together with a broadcom NIC the 65000 was one times matching well and together with Intel NICs the 1000000 was fine - changing the entire amount of network queues from 2 to 4 (less or more try it out)
each cpu core (also the HT) is opening for each lan port one or more queues, driver pending!
You can now try out to limit or high up this numbers, that it will be matching at best to your hardware and
delivering the best results to you.