[Solved] Supermicro X9SCA-F and AOC-SG-12 Dual port Server LAN Adapter problem
-
Hi,
I have a big problem after update to pfSense 2.1.2 and 2.1.3. My board is a server class Supermicro X9SCA-F and additional adapter AOC-SG-12 Dual port. They are using Intel Pro/1000 chipset and work just fine with pfSense 2.1
Now after upgrade I have multiple problems - many events "Hotplug detected", very low SIP quality, errors on all interfaces, lags between VLANs, etc.
I do not want to downgrade pfSense version, what else can I do?
-
The Intel Gigabit drivers were updated for 2.1.1. Which drivers are your NICs using? em or igb?
Do you have any of the NIC tuning tweaks in your loader.conf?Steve
-
After 2.1.1 update my VLANs do not work - they have a big lag (300 ms in local network). I have both em and igb and they both do not work well with VLANs (I check with VLANs on em or igb only). There are no problem with default LAN/WAN connections, but all VLANs have a big lag.
I have no modifications or tweaks in loader.conf.
From pciconf:
igb0@pci0:1:0:0: class=0x020000 card=0x10a715d9 chip=0x10a78086 rev=0x02 hdr=0x00
igb1@pci0:1:0:1: class=0x020000 card=0x10a715d9 chip=0x10a78086 rev=0x02 hdr=0x00
em0@pci0:3:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00
em1@pci0:4:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00From phpsysinfo:
- em0: Intel(R) PRO/1000 Network Connection 7.3.8
- em1: Intel(R) PRO/1000 Network Connection 7.3.8
- igb0: Intel(R) PRO/1000 Network Connection version - 2.4.0
- igb1: Intel(R) PRO/1000 Network Connection version - 2.4.0
Ping to internal VLAN address:
64 bytes from 192.168.170.50: icmp_seq=0 ttl=128 time=848.800 ms
64 bytes from 192.168.170.50: icmp_seq=1 ttl=128 time=1001.225 ms
64 bytes from 192.168.170.50: icmp_seq=2 ttl=128 time=0.249 ms
64 bytes from 192.168.170.50: icmp_seq=3 ttl=128 time=372.626 ms
64 bytes from 192.168.170.50: icmp_seq=4 ttl=128 time=244.640 ms
64 bytes from 192.168.170.50: icmp_seq=5 ttl=128 time=743.548 ms
64 bytes from 192.168.170.50: icmp_seq=6 ttl=128 time=97.518 ms
64 bytes from 192.168.170.50: icmp_seq=7 ttl=128 time=326.277 ms
64 bytes from 192.168.170.50: icmp_seq=8 ttl=128 time=4.810 ms
64 bytes from 192.168.170.50: icmp_seq=9 ttl=128 time=2.120 ms
64 bytes from 192.168.170.50: icmp_seq=10 ttl=128 time=15.078 ms
64 bytes from 192.168.170.50: icmp_seq=11 ttl=128 time=12.480 ms
64 bytes from 192.168.170.50: icmp_seq=12 ttl=128 time=1008.188 msPing to outside:
64 bytes from 193.201.172.98: icmp_seq=0 ttl=59 time=4.084 ms
64 bytes from 193.201.172.98: icmp_seq=1 ttl=59 time=2.710 ms
64 bytes from 193.201.172.98: icmp_seq=2 ttl=59 time=4.037 ms
64 bytes from 193.201.172.98: icmp_seq=3 ttl=59 time=2.516 ms
64 bytes from 193.201.172.98: icmp_seq=4 ttl=59 time=3.040 ms -
Ok, seems pretty clearly a problem.
Does it make any difference if you ping between VLANs just on em interface? Or just on igb interfaces?What about any NIC tuning? Have you done anything listed here: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Intel_igb.284.29_and_em.284.29_Cards
It's been reported that the newer drivers do not need tuning and that some of these can in fact now cause problems.
Do you have VLAN hardware off loading options enabled? They are by default:
2.1.3-RELEASE][root@pfsense.fire.box]/root(2): ifconfig em1|grep VLAN options=9b <rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum></rxcsum,txcsum,vlan_mtu,vlan_hwtagging,vlan_hwcsum>
Steve
-
Thanks for the advises stephenw10! I try all possible combinations for my igb interface but nothing seems to help. Lag between VLANs is the same as before.
I try:
kern.ipc.nmbclusters="131072"
hw.igb.num_queues=1ifconfig igb0 -vlanhwfilter -vlanhwtso -tso
hw.igb.fc_setting=0
Unchecked "Disable hardware TCP segmentation offload" and "Disable hardware large receive offload"
… and nothing works. My VLANs are now unusable with this lag. In summary ping between VLAN hosts have lag, pings between VLAN and outside have lag, ping from default VLAN and VLANs have lag, pings between default VLAN and outside and between hosts in default VLAN have no lag.
I have the same problems with em0 and that was the reason to move VLANs to igb0, but firewall is on a remote location and I can not switch them back easy.
-
Hmm. :-
So just to be clear when did you switch your VLANs from em NICs to igb NICs? After you upgraded to 2.1.3?
And there was no difference in the behaviour of the em and igb interfaces? Both showed lag in the same way?By 'default VLAN' do you mean untagged traffic ot traffic tagged VLAN1?
The em interfces in that box are using the 82574L chip which is an extremely widely used NIC. I'd be very surprised if others weren't using this with VLANs successfully. Some sort of firmware problem perhaps. :-\
Steve
-
I have no problems more than an year. Same configuration, no changes in hardware or software. Just upgrade to pfSense 2.1.1 and my VLANs goes crazy.
All my VLANs was on em0 and em1 interface. Just to be sure I changed them to igb0 interface, but this doesn't help at all.
By "default VLAN" I mean my untagged traffic, all my tagged traffic have a big lag now. And again I have no problems at all until version 2.1.1 upgrade.
I will now downgrade to 2.1, now my network is not usable. Is there any way to use the "old" driver?
-
You can probably load the old driver at boot. I've never tried it with an older driver but it works fine for newer. You need to get hold of the kernel modules, if_em.ko and if_igb.ko from FreeBSD 8.3. Either from a FreeBSD 8.3 image or I think Jim has them here: http://files.pfsense.org/jimp/ko-8.3/
Copy the modules to /boot/modules on your pfsense box. Now create a file /boot/loader.conf.local and add the following lines to it:if_em_load="yes" if_igb_load="yes"
However I'd be very surprised to find you're the only person running VLANs on 82574L NICs and other people don't seem to be having a problem. :-\
Steve
-
Steve, thanks a million times!
I've changed only igb driver (because it is LAN and I can do this from remote location) and now it works! Just see the ping to this VLAN address:
64 bytes from 192.168.170.50: icmp_seq=28 ttl=128 time=0.394 ms
64 bytes from 192.168.170.50: icmp_seq=29 ttl=128 time=0.180 ms
64 bytes from 192.168.170.50: icmp_seq=30 ttl=128 time=0.152 ms
64 bytes from 192.168.170.50: icmp_seq=31 ttl=128 time=0.181 ms
64 bytes from 192.168.170.50: icmp_seq=32 ttl=128 time=0.231 ms
64 bytes from 192.168.170.50: icmp_seq=33 ttl=128 time=0.229 ms
64 bytes from 192.168.170.50: icmp_seq=34 ttl=128 time=0.227 msSo what to do with the next pfSense versions?
-
Nice. :)
Glad it worked out for you but you should consider why nobody else seems to be having those problems. It seems likely you have some odd driver setting in place or unusual combination of hardware.Those newer drivers were backported from FreeBSD 10. pfSense 2.2 will be built on 10. Try one of the snapshots if you can to see how it will run on your hardware.
Steve
-
You are right, I can not relay on this driver forever, but the problem is that this is a production firewall on a big customer at remote location. So it is not easy to test new snapshots, but I will try to upgrade firmware on the motherboard or just ask Supermicro for help. FYI all my other firewalls based on Supermicro and Intel chipset works just fine.
-
Running different software you mean or just slightly different hardware?
Perhaps the switch at that location is doing something odd?Steve
-
All my firewalls are with the latest pfSense version, hardware is different, because I get what is on stock at the time and is mainly based on Supermicro server boards. Switches are D-Link DGS-3100-48 (two connected with HDMI cable) and they work as one.
-
Hmm, odd. I can't believe nobody else has seen this. There were many other issues when the newer drivers were being backported, for a long while they weren't stable and were in fact backed out of the 2.1.1 snapshots at one point.
There might be something in this locked thread: https://forum.pfsense.org/index.php?topic=72763.0Otherwise maybe time to start a new thread with a more descriptive title.
Steve
-
After 2 years I have similar problems with the same hardware on different location. My pfSense work on that location more than 3 years without any problem, but now all connections are upgraded and introduction of 1Gbit Internet (via media converter) cause troubles.
After the speed upgrade it is impossible to get IP from my ISP via DHCP. I tested with two laptops and cheap router and it works. So I remember about my problems with this hardware combination and try to set interface to 100baseTX full-duplex and it works (I can get IP) but then I have 3-10% packets lost and lag is from 100ms to 1000 ms - connection is not usable at all.
I use pfSense 2.3.2-RELEASE (i386). I will replace the MB (this was the final solution with my previous problem), but I need something to keep my location up until that moment. Can I use some old drivers (this was the temporary solution for the other case) this time?
-
OK, I solved again. Here are the steps:
1. Remove pfSense and install VMware ESXi (free).
2. Install pfSense on ESXi as VM.
3. Configure Ethernet interfaces, but do not use AOC-SG-12 Dual port Server LAN Adapter ports. They will not work as expected.
4. Run tests - 850 MBps NAT.