New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14
-
Hi everyone.
I am installing the new Pfsense version 2.4.4 and having this error (aq_add_macvlan err -53, aq_error 14) whenever I am configuring interfaces or Vlan. Does anyone here is testing the new version and having the same error?
P.S. When I was using the version 2.4.3 I was not having this error.Cheers,
Peter -
What type of interface is that happening on? Which driver?
Are you spoofing the MAC address?
Steve
-
What do you mean by type of interface? Virtual/Physical? if so the type is Physical!
The driver that I am using is this one, Intel(R) Ethernet Connection 700 Series PF Driver, Version - 1.9.9-k
I am not spoofing the MAC address.Cheers,
-
Sorry what I meant was what driver such as em0, igb0, ix0 etc.
That looks like ixl though?
I have seen that error once, also on an ixl card. It did not seems to be associated with any sort of problem in that instance.
Steve
-
You are correct the driver is ixl0
Anything that I can try to do to get around it?Cheers,
Peter Franca -
Do you have a problem with that interface not working correctly? Or is it just logging that error repeatedly?
-
The interface is working correctly but that error keeps logging repeatedly.
Sometimes the server is rebooting without a single warning. Just today I went to enable the ssh remote access and the server crashed and then rebooted.Cheers,
Peter Franca -
Did it happen to offer a crash report after the reboot?
Does that hardware have a serial console available? If so, try to setup a console client to log the output. If it does print crash data to the console that would help track down what is happening.
If it reboots without any crash data at all, that's more troubling and tends to lean more toward a hardware issue. FreeBSD 11.2 may be driving the hardware in a more strenuous way that didn't trigger an issue on older versions.
-
I had to post a pic because the Antispam Solution of the Forum didn't allow the text. -
Was that just adding a new VLAN where others existed or is that the only VLAN on that card?
I assume you only see that when adding to ixl0 as the parent?Steve
-
Hello Steve, we have about 10 VLANs, which where created under 2.4.3. These vlans work fine after the upgrade to 2.4.4, but creating a new vlan with ixl0 as parent is impossible.
-
Hmm, interesting. The fact existing VLANs on ixl work OK implies it's something happening when they are created in the webgui.
Where exactly do you see this error?
How do you have the ixl NIC configured?Are you able to try adding a VLAN by editing the config directly and rebooting?
Steve
-
-
Same hardware. Same issues.
Seems to not affect traffic flows, but editing one (of 21) interfaces and saving changes last round about 20 to 30 minutes.
For every interface pfsense seems to retry to set a vlanmac several times:
-
Hi there,
just configured a new cluster for a customer last wednesday and saw those exact same error messages pop up in the general log. Never had these before but also never had ixL interfaces (the 10G ones were ix ones almost every time). Anything to do with the driver perhaps?
As those are two completely new hardware machines that have a module bay with a 4-port SFP+ module (ixl0-3) and both had those messages right from the start, I don't think the hardware is at fault.
Messages popped up when adding the 6 VLANs from the customers setup to the ixl0 interface and we have those popping up seemingly random and sparse in the logs but now the customer's worried if hardware or software is at fault.
If there's anything to help analyze, I'm sure we can provide a few details to catch the meaning of it.
Greets
Jens -
Hmm, there does seem to be something that has snuck in here.
The previous reports of this suggested that VLANs already added to the interface in 2.4.3 were not affected and continued to function. That implies it's something in the actual addition process that is triggering the error.
It would be interesting to try manually editing the config to add a new VLAN with an ixl parent and see if that works.If that was the case though you would think that simply rebooting after adding those new VLANs would bring them up correctly.
It does seem to be VLAN hardware offloading failing.
Steve
-
Does is have something to do with the older "error" in this thread that mentioned the problem would be gone with a further driver update? Could this be related to a newer driver or driver changes to ixl on FreeBSD 11.2 perhaps?
https://communities.intel.com/thread/103549
Otherwise the VLANs came up alright, what I did see was CARP on those VLAN interfaces somewhat "jittery". If you refresh CARP status on both nodes, you could see the them switching master roles very very shortly but noticable for a second. After witnessing this, I rebooted both nodes. After a bit of research this weekend I found this thread and tried salvaging the reboot log from those boxes:
Nov 2 15:48:06 kernel done. Nov 2 15:48:05 php-cgi rc.bootup: Configuring CARP settings finalize... Nov 2 15:48:05 php-cgi rc.bootup: pfsync done in 0 seconds. Nov 2 15:48:05 php-fpm 334 /rc.carpbackup: HA cluster member "(192.168.91.4@ixl0.91): (V091_PHONE)" has resumed CARP state "BACKUP" for vhid 4 Nov 2 15:48:05 php-fpm 334 /rc.carpbackup: HA cluster member "(192.168.82.4@ixl0.82): (V082_BZD)" has resumed CARP state "BACKUP" for vhid 4 Nov 2 15:48:05 php-fpm 334 /rc.carpbackup: HA cluster member "(192.168.80.4@ixl0.80): (V080_HNR)" has resumed CARP state "BACKUP" for vhid 4 Nov 2 15:48:05 php-cgi rc.bootup: waiting for pfsync... Nov 2 15:48:05 php-fpm 335 /rc.carpbackup: HA cluster member "(192.168.95.4@ixl0.95): (V095_ADMIN)" has resumed CARP state "BACKUP" for vhid 4 Nov 2 15:48:05 php-fpm 334 /rc.carpbackup: HA cluster member "(10.0.0.4@ixl0.10): (V010_VERWA)" has resumed CARP state "BACKUP" for vhid 4 Nov 2 15:48:04 kernel carp: 4@ixl0.91: INIT -> BACKUP (initialization complete) Nov 2 15:48:04 kernel ixl0.91: promiscuous mode enabled Nov 2 15:48:04 check_reload_status Carp backup event Nov 2 15:48:04 kernel carp: 4@ixl0.82: INIT -> BACKUP (initialization complete) Nov 2 15:48:04 kernel ixl0.82: promiscuous mode enabled Nov 2 15:48:04 check_reload_status Carp backup event Nov 2 15:48:04 kernel carp: 4@ixl0.80: INIT -> BACKUP (initialization complete) Nov 2 15:48:04 kernel ixl0.80: promiscuous mode enabled Nov 2 15:48:04 check_reload_status Carp backup event Nov 2 15:48:04 kernel carp: 4@ixl0.10: INIT -> BACKUP (initialization complete) Nov 2 15:48:04 kernel ixl0.10: promiscuous mode enabled Nov 2 15:48:04 kernel carp: demoted by 240 to 720 (interface down) Nov 2 15:48:04 kernel igb4: promiscuous mode enabled Nov 2 15:48:04 kernel carp: demoted by 240 to 480 (interface down) Nov 2 15:48:04 kernel igb0: promiscuous mode enabled Nov 2 15:48:04 check_reload_status Carp backup event Nov 2 15:48:04 kernel carp: 4@ixl0.95: INIT -> BACKUP (initialization complete) Nov 2 15:48:04 kernel ixl0.95: promiscuous mode enabled Nov 2 15:48:04 kernel ixl0: promiscuous mode enabled Nov 2 15:48:04 kernel carp: demoted by 240 to 240 (interface down) Nov 2 15:48:04 kernel igb1: promiscuous mode enabled Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 check_reload_status Carp backup event Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:04 kernel done. Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel done. Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel vlan4: changing name to 'ixl0.95' Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 syslogd Logging subprocess 11242 (exec /usr/local/sbin/sshguard) exited due to signal 15. Nov 2 15:48:03 sshd 10982 Server listening on 0.0.0.0 port 22. Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 sshd 10982 Server listening on :: port 22. Nov 2 15:48:03 kernel vlan3: changing name to 'ixl0.91' Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:03 kernel vlan2: changing name to 'ixl0.82' Nov 2 15:48:03 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel vlan1: changing name to 'ixl0.80' Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel ixl0: aq_add_macvlan err -53, aq_error 14 Nov 2 15:48:02 kernel vlan0: changing name to 'ixl0.10' Nov 2 15:48:02 kernel device_attach: est3 attach returned 6 Nov 2 15:48:02 kernel est: cpu_vendor GenuineIntel, msr 211200002200 Nov 2 15:48:02 kernel est: CPU supports Enhanced Speedstep, but is not recognized. Nov 2 15:48:02 kernel est3: <Enhanced SpeedStep Frequency Control> on cpu3 Nov 2 15:48:02 kernel coretemp3: <CPU On-Die Thermal Sensors> on cpu3
(log is newest on top)
Seems to me that assigning and renaming the VLANs somehow triggers that error, too. -
I reported a bug for this case.
https://redmine.pfsense.org/issues/9123 -
It does look like there have been some driver updates in FreeBSD that might apply to this.
If you're able to test FreeBSD 11-stable or 12 that would be useful.Steve
-
I'd like to help but as those are remote installations from a customer, I'm not at freedom to drive there, pull out the standby one and throw FreeBSD on it. ;) I'd like to (at least to help sort things out), but unfortunately that will be a hard one.
-
@stephenw10 said in New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14:
It does look like there have been some driver updates in FreeBSD that might apply to this.
If you're able to test FreeBSD 11-stable or 12 that would be useful.Steve
I can help, but don't know how to upgrade to a newer version of FreeBSD in PFSense. The only way I know how to upgrade FreeBSD, freebsd-update, does not exist - so a brief instruction would be helpful.
-
You would need to install FreeBSD instead of pfSense really. The changes look significant, I don't think it would load into 11.2 from 11-stable. Our current dev snapshots are still built on 11.2.
Steve
-
I've found that if I disable a physical interface (the parent so to speak) and only have tagged vlans on that interface, the error does not show. Also, if the error does occur, it can provoke a kernel panic immediately or at shutdown. There is definitely something fishy going on.
Is there a way to log the commands that are applied after a save in the web interface so I can further debug which command is causing the problem?
-
Not really. The best you can probably do is boot in verbose mode to log more debug info. Other then recompiling the driver with debugging enabled but that will probably give you far too much detail.
Add to /boot/loader.conf.local:
boot_verbose="YES"
Steve
-
@stephenw10 Adding boot_verbose did not give more info, other than the occasional printing of "vlanx: bpf attached" in between the errors.
Found from the Intel source code that "-14" means "invalid argument".
It is also not consistent. In one test I moved all the VLAN's from ixl0 (which was throwing the error) to ixl1 (which did not) and then back - error gone. The resulting config.xml did not show any difference from before the moving the vlans back-and-forth.
After a reboot the errors where back. To provoke the error reliably: press save on an interface detail page (no change needed, just press save on e.g. the WAN page), then "Apply Changes" will throw the error.
-
Really I think the only way to do this is to try to replicate it in FreeBSD. First in 11.2 and then in 12.
Unfortunately I don't have access to any ixl NICs to try that.
Steve
-
@stephenw10 I've ifconfig-ed myself a finger hernia but I cannot get the error that way.
My feeling is that the error occurs in the calls done to pfSense.so; specifically when interfaces are un-configured before being reconfigured. Since it does not seem to occur when loading the inital config but does when reconfiguring.Also tried with the driver in debug mode (compiled with -DIXL_DEBUG, you need to add two functions to the header that Intel seem to have missed) but that did not render any useful output. So I am out of my witz.
I can give you remote access to a box with IXL interfaces if you like, PM me for that.
-
I'm not the right guy to be doing that. You'd be better off offering in the bug report. Or just updating that with everything you have found to help developers investigating.
https://redmine.pfsense.org/issues/9123
Steve
-
Hi,
I am facing the same issue:
A system with two XL710 dual port, using a lagg over xl0 and xl2, no traffic is passing inbound
vlan0: changing name to 'lagg0.50'
ixl0: aq_add_macvlan err -53, aq_error 14
ixl0: aq_add_macvlan err -53, aq_error 14
ixl0: aq_add_macvlan err -53, aq_error 14
ixl0: aq_add_macvlan err -53, aq_error 14
vlan1: changing name to 'lagg0.51'
ixl0: aq_add_macvlan err -53, aq_error 14Can't find any solution, I am still investigating right now.
If anyone has a clue ?Thank you all
-
Looks like the error is coming from the way the php module is configuring the vlan on the interface.
-
Data on the bug report suggests this is a FreeBSD 11.2 issue. So try a 2.5 snapshot if you can.
Steve
-
Hi,
I too have a X710 based system that I'm testing with. I have had some success with the following tweaks to network adapter under pfSense 2.4.3:
ifconfig ixl0 -vlanhwfilter -vlanhwtso -tso
ifconfig ixl1 -vlanhwfilter -vlanhwtso -tso
ifconfig ixl2 -vlanhwfilter -vlanhwtso -tso
ifconfig ixl3 -vlanhwfilter -vlanhwtso -tsoTo re-iterate, the error was still being thrown, but the system continued to process packets.
-
So, I did lot of testing and tried the lastest driver compiled for FreeBSD 11.2. My conclusion is that problems are related to LACP lagg and not the driver itself.
If you don't use LACP lagg or use a Failover Lagg there are no issues.
If you use LACP mode you will suffer "queue hanging" problems under traffic.
If you use a newer driver, the kernel error message "aq_add_macvlan err -53, aq_error 14" isn't present anymore.For the moment I am running stock 2.4.4-P3 with embeded driver (1.9.9.k) with a failover lagg and I am not seeing any issue. The error message logged at configure time (aq_add_macvlan err -53, aq_error 14) seems to be harmless. I stressed the system with iperf (14 threads) during 30 minutes without any packet drop or kernel message about queue hanging.
I'll try to see if there are any performance enhancement with the latest driver.
-
Please add that info to the bug report if you have confirmed it.
https://redmine.pfsense.org/issues/9123Steve
-
I have got to say in my case the firewall did freeze eventually after seeing those "mcvlan" errors. I've disabled WOL in BIOS which apparently controls like 10 different power saving options. It can also be disabled by an Intel utility without reboot. Since then I've put a firewall under testing on a 10g link and have not been able to crash/freeze it.
-
Interesting. The hardware I'm working with is a HPE Proliant DL360. The bios tuning is
maximal performance, no virtualization, no C-States, no hyper-threading, no boot on lanlooks like IXL and LACP don't play well together at least on 11.x : https://lists.freebsd.org/pipermail/freebsd-net/2016-April/045091.html
@stephenw10 , I will do so when I will have tested the system for enougth time to tell it is working as expected. We need to be sure.
-
@Juve said in New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14:
Interesting. The hardware I'm working with is a HPE Proliant DL360. The bios tuning is
maximal performance, no virtualization, no C-States, no hyper-threading, no boot on lanlooks like IXL and LACP don't play well together at least on 11.x : https://lists.freebsd.org/pipermail/freebsd-net/2016-April/045091.html
@stephenw10 , I will do so when I will have tested the system for enougth time to tell it is working as expected. We need to be sure.
I'm under the same impression. Check out the advisory document from Intel regarding bugs with that adapter. Many many thing are broken.
-
On day of testing my setup:
- 2xHPE DL360 with HPE NC562 SFP+ (Intel X710 DA2), 6core @3,4Ghz, 48 GB of RAM.
I stressed the boxes with iperf3 because that was all I got at hand.
Common tuning:
- earlyshellcmd doing "ifconfig ixl -vlanhwtso"
- tunable hw.intr_storm_threshold set at 0
- IXL cards are configured in two LAGGS configured in failover mode
- each interface is a vlan tagged on one of those laggs
- /boot/loader.conf.local with
net.pf.source_nodes_hashsize="1048576"
net.pf.states_hashsize="67108864"testing the box with 1.9.9-k driver (stock FreeSD 11.2):
- iperf3 5 thread during 900s with pf enable: 6,13Gbit/s, peak at 4,25MPPS
- iperf3 5 thread during 900s with pf disabled: 9,45Gbit/s
- one "queue hung" on slave node at boot in dmesg but nothing after
- no problem with CARP failover, going from master to slave and failing back as expected in a timely mannertesting the box with 1.11.9 driver (lastest on Intel WebSite):
- iperf3 5 thread during 900s with pf enable: 6,55Gbit/s, peak at 4,58MPPS
- iperf3 5 thread during 900s with pf disabled: 9,45Gbit/s
- some "queue hung" on slave node (mainly at boot)
- problem with CARP failover, slave node stay Master randomly on different interfaces (IPv4 or IPV6) during few seconds to 10 minutes. A tcpdump on the slave node shows it can't see advertisements packet from master. A tcpdump on master shows both packet from master and slave !to sumup, newest driver seems to improve a litlle the performance using pf.
disabling TSO on vlans seems to have solved a lot of problem for me and I also gained arround 900mbit/s of throughput and 300KPPS
I decided to stay on stock driver 1.9.9-k without LACP (see LLDP agent problem in Intel document) and with vlanhwtso disabled. I will continue stress testing that setup soon.Side note : for those struggling with IPV6 CARP not willing to configure (the file exists issue). The rule is to no use CAP in hexadecimal numbers and to remove leading zeroes. Ok, but that is not a always wining rule. if you have only one useless 0 in the adress, you should keep that zero. You will see in the shell that even you configured the adress using "::" in the UI, FreeBSD impose the 0. If you have more than one 0 you are good.
-
Some followup because it will help others for sure.
After a lot more testing I can confirm that the only working solution for me (= 10 days uptime without any issue) is:
- stock driver 1.9.9k
- failover LAGG
- no IPV6 at all
As soon as you will be using IPV6 you will get hanging queues.
-
Thanks for the update. I'm somewhat convinced we are hitting a similar issue, but in different ways.
Do You have "Allow IPv6" in the Advanced/Networking enabled or disabled? We don't actually have IPv6 on any of the locations where we have the issue, but in our office, where we do have IPv6, there isn't an issue. Also we are not using LAGG.