Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14

    Scheduled Pinned Locked Moved Problems Installing or Upgrading pfSense Software
    59 Posts 11 Posters 15.6k Views 10 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J Offline
      Juve
      last edited by

      We experienced a hard reboot during VLAN configuration on the slave node.
      Looks like problem is only occuring when reconfiguring interfaces.

      I'll keep you updated

      1 Reply Last reply Reply Quote 0
      • J Offline
        Juve
        last edited by Juve

        Regarding the crash he message was:
        Fatal trap 9: general protection fault while in kernel mode
        cpuid = 0; apic id = 00
        instruction pointer = 0x20:0xffffffff80e38d40
        stack pointer = 0x28:0xfffffe0b9ceba130
        frame pointer = 0x28:0xfffffe0b9ceba170
        code segment = base 0x0, limit 0xfffff, type 0x1b
        = DPL 0, pres 1, long 1, def32 0, gran 1
        processor eflags = interrupt enabled, resume, IOPL = 0
        current process = 12 (swi4: clock (0))

        the trace is :
        Tracing pid 12 tid 100034 td 0xfffff8000a331620
        carp_master_down_locked() at carp_master_down_locked+0xf0/frame 0xfffffe0b9ceba170
        carp_master_down() at carp_master_down+0x21/frame 0xfffffe0b9ceba190
        softclock_call_cc() at softclock_call_cc+0x13a/frame 0xfffffe0b9ceba240
        softclock() at softclock+0x79/frame 0xfffffe0b9ceba260
        intr_event_execute_handlers() at intr_event_execute_handlers+0xe9/frame 0xfffffe0b9ceba2a0
        ithread_loop() at ithread_loop+0xe7/frame 0xfffffe0b9ceba2f0
        fork_exit() at fork_exit+0x83/frame 0xfffffe0b9ceba330
        fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0b9ceba330
        --- trap 0, rip = 0, rsp = 0, rbp = 0 ---

        Regarding IXL driver, I noticed that a 1.11.20 version was released in september but but download page wasn't displaying the good driver (IAVF instead of IXL).
        So I emailed Intel today and they fixed the issue and we can now download the 1.11.20 version.
        We will try it tomorrow.

        1 Reply Last reply Reply Quote 0
        • J Offline
          Juve
          last edited by

          So here we are after a lot of trial:

          • we know and are sure that reconfiguring interface capabilities (TSO) while transmitting trafic is what is causing queue hang (like explained in the FreeBSD ticket)

          • we know adding vlan with 1.9.9k produce a queue hang and a mac vlan error

          • we know IPV6 was causing hang, but we did not test it again in the last setup (see below)

          • the hard reboot is caused by an issue not directly related to the interface problem but a result of a lock when a queue hang

          • we did compile and install version 1.11.20 of the IXL driver and we did remove the execution of our script that was removing HWVLANTSO at boot because when it was executed the NIC was already transmitting trafic and we had one or two queue hung at boot. This would cause issues later on.

          • we did test adding vlan this morning and nothing bad happened, no error message, no queue hang,

          So, we start a new phase of monitoring of how it is going.

          1 Reply Last reply Reply Quote 1
          • J Offline
            Juve
            last edited by

            A quick FollowUP running 1.11.20 :

            • 9 days uptime
            • no error so far.
            • throuhput is good, no issue
            • we added 6 new vlan in the last 9 Days and everything works as expected for the moment.
            • we did test multiple master/slave failover/failback, no issues

            I'll continue to keep you informed.

            1 Reply Last reply Reply Quote 2
            • JeGrJ Offline
              JeGr LAYER 8 Moderator @stephenw10
              last edited by

              @stephenw10 said in New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14:

              Please add that info to the bug report if you have confirmed it.
              https://redmine.pfsense.org/issues/9123

              Steve

              @Juve could you add those intel to the ticket, Stephen mentioned?
              It would provide additional help/info.

              also @stephenw10 if that driver release seems to solve the problem, how are chances that it be included in a 2.4.5 release? As we're currently having information from two customers they are seeing similar problems (one with "no response - dead" pfsense when adding VLANs, one having a reboot with queue and VLAN errors shown), that would be a great thing.

              best regards

              Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

              If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

              1 Reply Last reply Reply Quote 0
              • stephenw10S Offline
                stephenw10 Netgate Administrator
                last edited by

                If it's in 11 stable it would likely make it into a 2.4.X release. The drivers in 12 look to be significantly different, I'm not sure that could be brought back.

                Steve

                1 Reply Last reply Reply Quote 0
                • J Offline
                  Juve
                  last edited by

                  I did update the ticket :-)

                  1 Reply Last reply Reply Quote 2
                  • J Offline
                    Juve
                    last edited by

                    Quick update:
                    50 days uptime, no error so far.
                    Looks like the driver was the issue.

                    JeGrJ 1 Reply Last reply Reply Quote 1
                    • JeGrJ Offline
                      JeGr LAYER 8 Moderator @Juve
                      last edited by JeGr

                      @Juve said in New Version 2.4.4 - Interface Error --> aq_add_macvlan err -53, aq_error 14:

                      Quick update:
                      50 days uptime, no error so far.
                      Looks like the driver was the issue.

                      What were the latest changes you were running? Latest driver as you wrote (1.11.20) and HWVLANTSO removal?

                      If so, any chance @stephenw10 to bring that driver version into the mix for the upcoming 2.4.5?

                      Don't forget to upvote ๐Ÿ‘ those who kindly offered their time and brainpower to help you!

                      If you're interested, I'm available to discuss details of German-speaking paid support (for companies) if needed.

                      1 Reply Last reply Reply Quote 0
                      • J Offline
                        Juve
                        last edited by

                        It is still super stable.

                        The only two things we did at the end are:

                        • use the latest driver
                        • use failover LAGG and not LACP lagg (but we did not test LACP with latest driver so we can't confirm is is not working with it)

                        no VLANTSO removal etc.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S Offline
                          stephenw10 Netgate Administrator
                          last edited by

                          Is anyone still seeing this with the 1.11.9 driver in 2.4.5?

                          Steve

                          D 1 Reply Last reply Reply Quote 0
                          • D Offline
                            DarkMasta @stephenw10
                            last edited by

                            @stephenw10
                            with 2.4.5 no error (LACP LAGG, 15 VLANs)

                            1 Reply Last reply Reply Quote 0
                            • stephenw10S Offline
                              stephenw10 Netgate Administrator
                              last edited by

                              Ah, good. We were trying to replicate it in 2.4.5 and have not done so.

                              If anyone is seeing it though please let us know the steps required.

                              Steve

                              F 1 Reply Last reply Reply Quote 0
                              • F Offline
                                fsir @stephenw10
                                last edited by

                                @stephenw10

                                Hi, I think I can reproduce this problem with pfSense version 2.4.5-RELEASE-p1 (amd64) and native driver 1.11.9-k.

                                We have ~45 vlans on one of the interfaces, adding a new vlan on this interface leads to impossibility to ping / reach the system on any of the sub-interfaces related.
                                Console prompts "queue appears hung" messages, reboot seems to be the only solution to get the system working again (new vlan included).
                                I did set net.inet.tcp.tso = 0 via Advanced / System tunables : the messages don't show anymore, but still the same problem (system unreachable when adding vlan, reboot needed).

                                Tested on X710 and X722 Intel NICs, Supermicro server.

                                I will test upgrading the Intel driver and/or VLANTSO removal, keep you informed.
                                Don't hesitate to ask if I can provide more info.
                                Have a nice day,
                                Fabien

                                1 Reply Last reply Reply Quote 0
                                • stephenw10S Offline
                                  stephenw10 Netgate Administrator
                                  last edited by

                                  Are you seeing the error messages:

                                  Interface Error --> aq_add_macvlan err -53, aq_error 14
                                  

                                  Or just the 'queue appears hung' logs?

                                  Steve

                                  F 1 Reply Last reply Reply Quote 0
                                  • F Offline
                                    fsir @stephenw10
                                    last edited by

                                    @stephenw10 Hi !
                                    Just the 'queue appears hung' and server unreachable on the interface.
                                    Fabien

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.