Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    23.09d - Is QAT Broken?

    Scheduled Pinned Locked Moved Plus 23.09 Development Snapshots (Retired)
    86 Posts 10 Posters 19.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jaltman @RobbieTT
      last edited by

      @jimp I'm seeing the same lack of qat references from "vmstat -i" as @RobbieTT on my 4100.

      1 Reply Last reply Reply Quote 0
      • jimpJ
        jimp Rebel Alliance Developer Netgate
        last edited by

        Do you also have IPsec-MB / IIMB loaded?

        If so, it may be handling whatever encryption has been requested.

        I don't see any interrupts on qat when I have both loaded here, but if I disable IPsec-MB, I do:

        irq158: qat0:b2                        1          0
        

        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

        Need help fast? Netgate Global Support!

        Do not Chat/PM for help!

        J RobbieTTR 2 Replies Last reply Reply Quote 0
        • J
          jaltman @jimp
          last edited by

          @jimp said in 23.09d - Is QAT Broken?:

          Do you also have IPsec-MB / IIMB loaded?

          On my 4100, IPsec-MB is unchecked in the UI and "Intel QuickAssist (QAT)" is selected as the cryptographic hardware.

          J 1 Reply Last reply Reply Quote 0
          • J
            jaltman @jaltman
            last edited by

            @jimp Tonight I will boot a 23.05.1 snapshot to confirm the prior behavior.

            RobbieTTR 1 Reply Last reply Reply Quote 0
            • RobbieTTR
              RobbieTT @jimp
              last edited by

              @jimp said in 23.09d - Is QAT Broken?:

              Do you also have IPsec-MB / IIMB loaded?

               2023-09-29 at 21.38.17.png

              I've not selected anything different, so just the default.

              โ˜•๏ธ

              1 Reply Last reply Reply Quote 0
              • RobbieTTR
                RobbieTT @jaltman
                last edited by RobbieTT

                @jaltman said in 23.09d - Is QAT Broken?:

                @jimp Tonight I will boot a 23.05.1 snapshot to confirm the prior behavior.

                @jimp
                Just rolled back to 23.05.1 and QAT works as expected:

                 2023-09-30 at 17.05.23.png

                [23.05.1-RELEASE][admin@Router-8.redacted.me]/root: vmstat -i | grep qat
                irq175: qat0:b1                       46          0
                irq176: qat0:b2                       46          0
                [23.05.1-RELEASE][admin@Router-8.redacted.me]/root: vmstat -i | grep qat
                irq175: qat0:b1                      114          0
                irq176: qat0:b2                       90          0
                [23.05.1-RELEASE][admin@Router-8.redacted.me]/root:
                

                Back to the latest dev load and things go south:

                [23.05.1-RELEASE][admin@Router-8.redacted.me]/root: vmstat -i | grep qat
                irq175: qat0:b1                      176          0
                irq176: qat0:b2                      208          0
                [23.05.1-RELEASE][admin@Router-8.redacted.me]/root: 
                
                Netgate 6100 - Serial: 2xxxxxxxx8 - Netgate Device ID: redacted
                
                *** Welcome to Netgate pfSense Plus 23.09-DEVELOPMENT (amd64) on Router-8 ***
                
                 Current Boot Environment:  default
                    Next Boot Environment:  quick-20230930155240
                
                [23.09-DEVELOPMENT][admin@Router-8.redacted.me]/root: vmstat -i | grep qat
                [23.09-DEVELOPMENT][admin@Router-8.redacted.me]/root: 
                

                It's only the 23.09d load that I have QAT issues with - something has broken, at least on my 6100.

                โ˜•๏ธ

                RobbieTTR 1 Reply Last reply Reply Quote 0
                • RobbieTTR
                  RobbieTT @RobbieTT
                  last edited by RobbieTT

                  @jimp
                  Loaded 23.09.a.20231002.0600 dev this morning - still no functioning QAT on my 6100:

                  [23.09-DEVELOPMENT][admin@Router-8.redacted.me]/root: vmstat -i | grep qat
                  [23.09-DEVELOPMENT][admin@Router-8.redacted.me]/root:
                  

                  โ˜•๏ธ

                  1 Reply Last reply Reply Quote 0
                  • jimpJ
                    jimp Rebel Alliance Developer Netgate
                    last edited by

                    If you check on the dashboard, does the list of accelerated algorithms match an algorithm in use by your VPNs?

                    For example on 23.09 the dashboard shows:

                    AES-CBC, AES-CCM, AES-GCM, AES-ICM, AES-XTS, SHA1, SHA256, SHA384, SHA512
                    

                    I have two tunnels on the 4100 I am looking at. One has a P2 using AES-128, the other has a P2 using AES128-GCM. As I pass traffic over the tunnel, I see the interrupts on the QAT device increase. Note that it won't show any activity until the tunnel is connected and passing traffic.

                    irq157: qat0:b1                        7          0
                    irq158: qat0:b2                        6          0
                    

                    It's also possible that something else that used to use QAT on 23.05.1 isn't using the same algorithm on 23.09 and now isn't being accelerated. For example if something used AES-128 before but now selected ChaCha20-Poly1305, then it wouldn't be using QAT.

                    Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                    Need help fast? Netgate Global Support!

                    Do not Chat/PM for help!

                    RobbieTTR 1 Reply Last reply Reply Quote 0
                    • RobbieTTR
                      RobbieTT @jimp
                      last edited by

                      @jimp
                      I'm not using a VPN at the moment, so that is not a factor. As described earlier, in this state the QAT would only be used for encrypted traffic that is requested and received by the router itself. I guess that almost all (if not all) will be TLS traffic.

                      If I actively reach out from the router CLI with a given cypher, eg:

                      openssl s_client -host sdcstest.blob.core.windows.net -port 443 -cipher ECDHE-RSA-AES256-GCM-SHA384

                      I can achieve a correct SSL handshake with the cypher chosen but this traffic is also missing the QAT offload:

                      ---
                      No client certificate CA names sent
                      Peer signing digest: SHA256
                      Peer signature type: RSA-PSS
                      Server Temp Key: ECDH, secp384r1, 384 bits
                      ---
                      SSL handshake has read 5764 bytes and written 442 bytes
                      Verification: OK
                      ---
                      New, TLSv1.2, Cipher is ECDHE-RSA-AES256-GCM-SHA384
                      Server public key is 2048 bit
                      Secure Renegotiation IS supported
                      Compression: NONE
                      Expansion: NONE
                      No ALPN negotiated
                      SSL-Session:
                          Protocol  : TLSv1.2
                          Cipher    : ECDHE-RSA-AES256-GCM-SHA384
                          Session-ID: 552B0000932731A77EE20EBC73D3535D7A55980427FAC5D3...
                          Session-ID-ctx: 
                          Master-Key: 99ED51BB752865C07DC71A98871DFEE6EBA67368F54CAE50...
                          PSK identity: None
                          PSK identity hint: None
                          SRP username: None
                          Start Time: 1696249971
                          Timeout   : 7200 (sec)
                          Verify return code: 0 (ok)
                          Extended master secret: yes
                      ---
                      
                      

                      There may be other test methods you prefer for diagnostics, so happy to try them too.

                      โ˜•๏ธ

                      jimpJ 1 Reply Last reply Reply Quote 0
                      • jimpJ
                        jimp Rebel Alliance Developer Netgate @RobbieTT
                        last edited by

                        @RobbieTT said in 23.09d - Is QAT Broken?:

                        @jimp
                        I'm not using a VPN at the moment, so that is not a factor. As described earlier, in this state the QAT would only be used for encrypted traffic that is requested and received by the router itself. I guess that almost all (if not all) will be TLS traffic.

                        What traffic exactly? What daemons do you have running on the firewall that would be using TLS/algorithms you expect to be accelerated?

                        Just the GUI (nginx) or something else? The s_client example you showed would only be hitting the GUI (nginx).

                        "QAT is broken" is a completely different statement than "QAT isn't working for daemons X, Y, Z"

                        We have already demonstrated that the former statement is untrue, what remains is determining the latter.

                        Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                        Need help fast? Netgate Global Support!

                        Do not Chat/PM for help!

                        RobbieTTR 1 Reply Last reply Reply Quote 0
                        • RobbieTTR
                          RobbieTT @jimp
                          last edited by RobbieTT

                          @jimp
                          Am I not correct in presuming that encrypted traffic originating from the router that uses TLS will use QAT?

                          I don't know what daemons running on the router reach-out. I'm not good at guessing but router updates/checks, package updates, DoT (unbound), pfBlocker et al. I would be surprised if any of the services hosted on pfSense used unencrypted traffic but again, that is a guess from a non-developer.

                          The use of QAT has changed from 23.05.1 (and below) to 23.09 dev. I genuinely don't know why or how routine TLS traffic or a test openssl session with Microsoft fails to show any QAT usage. I read through the Intel QAT white-paper on testing and used the confirmation methods they gave.

                          Perhaps 'broken' is the wrong verb in dev-speak so if I need to use English in a different way please suggest a better one. As a customer I am trying my best here.

                          โ˜•๏ธ

                          1 Reply Last reply Reply Quote 0
                          • jimpJ
                            jimp Rebel Alliance Developer Netgate
                            last edited by

                            What I'm saying is you need to be a lot more specific. Yes, there is a change in behavior but you haven't even clearly defined what that is.

                            I ran some tests locally and it doesn't appear to be getting used by ssh, nginx, or openvpn at least and I seem to recall it worked at least with nginx in the past. I am not sure about outbound. I don't have anything left on 23.05.1 with QAT to confirm all of the old behavior.

                            Primarily what gets accelerated is use of the accelerated algorithms in the kernel, so IPsec and OpenVPN DCO are the main consumers.

                            What you'd need to check is looking at interrupts on the QAT device and see if they go up in proportion with traffic you send to things like SSH (type some lines, check the count, or SCP a large file), the GUI (check after refreshing the page), and so on. For testing outbound, you can try pkg update -f or try to curl https://<something>.

                            Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                            Need help fast? Netgate Global Support!

                            Do not Chat/PM for help!

                            RobbieTTR 1 Reply Last reply Reply Quote 0
                            • jimpJ
                              jimp Rebel Alliance Developer Netgate
                              last edited by

                              Speaking with some others here the only things that would use QAT would be encryption in the kernel, so IPsec and OpenVPN DCO as I mentioned. It wouldn't get used for userspace daemons or clients.

                              So it would help to confirm what exactly was using QAT on 23.05.1 that you aren't seeing now.

                              Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                              Need help fast? Netgate Global Support!

                              Do not Chat/PM for help!

                              RobbieTTR 1 Reply Last reply Reply Quote 0
                              • RobbieTTR
                                RobbieTT @jimp
                                last edited by

                                @jimp

                                I am trying to be specific as I can but I just don't have your knowledge. I can only give you exactly what I have tested and paste in the exact results given. Not that it matters, I did not state 'QAT is broken', I posted a question asking if it was, specific to a firmware load.

                                I have tried the testing methods listed in the Intel QAT paper, I have monitored the routine traffic originating from the router, I have run updates, basic pkg update -f & curl to HTTPS sites and the openssl test session with Microsoft (as per their developer guide). There remains zero QAT interrupts, whatever I try.

                                I am sure there are different crypto uses I have not tried and may work, but I can say that all the different (and somewhat basic) things I would expect QAT to work with now produce zero interrupts.

                                Have you tried any of the tests I have used and, if so, is everything working ok for you?

                                Apologies for being a pfSense newcomer and not having enough knowledge to resolve this on my own. If you need someone who can fly a fighter-jet or run a complex flight test profile I'm your guy but I need help with pfSense!

                                โ˜•๏ธ

                                1 Reply Last reply Reply Quote 0
                                • RobbieTTR
                                  RobbieTT @jimp
                                  last edited by

                                  @jimp said in 23.09d - Is QAT Broken?:

                                  Speaking with some others here the only things that would use QAT would be encryption in the kernel, so IPsec and OpenVPN DCO as I mentioned. It wouldn't get used for userspace daemons or clients.

                                  So it would help to confirm what exactly was using QAT on 23.05.1 that you aren't seeing now.

                                  Our messages crossed. How would you like me to test for that on a 23.05.1 snapshot?

                                  โ˜•๏ธ

                                  J 1 Reply Last reply Reply Quote 0
                                  • jimpJ
                                    jimp Rebel Alliance Developer Netgate
                                    last edited by

                                    Test that exactly as I mentioned before -- try running traffic through each daemon individually and see if you see interrupts on the QAT device when you do.

                                    I just setup and tested an OpenVPN DCO tunnel here and it is also using QAT just like IPsec, both of which are in the kernel. That's what we expect to see.

                                    So we need to figure out what you were seeing using DCO on 23.05.1 to narrow down what has changed in your environment.

                                    Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                    Need help fast? Netgate Global Support!

                                    Do not Chat/PM for help!

                                    RobbieTTR 1 Reply Last reply Reply Quote 0
                                    • RobbieTTR
                                      RobbieTT @jimp
                                      last edited by

                                      @jimp
                                      Thanks @jimp and happy to do so. To avoid more errata, how specifically do you want to achieve traffic through each daemon?

                                      Does this not conflict with your note that QAT is not used for userspace daemons?

                                      Does the lack of QAT interrupts when using TLS, or a curl to HTTPS, or pkg update -f or an openssl test session mean nothing and a zero result is actually the expected behaviour?

                                      I've become slightly confused to know if I have a problem or not?

                                      I need more tea.

                                      โ˜•๏ธ

                                      jimpJ 1 Reply Last reply Reply Quote 0
                                      • J
                                        jaltman @RobbieTT
                                        last edited by

                                        @jimp One of the big changes in 23.09 is the switch to OpenSSL v3. OpenSSL has a QAT engine if its built with it. Is it possible that the OpenSSL 1.1.x was built with the QAT engine support and 3.x is not?

                                        [23.05.1-RELEASE][root@pfsense.bayside.sara-jeff.nyc]/root: openssl engine
                                        (devcrypto) /dev/crypto engine
                                        (rdrand) Intel RDRAND engine
                                        (dynamic) Dynamic engine loading support
                                        
                                        [23.09-DEVELOPMENT][root@hostname]/root: openssl engine
                                        (rdrand) Intel RDRAND engine
                                        (dynamic) Dynamic engine loading support
                                        

                                        I suspect that the OpenSSL QAT engine is unavailable as part of 23.05 because qatengine.so is not installed in the default location and /etc/ssl/openssl.conf does not configure it.

                                        [23.05.1-RELEASE][root@pfsense.bayside.sara-jeff.nyc]/usr/bin: ./openssl engine -t -c -v qatengine
                                        59772833685504:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:/var/jenkins/workspace/pfSense-Plus-snapshots-23_05_1-main/sources/FreeBSD-src-plus-RELENG_23_05_1/crypto/openssl/crypto/dso/dso_dlfcn.c:118:filename(/usr/lib/engines/qatengine.so): Cannot open "/usr/lib/engines/qatengine.so"
                                        59772833685504:error:25070067:DSO support routines:DSO_load:could not load the shared library:/var/jenkins/workspace/pfSense-Plus-snapshots-23_05_1-main/sources/FreeBSD-src-plus-RELENG_23_05_1/crypto/openssl/crypto/dso/dso_lib.c:162:
                                        59772833685504:error:260B6084:engine routines:dynamic_load:dso not found:/var/jenkins/workspace/pfSense-Plus-snapshots-23_05_1-main/sources/FreeBSD-src-plus-RELENG_23_05_1/crypto/openssl/crypto/engine/eng_dyn.c:434:
                                        59772833685504:error:2606A074:engine routines:ENGINE_by_id:no such engine:/var/jenkins/workspace/pfSense-Plus-snapshots-23_05_1-main/sources/FreeBSD-src-plus-RELENG_23_05_1/crypto/openssl/crypto/engine/eng_list.c:421:id=qatengine
                                        

                                        The Intel QAT engine is supported for FreeBSD so it might be worthwhile adding to a future release:

                                        https://github.com/intel/QAT_Engine/blob/master/README.md
                                        https://www.intel.com/content/www/us/en/download/19735/intel-quickassist-technology-driver-for-freebsd-hw-version-1-x.html?

                                        1 Reply Last reply Reply Quote 0
                                        • jimpJ
                                          jimp Rebel Alliance Developer Netgate @RobbieTT
                                          last edited by

                                          @RobbieTT said in 23.09d - Is QAT Broken?:

                                          @jimp
                                          Thanks @jimp and happy to do so. To avoid more errata, how specifically do you want to achieve traffic through each daemon?

                                          openvpn: Pinging the tunnel IP addresses on the other end and checking is sufficient.

                                          IPsec: For tunnel mode, ping LAN to LAN, for VTI, pinging the VTI address on the far side is enough.

                                          nginx: Try reloading various GUI pages in a browser and see if the interrupts increase, or try transferring data from a remote curl client. s_client alone may not do enough to be meaningful since it's just negotiating the connection.

                                          ssh: Even just running the command to check the interrupts should increase the interrupts, but using scp to transfer a large-ish file would really show it checking before and after.

                                          outbound/curl: Try fetching a remote file using https and see if the interrupts increase.

                                          Does this not conflict with your note that QAT is not used for userspace daemons?

                                          We expect it only to be used by the kernel. You're seeing some difference on 23.05.1, which is why we need more data to isolate what that might be.

                                          Does the lack of QAT interrupts when using TLS, or a curl to HTTPS, or pkg update -f or an openssl test session mean nothing and a zero result is actually the expected behaviour?

                                          We don't expect any of those to cause interrupts on QAT since they aren't running through the kernel. So not seeing an increase is the correct behavior.

                                          So far I haven't seen anything that suggests there is a problem on 23.09 but we need more data about what you were seeing on 23.05.1 to say for sure.

                                          Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                          Need help fast? Netgate Global Support!

                                          Do not Chat/PM for help!

                                          1 Reply Last reply Reply Quote 0
                                          • jimpJ
                                            jimp Rebel Alliance Developer Netgate
                                            last edited by

                                            I reloaded 23.05.1 on a 4100 and I don't see any QAT activity on there at all for the GUI, ssh, curl, etc.

                                            Are you certain you don't have any IPsec or OpenVPN DCO tunnels on 23.05.1 or 23.09?

                                            Maybe if you had an OpenVPN DCO tunnel on 23.05.1 it was using AES-GCM (accelerated) but on 23.09 it may be using ChaCha20-Poly1305 (not accelerated).

                                            Remember: Upvote with the ๐Ÿ‘ button for any user/post you find to be helpful, informative, or deserving of recognition!

                                            Need help fast? Netgate Global Support!

                                            Do not Chat/PM for help!

                                            RobbieTTR 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.