Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Epyc 3251 and Wireguard

    Scheduled Pinned Locked Moved General pfSense Questions
    50 Posts 3 Posters 8.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stephenw10S
      stephenw10 Netgate Administrator
      last edited by

      Ah, OK. Sorry I misinterpreted the responses there then.

      Is it possible to switch back to igb and try to generate a crash report?
      The crash you saw there in cxgbe looks very similar to some we have seen in other drivers but that I expect to be fixed in igb.

      I'm not aware of any testing that has been done on an Epyc platform device.

      Steve

      J 1 Reply Last reply Reply Quote 0
      • J
        Jarhead @stephenw10
        last edited by

        @stephenw10 Definitely possible, just don't know when I can get to it. Got a lot of painting to do tonight so maybe I can play around as I watch the paint dry. ๐Ÿ˜€

        1 Reply Last reply Reply Quote 0
        • stephenw10S
          stephenw10 Netgate Administrator
          last edited by

          Ha, sounds like a good option!

          I'll keep digging here, see if anyone has any suggestions.

          J 1 Reply Last reply Reply Quote 0
          • J
            Jarhead @stephenw10
            last edited by

            @stephenw10 Boy that paint took a long time to dry! ๐Ÿ˜€

            Gave me a lot of time to try this out.
            Kept going back to that config being messed up so I started from scratch.
            New install, chelsio card not connected, just changed my network address with WAN on igb0 and LAN on igb1.
            Installed Wireguard. It ran fine.
            Installed Chelsio. still ran fine on igb interfaces.
            Moved WAN and LAN to chelsio. Still ran fine!
            So the Chelsio is not causing the issue.
            Spent some time (a lot! long night) setting the config back to my usual. Not importing anything, had my old router up and used it as a reference.
            Wireguard crashed.
            In the process of setting up my old router as the new endpoint so I can rule out opnsense being the cause. (secretly hoping it is so I can get rid of it!!)

            Did get a new crash dump that might show something. This is still on the chelsio though.

            textdump.tar.0
            info.0

            1 Reply Last reply Reply Quote 0
            • stephenw10S
              stephenw10 Netgate Administrator
              last edited by

              Mmm, pretty much identical crash and it's in the cxgbe driver.
              I have no idea how the WG encrypted traffic could be triggering it though.

              I'd be willing to bet it would not crash with igb as WAN. Though you said you were still seeing the errors logged with igb?

              After reconfiguring it with your previous settings did it start crashing immediately?
              Was it actually panicking and rebooting or just Wireguard erroring out?

              Steve

              J 1 Reply Last reply Reply Quote 0
              • J
                Jarhead @stephenw10
                last edited by

                @stephenw10
                But it did crash with the igb driver.
                When I have a clean config (meaning no other interfaces assigned) it runs fine on both igb and cxgbe.
                When I put my config back on it, it crashes on igb and cxgbe.
                Gonna try disabling all other interfaces when I get a chance later.

                1 Reply Last reply Reply Quote 0
                • stephenw10S
                  stephenw10 Netgate Administrator
                  last edited by

                  The firewall itself crashed and rebooted with the igb NIC as WAN?

                  J 1 Reply Last reply Reply Quote 0
                  • J
                    Jarhead @stephenw10
                    last edited by

                    @stephenw10
                    Yes, pointed that out many posts ago. That's why I keep saying stop focusing on the chelsio.
                    But I did make some progress.
                    I have WG working from pfSense to pfSense.
                    Don't think that matters because I did have it running to this same opnsense box.
                    What I'm thinking, and about to try, is that I've been recreating the tunnel I used with the old router to this opnsense junk.
                    I now created a whole new tunnel as a test between pfSenseseses.

                    Gonna now try to create a new tunnel to the opnsense.

                    1 Reply Last reply Reply Quote 0
                    • stephenw10S
                      stephenw10 Netgate Administrator
                      last edited by

                      Hmm, no crash report from when it was running over igb though?

                      Just very odd that it shows as a crash in the driver not in Wireguard.
                      It must be some very unusual traffic the WG is creating and driver is trying to do something with. Hard to imagine what that could be though.

                      J 1 Reply Last reply Reply Quote 1
                      • J
                        Jarhead @stephenw10
                        last edited by

                        @stephenw10
                        No crash report from igb. I had mentioned that when it crashed with the igb too, I went back to the chelsio.

                        Got it working though.
                        Never would have guessed you can't "reuse" a tunnel but creating a new one fixed it.
                        Does the wireguard tunnels 'key' to the actual hardware some how?

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          No way that I can think of. Hmm. I've imported configs with WG tunnels defined before and never seen an issue. ๐Ÿค”

                          J 1 Reply Last reply Reply Quote 0
                          • J
                            Jarhead @stephenw10
                            last edited by

                            @stephenw10
                            I didn't import them, I recreated them with the same values. Shouldn't make a difference and I'm now guessing it didn't.
                            Tunnels have been up for about an hour or so and it just went down.
                            Same thing, tried to access the opnsense box.

                            J 1 Reply Last reply Reply Quote 0
                            • J
                              Jarhead @Jarhead
                              last edited by

                              @stephenw10 Gonna try to switch LAN and WAN back to the igb at some point.
                              If this turns out to be the chelsio card, any ideas why it would have a problem?
                              Any suggested 'tuning' to it maybe?

                              1 Reply Last reply Reply Quote 0
                              • stephenw10S
                                stephenw10 Netgate Administrator
                                last edited by

                                You might check the mbuf usage history in the monitoring graphs. We have seen odd traffic create an mbuf leak before in cxgbe. The Wireguard traffic just isn't that odd though. Makes me wonder if it fails in the driver but is actually triggered somewhere else....

                                J 1 Reply Last reply Reply Quote 0
                                • J
                                  Jarhead @stephenw10
                                  last edited by

                                  @stephenw10 So this is getting more weird by the minute!

                                  Last night I switched the WAN to igb0. I had also planned on switching the LAN to igb1 but didn't, although I did free igb1 up and unassigned it. Might be relevant.

                                  All was working great but I didn't have high hopes since it was all working fine with cxl3 for an hour yesterday.
                                  Went to sleep, this morning I checked on it and it was still running. One tunnel has my camera network, another has backups.
                                  'I was confused since it crashed twice on igb0.
                                  Trying to find out what the difference was and I remembered igb1.
                                  I went back and reassigned igb1 to my guest wifi network.
                                  Tried to access the opnsense box on the other end and the page wouldn't load. Went straight to the monitor connected to my pfSense expecting to see the screen scrolling but it wasn't.
                                  Went back to my pfSense webgui and saw the tunnel went down and was coming back up in the time it took me to get back to the interface.
                                  That was at 7am, it still is not completely back. Right now the tunnel gateway shows 1,235.1ms 174.9ms 0.0%. It was at over 5000 rtt and over 500 rttsd.

                                  So with 5 physical interfaces up, it went down but didn't crash.

                                  Could this be a wireguard bug?
                                  Gonna disable the 5th interface and see if it changes anything when I can.

                                  For reference:

                                  My side.
                                  LAN = 10.12.8.0/26
                                  camera = 10.12.8.64/27
                                  Guest wifi = 10.12.8.96/27
                                  IoT = 10.12.8.128/25

                                  1st tunnel:
                                  LAN = 10.8.19.0/26
                                  camera = 10.8.19.248/29

                                  2nd tunnel:
                                  LAN = 192.168.1.0/24

                                  Nothing overlapping.

                                  Bob.DigB 1 Reply Last reply Reply Quote 0
                                  • Bob.DigB
                                    Bob.Dig LAYER 8 @Jarhead
                                    last edited by

                                    @jarhead I ran WG to Android, Windows and a Privacy-VPN, so far no crashes.

                                    J 1 Reply Last reply Reply Quote 0
                                    • J
                                      Jarhead @Bob.Dig
                                      last edited by

                                      @bob-dig said in Epyc 3251 and Wireguard:

                                      @jarhead I ran WG to Android, Windows and a Privacy-VPN, so far no crashes.

                                      But did you have 5 physical interfaces up?

                                      Bob.DigB 1 Reply Last reply Reply Quote 0
                                      • Bob.DigB
                                        Bob.Dig LAYER 8 @Jarhead
                                        last edited by Bob.Dig

                                        @jarhead I had 4 WG tunnels to the privacy VPN and 2 other tunnels. When it comes to physical interfaces, most are VLANs here, so no. Also I just wanted to mentioned it, don't really think that I could help here anyway.

                                        1 Reply Last reply Reply Quote 0
                                        • stephenw10S
                                          stephenw10 Netgate Administrator
                                          last edited by

                                          Hmm, hard to see how the number of interfaces would affect Wireguard. It would increase the mbuf allocation.
                                          Do you see any errors on the console or logged even if it doesn't panic/reboot?

                                          Not sure if it would better to get a crash report from igb at this point. ๐Ÿ˜‰

                                          I could imagine the cxgbe driver is crashing completely but igb, when faces with the same traffic, just starts dropping it causing the massive loss/latency you are seeing.

                                          Steve

                                          J 1 Reply Last reply Reply Quote 0
                                          • J
                                            Jarhead @stephenw10
                                            last edited by

                                            @stephenw10
                                            No errors anywhere I can see but now the other WG tunnel is going out too.

                                            pfsense.png

                                            Bob.DigB 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.