Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Super Micro C2758 crashes

    Scheduled Pinned Locked Moved 2.4 Development Snapshots
    17 Posts 4 Posters 5.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • chrismacmahonC
      chrismacmahon
      last edited by

      looking at both crash reports (yesterday/today), you are having different call outs.

      Potentially add: hw.igb.num_queues=1 in loader.conf

      Hope that helps.

      Need help fast? Our support is available 24/7 https://www.netgate.com/support/

      Do Not PM For Help!

      1 Reply Last reply Reply Quote 0
      • A
        athurdent
        last edited by

        Thanks, I have added this and rebooted. Will report back.

        1 Reply Last reply Reply Quote 0
        • w0wW
          w0w
          last edited by

          If "safe" hw.igb.num_queues=1 works for for you, then you can try
          hw.igb.num_queues=2
          C2758 is 8 core CPU and you have 4 intel ports, according to freebsd tuning guide https://calomel.org/freebsd_network_tuning.html

          1 Reply Last reply Reply Quote 0
          • A
            athurdent
            last edited by

            The board has been stable, so I modified the loader conf to use 2 queues. While there I also installed the latest snapshot and rebooted.
            On Sunday I upgraded to BIOS 1.1a, the board was on 1.1 before, forgot to mention that. Don't know if anything relevant has changed with this BIOS, as I cannot find any changelog / release notes from Supermicro.

            1 Reply Last reply Reply Quote 0
            • A
              athurdent
              last edited by

              OK, no luck with 2 queues.

              I can reboot the board with

              hping3 -c 100 -d 120 -S -w 64 -p 443 --flood 192.168.x.10
              

              from LAN to my DMZ (produces about 130000 states).
              I just send in the Crash Report for that.

              With only one queue I can even use "–rand-source" until the state table is full, no reboot.

              1 Reply Last reply Reply Quote 0
              • w0wW
                w0w
                last edited by

                I'll do the test on similar hardware later this week.

                1 Reply Last reply Reply Quote 0
                • A
                  athurdent
                  last edited by

                  Thanks w0w, looking forward to your results.

                  BTW the hping3 crash looked different:

                  panic: bpf_mcopy
                  cpuid = 6
                  KDB: enter: panic

                  Edit: Just found this on Reddit, so I guess only one queue per interface should be OK for everyone just passing traffic trough?

                  https://www.reddit.com/r/PFSENSE/comments/5obhlm/what_are_the_ramifications_of_less_nic_queues/dci77px/

                  1 Reply Last reply Reply Quote 0
                  • w0wW
                    w0w
                    last edited by

                    Tested, but I could not replicate the issue.
                    How much physical RAM do you have installed?
                    Do you have any custom tunables enabled, other than you have provided in this topic?
                    Below you will see network stack tunes I have enabled:
                    #some magic numbers :)
                    kern.ipc.nmbjumbo9="20000"
                    kern.ipc.nmbclusters="1000000"
                    kern.ipc.maxsockbuf="256000000"
                    #some more igb tune for GIG links
                    hw.igb.rxd="4096"
                    hw.igb.txd="4096"
                    net.inet.tcp.syncache.hashsize=1024
                    net.inet.tcp.syncache.bucketlimit=100
                    net.isr.defaultqlimit=4096
                    net.link.ifqmaxlen=10240
                    hw.igb.rx_process_limit="-1"
                    hw.igb.num_queues=2
                    #disable flow control on all igb interfaces
                    dev.igb.0.fc=0
                    dev.igb.1.fc=0
                    dev.igb.2.fc=0
                    dev.igb.3.fc=0
                    I'll do some test with all tunables disabled and will try to utilize all bandwidth I have on WAN…

                    1 Reply Last reply Reply Quote 0
                    • A
                      athurdent
                      last edited by

                      Thanks for looking into this. I tend to stick with the defaults until I hit a problem. So I only have nmbclusters defined. I have also disabled Hardware Checksum Offloading, can't remember why, though.
                      Amongst others I'm using CARP, VLANs, the on-board SNMP and Snort. Those could also be resonsible for problems I guess.

                      1 Reply Last reply Reply Quote 0
                      • C
                        chrcoluk
                        last edited by

                        did you try with 1 igb queue? if not please test with it.

                        –edit--

                        I see you did, thanks for reporting the findings, it does seem igb currently on FreeBSD 11 has issues with multi queue.

                        pfSense CE 2.8.0

                        1 Reply Last reply Reply Quote 0
                        • A
                          athurdent
                          last edited by

                          @chrcoluk:

                          I see you did, thanks for reporting the findings, it does seem igb currently on FreeBSD 11 has issues with multi queue.

                          Thanks, do you have any FreeBSD Bugtracker or forum reference for this?

                          1 Reply Last reply Reply Quote 0
                          • C
                            chrcoluk
                            last edited by

                            sadly I did not bookmark it but will have a look later and if I find it will post here.

                            pfSense CE 2.8.0

                            1 Reply Last reply Reply Quote 0
                            • w0wW
                              w0w
                              last edited by

                              Tested with states over 196015, but nothing.  But I have found that what I am doing is a little bit wrong. The problem is that my testing machine have PPPoE link enabled on WAN testing interface and that means that WAN uses only one queue by freebsd design.
                              Sorry, I can not test it in other way, but it looks like igb(4) drivers are not so good, but at least are not so bad as stock realtek freebsd drivers;)
                              I did not researched yet, but there are some custom patches can be found over freebsd community.
                              But if your SuperMicro C2758 doing its job and you have not performance issues than just leave it  alone :)

                              1 Reply Last reply Reply Quote 0
                              • C
                                chrcoluk
                                last edited by

                                The problem seems to be triggered by one of these two things or perhaps even both.

                                FreeBSD 11 changes ref igb driver.

                                RSS awareness has been added to the igb(4) driver (r268028)
                                Automatic disabling of multi queue if ALTQ is enabled in kernel.

                                The latter I think is the most likely as it seems its not fully disabling multi queue, as evident by the fact the loader tunable is defaulting to match number of cpu cores, and then the problem goes away when its forced to a 1 value.

                                https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212413

                                That problem report also links to another couple as well.

                                pfSense CE 2.8.0

                                1 Reply Last reply Reply Quote 0
                                • A
                                  athurdent
                                  last edited by

                                  @chrcoluk: Thank you!

                                  1 Reply Last reply Reply Quote 0
                                  • A
                                    athurdent
                                    last edited by

                                    Might be fixed now: https://redmine.pfsense.org/issues/7149

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.