Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Firebox Marvel ports locking up (CORE-E SERIES)

    Scheduled Pinned Locked Moved Hardware
    79 Posts 12 Posters 14.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • E
      Engineer
      last edited by

      Had first LAN lockup with watchdog timeout in three days.  Just turned off all hardware and checksum offloading, saved and rebooted.  Now to start over monitoring.

      If this doesn't work, I will probably go to 2.1.5 until (if that works) until it's resolved.

      Edit:  After reading around more, I'm more inclined to believe that the error is in FreeBSD and that a commit to fix the same error for the older Intel cards (em) has already been committed. https://reviews.freebsd.org/D3192

      As to when it appears, not sure.  Seems to have been sent to the Intel Networking for review too so time will tell.

      As for the OP's issue…if this is a FreeBSD issue (seems to appear on lots of different brands of hardware), hopefully, the FreeBSD patches will fix it.  Don't know enough about that to guess at this point though.

      1 Reply Last reply Reply Quote 0
      • C
        corvey
        last edited by

        2.1.5 works.

        pfSensational™

        1 Reply Last reply Reply Quote 0
        • E
          Engineer
          last edited by

          Not sure if it matters or not but I just noticed that even though I checked to disable TSO under Advanced, Networking, the system tunable net.inet.tcp.tso was at 1.  Changed it to 0, saved and then added net.inet.tcp.tso=0 to /boot/loader.conf.local just to make sure.

          I guess I'll reboot now instead of waiting a few days because I had the disable TSO option checked before and it made no difference.

          1 Reply Last reply Reply Quote 0
          • D
            deanot
            last edited by

            Keep us informed on how it works out, I have been up for a couple of days without issue so far.  I did make the changes you mentioned earlier in the topic, it could be more stable from the changes, but this thing is temperamental and will flake out when it feels like it.

            PFSense System Specs.
            –---------------
            Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
            4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

            1 Reply Last reply Reply Quote 0
            • D
              deanot
              last edited by

              I changed the topic title to reflect more detail on the product.

              PFSense System Specs.
              –---------------
              Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
              4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

              1 Reply Last reply Reply Quote 0
              • E
                Engineer
                last edited by

                Didn't take long this time.  Just went down and a watchdog timeout reset it.  Going to try 2.1.5 to see if that takes care of it until it's fixed.  The folks at Opnsense seemed to think it wouldn't be until FreeBSD 11 before the commits were pushed into production. :(

                Edit:  As before, once the lockup occurs, the watchdog will reset it and restore IPV4 Internet access but IPV6 will remain down until turned off/on or the pfsense system is rebooted.  Right now, my browsing is very slow as it tries to use IPV6 first and then falls back to IPV4 (haven't rebooted yet).  It was a Netflix stream and possibly a TWC TV stream at the same time (high traffic load) that triggered this latest lockup.

                1 Reply Last reply Reply Quote 0
                • D
                  deanot
                  last edited by

                  Now you mention it, it does happen with high traffic, my wife watches a lot of online tv, normal to do it then.  Especially if I log into the box using the GUI, soon as I do that BOOM!.

                  PFSense System Specs.
                  –---------------
                  Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                  4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                  1 Reply Last reply Reply Quote 0
                  • E
                    Engineer
                    last edited by

                    Tried the MXI / MSIX disable and it wouldn't even let me to the LAN / Internet after a reboot.  Edited those back out via console / Shell, rebooted and it ran for 5 minutes before lockup.  Just downloaded 2.1.5 and will try it as soon as I can get people off of Netflix long enough to get it started.

                    1 Reply Last reply Reply Quote 0
                    • D
                      deanot
                      last edited by

                      @Engineer:

                      Tried the MXI / MSIX disable and it wouldn't even let me to the LAN / Internet after a reboot.  Edited those back out via console / Shell, rebooted and it ran for 5 minutes before lockup.  Just downloaded 2.1.5 and will try it as soon as I can get people off of Netflix long enough to get it started.

                      Interesting, I did the MXI / MSIX fix on mine, it worked ok on a reboot.

                      PFSense System Specs.
                      –---------------
                      Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                      4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                      1 Reply Last reply Reply Quote 0
                      • E
                        Engineer
                        last edited by

                        @corvey:

                        2.1.5 works.

                        It might if I could get past the ROOT MOUNT ERROR on the Live CD.  I guess I'll try the memstick version.

                        Edit:  I've given up for tonight.  Can't get past the ROOT MOUNT ERROR on the LiveCD version nor the Memstick version.  Have followed pages of advice and just can't get past it.  Might try later….sigh

                        Edit #2:  Never got it to work and will try later.  Did find a link that pretty much states the same thing for my card as the em card (PCI vs PCIe I think).  https://lists.freebsd.org/pipermail/freebsd-net/2014-January/037694.html . This would seem to be an issue with ANY card with TSO, no matter the driver.

                        1 Reply Last reply Reply Quote 0
                        • stephenw10S
                          stephenw10 Netgate Administrator
                          last edited by

                          Not seen that at all on any of the fireboxes I have here.
                          Even on the msk ports, with msi/msix disabled they are stable. I've never seen an issue with the sk interfaces.
                          The first thing I would do is disable hardware checksum offload if you still have that on. I did see that break dhcp relay inexplicably while everything else worked fine.

                          Many of those boxes have seen a lot of hours in hot conditions. If you have turned the fans down they might be running even hotter. If all the NICs are failing it could be some hardware fault. Bad caps?

                          Steve

                          1 Reply Last reply Reply Quote 0
                          • D
                            deanot
                            last edited by

                            Well, I am done with it.  I have tried everything I can do to make this stable, it went 7 days before shitting itself 4 times today so far.  I am back to my Cisco router/firewall, at least until I can get my hands on a 4 port network card.

                            Anyone interested in a Firebox?

                            PFSense System Specs.
                            –---------------
                            Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                            4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                            1 Reply Last reply Reply Quote 0
                            • E
                              Engineer
                              last edited by

                              sorry to hear deanot! :(

                              My board (igb driver) ran for 4 days before crashing twice this morning, once under almost no load at 4:30am in the morning.  I've managed to compile the latest Intel driver 2.4.3 for mine and will try it later.  Damn, took hours of reading and trial and error to get it going (had to compile it in a FreeBSD 10.1 image installed in a VM).  Hope this solves the issue.

                              Anyway, good luck in the future.

                              Edit:  Running on new driver but looks like it doesn't support traffic shaping, which is beyond my ability to fix.  Oh well, I guess I will try it out still to see how well it works (or doesn't).

                              Edit #2:  Looks like the driver source would need to be edited to allow 'altq' additions.  Too much to worry about right now…will just test as is to see if it locks up or not.

                              Edit #3: Still locked up during the night.  I'm going to move the LAN over to another, unused port to see if this is a hardware issue.

                              1 Reply Last reply Reply Quote 0
                              • D
                                deanot
                                last edited by

                                I do hope you get it working, I feel lost without using the red brick, kinda unsecured in a way.  I am going away for a couple of weeks, being out of the country and this being so unstable is not something I can leave running.

                                My phone server is protected on the other side of this brick, plus all the other devices that require it for internet access.

                                I thought you was going to load 2.1 back on your machine?  did you do that and are you still seeing the same problems? maybe once 2.2.5 is released on nano the problem will go away?

                                I am going to look for a couple of cards, just to get something up and running, at least I can dump my config and put it on a new build, might be something to try.

                                PFSense System Specs.
                                –---------------
                                Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                                4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                                1 Reply Last reply Reply Quote 0
                                • E
                                  Engineer
                                  last edited by

                                  Since this was new, I never had 2.1.5 on it in the first place.  I spent the better part of a day trying to load it with the full CD install only to be greeted by the ROOT MOUNT ERROR no matter what I did.  I finally gave up on that one (for now).

                                  Not sure what to do now other than to switch ports (or replace the switch that igb1 is attached to - Zyxel GS1900-16 smart semi managed switch - to see if the LAN port is having issues with the switch).

                                  Edit:  Probably a long shot but I did have it connected to another switch (dumb switch) when I did the week burn-in test (with no WAN though).  So either the switch makes a difference or something about routing traffic from the WAN to the LAN is locking up the port needing a reset.

                                  Edit #2:  Moved the connection on the LAN to another switch and it just went down for the count again.  Will move it back to the original switch and then move the LAN from igb1 to igb2 to see if there is a hardware issue with port 1 (igb1) of the board.

                                  Edit #3:  The igb2 port did a watchdog and reset also.  Unless I'm mistaken and the ports run off the same chip (don't think so), this has got to be a software or configuration error. :(

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    corvey
                                    last edited by

                                    Engineer, have you not tried 2.1.5 on the Firebox yet?  I've been using that for over a year without any problems 24/7.  BSD 10.1/Pfsense 2.2.4 has a bug somewhere for sure using this old hardware.    I'm building a new system now so can just retire my Firebox on 2.1.5.

                                    pfSensational™

                                    1 Reply Last reply Reply Quote 0
                                    • E
                                      Engineer
                                      last edited by

                                      @corvey:

                                      Engineer, have you not tried 2.1.5 on the Firebox yet?  I've been using that for over a year without any problems 24/7.  BSD 10.1/Pfsense 2.2.4 has a bug somewhere for sure using this old hardware.    I'm building a new system now so can just retire my Firebox on 2.1.5.

                                      corvey,  I'm not using a firebox…just jumped in this thread because it was very similar to my issue.  I'm running new hardware (Supermicro X11SBA-LN4F  with Intel N3700 board with 4 Intel A210-AT ports).  I tried to install 2.1.5 but couldn't get around ROOT MOUNT ERROR no matter what I did.  I spent half a day trying to get it to install from both a USB stick and burned CD....no luck.

                                      I might try again later when people aren't in the house using the Internet (they yell within 10 seconds of the watchdog triggering...sigh)

                                      I know, I should have started my own topic (or stayed in the X11SBA-LN4F  thread right below this one....just trying to solve both my and Deanot's issue).

                                      1 Reply Last reply Reply Quote 0
                                      • D
                                        deanot
                                        last edited by

                                        I have an HP 4 port nic coming, Dual intel chipset.  I will build a machine outside of the Firebox, see how it works out, I do hope this is not a driver issue with the build.  If it is, I will use an older version and avoid the newer builds until certain the issues are resolved.

                                        A few things I do know.

                                        (1) it is not device dependent.
                                        (2) It started to happen on the 2.2.4 Nano BSD build for myself and the regular 2.2.4 build for Engineer.
                                        (3) Pulling the network cable from the port resets the port and the issue goes away.  For a while at least.
                                        (4) Time is not a factor, I have seen my box run anything up to 14 days without issue, then BAM.  Or it could happen within minutes, hours or days of being up.
                                        (5) Heavy traffic or minimal traffic has not much affect, for me, I could just log into the GUI and it would trigger it.
                                        (6) Slowing the Nic Speed down, changing settings related to Nic cards had little to no effect.

                                        PFSense System Specs.
                                        –---------------
                                        Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                                        4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                                        1 Reply Last reply Reply Quote 0
                                        • E
                                          Engineer
                                          last edited by

                                          I turned off VLAN_HWTCO (ifconfig igb2 -vlanhwtco) and also apinger (WAN PORT MONITORING).  Log is down to less than 5% of what it was (things were restarting and checking VERY often.  One+ days up with little in the log.  Time will tell.

                                          1 Reply Last reply Reply Quote 0
                                          • D
                                            deanot
                                            last edited by

                                            Keep us posted, you might drill down into what the actual issue is.

                                            PFSense System Specs.
                                            –---------------
                                            Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
                                            4 CPUs: 1 package(s) x 4 core(s) 4 port HP Branded Intel Ethernet Card

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.