X11SBA-LN4F vs A1SRi-2558F



  • No problem. :) I'll be sure to post results when I do order and get it up and running.



  • @Jailer:

    I would agree that a 2558 based system would be a much better choice for someone with the bandwidth requirements that need it. My ISP is very limited so the quick assist means nothing to me but to others it may. Like I said the X1SBA-F-O is a good choice for me.

    I understand.

    However, given the cost, this is something I expect to buy/upgrade once in a few years (I hope it will last at least five years).

    My bandwidth is currently 70Mbps/4Mbps which is really slow so C2558 might even be an overkill, but I plan to run a few packages, not to mention that the internet speeds can only go up in the future so I consider this future-proofing.



  • Just wanted to update the thread.  We received our box back from supermicro yesterday and will be installing it into production tomorrow.  The repair report is somewhat vague about what they changed, but maybe it makes more sense to someone else:

    Customer Reported Symptoms:
    Watchdog timeout on ethernet ports.  Per TS, need ECO 18137
    
    Test result notes and repair:
    REPORTED PROBLEM FOUND.  WATCHDOG TIMEOUTS ON ETHERNET PORTS.  M/B HW ECO COMPLETED BY REWORK.  M/B BIOS, IPMI FW UPDATED TO CURRENT REVISION DONE.  CPU, DIMM SLOT DETECTION VERIFIED.  NIC PORT, USB PORT, IPMI CONNECTION TEST PASS.  NIC PORT LAN EEPROM FW UPDATED TO CURRENT REVISION COMPLETE.  NIC PORT PASSED OVERNIGHT PING TEST.  COM PORT CONNECTION VERIFIED.  SYSTEM HARDWARE FUNCTIONAL TEST PASS.  ECO VERIFIED.  ALL M/B SCREWS CHECKED.  TEST PASSED.
    

    I'm not too sure what the ECO refers to.  Anyone have an idea?



  • I just got the 2 port version and have the same issue with the 2nd port.
    I put both ports in a LAGG and see the 2nd port stop passing traffic. I don't see watchdog timeouts in the log though.
    The rest of the machine keeps running fine on only 1 port.

    ECO means  Engineering Change Order (ECO)
    Definitely sounds like a hardware change, looks like I'll have to send mine back too.
    I was hoping it would be resolved by a BIOS update.



  • Well I guess this last reply counts me out for either version of this board. I'm not knowledgeable enough to be a beta tester for motherboards so I'll stick with a proven performer.

    Bummer, either one of these would have suited me perfectly.  :(



  • @ldean:

    Just wanted to update the thread.  We received our box back from supermicro yesterday and will be installing it into production tomorrow.  The repair report is somewhat vague about what they changed, but maybe it makes more sense to someone else:

    Customer Reported Symptoms:
    Watchdog timeout on ethernet ports.  Per TS, need ECO 18137
    
    Test result notes and repair:
    REPORTED PROBLEM FOUND.  WATCHDOG TIMEOUTS ON ETHERNET PORTS.  M/B HW ECO COMPLETED BY REWORK.  M/B BIOS, IPMI FW UPDATED TO CURRENT REVISION DONE.  CPU, DIMM SLOT DETECTION VERIFIED.  NIC PORT, USB PORT, IPMI CONNECTION TEST PASS.  NIC PORT LAN EEPROM FW UPDATED TO CURRENT REVISION COMPLETE.  NIC PORT PASSED OVERNIGHT PING TEST.  COM PORT CONNECTION VERIFIED.  SYSTEM HARDWARE FUNCTIONAL TEST PASS.  ECO VERIFIED.  ALL M/B SCREWS CHECKED.  TEST PASSED.
    

    I'm not too sure what the ECO refers to.  Anyone have an idea?

    This one is interesting too…..

    NIC PORT LAN EEPROM FW UPDATED TO CURRENT REVISION COMPLETE.



  • I saw that too. Obviously a firmware issue with the controller causing the timeouts.

    I wonder how long production on them will run before the issue gets fixed in the manufacturing process?



  • Side note…my board has now been up 35 days on 2.2.5 with NO issues.  ZERO watchdog timeouts now.

    @Jailer:

    I saw that too. Obviously a firmware issue with the controller causing the timeouts.

    I wonder how long production on them will run before the issue gets fixed in the manufacturing process?

    I was wondering the same.  Don't know how many were in the supply chain before this was caught and SM seems content to ship them out defective and fix them IF the customer catches it (not always the case).

    @timotl:

    I just got the 2 port version and have the same issue with the 2nd port.

    What hardware revision does your 2 port board have on it (1.01 on the 4 port version so far is all that I have seen)?



  • Just noticed a new firmware 1.0b is now available.  Anyone want to try to see if it possibly solves this issue before sending the board back?



  • @Engineer:

    Just noticed a new firmware 1.0b is now available.  Anyone want to try to see if it possibly solves this issue before sending the board back?

    Is there a place to see what the BIOS revisions are? There has to be somewhere but I'm obviously not smart enough to find it on my own.



  • @Jailer:

    @Engineer:

    Just noticed a new firmware 1.0b is now available.  Anyone want to try to see if it possibly solves this issue before sending the board back?

    Is there a place to see what the BIOS revisions are? There has to be somewhere but I'm obviously not smart enough to find it on my own.

    I've not been able to find any sort of change log at SM for any firmware.



  • Bios update didn't make any difference.
    Time to RMA the board for repair..



  • I found this thread via Google, so i registered an account, because i have a similar Problem. I have a new X11SBA-LN4F, installed the newest pfSense build on it and configured (I switched from a X10SBA, because i Need Minimum 3 NICs). So far so good, but i have also These Watchdog timeouts. These occurs only on the igb2, which is my LAN Interface. I made some Tests and saw, that the timeout only occurs on heavy network load (LAN -> WAN / WAN -> LAN). Firmware and BIOS of my X11SBA-LN4F are also up2date. I bought this board 1 week ago.
    Is the only solution to get it to work, to RMA this board?



  • @endy66:

    I found this thread via Google, so i registered an account, because i have a similar Problem. I have a new X11SBA-LN4F, installed the newest pfSense build on it and configured (I switched from a X10SBA, because i Need Minimum 3 NICs). So far so good, but i have also These Watchdog timeouts. These occurs only on the igb2, which is my LAN Interface. I made some Tests and saw, that the timeout only occurs on heavy network load (LAN -> WAN / WAN -> LAN). Firmware and BIOS of my X11SBA-LN4F are also up2date. I bought this board 1 week ago.
    Is the only solution to get it to work, to RMA this board?

    Yes, appears to be a hardware issue.  Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification.  Seems that they have not done them all yet if you have received one.

    Is your hardware level 1.01?



  • @Engineer:

    Yes, appears to be a hardware issue.  Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification.  Seems that they have not done them all yet if you have received one.

    Is your hardware level 1.01?

    Do i find the Revision number on the board itself? I will check that. The shop, which sells the board, have many pieces on stock, so i think this would be an older delivery. Then the only way is to send it back? If so, does this goes back directly to supermicro or via my Reseller?



  • @endy66:

    @Engineer:

    Yes, appears to be a hardware issue.  Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification.  Seems that they have not done them all yet if you have received one.

    Is your hardware level 1.01?

    Do i find the Revision number on the board itself? I will check that. The shop, which sells the board, have many pieces on stock, so i think this would be an older delivery. Then the only way is to send it back? If so, does this goes back directly to supermicro or via my Reseller?

    Unless you can get your reseller to swap out to a 1.02 board, you'll need to get in contact with SuperMicro and request an RMA.  You might open a tech support ticket up first and reference this thread as well as Ken Huang of SuperMicro.  He should know the details by now.



  • I have some good and some bad news. First, i have read from another user with the same problem. He let change the board to the new revision, but he still has the same problems. They did not occur as many as before, but they ar present. He said, its because the cheap pci-e switch on the board. Thats the bad news.

    Now the good news. On my pfsense, i have increased the mbuf size to one million, with 8gb ram that should not be a problem. I have did this last sunday evening, so 3 days before. Today i checked my logs (logging to external syslog server), and there was NO ONE entry about whatchdog timeout. Before the increase of the mbuf size, i had to wait only a few hours. I will see whats happen the next days.



  • @endy66:

    I have some good and some bad news. First, i have read from another user with the same problem. He let change the board to the new revision, but he still has the same problems. They did not occur as many as before, but they ar present. He said, its because the cheap pci-e switch on the board. Thats the bad news.

    Now the good news. On my pfsense, i have increased the mbuf size to one million, with 8gb ram that should not be a problem. I have did this last sunday evening, so 3 days before. Today i checked my logs (logging to external syslog server), and there was NO ONE entry about whatchdog timeout. Before the increase of the mbuf size, i had to wait only a few hours. I will see whats happen the next days.

    My board would timeout between 1 hour and 4 days, depending on load.  That is bad news about the board revision still doing it.  My board has been up 50+ days since the correction with no timeouts.



  • Day 4 without any issue. I will check that for the next few days and will report here, but it seems to look not bad:)



  • After 13 days without any Watchdog timeout, i can say that the Problem is solved for me, by increasing the mbuf size to 1000000. I hope it stays stable:)



  • @endy66:

    After 13 days without any Watchdog timeout, i can say that the Problem is solved for me, by increasing the mbuf size to 1000000. I hope it stays stable:)

    Interesting.  I had the same mbuf size of 1000000 and still had the watchdog timeouts.  Regardless, up over 60 days since the board has been 'repaired' (whatever that means from SuperMicro).



  • Tomorrow i will install my new 500Mbit/s Cable Connection and make some Tests, hope there will be no Watchdog Timeouts. Does this Problem also occur on the A1SRi-2558F Board or is this only on the X11SBA-LN4F?



  • @endy66:

    Tomorrow i will install my new 500Mbit/s Cable Connection and make some Tests, hope there will be no Watchdog Timeouts. Does this Problem also occur on the A1SRi-2558F Board or is this only on the X11SBA-LN4F?

    I know it happens on the X11SBA-LN4F and X11SBA-F (from what I've read) but have saw nothing on the 2558F board.



  • Still watching this thread closely. This board is everything I'm looking for but if the timeout issue doesn't get resolved I won't go ahead with it.

    Gonna have to wait for a bit anyway. Life gets in the way and we are on a spending moratorium until jr's medical bills stop rolling in.  :-\



  • Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform? I havent tested the x86 version of pfsense on this board yet, but if no one with these problems have done it, i can test it.



  • @endy66:

    Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform?

    No. I'd try 2.3, the newer drivers may fix.



  • @cmb:

    @endy66:

    Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform?

    No. I'd try 2.3, the newer drivers may fix.

    Ok, then i will install the newest build now and test it. Will report later, if the newer included drivers fix the problem.



  • So, i have tested with the newest 2.3 build, but still the same issue. The watchdog times out. I think there is no other solution, as to RMA the board.



  • I think there is no other solution, as to RMA the board.

    Did you try out high up the mbuf size? Could be a solution but is also based on the
    amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
    will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
    Could solve it, but is not a must be!



  • @BlueKobold:

    I think there is no other solution, as to RMA the board.

    Did you try out high up the mbuf size? Could be a solution but is also based on the
    amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
    will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
    Could solve it, but is not a must be!

    Yes i tried that. It works a Little bit better, but the timeout is still here on high load.



  • Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.



  • @endy66:

    Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.

    It's not the i210-AT that's the issue from what we can tell.  Port 1 is fine.  It's ports 2,3 and 4 that have the issues.  It was posted very early in the thread that ports 2,3 and 4 are connected to the PCIe lanes via a PCIe switching chip, the Pericom 608GP.  It seems there is some sort of hardware design issue in the original board that causes the watchdog timeouts on the 3 ports attached to the 608GP.  Whatever the issue, SM states that a hardware modification was done (don't specify on what) which corrects the issue.  Have been up 70 days with no issues after getting my board back.

    Now if the X10SBA uses the Pericom 608GP chip, it would seem that SuperMicro didn't make the same mistake on that board as they did on the X1SBA-L / LN4F boards.

    Edit:  the X10SBA uses the PLX_PEX8605 PCIe switching chip, not the Pericom 608GP.  Looks like that's the difference between the board with timeouts vs the one without.



  • Ok that makes sense, thank you for the clarification.



  • As I'm reading all the comments here in this thread, am becoming more and more confused as time goes on.

    Between the two boards X11SBA-LN4F vs A1SRi-2558F, please give your final thoughts. Quick assist and AES-NI is a must, and this will be deplolyed on small lab/office (Plex, NAS, 5x AP, VMwares, 5x PCs, 5x wireless devices, etc).



  • Between the two boards X11SBA-LN4F vs A1SRi-2558F, please give your final thoughts.

    Take a SG-4860, SG-8860 or XG-2758 1U & mSATA and thats it.

    Quick assist and AES-NI is a must,

    AES-NI is still working and Intel QuickAssist is coming in 2016 as it is known until today.

    and this will be deplolyed on small lab/office (Plex, NAS, 5x AP, VMwares, 5x PCs, 5x wireless devices, etc).

    It is all about what functions, options, services will be used or offered and which packets will be installed
    and what Internet connection speed is in usage. Also a smaller office with the need of a real UTM device
    will be on the need of much more power then pfSense as a pure firewall and the captive portal is. And
    on top of this there are many other points that would be jumping in on top of this, likes perhaps if the
    DMZ and LAN switch are capable of 10 GbE or are sorted with SFP+ ports you would perhaps more lucky
    with the XG-2758 1U rack mount version only as an example. But for Internet speed under 1 GBit/s and
    pfSense as a pure firewall only the SG-2440 would perhaps enough. Who knows, tell us more about your
    use case please.



  • Has anyone having the watchdog time out problem been able to resolve it by updating the BIOS or firmware?



  • Bumping this thread.

    Just bought a SM SuperServer E200-9B. Added Kingston 120gb ssd and 8gb of crucial 1333 SODIMM.

    Currently I am getting Watchdog timeouts on the LAN (IGB1) interface. I am watching the new 2.3 thread here:

    https://forum.pfsense.org/index.php?topic=110710.15

    Possible root cause:
    It seems like it might be specific to SMP (>1 CPU core)

    i did find older post on watchdog timeouts:

    https://doc.pfsense.org/index.php/Disable_ACPI

    Now I have found this thread. The SuperServer E200-9B is currently using 3.2.1 PFsense. Bios revision is 1.0 (there is a 1.0b - but no changelog information i can find). IMPI firmware is 00.55 (newest).

    I do have some fallback ALIX boards in use previously but I am concerned that the SM SuperServer E200-9B Pericom 608GP is the issue here. The previous post on RMA do show an EEPROM Firmware update of the NIC:

    quote author=ldean link=topic=98230.msg594532#msg594532 date=1455223690]
    Just wanted to update the thread.  We received our box back from supermicro yesterday and will be installing it into production tomorrow.  The repair report is somewhat vague about what they changed, but maybe it makes more sense to someone else:

    Customer Reported Symptoms:
    Watchdog timeout on ethernet ports.  Per TS, need ECO 18137
    
    Test result notes and repair:
    REPORTED PROBLEM FOUND.  WATCHDOG TIMEOUTS ON ETHERNET PORTS.  M/B HW ECO COMPLETED BY REWORK.  M/B BIOS, IPMI FW UPDATED TO CURRENT REVISION DONE.  CPU, DIMM SLOT DETECTION VERIFIED.  NIC PORT, USB PORT, IPMI CONNECTION TEST PASS.  NIC PORT LAN EEPROM FW UPDATED TO CURRENT REVISION COMPLETE.  NIC PORT PASSED OVERNIGHT PING TEST.  COM PORT CONNECTION VERIFIED.  SYSTEM HARDWARE FUNCTIONAL TEST PASS.  ECO VERIFIED.  ALL M/B SCREWS CHECKED.  TEST PASSED.
    

    I'm not too sure what the ECO refers to.  Anyone have an idea?

    https://dl.dropboxusercontent.com/u/42296/SMSSE200-9B%20block.JPG

    My Question is this hardware or PFsense 2.3 related (as others are experiencing the issue as well)? I have no issue to RMA this board back (although i really like it beyond this watchdog issue).

    Any response is appreciated.



  • The board, since repair (modification?), has been running for 120 days now.  No issues.

    Much, much better.

    @OLBaID - the watchdog timeouts are a hardware issue.  SM made an unknown modification on my board to eliminate the watchdog timeouts.  Seems related to the PCIe switching chip (the first Ethernet port is attached directly to the PCIe bus of the N3700 while the other ports go through a port switching chip.  Those three ports all have watchdog timeout issues).

    There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.



  • @Engineer:

    The board, since repair (modification?), has been running for 120 days now.  No issues.

    There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.

    Great work Engineer… You saved me (and likely a few others) a lot of grief... I was going to buy that board, and with my current level of BSD knowledge I'd be F#@ked!

    Based on your work, I've tried to contact SM to find out if this has been incorporated into new boards and how to positively identify which boards have been modified.  Time will tell if I get an answer.  I'll report back to the board for the benefit of all.

    Based on the way things are now, would you say that this board is a good choice (assuming the mod is done)?

    If yes, can you please comment on:

    • Parts you used (Case/Power Supply/Fan)

    • What operating temp is like

    • What version you are running

    • What packages

    • What throughput you are getting.

    Thanks again for all your good work!



  • @guardian:

    • Parts you used (Case/Power Supply/Fan)

    • What operating temp is like

    • What version you are running

    • What packages

    • What throughput you are getting.

    Thanks again for all your good work!

    Case was an Antec ISK-110 (no fans)
    Power supply was the 90W built in Antec supply (fanless) - measures 10-11 watts at the wall.
    Temperatures hang around 45-50C on the four cores (no fans though)
    Still running 2.2.5
    No packages
    I'm getting my full speed but that isn't much.  Waiting on TWC Maxx

    I built this more out of curiosity and enjoyment than really needing it.  I wanted as much CPU power that I could get at as low of power that I could get (within pricing reason of course).  I do run an ipsec VPN.  CPU load stays mostly around 3%.  Based on research, it should be good for 1Gbit plus and with AES-NI, it should be pretty good on encrypted VPN stuff.  I wanted a stable router that was future proof.  After the headaches of hardware issues, I think I have it.  I'm not sure that SM has implemented this into production yet even though they say that they have.