X11SBA-LN4F vs A1SRi-2558F
-
Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform? I havent tested the x86 version of pfsense on this board yet, but if no one with these problems have done it, i can test it.
-
Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform?
No. I'd try 2.3, the newer drivers may fix.
-
-
So, i have tested with the newest 2.3 build, but still the same issue. The watchdog times out. I think there is no other solution, as to RMA the board.
-
I think there is no other solution, as to RMA the board.
Did you try out high up the mbuf size? Could be a solution but is also based on the
amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
Could solve it, but is not a must be! -
@BlueKobold:
I think there is no other solution, as to RMA the board.
Did you try out high up the mbuf size? Could be a solution but is also based on the
amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
Could solve it, but is not a must be!Yes i tried that. It works a Little bit better, but the timeout is still here on high load.
-
Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.
-
Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.
It's not the i210-AT that's the issue from what we can tell. Port 1 is fine. It's ports 2,3 and 4 that have the issues. It was posted very early in the thread that ports 2,3 and 4 are connected to the PCIe lanes via a PCIe switching chip, the Pericom 608GP. It seems there is some sort of hardware design issue in the original board that causes the watchdog timeouts on the 3 ports attached to the 608GP. Whatever the issue, SM states that a hardware modification was done (don't specify on what) which corrects the issue. Have been up 70 days with no issues after getting my board back.
Now if the X10SBA uses the Pericom 608GP chip, it would seem that SuperMicro didn't make the same mistake on that board as they did on the X1SBA-L / LN4F boards.
Edit: the X10SBA uses the PLX_PEX8605 PCIe switching chip, not the Pericom 608GP. Looks like that's the difference between the board with timeouts vs the one without.
-
Ok that makes sense, thank you for the clarification.
-
As I'm reading all the comments here in this thread, am becoming more and more confused as time goes on.
Between the two boards X11SBA-LN4F vs A1SRi-2558F, please give your final thoughts. Quick assist and AES-NI is a must, and this will be deplolyed on small lab/office (Plex, NAS, 5x AP, VMwares, 5x PCs, 5x wireless devices, etc).
-
Between the two boards X11SBA-LN4F vs A1SRi-2558F, please give your final thoughts.
Take a SG-4860, SG-8860 or XG-2758 1U & mSATA and thats it.
Quick assist and AES-NI is a must,
AES-NI is still working and Intel QuickAssist is coming in 2016 as it is known until today.
and this will be deplolyed on small lab/office (Plex, NAS, 5x AP, VMwares, 5x PCs, 5x wireless devices, etc).
It is all about what functions, options, services will be used or offered and which packets will be installed
and what Internet connection speed is in usage. Also a smaller office with the need of a real UTM device
will be on the need of much more power then pfSense as a pure firewall and the captive portal is. And
on top of this there are many other points that would be jumping in on top of this, likes perhaps if the
DMZ and LAN switch are capable of 10 GbE or are sorted with SFP+ ports you would perhaps more lucky
with the XG-2758 1U rack mount version only as an example. But for Internet speed under 1 GBit/s and
pfSense as a pure firewall only the SG-2440 would perhaps enough. Who knows, tell us more about your
use case please. -
Has anyone having the watchdog time out problem been able to resolve it by updating the BIOS or firmware?
-
Bumping this thread.
Just bought a SM SuperServer E200-9B. Added Kingston 120gb ssd and 8gb of crucial 1333 SODIMM.
Currently I am getting Watchdog timeouts on the LAN (IGB1) interface. I am watching the new 2.3 thread here:
https://forum.pfsense.org/index.php?topic=110710.15
Possible root cause:
It seems like it might be specific to SMP (>1 CPU core)i did find older post on watchdog timeouts:
https://doc.pfsense.org/index.php/Disable_ACPI
Now I have found this thread. The SuperServer E200-9B is currently using 3.2.1 PFsense. Bios revision is 1.0 (there is a 1.0b - but no changelog information i can find). IMPI firmware is 00.55 (newest).
I do have some fallback ALIX boards in use previously but I am concerned that the SM SuperServer E200-9B Pericom 608GP is the issue here. The previous post on RMA do show an EEPROM Firmware update of the NIC:
quote author=ldean link=topic=98230.msg594532#msg594532 date=1455223690]
Just wanted to update the thread. We received our box back from supermicro yesterday and will be installing it into production tomorrow. The repair report is somewhat vague about what they changed, but maybe it makes more sense to someone else:Customer Reported Symptoms: Watchdog timeout on ethernet ports. Per TS, need ECO 18137 Test result notes and repair: REPORTED PROBLEM FOUND. WATCHDOG TIMEOUTS ON ETHERNET PORTS. M/B HW ECO COMPLETED BY REWORK. M/B BIOS, IPMI FW UPDATED TO CURRENT REVISION DONE. CPU, DIMM SLOT DETECTION VERIFIED. NIC PORT, USB PORT, IPMI CONNECTION TEST PASS. NIC PORT LAN EEPROM FW UPDATED TO CURRENT REVISION COMPLETE. NIC PORT PASSED OVERNIGHT PING TEST. COM PORT CONNECTION VERIFIED. SYSTEM HARDWARE FUNCTIONAL TEST PASS. ECO VERIFIED. ALL M/B SCREWS CHECKED. TEST PASSED.
I'm not too sure what the ECO refers to. Anyone have an idea?
https://dl.dropboxusercontent.com/u/42296/SMSSE200-9B%20block.JPG
My Question is this hardware or PFsense 2.3 related (as others are experiencing the issue as well)? I have no issue to RMA this board back (although i really like it beyond this watchdog issue).
Any response is appreciated.
-
The board, since repair (modification?), has been running for 120 days now. No issues.
Much, much better.
@OLBaID - the watchdog timeouts are a hardware issue. SM made an unknown modification on my board to eliminate the watchdog timeouts. Seems related to the PCIe switching chip (the first Ethernet port is attached directly to the PCIe bus of the N3700 while the other ports go through a port switching chip. Those three ports all have watchdog timeout issues).
There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.
-
The board, since repair (modification?), has been running for 120 days now. No issues.
There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.
Great work Engineer… You saved me (and likely a few others) a lot of grief... I was going to buy that board, and with my current level of BSD knowledge I'd be F#@ked!
Based on your work, I've tried to contact SM to find out if this has been incorporated into new boards and how to positively identify which boards have been modified. Time will tell if I get an answer. I'll report back to the board for the benefit of all.
Based on the way things are now, would you say that this board is a good choice (assuming the mod is done)?
If yes, can you please comment on:
-
Parts you used (Case/Power Supply/Fan)
-
What operating temp is like
-
What version you are running
-
What packages
-
What throughput you are getting.
Thanks again for all your good work!
-
-
-
Parts you used (Case/Power Supply/Fan)
-
What operating temp is like
-
What version you are running
-
What packages
-
What throughput you are getting.
Thanks again for all your good work!
Case was an Antec ISK-110 (no fans)
Power supply was the 90W built in Antec supply (fanless) - measures 10-11 watts at the wall.
Temperatures hang around 45-50C on the four cores (no fans though)
Still running 2.2.5
No packages
I'm getting my full speed but that isn't much. Waiting on TWC MaxxI built this more out of curiosity and enjoyment than really needing it. I wanted as much CPU power that I could get at as low of power that I could get (within pricing reason of course). I do run an ipsec VPN. CPU load stays mostly around 3%. Based on research, it should be good for 1Gbit plus and with AES-NI, it should be pretty good on encrypted VPN stuff. I wanted a stable router that was future proof. After the headaches of hardware issues, I think I have it. I'm not sure that SM has implemented this into production yet even though they say that they have.
-
-
Hint: Read the contents of the NIC-EEproms
My Guess: Changing of ASPM-Parameters (I still can't fathom who invented that atrocity and also who decided to enable it on servers boards :))
I have/had 3 of X11SBA-LN4F in production and they decided to fail this weekend. :o
Now testing: ASPM Disabled, MSI-X Disabled.
-
The board, since repair (modification?), has been running for 120 days now. No issues.
Much, much better.
@OLBaID - the watchdog timeouts are a hardware issue. SM made an unknown modification on my board to eliminate the watchdog timeouts. Seems related to the PCIe switching chip (the first Ethernet port is attached directly to the PCIe bus of the N3700 while the other ports go through a port switching chip. Those three ports all have watchdog timeout issues).
There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.
Engineer
Thanks so much i actually did reference this post and have talked to Ken and several others at SM and the board is back for RMA already. I also mentioned the eeprom updated listed in IDean's post.
I will update this thread once it is returned to me but please note this as well:
https://redmine.pfsense.org/issues/6296
as it is relevant (same issue) but with many hardware configurations.
Appreciate everyone whom added to this thread, the community is great!
-
but please note this as well:
https://redmine.pfsense.org/issues/6296
as it is relevant (same issue) but with many hardware configurations.
Appreciate everyone whom added to this thread, the community is great!
That's something new with 2.3 it seems. I'm running 2.2.5 (had the watchdog issues with 2.2.4 and 2.2.5 with the board before repair). I tried as many option items as possible and even figured out how to compile Intel's latest driver for the I210 chip. It ran but with the same watchdog timeouts on ports 2,3 and 4 (1, which is directly to the N3700 PCIe lane, always worked just fine).
Keep us updated and good luck!
-
The board, since repair (modification?), has been running for 120 days now. No issues.
Much, much better.
@OLBaID - the watchdog timeouts are a hardware issue. SM made an unknown modification on my board to eliminate the watchdog timeouts. Seems related to the PCIe switching chip (the first Ethernet port is attached directly to the PCIe bus of the N3700 while the other ports go through a port switching chip. Those three ports all have watchdog timeout issues).
There's quite a bit of information in this thread including the contact (Ken Huang IIRC) that has experience in this issue.
I posted this information on another thread, but I thought putting it here
might save somebody some time.Engineer seems to have figured out thought a lot of hard work that an RMA
to encorporate ECO 18137 is what is required to make the
Supermicro X11SBA-LN4F-O N3700 stable.AND
From what I understand from reading the form, it seems to be very low power, is
easy to keep cool, and has decent performance for it's class.This motivated me to follow up with SM Tech Support, and I got
the following back:–----------------------------------------------------------------------------------------
-------- Forwarded Message --------
Subject: RE: X11SBA-LN4F-O - Pre-Sales Enquiry [WT]
Date: Tue, 10 May 2016 00:02:16 +0000
From: Technical Support support@supermicro.comTo: –---@---.ca>, Technical Support support@supermicro.comHelloAfter further investigation, the ECOs has been implemented onto
PCB 1.02 for the aforementioned issues. When you place an order
with your distributor, please ensure to specify a PCB 1.02 to be
shipped to you.Regards,
Technical Support
If I understand what I've read, the X11SBA-LN4F-O N3700 PCB 1.02
should make a decent pfSense platform – or am I missing something?Clearly it's no A1SRi-2558F, but in Canada the difference between the
two boards is $141 CDN based on the best prices I could find today.
Unless broadband costs drop a lot, I can't see outgrowing it for 4 or 5
years (Minimum), and by that time I'll likely have a cap dry out and have
to replace whatever I buy anyway, so I might as well put the $141 toward
a case and memory, or am I missing something?/support@supermicro.com/support@supermicro.com