X11SBA-LN4F vs A1SRi-2558F
-
I found this thread via Google, so i registered an account, because i have a similar Problem. I have a new X11SBA-LN4F, installed the newest pfSense build on it and configured (I switched from a X10SBA, because i Need Minimum 3 NICs). So far so good, but i have also These Watchdog timeouts. These occurs only on the igb2, which is my LAN Interface. I made some Tests and saw, that the timeout only occurs on heavy network load (LAN -> WAN / WAN -> LAN). Firmware and BIOS of my X11SBA-LN4F are also up2date. I bought this board 1 week ago.
Is the only solution to get it to work, to RMA this board? -
I found this thread via Google, so i registered an account, because i have a similar Problem. I have a new X11SBA-LN4F, installed the newest pfSense build on it and configured (I switched from a X10SBA, because i Need Minimum 3 NICs). So far so good, but i have also These Watchdog timeouts. These occurs only on the igb2, which is my LAN Interface. I made some Tests and saw, that the timeout only occurs on heavy network load (LAN -> WAN / WAN -> LAN). Firmware and BIOS of my X11SBA-LN4F are also up2date. I bought this board 1 week ago.
Is the only solution to get it to work, to RMA this board?Yes, appears to be a hardware issue. Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification. Seems that they have not done them all yet if you have received one.
Is your hardware level 1.01?
-
Yes, appears to be a hardware issue. Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification. Seems that they have not done them all yet if you have received one.
Is your hardware level 1.01?
Do i find the Revision number on the board itself? I will check that. The shop, which sells the board, have many pieces on stock, so i think this would be an older delivery. Then the only way is to send it back? If so, does this goes back directly to supermicro or via my Reseller?
-
Yes, appears to be a hardware issue. Talking to ldean, SuperMicro has stated that the board has been updated to hardware revision 1.02 and the remaining 1.01 boards were to be changed (ECO) to the same level via modification. Seems that they have not done them all yet if you have received one.
Is your hardware level 1.01?
Do i find the Revision number on the board itself? I will check that. The shop, which sells the board, have many pieces on stock, so i think this would be an older delivery. Then the only way is to send it back? If so, does this goes back directly to supermicro or via my Reseller?
Unless you can get your reseller to swap out to a 1.02 board, you'll need to get in contact with SuperMicro and request an RMA. You might open a tech support ticket up first and reference this thread as well as Ken Huang of SuperMicro. He should know the details by now.
-
I have some good and some bad news. First, i have read from another user with the same problem. He let change the board to the new revision, but he still has the same problems. They did not occur as many as before, but they ar present. He said, its because the cheap pci-e switch on the board. Thats the bad news.
Now the good news. On my pfsense, i have increased the mbuf size to one million, with 8gb ram that should not be a problem. I have did this last sunday evening, so 3 days before. Today i checked my logs (logging to external syslog server), and there was NO ONE entry about whatchdog timeout. Before the increase of the mbuf size, i had to wait only a few hours. I will see whats happen the next days.
-
I have some good and some bad news. First, i have read from another user with the same problem. He let change the board to the new revision, but he still has the same problems. They did not occur as many as before, but they ar present. He said, its because the cheap pci-e switch on the board. Thats the bad news.
Now the good news. On my pfsense, i have increased the mbuf size to one million, with 8gb ram that should not be a problem. I have did this last sunday evening, so 3 days before. Today i checked my logs (logging to external syslog server), and there was NO ONE entry about whatchdog timeout. Before the increase of the mbuf size, i had to wait only a few hours. I will see whats happen the next days.
My board would timeout between 1 hour and 4 days, depending on load. That is bad news about the board revision still doing it. My board has been up 50+ days since the correction with no timeouts.
-
Day 4 without any issue. I will check that for the next few days and will report here, but it seems to look not bad:)
-
After 13 days without any Watchdog timeout, i can say that the Problem is solved for me, by increasing the mbuf size to 1000000. I hope it stays stable:)
-
After 13 days without any Watchdog timeout, i can say that the Problem is solved for me, by increasing the mbuf size to 1000000. I hope it stays stable:)
Interesting. I had the same mbuf size of 1000000 and still had the watchdog timeouts. Regardless, up over 60 days since the board has been 'repaired' (whatever that means from SuperMicro).
-
Tomorrow i will install my new 500Mbit/s Cable Connection and make some Tests, hope there will be no Watchdog Timeouts. Does this Problem also occur on the A1SRi-2558F Board or is this only on the X11SBA-LN4F?
-
Tomorrow i will install my new 500Mbit/s Cable Connection and make some Tests, hope there will be no Watchdog Timeouts. Does this Problem also occur on the A1SRi-2558F Board or is this only on the X11SBA-LN4F?
I know it happens on the X11SBA-LN4F and X11SBA-F (from what I've read) but have saw nothing on the 2558F board.
-
Still watching this thread closely. This board is everything I'm looking for but if the timeout issue doesn't get resolved I won't go ahead with it.
Gonna have to wait for a bit anyway. Life gets in the way and we are on a spending moratorium until jr's medical bills stop rolling in. :-\
-
Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform? I havent tested the x86 version of pfsense on this board yet, but if no one with these problems have done it, i can test it.
-
Ok after a lot of testing, the watchdog timeouts are still here :(. Is this problem related to the x64 platform?
No. I'd try 2.3, the newer drivers may fix.
-
-
So, i have tested with the newest 2.3 build, but still the same issue. The watchdog times out. I think there is no other solution, as to RMA the board.
-
I think there is no other solution, as to RMA the board.
Did you try out high up the mbuf size? Could be a solution but is also based on the
amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
Could solve it, but is not a must be! -
@BlueKobold:
I think there is no other solution, as to RMA the board.
Did you try out high up the mbuf size? Could be a solution but is also based on the
amount of RAM inside of your pfSense box! If to less RAM is there after doing so, you
will end in a booting loop. With 8 GB you will be fine using 500.000 or 1.000.000.
Could solve it, but is not a must be!Yes i tried that. It works a Little bit better, but the timeout is still here on high load.
-
Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.
-
Today, i searched the exact specs from my previous board, which i used as a pfsense box. It was a Supermicro X10SBA, which uses a Intel i210-AT chipset for the NICs…EXACTLY the same chipset, which is on the X1SBA-LN4F. So, how is it possible, that there are such problems (e.g. the "watchdog timeouts") on the newer board, which contains rhe same NIC chipset? On the older X10SBA i had NEVER suvh problems, this board was running over 1 year without any issue, since the first day i set up pfsense.
It's not the i210-AT that's the issue from what we can tell. Port 1 is fine. It's ports 2,3 and 4 that have the issues. It was posted very early in the thread that ports 2,3 and 4 are connected to the PCIe lanes via a PCIe switching chip, the Pericom 608GP. It seems there is some sort of hardware design issue in the original board that causes the watchdog timeouts on the 3 ports attached to the 608GP. Whatever the issue, SM states that a hardware modification was done (don't specify on what) which corrects the issue. Have been up 70 days with no issues after getting my board back.
Now if the X10SBA uses the Pericom 608GP chip, it would seem that SuperMicro didn't make the same mistake on that board as they did on the X1SBA-L / LN4F boards.
Edit: the X10SBA uses the PLX_PEX8605 PCIe switching chip, not the Pericom 608GP. Looks like that's the difference between the board with timeouts vs the one without.