X11SBA-LN4F vs A1SRi-2558F
-
Two more crashes this morning, one with no traffic load at 4:50 am. Will work on it later to see if I can solve this mess…sigh
Edit: Spent the better part of Saturday learning how to compile the latest Intel driver (2.4.3) for igb ports on FreeBSD (using FreeBSD 10.1 running in a VM). Finally got it installed and working but it broke Traffic Shaping (because I didn't have ALTQ support in the driver source). Regardless, it did a watchdog timeout within 6 hours of installation.
Edit #2: Put the original driver back and connected the LAN port to another switch. It ran for over a week with nothing in the long on a 'dumb switch' so it's worth a shot to see if it's the 'smart switch' or the LAN port hardware / driver. If that doesn't work, I'll switch to an un-used port on the motherboard (surely, couldn't be two ports malfunctioning if it were hardware, right? - don't answer.....:P )
Edit #3: Moved the connection on the LAN to another switch and it just went down for the count again. Will move it back to the original switch and then move the LAN from igb1 to igb2 to see if there is a hardware issue with port 1 (igb1) of the board.
-
The igb2 port did a watchdog and reset also. Unless I'm mistaken and the ports run off the same chip (don't think so), this has got to be a software or configuration error. :(
Edit: Removed VLANHWTSO (ifconfig igb2 -vlanhwtso) and turned off apinger (WAN Monitoring). Cut 95% of the log out and things are running OK so far…will see if this is the magic pill that fixes this thing.
-
Watching this thread with anticipation. This looks like a good board on paper and will be on my short list for a build if you can get it to run.
-
Watching this thread with anticipation. This looks like a good board on paper and will be on my short list for a build if you can get it to run.
I'm doing all that I can right now. Never expected in my life to learn how to compile an Intel NIC driver in FreeBSD 10.1 just to try to get pfsense to run, lol. The only thing I've not been able to get going was 2.1.5 and I'll go back to trying that if it's not stable this time.
-
Update: Two+ days of uptime since I made the last changes. About 20 entries TOTAL in the GENERAL logs and 95% of those are login / logout entries.
Keeping fingers crossed that either VLAN_HWTSO removal or apinger removal fixed this. Will probably turn one or other back on if this things runs for some time (weeks) just to figure out which one was taking down the LAN - time will tell….
Update 2: Three+ days. Only thing in logs other than login / logout entries is a Dynamic DNS check (no update required).
-
Update: At 4 plus days now with no watchdog. Again, only login/logout entries in the system log now.
Also, since turning off apinger, have not lost IPV6 connectivity.
Edit: Well shit….down again. Made it 4 days this time. I'm out of ideas at this point....other than 2.1.5, which I can't get to install (Root Mount Error). I will try later (2.1.5 or 2.2.5).
-
If your IPv6 Internet connection uses DHCP6, SLAAC and/or DHCP-PD over PPPoE, 2.2.5 correctly handles link up -> link down and link down -> link up scenarios. 2.2.4 and earlier versions simply trust that IPv6 will start working again exactly as it did before when the link returns, which is not necessarily true. My ISP doesn't install a necessary route until DHCP-PD has delegated a prefix.
If this fix is relevant to you, it won't do anything to stabilise your troublesome NICs, might help things recover properly after the interface's watchdog reset.
pfSense uses a custom kernel. If you want to experiment with kernel patches (including replacing the NIC driver entirely), you really need to be using a pfSense build environment to do so in order to avoid random breakage and loss of functionality. pfSense 2.2.5-RELEASE has 111 patches applied to the stock FreeBSD 10.1 operating system, many of them affecting the kernel in some way. The kernel is also configured differently to the FreeBSD GENERIC kernel, for example by including ALTQ support. Sadly, setting up a pfSense 2.2.x build environment is not that straightforward, especially if you have no previous experience with FreeBSD and git (the version control system used by pfSense).
Backporting fixes from one FreeBSD branch to another (for example from HEAD, which is currently FreeBSD 11, to releng/10.1, which is the base OS for pfSense 2.2) is often non-trivial and may well be difficult without some experience with Subversion (the version control system used by FreeBSD). You will struggle to manage more than the most straightforward backport without experience of programming in C, diff and patch.
I suspect your ROOT MOUNT ERROR in 2.1.5 is because FreeBSD 8.3 is too old to support the controller for the device you booted from (the USB controller if you booted from a memory stick). I would give up on 2.1.5 - pfSense 2.1 is End of Life, FreeBSD 8 is End of Life and FreeBSD ports for FreeBSD 8 are End of Life. There are almost certainly unfixed security issues in both pfSense and FreeBSD lurking within 2.1. Snort for pfSense 2.1 has been pulled, as VRT rule updates for the last version of Snort that was available for FreeBSD 8 have been discontinued.
-
Thanks David and from what little I've learned, I have about figured out what you posted above (not how to do it, just the general idea). I learned about some of it by compiling the newer Intel driver using standard stock FreeBSD 10.1 - missing ALTQ for Traffic Shaping (still had a watchdog though).
I don't think I have a hardware problem. I think something is broken at the kernel or driver level. I'm just out of ideas and not experienced enough (nor do I have time) to solve it past what I've already done. If 2.2.5 doesn't fix it, I'll have to try something else or go back to previous Asus router and simply wait to see if it ever gets solved. That's always a risk when you choose new hardware. Regardless, I thank everyone for helping and those that put this whole thing together and keep it moving forward.
Edit: Updated to 2.2.5 and will see how that goes.
-
Any luck with 2.2.5?
-
Any luck with 2.2.5?
Update was FAST and smooth (i.e. no issues). Been up for a day+ now with no issues (again, I don't run much yet as I'm waiting for stability first). It made it to 4+ days before going down last time (2.2.4), so time will tell. The driver for the NIC is, as expected, Intel 2.4.0 (FreeBSD version - won't probably change until FreeBSD 11 - but 2.4.3 compiled on FreeBSD 10.1 didn't fix it anway).
I'll keep you updated.
-
Well, went down on 2.2.5 but never received a watchdog timeout (waited nearly 5 minutes). Console worked fine and ifconfig shows both ports as active. Don't know where to go from here except back to the Asus router. I don't think it's a hardware issue (unless multiple ports have bad chipsets). Either a BIOS issue or FreeBSD/driver/issue (possibly pfsense but I suspect FreeBSD/driver).
sigh
-
Have you tried disabling MSIX and MSI? Instructions at https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#MSI.2FMSIX
-
Have you tried disabling MSIX and MSI? Instructions at https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#MSI.2FMSIX
I tried that once before but it locked the system (tried to disable both at same time). I'll try again to see if I it makes a difference. Will have to wait as I have a house full of people - all using the web continuously. I believe I would know about the Internet going down faster (with all of the crying about it) than I would know about someone breaking into my house and shooting up the place, lol. :P
From dmesg, it would appear that the ports are using MSIX. Will disable that first and see what happens.
Edit: msix disabled now (confirmed via dmesg). Will report back. If it goes down again, will try to disable both.
-
Disabling MSIX didn't fix it. Just went down again.
When I get a chance, I'll try to disable both again.
:-[
Edit: Just disabled both MSIX and MSI and the LAN went down. I logged in via IPMI and the console indicated all was fine (ifconfig - lists all as active). Tried pinging from the shell and couldn't reach anything on the LAN. Could ping the cable modem on the WAN port just fine.
Something is borking the LAN…what....I have no idea?!?
-
I have an X10SBA-L build:
-
SSD: 120 GB Crucial BX100
-
Case + PSU Combo: M350 and 90W PicoPSU
-
RAM: 2x4GB Crucial
I had a couple of problems that might be worth mentioning:
-
My first build had a faulty PSU/12V adaptor (wavy lines on VGA output, buzzing noise). Wouldn't stay up for more than 30 seconds, I replaced both, after which no more PSU issues. I've ordered a 12V straight cable (in theory no PicoPSU necessary) but haven't gotten around to installing it yet.
-
I had lots of trouble getting the exact RAM, at first I ordered some 1.35V/1.5V from Amazon, which passed Memtest for about 15 minutes until it failed. Bizarre. The replacements I got from Microcenter (didn't want to wait for shipping) work perfectly.
-
I added a fancy Swellder Delta TFB0412EHN PWM fan - sounds like a rocket ship on startup but afterwards the fan never turns on, system barely gets warm.
It was a bit of a dog to install 2.2.4, I had endless crashes and even after successful install it wouldn't stay up for more than a few hours before corrupting the file system. However after running the exhaustive Memtest I saw one failure, changing the RAM out with a slightly different model fixed all my issues.
Since then I had 51 days of uptime (I use pfsense on my home netrowk for firewall, VPN server, DNS server), reset this morning to upgrade to pfsense 2.2.5 which appears to be running just great.
2c
-
-
My X10SBA-L build use since initial startup, I have a 150MB Comcast connection, about 10-15 concurrently connected devices:
Basically total overkill for my network…
-
Thimee, did your entire system crash or did your interfaces go down (and console still work)?
My console still works fine…it's just the LAN (not WAN) port that goes down and resets via Watchdog timeout.
-
Engineer: Doing some double checking on SuperMicro web site:
1. Still no BIOS update
2. Certified 4GB ram is only one item listed: 4GB – Hynix - MEM-DR340L-HL02-SO16 - HMT451S6BFR8A-PB - H5TC4G83BFRSpecs above further detailed:
• 4GB Memory Module
• DDR3-1600MHz
• PC3L-12800
• Non-ECC
• 1.35v Low Voltage LV
• SODIMM
• 204-pin
• Cas Latency 11
• 1Rx8 (H5TC4G83BFR - 512MB chips each)3. Certified 8GB stick has the same exact component chip as the 4GB - H5TC4G83BFR - 512MB chips each
Your spec from page one listed as follows:
2 x 4GB Samsung PC1600 DDR3L: $36 shipped from eBayFound Samsung 4gb pc1600 sodimm items to get comparatives:
1. Samsung Part number M471B5273DH0-CK0 - 1.50v
2. Samsung Part number M471B5173DBO-YK0 - 1.35v - CL11 - ?Single sided 1Rx4?
3. Samsung Part number M471B5173BHO-YK0 - 1.35v - CL11 - ?Single sided 1Rx4?
4. Samsung MV-3T4G3D/US - 1.35v - CL11 - 2Rx4 Double sided - 8 eachMy questions:
1. Do these certified specs match exactly to yours especially 1Rx8 with 512 and CAS 11?
2. Have you run with just one chip of ram at 4GB - number one cycles solo and then number two cycled solo as well to isolate behavior?
3. Have you swapped memory places of the existing Samsung 2 *4GB sticks? -
I have re-seated the ram but not swapped it or run it in single channel (one stick). Once I completed the memtest+ test, I had assumed all was good. I suppose it could be ram but I tend to think that takes the whole ship down including the console (aka - one big crash). I suppose it might be ram. The ram is DDR3L 1.35V (I did make sure of that when I purchased it).
I'll see about removing a stick later. As for the certified memory stuff, I understand the testing and stuff but there's no real reason for most of the good stuff not to run. Just hasn't been certified….YET. When I bought the board and ram, there were NO certified sticks listed on the SuperMicro site. ;)
Oh, and for now, I've turned the Asus RT-AC68U back from AP to Router while trying to sort this out. Busy lately so I'll have to put it on hold a little. Hard to even do anything as people scream faster about the Internet being down than they do if we were being robbed, lol.
-
This is extremely interesting to me - the LAN1 LAN2 LAN3 and LAN4 ports on the back are not the same!!
Manual from SuperMicro - page 17 block diagram - shows:
A. 1x intel I210 is a single off the SoC on PCIe(1) and
B. the other three are 2 stepped off PCIe(2) then to a Pericom 608GP “PI7C9X2G608GP” (PCIe2 6-Port/ 8-Lane Packet Switch, GreenPacketTM Family) – add 150ns latency to stream per spec.My thoughts would be to move the LAN port if on the three port Pericom switch to PCIe(1) directly off the SoC. Guessing this is lower left corner LAN1 while looking from back side or page 26 of the manual.