Server Locking Up: SYS-5015A-EHF-D525



  • Hi,

    I'm currently having an issue with a new pfSense build that's causing some headaches. I have the SuperMicro SYS-5015A-EHF-D525 barebones kit and when pfSense is installed, everything seems to work great.

    Our operation requires more NIC ports so after adding in a dual port NIC: http://www.amazon.com/gp/product/B000BMXME8/ref=oh_details_o02_s00_i00?ie=UTF8&psc=1 everything runs fine for a day or two then locks up.

    I don't have any connections plugged into the dual port NIC because I'm only testing right now but basically without the dual-port NIC it's stable but with it, system crashes. The last entries in the system log have the dual port NIC going up and down before the server locks up. Kinda seems odd because nothing is plugged into them…

    Any ideas? I'm currently going with some sort of power saving feature on the NIC and/or PCI-e that I'm missing which causes it to lock up after X amount of inactivity but won't be able to test this until tomorrow. Any advice would be great!



  • Stick it in a different box, if it crashes it as well? RMA?
    Do you have another dual-port NIC to put in?


  • Netgate Administrator

    You may as well try this since it's an easy test:
    http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Intel_igb.284.29_and_em.284.29_Cards

    However since you're not moving any packets through it it seems unlikely.

    Steve



  • @SeventhSon:

    Stick it in a different box, if it crashes it as well? RMA?
    Do you have another dual-port NIC to put in?

    I have a Realtek one that I've tried, same issue. When this first happened, I thought the board was faulty so I RMA'ed it. Got the new board, same issue with the Realtek dual port NIC so I purchased the Intel dual port NIC and again, same issue. HD is new, same with RAM- both of which I've stress tested and they came back OK.

    I'll try to stick it in another SuperMicro server and see if the same result happens. Thanks for the advice!



  • @rgrobbel:

    The last entries in the system log have the dual port NIC going up and down before the server locks up. Kinda seems odd because nothing is plugged into them…

    Can you post the log entries or some of them.

    Perhaps the unterminated NIC connector is picking up electrical noise causing the link state to flap. Have you configured the interfaces in pfSense? Perhaps pfSense is looping trying to bring those interfaces UP, failing, trying again etc etc.

    What build of pfSense are you using?



  • @wallabybob:

    @rgrobbel:

    The last entries in the system log have the dual port NIC going up and down before the server locks up. Kinda seems odd because nothing is plugged into them…

    Can you post the log entries or some of them.

    Perhaps the unterminated NIC connector is picking up electrical noise causing the link state to flap. Have you configured the interfaces in pfSense? Perhaps pfSense is looping trying to bring those interfaces UP, failing, trying again etc etc.

    What build of pfSense are you using?

    Thanks for the reply. pfSense 2.0.3 at the moment.

    I'll post the log entries when I'm back at work tomorrow. I have just the onboard, 2 port NIC configured for WAN/LAN.

    Your assumption makes sense. Hopefully some log data I post tomorrow will help. Thanks again guys!



  • Is this an 82574L-based card by any chance? These have known issues with ASPM (PCIe power saving mechanism); see e.g. this post.



  • Oh, looks like this is a PRO/1000 PT, which is actually 82571-based; however, that chipset appears to have known ASPM issues as well (see e.g. the description for this Linux commit). Not sure if there's some kind of tunable for disabling ASPM on FreeBSD, though. Perhaps there's a setting in the BIOS?



  • Might also be worth checking if disabling MSI-X (hw.em.enable_msix=0) helps, as some cards/chips apparently have trouble with that as well.



  • Hi Guys,

    Thanks for all the suggestions. I was able figure out the issue and it was something stupid I did :P



  • Hey, do you mind describing your mistake? There's nothing worse than finding a thread that describes your problem perfectly and then seeing "Oh I fixed it. Silly mistake." :)

    Relevant xkcd, as always: http://xkcd.com/979/



  • Haha, that's what I was thinking as well :-)


Locked