Faulty hardware? Incompatible drivers? em1 is flaky



  • I'm at my whits end trying to figure out what's wrong.  I have a Supermicro SYS-5017C 1U server with two onboard Intel NICs and I added an additional Intel PRO/1000 PT Dual Port to the riser card for a total of 4 NICs.  The PT Dual Port card is em0 and em1.

    em0 is LAN
    em1 is WAN1
    em2 is WAN2
    em3 is DMZ

    Testing with just LAN and WAN1, I'll boot up the firewall, everything is working great, then maybe 3-4 minutes afterwards em1 stops working, I get a "Unable to check for updates" error and I can no longer ping the WAN1 Gateway.  If I switch configs and change WAN1 to em3 (one of the onboard NICs) it works perfectly for days.  But the DMZ seems to not work correctly on em1 either, pinging things is inconsistent at best.  I can't seem to get em1 to be stable for anything!

    What I've tried so far.  I swapped out the NIC with an identical one.  I've swapped out the Riser card.  I even bought a whole new SYS-5017C 1U server and I'm still getting the same results.  It doesn't make any sense!

    Is there something I'm missing?  Thanks for any advice, I'm really needing to get this firewall in production soon.  Been messing with it for weeks.



  • Have you gone into Supermicro IPMI and set the management LAN port as "dedicated"? It is not shipped this way as a default, and as a result the LAN ports will act flaky.  If you do not set it to dedicated,  then it TRIES TO MULTIPLEX YOUR IN USE LAN PORT AND CRIPPLES ITS PERFORMANCE….



  • I set a static IP on the IPMI management interface and then logged in through that.  I was then able to set it to dedicated like you suggested.

    I'm still getting the same behavior, there's just something about eth1 that it doesn't like.  I have since switched my LAN port to eth1 and put the WAN1 on eth0.  Everything seems to be working ok right now, but I was certain I had tried that configuration before and was getting sketchy results.  I really wish I knew why it was behaving the way it was.  I'm going to keep an eye on this unless anybody has any other suggestions.

    Thanks!



  • I went ahead and did a fresh install just to make sure there wasn't some strange configuration problem.  em1 seems to be working perfectly fine right now so it must have been something I did.  I had Multi-wan gateway groups setup, so perhaps something wasn't working correctly.  We'll see what the outcome is when I get it all setup again.


  • Netgate Administrator

    If it just stops working completely check for mbuf exhaustion on the dashboard or:

    [2.1.2-RELEASE][root@pfsense.fire.box]/root(2): sysctl dev.em.0.mbuf_alloc_fail
    dev.em.0.mbuf_alloc_fail: 0
    

    Also check the system logs.

    Should always be 0. Obviously look at em1-3 also. If that is happening then try:
    https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Intel_igb.284.29_and_em.284.29_Cards

    Steve