Internal NIC crashes down / no buffer space available

Gimli

I'm experiencing the same issue.

Most of the time when I start a multi-threaded (10 threads) FTPES transfer the LAN interface on my pfSense box stops responding, with the same error message. Whenever that happens I notice that the OACTIVE flag gets turned on on that interface (anybody knows what that flag is? Google is of no help). If I just bring the interface down/up it goes back to normal. Sometimes I have to reset the interface 3-5 times before it starts transferring without bugging out.

I'm running pfSense 2.2.5-RELEASE (did it on 2.2.4-RELEASE as well) as an HyperV 2012R2 VM with a quad gigabit ports Intel i350 NIC. It seems to happen more the faster the transfers are (I have gigabit internet). It also takes longer to happen if I reduce the number of download threads.

I've looked at MBUF counters and nothing seems out of the ordinary when it happens. It generally stays under 5000/1000000. Can't see any other counter going wild either but I'm willing to experiment if anybody has a suggestion.

Gimli

I think I found the cause and the solution of/to this problem.

What happens is that the host OS puts the adapter in low power consumption mode. When the pfsense VM suddenly starts a very fast, multi-threaded network transfer the host OS doesn't react quickly enough to bring the adapter to full power and the virtual interface just panics and shuts down (OACTIVE flag goes up).

To fix it, from your host OS, go into Device Manager and for each of the network interfaces that you use for your pfsense VM, right-click the interface and select "Properties". From there where you go will depend on the driver but for my Intel NIC with Intel drivers there is a tab called "Power Management". Click on that and un-check the "Energy Efficient Ethernet" box. Click OK and off you go, you shouldn't have this problem anymore. On some adapters that setting may be buried somewhere in the "Advanced" tab.

I did this a couple of days ago on my box and haven't experienced any interface dropping since. Hopefully this works for you too.

NXLF

Hello Gimli,

thank you for your help. I've checked the properties of our NICs but the EEE-option was already disabled.

But I think it is good to know, that this problem is caused by the hosts NIC. Tonight we will update the NIC drivers. Maybe this will help. I will also check if the OACTIVE flag will be set next time our NIC crashes down.

Gimli

You're welcome. It's been four days now and still no crash for me.

I'll note that I also changed the size of the receive and transmit buffers under the advanced tab > performance options as well, to 1024 and 2048 respectively. I don't think this is what did it but that's another thing you may try if disabling the energy efficient ethernet option didn't do it.

Gimli

Well, that was short-lived. My interface started crashing again on really fast transfers tonight. Same symptoms as before, it just took a few more days to start happening. Guess it wasn't the energy-efficient Ethernet setting after all.

Back to the drawing board…

Kaavi

Gimli, I have the exact same problem as you - did you find any solution back at the drawing board? :)

Gimli

I haven't had a lot of time to do any more testing but I'm starting to think it may be a bug in the FreeBSD driver for the Hyper-V virtual NIC. I have a different box on which I installed pfSense natively (i.e. not as a VM) with the same NIC and I don't see the issue on that one.

Kaavi

Gimli thanks for your reply, I tried to use another NIC (X552/x557-AT) instead of the I350 - unfortunally it is the same error :(

So I guess you are correct about it being FreeBSD/Hyper-V issue :(

Gimli

Alright, here's an update on this issue.

For the last few weeks I haven't experienced the problem but I don't think I fixed it, it's more of a workaround. I added a cron job on the pfSense box that resets the interface that usually goes down with heavy usage at midnight every day. It appears that cycling the interface down/up before it crashes keeps it from crashing. The cycling is so fast that it doesn't even break connections that are active, it just delays them for a few milliseconds.

Vorland

We had the same problem, our setup:

Windows Server 2012 (without R2) - Hyper-V host
pfSense 2.2.6-RELEASE (amd64)
NIC: HP NC382i DP Multifunction Gigabit Server Adapter

Problem was resolved by installing all windows updates and updating NIC driver.

Hope this information will be helpful.

Gimli

What time frame are you talking about Vorland? My servers have always been up-to-date on updates and drivers. Maybe it's one of the December updates that fixed it.

I'll disable my cron job for a while to see if it comes back.

Vorland

I've installed all updates on 2016-01-06. I guess updated NIC drivers resolved the issue.

rodymcamp

I also had this issue running freenas on top of esxi, the only information that I could find hinted that I needed to stop using VMXNET nic do to an issue with the free BSD driver and switch back to using the intel virtual nic. I have not had a crash sense.

LEckley

Hi Guys,

Has anyone found a solution to this issue beyond restarting the interface using a cron?

Gimli

I've disabled my cron job since the 22nd and haven't experienced the issue since. I don't think it's a question of drivers as I've had the same Intel drivers since last October (they're the latest) but I think the December patches from Microsoft may have fixed it.

LEckley

@Gimli:

I've disabled my cron job since the 22nd and haven't experienced the issue since. I don't think it's a question of drivers as I've had the same Intel drivers since last October (they're the latest) but I think the December patches from Microsoft may have fixed it.

Thanks! I pushed the last round of MS updates last night, will see how it goes.

Cheers!

SkinnyT

I've been having the same issue with the connection dropping when running a speed test. It'll only crash on the upload though.

After about 3 days of pulling my hair out, I ended up disabling the Energy Efficient Ethernet setting on both NIC's and also turning off Flow Control. I also changed the nbmclusters to 1000000. I tried each of these settings on their own with no luck, but all three together seem to have made a difference.

All week I couldn't run one speedtest without dropping my WAN connection and last night I ran a test every 15 minutes for about two hours with no drops.

Now that I'm at work, I'm a little more hesitant to remote in and run one for fear of it dropping again, but hopefully in a few days I'll have a bit more confidence in it.

bean72

I have running into the same issue, for now I have my network running on a backup router until I can resolve this issue. What's weird is when it drops, I can still ping certain external IP's. My network adapter is a BCM5716, tried updating the drivers and still have the same problem.
Anyone seeing anything in the logs on the hyper-V host?
It seems that the error starts at the same time as event viewer logs the error: "The network link is down. Check to make sure the network cable is properly connected"

I thought it was originally a faulty cable, but after switching the cable 3 times I'm guessing it's something else.

Gimli

All I can recommend at this point is to make sure that you're running the latest version of pfSense, your host is fully up-to-date with Microsoft patches and NIC driver updates and that you disable the Energy Saving mode(s) in your host's NIC configuration.

If that all fails, try with a different NIC.

jsingh

Loosing connectivity with external switch on Hyper-V
I have installed 2.3.1 release as a Hyper-V guest on server 2012 R2. WAN is working fine with an External Switch but LAN is connected with other NIC (Connected to the LAN). LAN is loosing connectivity and I restart the LAN interface most of the times or I have to reboot the pfsense guest OS.

Even I have tried to use the internal switch on the LAN side and issue still exists. It is for sure not my NIC. It is something to do with the Hyper-V settings or pfSense.
I ran similar setup in test lab on VMware Workstation and it works like a charm on it.

Any solution guys!