MBUF usage at 87%

theaddies

Sorry for the simple question folks, but I couldn't figure out what to do after searching. I have a Supermicro A1SAM-2550F PF sense machine running version 2.2.1. Since my motherboard has motherboard has 4 intel NICS I thought this might be the issue. I have 8GB of Kingston ECC ram. For a while the MBUF was running at 33% but today I noticed it is quite high. Is this a problem?

Guest

Perhaps this would help you out to tune your NICs a little bit.
Tuning and Troubleshooting Network Cards

stephenw10

What actual MBUF values (used/max) are you seeing?

Steve

theaddies

I will post the exact used and max values tonight. The used value was 87% but I don't remember out of how much. I did reboot the computer last night and the values dropped back down to about 33%. I did find the statement about 4 NIC's in the link posted above and found the line kern.ipc.nmbclusters="1000000", but I honestly wasn't sure how to implement it. I don't know how to get a command prompt within pfsense to make the change in the referenced file. Thanks folks.

almabes

You don't need 1M mbufs. Gonzopancho smacked me around for posting that suggestion in a separate thread about RCC-VE hardware, but also included some educational material.

I have production firewalls with 40+ users that run with 25k mbufs only actively using roughly 1800 or so.

See:
@gonzopancho:

@almabes:

Another tweak…

Certain intel igb cards, especially multi-port cards, can very easily exhaust mbufs and cause kernel panics, especially on amd64. The following tweak will prevent this from being an issue:
In /boot/loader.conf.local - Add the following (or create the file if it does not exist):
kern.ipc.nmbclusters="1000000"
That will increase the amount of network memory buffers, allowing the driver enough headroom for its optimal operation.

see: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#Intel_igb.284.29_and_em.284.29_Cards

the kernel doesn't panic when you exhaust mbufs, it panics when you set this limit too high (and your number is too high), because
the system runs out of memory.

For each mbuf cluster there is “mbuf” structure needed. These each consume 256 bytes, and are used to organize mbuf clusters in chains. An mbuf cluster takes another 2048 bytes (or more, for jumbo frames). There’s possibility to store some additional useful 100B data into the mbuf, but it is not always used.

When there are no free mbuf clusters available, FreeBSD enters the zonelimit state and stops answering network requests. You can see it as the zoneli state in the output of the top command. It doesn't panic, it appears to 'freeze' for network activity.

If your box has 1GB of RAM or more, 25K mbuf clusters will be created by default. Occasionally this is not enough. If it is, then perhaps doubling that value, and maybe doubling again, are in order. But 1M mbuf clusters? Are you serious?

You just advised people to consume 1,000,000 mbuf clusters (at 2K each). Let me know if I need to explain how much RAM you needlessly advised people to allocate for no good purpose.

I am well-aware that someone wrote something completely uninformed here: https://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards#mbuf_.2F_nmbclusters
so please don't quote it back to me.

theaddies

My MBUF usage readings are below.
37% (9876/26584).
Is this acceptable? As I said over about a period of a month or so I noticed the value was at 87%.

stephenw10

If it was up to 87% and still climbing then that's an issue because you don't want to run out. Try doubling it at first and keep an eye on the mbuf RRD graphs.
To do that you want to add the line shown to the file /boot/loader.conf.local. You can do that from the GUI be executing the following in the Diagnostics > Command prompt box:

echo 'kern.ipc.nmbclusters="50000"' >> /boot/loader.conf.local

That will create the file. If you need to change it again you can do so via Diagnostics > Edit file. It only takes effect at boot though.

Steve

antillie

My MBUF usage was sitting at 200006/~256000, or about 78% with no traffic going through the box. Granted it didn't go up much when I had small amounts of testing traffic but sitting at over 75% of capacity all the time really made me nervous.

I added "kern.ipc.nmbclusters="1000000" to the /boot/loader.conf.local file and now my MBUF usage is comfortably at 2%. Memory usage is also comfortably at 5% vs 3% previously. A small price to pay I think to ensure that my firewall won't stop passing traffic if things get busy.

I suppose I could use a lower number. But the default just seemed off.

stephenw10

What CPU and NICs are you using?

theaddies

I ran the command suggested by stephenw10 and the values are now 18% (9120/50000) after a reboot. Thanks for the help everyone. You guys were awesome.

antillie

I am using a Supermicro A1SRi-2758F.

stephenw10

Ah, OK. You will see a lot then. Cores X NICs X mbuf allocation = big. :)

Steve

almabes

Can someone knowledgeable post some guidelines for mbuf configuration? There's incomplete and conflicting information out there which is confusing folks. Some information as to what might cause mbuf utilzation to climb would be useful too.

Thanks

tattinger

I'm running a Supermicro MBD-A1SRM-LN7F-2758 with 16gb memory with pfSense 2.2.2. My initial MBUF was 73% (19496/26584) memory usage 4%. I edited the /boot/loader.conf.local using Diagnostics-Edit file and added kern.ipc.nmbclusters="1000000"

Now , after reboot my MBUF is 2% (19750/1000000) Memory usage is 2%.

Sir Loin

I have the Supermicro A1SRi-2758F with 16GB RAM. I ran into the same MBUF problem initially. I had to up kern.ipc.nmbclusters to 1000000 and now all is good.

robi

@stephenw10:

Ah, OK. You will see a lot then. Cores X NICs X mbuf allocation = big. :)

Steve

I've got a system with Atom D525 (4 cores - 1 package(s) x 2 core(s) x 2 HTT threads) and 5 Intel Gigabit NICs, all use the em driver, MBUF is at 2%, no tweak.
I've also got a new A1SRi-2758F with Atom C2758 (8 cores - 1 package(s) x 2 core(s) x 2 HTT threads) and 4 Intel Gigabit NICs, all use the igb driver, MBUF is at 14%, no tweak.

Both in the same place using exactly the same config (the 5th NIC on the D525 not connected to keep machines interchangeable).

Don't see why the MBUFs are so much higher on the C2758. According to your math it shouldn't be more than 5%…
Couldn't we somehow force to use the em driver instead of the igb driver on the intel nics of A1SRi-2758F?

stephenw10

I guess there are more variables in play than I'm aware of. Most likely the usage scales with traffic throughput. Though I'm guessing now…. ::)
There's no way to use the em driver with newer Intel NICs as far as I know.

Steve

robi

There could be some automatic detection at boot which would pre-set the correct value based on the specific hardware, using some math like you suggested.

got0

I own a new SG-4860 and only did some basic configuration and testing so far. However, the usage of MBUF is causing issues:

MBUF Usage: 81% (21516/26584) <- just booted
…
MBUF Usage: 100% (26584/26584) <- climbing without anything really happening on the box
...
kernel: [zone: mbuf_cluster] kern.ipc.nmbclusters limit reached
Uptime ~18h

As suggested by Steve before, I start now the game of doubling the nmbclusters until the box it not freezing anymore. But is it just my box, or is that a general issue with the SG-*? Aren't they already tuned?

stephenw10

We were discussing that internally just recently. In testing the limit was not reached apparently but as always the real world can be different to the test bench. You're not the first person to query that setting.
It's likely that value will be set higher by default in future releases for the SG series. If you run real world tests and come to a conclusion about a suitable value we'd love to hear it.

Steve

Edit: managed to leave out an entire word there!