Inactive Memory problems

kelsen

Hi,

I'm facing a problem that I can't understand, an pfSense 2.0.3 with 6GB of memory is consuming 3+GB with inactive memory, free memory dropping below 30MB cause network outage then I need to reboot to clear the memory, week later it start filling inactive memory again until the same problem happen.
I know free memory is a waste, but somehow the kernel don't free up memory to avoid that problem.
I've squid2.7 with AD authentication, with 168 clients.

cache_mem 256 MB
maximum_object_size_in_memory 128 KB
memory_replacement_policy lru
cache_replacement_policy heap LFUDA
cache_dir aufs /var/squid/cache 51200 32 256
minimum_object_size 0 KB
maximum_object_size 5120 KB
offline_mode off
cache_swap_low 94
cache_swap_high 95

I can't paste top now because I just restarted the firewall.

jimp

What leads you to believe that the lack of "free" memory is causing your problem?

If you explain the symptoms in more detail it would help us find the actual cause and solution.

kelsen

Well, I think it is because, when free memory drops below 30MB all interfaces stop responding ping, all gateways goes offline. When it happened first time i saw that firewall had only 512MB memory, then we increased it to 6GB, but like I said, inactive memory consumes everything until there is no free memory available which I think it's related to squid cache.
When it happens I need to reboot the system because I can't wait until it recover itself somehow.
I would gladly provide you more info if you need.
Sorry if my english is poor.

top.jpg_thumb

mem.jpg_thumb

jimp

You are probably looking at two symptoms of the same root cause though. The lack of free memory would not cause that. It would start swapping and it hasn't touched swap yet in your top output.

Keep looking in other places, but the memory alone is not the problem.

kelsen

Actually this top is not when the issue happens, is just to show the amount of inactive memory from yesterday when I rebooted the firewall, indeed it swap a little when things happen.
But what else could be if it happened three times, in those three times the memory was below 30MB?
I'm sorry if i'm insisting on this, but it must be too much coincidence.

jimp

I'm not saying it's not related, I'm saying it's not likely the cause, just another symptom. It's like getting shot in the leg and then saying the blood and the limp are the problem, not the gunshot.

Keep working until you find the bullet. :-)

wallabybob

Unless you are running the amd64 variant of pfSense you won't be able to use more than about 3GB of the 6GB you have allocated.

@kelsen:

Well, I think it is because, when free memory drops below 30MB all interfaces stop responding ping, all gateways goes offline.

This is the sort of symptom you would see if mbufs (kernel network buffers) are (nearly) exhausted. pfSense shell command```
netstat -m

reports mbuf statistics. It could be worth running a shell script on the console to loop giving a timestamp, reporting the statistics and sleeping for an hour. You could also run that in a SSH session to capture history while the console run will (hopefully) give you statistics after you lose network access.

kelsen

@wallabybob:

Unless you are running the amd64 variant of pfSense you won't be able to use more than about 3GB of the 6GB you have allocated.

Sure I am running amd64.

This is the sort of symptom you would see if mbufs (kernel network buffers) are (nearly) exhausted. pfSense shell command```
netstat -m
reports mbuf statistics. It could be worth running a shell script on the console to loop giving a timestamp, reporting the statistics and sleeping for an hour. You could also run that in a SSH session to capture history while the console run will (hopefully) give you statistics after you lose network access.

I haven't thought about that. For now it's fine, when free memory reach about 100MB I'll execute this script.
Thank you for your tips!

adam65535

@wallabybob:

This is the sort of symptom you would see if mbufs (kernel network buffers) are (nearly) exhausted. pfSense shell command```
netstat -m
reports mbuf statistics. It could be worth running a shell script on the console to loop giving a timestamp, reporting the statistics and sleeping for an hour. You could also run that in a SSH session to capture history while the console run will (hopefully) give you statistics after you lose network access.

I wish mbuf counts were on an rrd graph in pfsense. It is such an important thing to keep an eye out for. It would be great to see the history of that over time.

Thinking about it… It would be great if we could get a consensus on some very important things to monitor like this and get a script going to send an email alert when the values are approaching the maximum values.