Unbound crashes daily, 'out of swap space'
-
@steveits I should have mentioned it's an SG-1100 running 22.01 which has 1G RAM and an 8G disk. Memory usage is typically at 50%, disk is zfs at 11%. I do not have the pcscd service running.
-
The GUI reports right now :
Using acme cron filer Notes Avahi openvpn-client-export System_Patches NUT Shellcmd, these have a very small disk foot print, and do not run permanently. If they run, they don't use a lot of memory.
The big one could be pfBlockerNG-devel, but I use it more as a case study, with less then 20 thousand lines of DNSBL and a small hundred IP's.
When I take down pfBlockerNG-devel (disable IP and DNSBL), the memory footprint drops to env 500 Mbytes usage.
I'm also using Freeradius. It looks stable to me in its memory usage.Look again of at the memory graph : 1 Gbytes would work for me (I guess).
The very moment the system starts using one byte of swap space, consider taking down processes.
Service Watchdog is useless, as it will kill a doing system. unbound was probably being killed for Out Of Memory reasons by the OS, by the flip-of-a-coin election. It didn't crash, and if it did, it was it ran out of memory. The Watchdog doesn't make more memory. It will just eat more rare system resources.Use the ssh command line tool 'top', or better, install 'pkg install htop' and use it, sort on memory usage and watch what happens over time.
-
@gertjan Service Watchdog is a necessary evil since Unbound started crashing. I have a house full of people, some of whom work from home and need a stable Internet connection. If Unbound dies while I'm in the office, then everyone is out of luck which is not acceptable. I also suspect pfB but can't reconcile the error with the RAM and disk usage stats. I already have an ssh session open running top sorted by memory use.
-
@kom said in Unbound crashes daily, 'out of swap space':
Service Watchdog is a necessary evil since Unbound started crashing.
I understand. Some solution is better as no solution.
I don't have to an example log line, but there are typical log messages that show that a process is thrown of the running list, terminated, for OOM (Out Of Memory) reasons. Swapping is a very resource extensive task, and when that start, things go down hill very fast.
Btw : the subject : "out of swap space" : any OS would OOM process(es), or it's instant kernel death time.
No choice : lower your needs, or convert an old desktop PC or .... use a Q-Box clone or visit 'the store'.I don't have hands-on experience with a SG1100 - I saw a red SG1000 ones, I loved it, but wouldn't dare putting a "2022" household behind, as I would be condemned by sleeping in the dog house in no time ....
-
@kom I don't think the 1100 or other ARM (maybe all eMMC?) devices have swap, so it's just running out of memory and saying it can't use the (zero) swap space. Our 3100 doesn't have swap. Swap on an eMMC I think would glacially slow.
re: Service Watchdog, it's a double edged sword since I've seen comments it will interfere with service restarts as you noted. Are you sure unbound isn't being started twice in some instances? I recall that being the issue with it and Snort/Suricata.
-
I'm still suspecting pfB but I don't see anything definitive in the logs. pfB seems to have 29 different log files so who knows.
-
@kom Unbound RAM usage went from 147M to 152M in a day. I have pfB set to update only once a day at 12:30am. I did not receive any watchdog alerts last night or all day yesterday so far.
-
@kom
Here is mine :Does this you mean that you've added for about 50 Mbytes worth of DNSBL and IP's ?
I'm not using the 'arm' version, but the classic 'amd64'.
-
@gertjan Perhaps. I don't know. I have no idea how much space the various lists consume. I haven't added any custom lists, I just use the defaults.
-
@kom
This morning it shows 108 Mbytes, that's 2 Mbytes more.
Could be the dns cache, I'm not sure.
I'll follow the mem usage for some time. -
Stupid me.
Some one is already tracing unbound memory usage every 5 minutes.
Here.There is a memory usage chart.
As you can see, everything came to a 'crashing' halt (the stats) last night (GMT +2) : pfBlocker decided to do a rstart on unbound as some DNSBL was updated.
That's actually a positive effect of restarting a process : if there was a memory leak, even small, a restart would clean them up.
Your setup should does the same I guess, but your memory ceiling is much lower. -
@gertjan I also have a running session tracking memory. It grew at first by 3-5M and then has been stable for 2 days now. We'll see over the next week or two.