System self destructed overnight - need some guidance



  • Woke up this am to find that I had no internet access. Logged into pfSense and things were all borked. No dashboard, menu items missing etc.. Things have completely reset themselves. The firewall rules are back to default, all my rules are gone, had to add a LAN rule to get access to the internet. About 75% of my static mappings are gone. All services are stopped and reset, ex the DNS Resolver has none of my settings. The entire system is like this..

    Looked at the system logs and this is what I see:

    Nov 23 05:00:00	kernel		pid 42139 (freshclam), uid 106 inumber 10353329 on /: filesystem full
    Nov 23 05:00:01	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:00:28	kernel		pid 53743 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:01:02	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:01:30	kernel		pid 84160 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:02:02	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:02:33	kernel		pid 15866 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:03:01	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:03:36	kernel		pid 43490 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:04:02	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:04:39	kernel		pid 74470 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:05:01	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:05:41	kernel		pid 2973 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    Nov 23 05:06:02	kernel		pid 55915 (ntopng), uid 0 inumber 410356 on /: filesystem full
    Nov 23 05:06:44	kernel		pid 31098 (pfctl), uid 0 inumber 3290523 on /: filesystem full
    

    Did a restart and I see this:

    Nov 23 05:37:40	yukon.lan		nginx: 2016/11/23 05:37:40 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:40	yukon.lan		nginx: 2016/11/23 05:37:40 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:41	yukon.lan		nginx: 2016/11/23 05:37:41 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:42	yukon.lan		nginx: 2016/11/23 05:37:42 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:42	yukon.lan		nginx: 2016/11/23 05:37:42 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:42	yukon.lan		nginx: 2016/11/23 05:37:42 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:42	yukon.lan		nginx: 2016/11/23 05:37:42 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    Nov 23 05:37:42	yukon.lan		nginx: 2016/11/23 05:37:42 [error] 23468#100181: *37 open() "/usr/local/www/apple-touch-icon-precomposed.png" failed (2: No such file or directory), client: 192.168.1.5, server: , request: "GET /apple-touch-icon-precomposed.png HTTP/1.1", host: "192.168.1.1"
    

    Was able to get the System Information panel up on the dashboard and it says the disk is 101% full.

    Only things I changed on the system yesterday is this:

    • I enabled "Log packets matched from the default block rules in the ruleset" in the firewall rules settings - I've had this enabled before and its never filled up the disk due to the log management settings

    • I installed the Notes package - First time ever installing this

    Just used on simple deduction I suspect the Notes package conflicts with something.

    Am I looking at a complete format and re-install? I'm guessing so.

    PS I just tried to re-enable the DNS Resolver and I got this:

    The following input errors were detected:
    The generated config file cannot be parsed by unbound. Please correct the following errors:
    /var/unbound/test/unbound_server.pem: No such file or directory
    [1479911345] unbound-checkconf[80306:0] fatal error: server-cert-file: "/var/unbound/test/unbound_server.pem" does not exist
    

  • Banned

    @AR15USR:

    Just used on simple deduction I suspect the Notes package conflicts with something.

    No, Notes package certainly does NOT cause full disk, unlike the ntopng stats and logs.



  • @doktornotor:

    @AR15USR:

    Just used on simple deduction I suspect the Notes package conflicts with something.

    No, Notes package certainly does NOT cause full disk, unlike the ntopng stats and logs.

    I wouldn't think so either, but it could conflict, no? Turning on log default block logs shouldn't either as the log limit should have stopped that. Not to mention I have a 250gb disk and I've never gotten anywhere near even a few % full.


  • Banned

    No, it does not conflict with anything, the only thing it does is saving base64-encoded plaintext notes into config.xml. Would take about a century of typing to fill 250GB drive with notes. SSH into the box, wipe some ntopng logs and other cruft

    rm -rf /var/db/ntopng/
    

    After that, remove the package from the developer shell menu.

    playback uninstallpkg pfSense-pkg-ntopng
    


  • @doktornotor:

    No, it does not conflict with anything, the only thing it does is saving base64-encoded plaintext notes into config.xml. Would take about a century of typing to fill 250GB drive with notes. SSH into the box, wipe some ntopng logs and other cruft

    rm -rf /var/db/ntopng/
    

    After that, remove the package from the developer shell menu.

    playback uninstallpkg pfSense-pkg-ntopng
    

    Roger.

    I wipe the ntop logs when I get a chance tonight..

    Question: In your opinion could the setting of "Log packets matched from the default block rules in the ruleset" cause the disk to fill up from only a few % to 100%+ overnight (overnight being 7hours)? Is it possible that ntop caused that to happen overnight? What else could cause that? Just curious as to what could have caused this..

    Thanks


  • LAYER 8 Global Moderator

    Who says the disk filled up over night??  Maybe it has been growing in usage from day one, lets call it 1% a day..  Yesterday it was at 99%, you go to bed and now today its 100% full..


  • Banned

    Also, not not ever, ever enable the historical data "feature". (There's a giant warning about this.) And yeah, with johnpoz here, this does not happen overnight. The storage space can be monitored via SNMP and you can alert yourself accordingly on set thresholds.



  • Understood about it happening slowly, but pretty sure I would have noticed. I watch the Dashboard System Information panel religiously and the Disk Usage was no where near even 30% yesterday. I have the web console up on my screen (with the system info panel displayed) pretty much the entire time I'm around. Before today I had yet to see any of the MBUF/CPU/Memory/Disk Usage ever get above 25%.

    I suppose I could have missed it though.

    I did notice when I turned on the "Log packets matched from the default block rules in the ruleset" in the firewall rules settings that a crap ton of blocks started flying by. More than I have ever seen before. Seems unusual but I paid no mind at the time..

    I did not have the historical data turned on in ntop btw.

    Oh, also forgot to mention I left nmap running a scan on my lan when I left for the night. It was running from a machine on the lan, wouldn't think this could cause anything like this.

    At any rate, I save my configuration routinely so worst case I just reinstall and reload the configuration file..

    @doktornotor, what application is that a screenshot of?


  • LAYER 8 Global Moderator

    "@doktornotor, what application is that a screenshot of?"

    LibreNMS would be my guess ;)  Dok had mention setting that up recently ;)

    There are plenty of tools that can do it though.  If your a windows shop look at prtg.. Its free for 100 sensors..  LibreNMS is just a fork of observium..  But there are shittons of monitoring tools that can provide you that sort of information via snmp.


  • Banned

    @johnpoz:

    "@doktornotor, what application is that a screenshot of?"

    LibreNMS would be my guess ;)  Dok had mention setting that up recently ;)

    Yeah, that's LibreNMS. There're ready to use self-updating VMs available for download (Ubuntu 16 LTS or CentOS 7)

    http://docs.librenms.org/Installation/Ubuntu-image/
    http://docs.librenms.org/Installation/CentOS-image/



  • Yeah, that's LibreNMS.

    Sorry to hijack the thread, but how do you find working with LibreNMS?  I just implemented Zabbix here and while it is nice, it was a tremendous pain in the ass to set up, and doing anything custom was more effort than it was worth.


  • LAYER 8 Netgate

    @AR15USR:

    Question: In your opinion could the setting of "Log packets matched from the default block rules in the ruleset" cause the disk to fill up

    No. The firewall log is circular. It never grows beyond its set limit.



  • @doktornotor:

    @johnpoz:

    "@doktornotor, what application is that a screenshot of?"

    LibreNMS would be my guess ;)  Dok had mention setting that up recently ;)

    Yeah, that's LibreNMS. There're ready to use self-updating VMs available for download (Ubuntu 16 LTS or CentOS 7)

    http://docs.librenms.org/Installation/Ubuntu-image/
    http://docs.librenms.org/Installation/CentOS-image/

    Thanks, maybe this will work for me better instead of trying to get ELK to work..


  • LAYER 8 Global Moderator

    ELK is more for syslog, vs monitoring of interface traffic, disk sizes, services, etc.



  • @johnpoz:

    ELK is more for syslog, vs monitoring of interface traffic, disk sizes, services, etc.

    I was referring to the fact it has a prebuilt VM as I can't get ELK to work for the life of me ???


  • Banned

    @KOM:

    Yeah, that's LibreNMS.

    Sorry to hijack the thread, but how do you find working with LibreNMS?  I just implemented Zabbix here and while it is nice, it was a tremendous pain in the ass to set up, and doing anything custom was more effort than it was worth.

    Oh… well, that's about 10,000% easier for anything capable of SNMP, plus no damned agents, proxies etc. required. Adding the devices and getting loads of basic monitoring data, graphs and a list/dashboard with overview of stuff shouldn't take more than a couple of hours. After that, you can play with tuning things, like disabling irrelevant SNMP plugins for various types of devices, customized alerting, monitoring of services (plus possibly some "one-click" remediation procedures if required, did not have time for that yet). Does fairly good job with pretty much default configuration when it comes to categorizing devices and producing relevant graphs for those.

    Quick shrinked screenshot of random stuff added to a testing LibreNMS instance:

    An overview of a switch:

    Anyway, we are totally OT here, LOL.  ;D



  • Anyway, we are totally OT here, LOL.

    They can sue us.  Thanks for the info.  Muchly appreciated.



  • Go ahead and thread jack, I don't mind. I'm going to install and learn this app so the knowledge is helpful…