Corrupt Configs



  • A couple of weeks ago I had a box die with a bad config. The description on the config that went bad pointed to pfBlocker. On Thursday I had another box die with a bad config. Looking at it I see:

    15. 6/3/19 05:00:54     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    14. 6/17/19 05:03:26    v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    13. 6/17/19 12:07:03    v15.8   admin@IP.Addr
        Suricata pkg: disabled Suricata on WAN
    
    12. 8/8/19 07:36:03     v15.8   (system)
        Intermediate config write during package install for Cron.
    
    11. 8/8/19 07:36:04     v15.8   (system)
        Installed Lightsquid package.
    
    10. 8/8/19 07:36:05     v15.8   (system)
        Intermediate config write during package install for pfBlockerNG.
    
    09. 8/8/19 07:36:06     v15.8   (system)
        Installed RRD Summary package.
    
    08. 8/8/19 07:36:07     v15.8   (system)
        Intermediate config write during package install for squid3.
    
    07. 8/8/19 07:36:08     v15.8   (system)
        Upgraded Squid configuration during package install.
    
    
    -rw-r--r--  1 root  wheel   221K May  9 14:27 config-1557426439.xml
    -rw-r--r--  1 root  wheel   221K May  9 15:37 config-1557426467.xml
    -rw-r--r--  1 root  wheel   222K Jun  3 11:50 config-1559552454.xml
    -rw-r--r--  1 root  wheel   222K Aug 12 16:48 config-1560762206.xml
    -rw-r--r--  1 root  wheel   222K Jun 17 12:44 config-1560787623.xml
    -rw-r--r--  1 root  wheel   8.6K Aug  8 07:36 config-1565264163.xml
    -rw-r--r--  1 root  wheel    11K Aug  8 07:36 config-1565264164.xml
    -rw-r--r--  1 root  wheel    13K Aug  8 07:36 config-1565264165.xml
    -rw-r--r--  1 root  wheel    16K Aug  8 07:36 config-1565264166.xml
    
    

    The config changes that started on Aug 8th appear to break the config. I think it's the

    12. 8/8/19 07:36:03     v15.8   (system)
        Intermediate config write during package install for Cron.
    

    All of this is a little confusing. First, we weren't doing any updates to the firewall so I don't know why the packages were installing / updating. Second, how is whatever process running corrupting the config file? Restoring the previous version of the config file fixed it but this is the third unit we've had in the last few months that has had this problem. Can someone help me figure it out? Thanks.


  • Netgate Administrator

    You could be hitting this: https://redmine.pfsense.org/issues/8811

    Steve



  • @stephenw10 I don't think so. That bug indicates that the config continues to keep growing. In my case most of the config is removed. The last time I noted this it was pfBlockerNG that was the most recent config change. This time it appears it was cron.

    Perhaps what the most recent change was isn't the issue. If I look at all of the backups:

    30. 5/1/19 12:14:23     v15.8   admin@IP.Addr
        Suricata pkg: save enabled rule categories for lan.
    
    29. 5/1/19 12:15:03     v15.8   admin@IP.Addr
        Suricata pkg: new rules configuration for lan.
    
    28. 5/1/19 17:01:33     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    27. 5/5/19 05:00:32     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    26. 5/5/19 15:19:12     v15.8   (system)
        Updated cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf
    
    25. 5/6/19 05:00:06     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    24. 5/6/19 12:33:14     v15.8   (system)
        Updated cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf
    
    23. 5/6/19 12:33:21     v15.8   (system)
        Updated cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf
    
    22. 5/6/19 13:15:29     v15.8   admin@IP.Addr
        /firewall_aliases_edit.php made unknown change
    
    21. 5/6/19 13:18:51     v15.8   admin@IP.Addr
        /firewall_aliases_edit.php made unknown change
    
    20. 5/6/19 13:19:55     v15.8   admin@IP.Addr
        Suricata pkg: modified PASS LIST passlist_29427.
    
    19. 5/6/19 13:20:58     v15.8   admin@IP.Addr
        Suricata pkg: modified interface configuration for WAN
    
    18. 5/9/19 05:01:34     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    17. 5/9/19 14:27:19     v15.8   admin@IP.Addr
        Widget configuration has been changed.
    
    16. 5/9/19 14:27:47     v15.8   admin@IP.Addr
        System:
    
    15. 6/3/19 05:00:54     v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    14. 6/17/19 05:03:26    v15.8   (system)
        Suricata pkg: updated status for updated rules package(s) check.
    
    13. 6/17/19 12:07:03    v15.8   admin@IP.Addr
        Suricata pkg: disabled Suricata on WAN
    
    12. 8/8/19 07:36:03     v15.8   (system)
        Intermediate config write during package install for Cron.
    
    11. 8/8/19 07:36:04     v15.8   (system)
        Installed Lightsquid package.
    
    10. 8/8/19 07:36:05     v15.8   (system)
        Intermediate config write during package install for pfBlockerNG.
    
    09. 8/8/19 07:36:06     v15.8   (system)
        Installed RRD Summary package.
    
    08. 8/8/19 07:36:07     v15.8   (system)
        Intermediate config write during package install for squid3.
    
    07. 8/8/19 07:36:08     v15.8   (system)
        Upgraded Squid configuration during package install.
    
    06. 8/8/19 07:36:15     v15.8   (system)
        Intermediate config write during package install for squidGuard.
    
    05. 8/8/19 07:36:26     v15.8   (system)
        Intermediate config write during package install for suricata.
    
    04. 8/8/19 07:36:38     v15.8   (system)
        Installed suricata package.
    
    03. 8/12/19 16:48:47    v15.8   (system)
        pfBlockerNG: saving Aliases
    
    02. 8/12/19 16:48:49    v15.8   (system)
        pfBlockerNG: saving Firewall rules
    
    01. 8/12/19 16:48:53    v15.8   (system)
        Updated cron job for /usr/local/bin/freshclam --config-file=/usr/local/etc/freshclam.conf
    

    It appears there is some form of update that happens. It configures Cron, (re?)Installs LightSquid, updates pfBlockerNG, (re?)Installs RRD Summary, updates Squid3, updates SquidGuard, updates Suricata, and (re?)Installs suricata. What is happening in the system that all of that would fire off at 7:30 in the morning? It would have to be automated as nobody was in the unit at the time.


  • Netgate Administrator

    Ah, sorry I read that in reverse!

    There's nothing I'm aware of that install packages like that automatically. Only if you restored a config that had them in might it re-install them. pfBlocker updates and other cronjobs can of course be triggered at their allotted times.

    I see you're running an old version, looks like 2.3.5 from the config version. I would suggest upgrading before doing anything else if you're able to.

    Steve



  • @stephenw10 We have a bunch of these out in the wild and so updates don't make it to all in a timely fashion. The last one this happened on was 2.4.4-p1. This one is 2.3.4-p1 but I'd like to wait until I figure this out to update it so that nothing important to the diagnosis gets lost.


  • Netgate Administrator

    The only possible way I could see that happening (other than someone actually manually doing that) would be if the box rebooted at that point and it somehow pulled in a new config via ECL which had those packages in it.

    Do you have system logs going back to that?

    Steve



  • @stephenw10 Is there a way to view the logs from the command line or to export the existing logs to a text file?


  • Netgate Administrator

    You can get the complete system log by running: clog /var/log/system.log > /tmp/systemlog.txt then downloading /tmp/systemlog.txt.
    You can do both those things from Diag > Command Prompt in the webgui.

    The default log size is 500k though, it may no longer contain that data.

    Steve



  • @stephenw10 Interesting. The logs show that the filesystem was full! But right now it shows:

    Filesystem                     Size    Used   Avail Capacity  Mounted on
    /dev/ufsid/57758f7e2e26ac75     50G     42G    4.6G    90%    /
    
    

    I think we're on to something. Is there a way to have the units send an email if the disk gets over a certain percentage used?


  • Netgate Administrator

    Not that I'm aware of. I thik Zabbix can do it if you had that configured.

    What's using 42GB there though?

    Usually that only happens if some package is not behaving, usually failing to rotate it's logs. I would guess it's Suricata.

    Steve



  • I see Squid listed as an installed/re-installed package in the first post. The Squid cache can eat up a lot of disk space. The Suricata package did have an issue with log rotation not working properly, but that was fixed a few versions back. So unless you have a really old version of Suricata, its log rotation problem should not be impacting you now.



  • @stephenw10

     16G    ./suricata_igb128922
     15G    ./suricata_igb252880
     31G    .
    
    

    igb128922 is an http.log files that goes back to 7/13/2017.
    igb252880 is http.log and alerts.log that goes back only to 5/1/2019.



  • @bmeeks Yeah, I got bit by the stats log not rotating on some other units. Not sure if this is the same. The Squid cache is 8.7GB as well.



  • The Suricata log rotation fix was this year I believe, but I don't remember precisely when. One aggravation with the Suricata binary is that not all of the log output options have built-in size limits. So I have to use a separate outside process to try and trim the files. What was happening is that Suricata would continue to write to the old file handle and thus the old file would continue to grow even though a new one was started.


  • Netgate Administrator

    That probably never made a 2.3.X package though....

    Make sure you set a total log directory size limit as well as the individual log sizes.

    Steve



  • Yeah, that fix was definitely for pfSense-2.4.x.



  • I've updated to 2.4.4-p3 but the available packages page is blank:

    d6580b3e-564b-426e-b83c-086cdb80a730-image.png


  • Netgate Administrator

    Try running pkg update see what error it gives.

    Steve



  • @stephenw10

    Updating pfSense-core repository catalogue...
    pkg: Repository pfSense-core load error: access repo file(/var/db/pkg/repo-pfSense-core.sqlite) failed: No such file or directory
    pkg: https://pkg.pfsense.org/pfSense_v2_4_4_amd64-core/meta.txz: Network is unreachable
    repository pfSense-core has no meta file, using default settings
    pkg: https://pkg.pfsense.org/pfSense_v2_4_4_amd64-core/packagesite.txz: Network is unreachable
    Unable to update repository pfSense-core
    Updating pfSense repository catalogue...
    pkg: Repository pfSense load error: access repo file(/var/db/pkg/repo-pfSense.sqlite) failed: No such file or directory
    pkg: https://pkg.pfsense.org/pfSense_v2_4_4_amd64-pfSense_v2_4_4/meta.txz: Network is unreachable
    repository pfSense has no meta file, using default settings
    pkg: https://pkg.pfsense.org/pfSense_v2_4_4_amd64-pfSense_v2_4_4/packagesite.txz: Network is unreachable
    Unable to update repository pfSense
    Error updating repositories!
    
    

    EDIT: WAN is up. I'm connecting out through the unit right now.


  • Netgate Administrator

    Can the firewall itself connect out? Ping? DNS lookup?

    If you're policy routing it may have lost it's default route coming from 2.3.X.

    Make sure you have a default gateway set in System > Routing > Gateways rather than set to 'automatic' there.

    Steve



  • Using the force command fixed it, though.

    /root: pkg update -f
    Updating pfSense-core repository catalogue...
    pkg: Repository pfSense-core load error: access repo file(/var/db/pkg/repo-pfSense-core.sqlite) failed: No such file or directory
    Fetching meta.txz: 100%    944 B   0.9kB/s    00:01
    Fetching packagesite.txz: 100%    2 KiB   1.7kB/s    00:01
    Processing entries: 100%
    pfSense-core repository update completed. 7 packages processed.
    Updating pfSense repository catalogue...
    Fetching meta.txz: 100%    944 B   0.9kB/s    00:01
    Fetching packagesite.txz: 100%  138 KiB 140.9kB/s    00:01
    Processing entries: 100%
    pfSense repository update completed. 518 packages processed.
    All repositories are up to date.
    
    

  • Netgate Administrator

    Ah, interesting. And it sees packages now?



  • Yes. So far so good. I don't know why it ran out of space while in service but after taking it offline suddenly there is 4.6GB free. Certainly need to clear it out but where did that other space run off to? I wonder...



  • I've also found an oddity in /var/squid/logs:

    /var/squid/logs: ls -lah | grep access
    -rw-r-----  1 squid  proxy     0B Aug 13 00:00 access.log
    -rw-r-----  1 squid  proxy   6.1G Jun  8 05:14 access.log.0
    -rw-r-----  1 squid  proxy    52M May  5 00:00 access.log.1
    -rw-r-----  1 squid  proxy     0B Apr 25 00:00 access.log.10
    -rw-r-----  1 squid  proxy     0B Apr 24 00:00 access.log.11
    -rw-r-----  1 squid  proxy     0B Apr 23 00:00 access.log.12
    -rw-r-----  1 squid  proxy     0B Apr 22 00:00 access.log.13
    -rw-r-----  1 squid  proxy     0B Apr 21 00:00 access.log.14
    -rw-r-----  1 squid  proxy     0B Apr 20 00:00 access.log.15
    -rw-r-----  1 squid  proxy     0B Apr 19 00:00 access.log.16
    -rw-r-----  1 squid  proxy     0B Apr 18 00:00 access.log.17
    -rw-r-----  1 squid  proxy     0B Apr 17 00:00 access.log.18
    -rw-r-----  1 squid  proxy     0B Apr 16 00:00 access.log.19
    -rw-r-----  1 squid  proxy    56M May  4 00:00 access.log.2
    -rw-r-----  1 squid  proxy     0B Apr 15 00:00 access.log.20
    -rw-r-----  1 squid  proxy     0B Apr 14 00:00 access.log.21
    -rw-r-----  1 squid  proxy     0B Apr 13 00:00 access.log.22
    -rw-r-----  1 squid  proxy     0B Apr 12 00:00 access.log.23
    -rw-r-----  1 squid  proxy     0B Apr 11 00:00 access.log.24
    -rw-r-----  1 squid  proxy     0B Apr 10 00:00 access.log.25
    -rw-r-----  1 squid  proxy     0B Apr  9 00:00 access.log.26
    -rw-r-----  1 squid  proxy     0B Apr  8 00:00 access.log.27
    -rw-r-----  1 squid  proxy     0B Apr  7 00:00 access.log.28
    -rw-r-----  1 squid  proxy     0B Apr  6 00:00 access.log.29
    -rw-r-----  1 squid  proxy    35M May  3 00:00 access.log.3
    -rw-r-----  1 squid  proxy   9.6M May  2 00:00 access.log.4
    -rw-r-----  1 squid  proxy   1.9M May  1 00:00 access.log.5
    -rw-r-----  1 squid  proxy     0B Apr 29 00:00 access.log.6
    -rw-r-----  1 squid  proxy     0B Apr 28 00:00 access.log.7
    -rw-r-----  1 squid  proxy     0B Apr 27 00:00 access.log.8
    -rw-r-----  1 squid  proxy     0B Apr 26 00:00 access.log.9
    
    

    with access.log.0 being 6.1GB in size, would that indicate that it isn't rotating?



  • @Stewart said in Corrupt Configs:

    I've also found an oddity in /var/squid/logs:

    /var/squid/logs: ls -lah | grep access
    -rw-r-----  1 squid  proxy     0B Aug 13 00:00 access.log
    -rw-r-----  1 squid  proxy   6.1G Jun  8 05:14 access.log.0
    -rw-r-----  1 squid  proxy    52M May  5 00:00 access.log.1
    -rw-r-----  1 squid  proxy     0B Apr 25 00:00 access.log.10
    -rw-r-----  1 squid  proxy     0B Apr 24 00:00 access.log.11
    -rw-r-----  1 squid  proxy     0B Apr 23 00:00 access.log.12
    -rw-r-----  1 squid  proxy     0B Apr 22 00:00 access.log.13
    -rw-r-----  1 squid  proxy     0B Apr 21 00:00 access.log.14
    -rw-r-----  1 squid  proxy     0B Apr 20 00:00 access.log.15
    -rw-r-----  1 squid  proxy     0B Apr 19 00:00 access.log.16
    -rw-r-----  1 squid  proxy     0B Apr 18 00:00 access.log.17
    -rw-r-----  1 squid  proxy     0B Apr 17 00:00 access.log.18
    -rw-r-----  1 squid  proxy     0B Apr 16 00:00 access.log.19
    -rw-r-----  1 squid  proxy    56M May  4 00:00 access.log.2
    -rw-r-----  1 squid  proxy     0B Apr 15 00:00 access.log.20
    -rw-r-----  1 squid  proxy     0B Apr 14 00:00 access.log.21
    -rw-r-----  1 squid  proxy     0B Apr 13 00:00 access.log.22
    -rw-r-----  1 squid  proxy     0B Apr 12 00:00 access.log.23
    -rw-r-----  1 squid  proxy     0B Apr 11 00:00 access.log.24
    -rw-r-----  1 squid  proxy     0B Apr 10 00:00 access.log.25
    -rw-r-----  1 squid  proxy     0B Apr  9 00:00 access.log.26
    -rw-r-----  1 squid  proxy     0B Apr  8 00:00 access.log.27
    -rw-r-----  1 squid  proxy     0B Apr  7 00:00 access.log.28
    -rw-r-----  1 squid  proxy     0B Apr  6 00:00 access.log.29
    -rw-r-----  1 squid  proxy    35M May  3 00:00 access.log.3
    -rw-r-----  1 squid  proxy   9.6M May  2 00:00 access.log.4
    -rw-r-----  1 squid  proxy   1.9M May  1 00:00 access.log.5
    -rw-r-----  1 squid  proxy     0B Apr 29 00:00 access.log.6
    -rw-r-----  1 squid  proxy     0B Apr 28 00:00 access.log.7
    -rw-r-----  1 squid  proxy     0B Apr 27 00:00 access.log.8
    -rw-r-----  1 squid  proxy     0B Apr 26 00:00 access.log.9
    
    

    with access.log.0 being 6.1GB in size, would that indicate that it isn't rotating?

    It would look that way to me at first glance.


Log in to reply