Manually ran reload and the lights went dim [SOLVED]



  • I have two nearly identical APU2 2.3.3 based systems with mSATA discs - mine and my brother's.  I installed pfBlockerNG on mine some time ago and tested it.  Today I installed it on my brother's system.  I went through the settings side by side and then ran through the update options, waiting for the successful message after each one.  When I ran the last one - reload - the system went unresponsive.

    The web configurator stopped working but after around 30 minutes I was able to SSH in.  The load averages were heading down nicely.  After some poking around I noticed that config.xml was over 5 Mbytes in size and now contains over 6000 rules on the OpenVPN interface - it looks like the two rules there have been repeated continuously until something died.

    This system is three hours drive away from me and the firewall itself is fully functional so I am loathe to thrash around making random changes.  The only changes I have made was restarting php-fpm and attempting to add an easyrule in the mistaken belief that I had locked myself out.

    I have kept a copy of the current config.xml and could revert to the last known good from the menu but I'm worried that it might not complete that task.  When I tried to add the easy rule from the PHP shell it did not finish after 20 odd minutes so I killed it.  I've just taken a copy of everything in /var/log

    Could someone give me some hints on how to get this thing back in order safely.

    Cheers
    Jon

    [Edit]  To cut a (very) long story short.  I think that having the Service Watchdog "watching" unbound was the root cause of the problem.  I suspect that when pfb was trying to update the firewall rules the watchdog was restarting unbound and the two ended up in a fight.  The end result was 6000 odd rules and a lot of large config.xml.

    Even after reverting to an old config and hacking out pfb the web configurator still had to be started manually and all the VLAN interfaces failed to be configured but I could run /etc/rc.interfaces_opt_configure to get them to work and then unbound would start OK.

    After much head scratching and Googling I attached "truss" to the pid of php-fpm as it spiralled put of control to 100% on a CPU.  I noticed it reading and writing the backup config.xml files.  I killed the process and deleted the files under /cf/conf/backup/ restarted php-fpm and the web configurator and things started working properly.

    With hindsight, reverting to a known good config and deleting the rubbish config.xml files would probably have been the correct fix from the start.

    OpenVPN and ssh saved me a four hour round trip and a reinstall 8)


  • Moderator

    ps: Don't add Unbound to Service watchdog when using DNBSL … :)

    Would be a good idea for the devs to exclude Unbound when DNSBL is used... Same goes for Snort/Suritcata...
    When these packages are updating, the watchdog thinks its down, and restart it midstream...


  • Moderator

    A quick way to restore a config file from the shell:

    cp /conf/config/backup/config.xml.backup /conf/config.xml
    rm /tmp/config.cache
    

    Then goto dashboard and it will reload the new config backup…



  • @BBcan177:

    ps: Don't add Unbound to Service watchdog when using DNBSL … :)

    Would be a good idea for the devs to exclude Unbound when DNSBL is used... Same goes for Snort/Suritcata...
    When these packages are updating, the watchdog thinks its down, and restart it midstream...

    So I discovered.  Unfortunately I had to find this out for myself, armed only with a black belt in Linux sysadmin - I only speak BSD with a really strong accent and a limited vocabulary.  On the bright side I now know a lot more about how pfSense is put together.

    Could pfBlockerNG do a test for the existence of the service watchdog package when the DNSBL is enabled and issue a warning?