[SOLVED] pfBlockerng sync and (occasional) LAN subnet blocks
Been using pfSense for a couple of years now pretty much without issue in a multi-WAN single LAN config. I had some initial trauma a year ago when I tried to set up a BT Vision multicast box, but once I had my head around how the IGMP proxy works, I nailed that and it's been pretty solid ever since!
Recently (15th Nov) upgraded from 2.2.3 to 2.2.5 successfully (it seemed).
Friday last week I added the firewall widget to the home screen and set it monitor "LAN only" after perceiving occasional inability to connect to the router (browse or ssh) and lack of connectivity to the outside world. Anything on the local subnet that does not traverse pfSense is fine - I have a bunch of servers etc. that I can connect to (via RDP/ssh/http/https etc.) without issue when this problem manifests itself.
When the problem occurs, I can see a whole bunch of LAN block messages on the firewall home page widget. So I naturally looked into the log files to see what I could see.
filter.log had rotated completely so it did not record the start of the "event". It just shows thousands of LAN IP–>WAN or router IP block messages against the default LAN deny rule (1000000103). Interestingly (or not!) these block messages (and those of pfBlocker) do not have their friendly descriptions listed in the GUI version of the firewall log.
I have none of the "Log Firewall Default Blocks" check boxes checked on the Status>System Logs>Settings tab.
Whilst I am in a multi-WAN configuration, I am directing all traffic from a host via one WAN gateway or the other under normal circumstances. In other words a single host has affinity to a single WAN gateway. In the event of a WAN failure, all traffic from an effected host is directed at the remaining gateway.
system.log showed that pfbocker sync had kicked off immediately prior to the "event". Every time the event has occurred, pfBlocker sync has just kicked off.
However pfblocker sync kicks off regularly and there is no 1:1 correlation between pfblocker sync and the loss of router/WAN connectivity - it is only occasional. The last occurrence was at around 7am this morning - I can't say for sure as filter.log rotated again by the time I checked. But at 7am system.log shows that pfblocker sync kicked off.
The "event" seems to last for approximately 15 minutes, after which normal connectivity resumes. I don't have to change/reset/reboot anything, it just starts to work again, always after around 15 minutes.
I increased the log file size to 10MB per log yesterday, but this was still insufficient to prevent filter.log rotating. I've just (as in ten minutes ago) increased the log file size to 100MB and will wait for the problem to appear again. Like an idiot, I hit the "reset log files" button to resize them and have consequently nothing to show at this point! :-(
Other than the firewall/system logs, what else do you need me to provide to be able to help? Of course sod's law is that I'll now not see another re-occurrence for days!
My thoughts so far are that the cron job might be somehow negating the default allow any rule (I have an IPv4 allow any:any catchall at the bottom of my LAN filter list). This could be memory related- though the system widget currently shows 25% of 2GB used, 9% mbuf and generally low CPU use (hovers between 2% and 10%). Swap usage is at 1%.
If I can't catch the footprint with 100MB log files, I think I'll have to resort to uninstalling pfBlockerng and seeing if things remain stable without it for a week.
Anyone any idea what might be wrong?
UPDATE Happened again between exactly 3pm and exactly 3:15pm as indicated by the screenshots (last three) of the logs. pfBlockerNG sync definitely has a bearing. The logs now capture the output.
Can't understand the 15 minute duration though. It is as if pfBlockerNG sync removes the default LAN allow any:any rule for a period of 15 minutes and then auto-magically loads that filter after 15 minutes (perhaps cron reloads the filters after 15 mins or there is a 15 minute timeout for something somewhere. Weird!
Is there something that operates around the quarter hour cycle? Just noticed (you can see it in the crontab screenshot) rc.filter_configure_sync runs every 15 minutes. Surely that's got to be it?!?
I reckon that pfBlocker wipes out (however that translates to filter loading terms?!?) the default LAN allow any rule - and possibly others, no way of knowing as I can't access the firewall to check or save the state of the rules when it happens - then 15 minutes later, filter_configure_sync comes along and reloads the defined rules: restoring connectivity.
Why does it happen only sometimes? perhaps it's due to the finite timing of the normal run of filter_configure_sync on the quarter hour conflicting with the running of pfBlockerNG sync.
![Screenshot 2015-12-08 11.51.08.png](/public/imported_attachments/1/Screenshot 2015-12-08 11.51.08.png)
![Screenshot 2015-12-08 11.51.08.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 11.51.08.png_thumb)
![Screenshot 2015-12-08 11.55.13.png](/public/imported_attachments/1/Screenshot 2015-12-08 11.55.13.png)
![Screenshot 2015-12-08 11.55.13.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 11.55.13.png_thumb)
![Screenshot 2015-12-08 12.34.23.png](/public/imported_attachments/1/Screenshot 2015-12-08 12.34.23.png)
![Screenshot 2015-12-08 12.34.23.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 12.34.23.png_thumb)
![Screenshot 2015-12-08 13.00.49.png](/public/imported_attachments/1/Screenshot 2015-12-08 13.00.49.png)
![Screenshot 2015-12-08 13.00.49.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 13.00.49.png_thumb)
![Screenshot 2015-12-08 18.12.02.png](/public/imported_attachments/1/Screenshot 2015-12-08 18.12.02.png)
![Screenshot 2015-12-08 18.12.02.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 18.12.02.png_thumb)
![Screenshot 2015-12-08 18.12.59.png](/public/imported_attachments/1/Screenshot 2015-12-08 18.12.59.png)
![Screenshot 2015-12-08 18.12.59.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 18.12.59.png_thumb)
![Screenshot 2015-12-08 18.14.42.png](/public/imported_attachments/1/Screenshot 2015-12-08 18.14.42.png)
![Screenshot 2015-12-08 18.14.42.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-08 18.14.42.png_thumb)
Going to disable pfBlockerNG in the short term and give it 24hrs. If I don't perceive this behavior during that period, I'll start digging deeper and see if I can manually offset pfBlocker by 5 minutes to start at 5 minutes past the hour (00:05,01:05,02:05 etc.) or even change rc.filter_configure_sync to run every 5 minutes and see if the event window lasts 5 minutes instead of 15.
When you are looking at the firewall logs, the pfBlockerNG rules begin with "177". None of the blocked/rejected alerts in those screenshots are from pfBlockerNG.
You shouldn't be manually editing the pfBlockerNG Cron task, please see the General Tab which has setting to alter the Cron task for the package.
If you re-enable pfBlockerNG, hit "save", then review your rules, please ensure that the "Order of the Rules" is appropriate for your network.
Are you moving any of the firewall rules after pfBlockerNG runs its cron task?
If you want to see what pfBlockerNG is blocking, please refer to the pfBlockerNG Alerts tab. Alternatively, goto the Firewall log settings, and change "Where to show rule descriptions" to "Display as column".
pfBlockerNG does not alter any of the pfSense filter.log files, all it does it read the log file to populate the Alerts Tab details.
What lists are you using? Maybe send a screenshot of the widget?
Been using pfBlocker(NG) for quite some time (did it used to be called ipBlockList or something like that?) so familiar with the config and the GUI - thank you for the pointers though. Just confirms that my config is broadly within spec :-)
I never change rule order. I'm aware of the consequences. I have the rule order set to default in pfBlockerNG config - this has not changed since I first installed it on this box between 2 and 2.5yrs ago. I experimented with pfSense for around 6 months prior to that before taking the plunge and moving off a dual-WAN Draytek 2930Vn whose licensing costs for what they called their "Web Filter" went through the roof!
I totally get that pfBlockerNG rules begin with 177 - as I suggested: I think that there is some kind of timing conflict that results in my default LAN any:any rule being removed/not being loaded. This is what results in all LAN traffic being blocked perhaps.
I know where to go for fired pgBlockerNG rules :-) I had one of my three webservers hacked(I think from China, though unlike pfSense I had no way of knowing for sure!) when using my old Draytek router, so I was determined to implement country-blocking for my open WAN ports from the start - I regularly check the alerts tab and I have the widget on the home page too.
I'll take some screenshots of my pfBlockerNG list config and post shortly.
The update this morning is that having disabled pfBlockerNG yesterday shortly after I posted that I might, I have had nearly a day (19hrs) without the problem. Rock solid.
I know I should not be tinkering with crontab - I am tearing my hair out because there is no way to alter the minute/hour components of the pfBlockerNG cron job from the GUI, outside the normal 0,15,30,45 minutes. I just want to test offsetting either/or filter_configure_sync or pfBlockerNG so that there is never a case where they both kick off at exactly the same time. They currently overlap on the hour every hour.
My gut feel is that for whatever reason, the filter_configure_sync function is unable to load my IPv4 LAN any:any rule from the existing config because the pfBlockerNG sync process has reached the assign rules stage and is reading the existing rules from the same config. Is that even possible? don't mind being wrong :-)
I might also try a full config backup, fresh install and config restore - just to make sure this is not an artifact of a previous version of one of the packages (only have snort, service_watchdog, pfBlockerNG, ntopg and the OpenVPN Export Utility) as I upgraded from 2.2.3 straight to 2.2.5.
To prove a point, I just enabled pfBlockerNG just before 11am GMT just now and within seconds, I had the same fault. The workaround is to go to the console and issues a pfctl -d and then a pfctl -e which immediately brings connectivity back.
I am reasonably resigned to backup/wipe/restore now….what do you think?
Including some pfBlockerNG config screenshots
![Screenshot 2015-12-09 11.33.43.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.33.43.png)
![Screenshot 2015-12-09 11.33.43.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.33.43.png_thumb)
![Screenshot 2015-12-09 11.34.36.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.34.36.png)
![Screenshot 2015-12-09 11.34.36.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.34.36.png_thumb)
![Screenshot 2015-12-09 11.34.50.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.34.50.png)
![Screenshot 2015-12-09 11.34.50.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.34.50.png_thumb)
![Screenshot 2015-12-09 11.35.13.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.35.13.png)
![Screenshot 2015-12-09 11.35.13.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.35.13.png_thumb)
![Screenshot 2015-12-09 11.38.47.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.38.47.png)
![Screenshot 2015-12-09 11.38.47.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.38.47.png_thumb)
![Screenshot 2015-12-09 11.39.14.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.39.14.png)
![Screenshot 2015-12-09 11.39.14.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.39.14.png_thumb)
![Screenshot 2015-12-09 11.46.00.png](/public/imported_attachments/1/Screenshot 2015-12-09 11.46.00.png)
![Screenshot 2015-12-09 11.46.00.png_thumb](/public/imported_attachments/1/Screenshot 2015-12-09 11.46.00.png_thumb)