Pfsense blocks websites after some hours of uptime
-
I have a SOHO installation where pfsense begins blocking access to websites, but after a reboot works fine for several hours. After much research, I am still unable to make a dent in the problem and need some expert pointers on what diagnostic steps to take to isolate the root cause.
The box originally had version 1.2.x, which was having a similar problem. I did an upgrade to 2.3.2-RELEASE-p1 and reinstalled the packages pfBlockerNG, Snort and a few utilities. The problem repeated, and reading that there could be left over files from the original packages, I did a clean install from CD. This did not help. After that I moved the box from its position between the modem and the wireless router to be between the router and a PC so I could work on it without bothering others. (still double NAT) Any complications from moving it should be immediate and not delayed. The problem did repeat, working fine for half a day then blocking.Observations when the problem is happening:
The version status line in the dashboard is unable to update.
The Package Manager could not retrieve the list of installed packages.
I cannot ping an external website, but the DNS lookup obtained the correct IP address.
I can ping Google's DNS IP.
Restarting the DNS Forwarder does not help.
Rebooting the upstream wireless router does not help.
I can ping the wireless router.
The Snort Blocked list is empty.
Turning off pfBlockerNG does not help.
Rebooting the pfsense box always clears the problem.Any advice on how to track this down?
-
Any time there are weird connection issues, I usually recommend temporarily disabling ALL IDS/IPS packages like Snort, Suricata, pfBlockerNG.
Is it all Internet access in general that slows down, or only HTTP/S?
-
It seems to affect all websites attempted thru a browser and also pf trying to retrieve version status.
The DNS requests seem to go through alright, as evident in the DNS resolver log. (odd because the Resolver is disabled and Forwarder enabled. They must write to the same log?)
I am going to have to do more testing, which is rather slow given the delay. -
Use the upstream wireless gear's diagnostics to see if the problem is there. Can it ping out to sites you can't from behind pfSense? When the problem is happening, do a packet capture on WAN and see if you're getting any replies. Can you ping problem sites via pfSense Diagnostics - Ping? Can you resolve new FQDNs (eg somewhere you haven't been lately or ever so their address isn't cached by your DNS)? Can you browse anywhere based on IP address instead of FQDN? Are you running squid proxy?
-
Well, wouldn't you know as soon as I get a thread going, pfsense stays up for 5 days with no problem. There is a background level of snort being overzealous on blocking, but the original problem has not recurred. I can answer one question;
I am not running any proxy. Will report back on the other questions as soon as I get a malfunction. -
Finally, I have another malfunction, and another clue. First let me answer the questions that were outstanding from before.
1. Can you ping from upstream of pfsense? Did not try this, but since reboot instantly fixed it, would assume that would work.
2. Do packet capture on WAN. Any replies? Looks like the WAN port (192.168.1.46) and the modem (192.168.1.1) had interactions, but a normal capture shows much more variety. There are no external IP's in this capture.|
No. Time Source Destination Protocol Length Info
1 0.000000 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=258/513, ttl=64 (reply in 2)
2 0.000422 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=258/513, ttl=64 (request in 1)
3 0.296613 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.119? Tell 192.168.1.1
4 0.502001 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=259/769, ttl=64 (reply in 5)
5 0.502430 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=259/769, ttl=64 (request in 4)
6 1.004017 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=260/1025, ttl=64 (reply in 7)
7 1.004443 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=260/1025, ttl=64 (request in 6)
8 1.316873 192.168.1.46 192.168.1.1 DNS 85 Standard query 0x0eed A 12-courier.push.apple.com
9 1.346712 192.168.1.1 192.168.1.46 DNS 312 Standard query response 0x0eed A 12-courier.push.apple.com CNAME
10 1.496623 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.120? Tell 192.168.1.1
11 1.506012 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=261/1281, ttl=64 (reply in 12)
12 1.506411 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=261/1281, ttl=64 (request in 11)
13 2.008015 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=262/1537, ttl=64 (reply in 14)
14 2.008441 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=262/1537, ttl=64 (request in 13)
15 2.133592 192.168.1.46 192.168.1.1 DNS 81 Standard query 0xcc34 A alt4-mtalk.google.com
16 2.163621 192.168.1.1 192.168.1.46 DNS 132 Standard query response 0xcc34 A alt4-mtalk.google.com CNAME Frame 16:
17 2.510013 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=263/1793, ttl=64 (reply in 18)
18 2.510466 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=263/1793, ttl=64 (request in 17)
19 2.696604 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.121? Tell 192.168.1.1
20 3.011004 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=264/2049, ttl=64 (reply in 21)
21 3.011432 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=264/2049, ttl=64 (request in 20)
22 3.513002 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=265/2305, ttl=64 (reply in 23)
23 3.513428 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=265/2305, ttl=64 (request in 22)
24 3.896641 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.122? Tell 192.168.1.1
25 4.014643 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=266/2561, ttl=64 (reply in 26)
26 4.015057 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=266/2561, ttl=64 (request in 25)
27 4.537490 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=267/2817, ttl=64 (reply in 28)
28 4.537913 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=267/2817, ttl=64 (request in 27)
29 5.038242 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=268/3073, ttl=64 (reply in 30)
30 5.038669 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=268/3073, ttl=64 (request in 29)
31 5.096618 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.123? Tell 192.168.1.1
32 5.540002 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=269/3329, ttl=64 (reply in 33)
33 5.540433 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=269/3329, ttl=64 (request in 32)
34 6.042002 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=270/3585, ttl=64 (reply in 35)
35 6.042439 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=270/3585, ttl=64 (request in 34)
36 6.100124 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) request id=0x5c90, seq=0/0, ttl=64 (no response found!)
37 6.543999 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=271/3841, ttl=64 (reply in 38)
38 6.544426 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=271/3841, ttl=64 (request in 37)
39 7.046003 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=272/4097, ttl=64 (reply in 40)
40 7.046420 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=272/4097, ttl=64 (request in 39)
41 7.100756 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.124? Tell 192.168.1.1
42 7.548005 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=273/4353, ttl=64 (reply in 43)
43 7.548429 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=273/4353, ttl=64 (request in 42)
44 8.050011 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=274/4609, ttl=64 (reply in 45)
45 8.050432 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=274/4609, ttl=64 (request in 44)
46 8.300646 WestellT_72:4b:f0 Broadcast ARP 60 Who has 192.168.1.125? Tell 192.168.1.1
47 8.552000 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=275/4865, ttl=64 (reply in 48)
48 8.552398 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=275/4865, ttl=64 (request in 47)
49 9.053994 192.168.1.46 192.168.1.1 ICMP 60 Echo (ping) request id=0x7829, seq=276/5121, ttl=64 (reply in 50)
50 9.054485 192.168.1.1 192.168.1.46 ICMP 60 Echo (ping) reply id=0x7829, seq=276/5121, ttl=64 (request in 49)
|3. Try a new FQDN (not cached), any difference? No response to ping, however, DNS lookup thru ISP appeared to be successful, no response from google DNS.
4. Can you browse by IP address? No.
5. Running Squid proxy? No, but package is installed now. Did not seem to matter.The new clue is that this condition arose immediately after a 5 second power outage. The pfsense box is on a UPS, but the modem and router are not. As before, a reboot instantly fixed the problem. Could interrupted transactions on the WAN have put it into an unusual state?
-
It happened again, all websites inaccessible no apparent reason. This time people were watching and there was definitely NO power glitch involved. The only solution in sight is to train people to reset the box.
-
I think the best troubleshooting is what KOM suggested or that is where I would start. Turn off all package. Leave pfsense running for 7 days then re-enable a package like snort. Reboot leave it for 7 days. Repeat. Until you find the problem child.
-
I had the same issue - I found some apple and google ip's in snort's block list and suppressed them.
-
Try to turn off the SNORT service for a while (if not already tried). Just to verify IF it could have some relationship. Might not be the answer, but it's more to eliminate potential speedbumps…
Since PING does not seem to give the wanted results (and then I do expect TRACEROUTE won't be any better), could there be some kind of ICMP blocking on the modem? Does not seem very likely, but if possible check it out (Since the traffic goes from WAN to modem but no response outside modem).Knottolf
-
Thanks all for the suggestions. It's been 11 days without a glitch… About time for another outage. I'm going to try the package removal idea on the theory that it's some interaction or incompatibility triggered by some rare event, but first I'm going to install the 2.3.3 upgrade and see if that cleans it.
For the record, I did turn off snort, pfBlockerNG and clear the blocked list during one of the episodes to no effect, but maybe that wouldn't be guaranteed to revert things.
Interesting idea about the modem. -
Problem solved!
Uptime is over 30 days with no issues so I'm going to declare success. The root cause seems to have been a mis-configuration issue that was probably left over from the initial setup a couple of software versions back. I discovered that the LAN interface was set as the default gateway. The clue was that the gateway status was showing Online for WAN_DHCP but always Pending for LANGW which caused me to dig into why. I changed WAN_DHCP to the default gateway and removed LANGW and there have been no more problems. Wonder why it worked most of the time instead of not at all… -
I had some similar issues as well, turns out I set three things:
I enabled SNORT as the IDS
I had Automatically checked the block systems from SNORT
The SNORT IDS automatically blocked some web pages that had been flagged by innocuous http inspect errors ( BYTE BLOCK etc)Once I suppressed the false flags http inspect, I then reset (cleared) all the blocked sites and poof I could get to where I had been unable to previously.
~Zackis