Periodic since 2.2 pages load blank, certs invalid

Trel

This has now happened three times

The symptoms I can see are

1. HTTP Webpages load blank
2. HTTPs webpages give a security error
3. Accessing pages by IP works
4. Any IP based connection works
5. Tracert appears valid

When this happens, if
1. I release and renew the IP for the WAN it works again
2. If I reboot PFSense, it works again

Additionally, while this is occuring
1. I CAN access the firewall's GUI internally (correct behavior)
1. I CAN access the firewall's GUI externally (correct behavior)

This started happening since I upgraded to 2.2 on Saturday morning.
No new rules have been created, no new firewall logs are showing up when this happens, nothing not usual in any of the log tabs.

This is a physical box with PFSense installed directly.
I just rebooted it remotely, to get everything back up and working.

Does anyone know what's going on here, or where I can look for more info?

EDIT: I should add, that the packages I have installed are
1. arping
2. Cron
3. File Manager
4. Notes
5. OpenVPN Client Export Utility

cmb

What specifically is the cert error you get on HTTPS sites in that circumstance? Short of some hacking to try to transparently proxy HTTPS, which wouldn't happen with that list of packages you have there, there wouldn't be anything within the firewall itself that'd cause a cert error.

Trel

@cmb:

What specifically is the cert error you get on HTTPS sites in that circumstance? Short of some hacking to try to transparently proxy HTTPS, which wouldn't happen with that list of packages you have there, there wouldn't be anything within the firewall itself that'd cause a cert error.

I can get that information if this happens again.
I had only had the blank page one when I was present, but I was told about a site giving the invalid cert, so I told them to try another site to see if it was their machine's clock being wrong or similar, and that's how I found out the blank page thing was happening again.

At that point I remotely rebooted and it hasn't happened again yet.

If I had to guess, considering the webpages didn't report an error, just loaded blank, I'd guess that it was the same for the cert and as such, they were invalid.

Trel

It's happening again. I can't get the cert error to occur on my machine, but I am getting the blank pages.

Is there any debug steps I can take before I reboot?

EDIT: I can reproduce the cert issue, any site affected shows up as a self signed cert from and to "lolcat"
EDIT2: It suddenly resolved itself and the pages began loading again and certs were valid.

I think I'm likely going to reinstall 2.1.5 if I can't find any reason for this. It coincides with the upgrade and the cert part of it, especially with an actual name showing up there unless that's some default name in Firefox that I don't know about when it can't load a cert, has me seriously worried.

cmb

Packet capture of the problem traffic would be telling.

The whole getting a self-signed certificate with "lolcat" is a serious cause for concern, nothing on your firewall would be doing that, that suggests some kind of malware somewhere. Potentially on a client machine that's running an ARP poisoning tool and hijacking your connections on occasion.

Check your system log for any indications of "xx:xx:xx:xx:xx:xx is using my IP 192.168.1.1" (replace 1.1 with your LAN IP), that's one place to see if you have something trying to ARP poison.

Trel

@cmb:

Packet capture of the problem traffic would be telling.

The whole getting a self-signed certificate with "lolcat" is a serious cause for concern, nothing on your firewall would be doing that, that suggests some kind of malware somewhere. Potentially on a client machine that's running an ARP poisoning tool and hijacking your connections on occasion.

Check your system log for any indications of "xx:xx:xx:xx:xx:xx is using my IP 192.168.1.1" (replace 1.1 with your LAN IP), that's one place to see if you have something trying to ARP poison.

I'll try that when next it happens.
I don't see anything in the system log that has any mention of the IP.

Someone in IRC suggested that when it happens next, I run 'openssl s_client -showcerts -connect site:443' from an SSH to the firewall to verify it's not something upstream.
Considering that I don't necessarily have to reboot PFSense to resolve it, and releasing/renewing the WAN IP fixes it as well has me worried on that.

If it is something upstream, I'm extra worried because my connection is ISP–Modem--PFSense--Switch(s)--Computers

cmb

Running openssl from the firewall itself is a good idea, that'll at least bisect the issue. It seems unlikely it'd be on your ISP's side at least if it's the same ISP you're using to hit the forum, Comcast. It's possible, and given a change of WAN IP fixes it, that makes it seem more likely it's WAN-side, as nothing LAN-side would be impacted by that (unless it's just a coincidence). Some inept small ISP with a bunch of customers on the same broadcast domain and inadequate protection against things like ARP poisoning from customer to customer, I could see as being potentially more likely.

Trel

@cmb:

Running openssl from the firewall itself is a good idea, that'll at least bisect the issue. It seems unlikely it'd be on your ISP's side at least if it's the same ISP you're using to hit the forum, Comcast. It's possible, and given a change of WAN IP fixes it, that makes it seem more likely it's WAN-side, as nothing LAN-side would be impacted by that (unless it's just a coincidence). Some inept small ISP with a bunch of customers on the same broadcast domain and inadequate protection against things like ARP poisoning from customer to customer, I could see as being potentially more likely.

I should mention though, that when I release/renew the WAN interface, I'm not getting a new IP. I'm getting the same one. Breaking the connection seems to be what fixes it.
I can't speculate further until I can run the openssl command from pfsense when it happens again.

And yes, currently Comcast, is the ISP in question.

I have one other theory as to the lolcat cert in that it's placeholder text in Firefox in the event that a cert loads completely blank, which would make sense as non SSL pages load blank while this is happening. I would need to look at (or ask someone familiar with) Firefox's sourcecode to know for sure.

But that will be answered when I run openssl from pfsense at least if I get something other than the cert I got when I ran it baseline for comparison.

MikeV7896

@Trel:

I should mention though, that when I release/renew the WAN interface, I'm not getting a new IP. I'm getting the same one. Breaking the connection seems to be what fixes it.
…
And yes, currently Comcast, is the ISP in question.

Comcast's DHCP leases are for a few days, which is why you don't get a new address with a release/renew. From looking at the DHCP client leases file on my box, it looks like they're about 4 days (renew time is half of the lease, and there's 2 days from renew to expire). IPv6 prefix leases are 7 days, from what I was told by a Comcast network engineer in another forum.

Trel

I know. I meant the issue wasn't stopping from my WAN IP changing when I release/renew.

saywhat

We have had the exact same issue here in UK. We use BT as the ISP/

Same lolcat 3rd party self signed cert appearing for many sites, all DNS being redirected to 195.22.26.248 which shows as being a malicious IP in Portugal, used for lots of spammy domains.

Interestingly, we had Google DNS set on pfsense. When I changed this to OpenDNS the problem immediately went away, pings began to return correct IPs again etc.

I know Google DNS was hijacked before, so that is a possibility, but I would have thought an attack such as that would have hit the news on twitter by now.

Trel

@saywhat:

We have had the exact same issue here in UK. We use BT as the ISP/

Same lolcat 3rd party self signed cert appearing for many sites, all DNS being redirected to 195.22.26.248 which shows as being a malicious IP in Portugal, used for lots of spammy domains.

Interestingly, we had Google DNS set on pfsense. When I changed this to OpenDNS the problem immediately went away, pings began to return correct IPs again etc.

I know Google DNS was hijacked before, so that is a possibility, but I would have thought an attack such as that would have hit the news on twitter by now.

Now THAT actually gives me something to go on.

My DNS servers are
4.2.2.2
8.8.8.8
4.2.2.1
8.8.4.4

The 8.8.8.8 and 8.8.4.4 being Google DNS.
Are those the ones you have configured?

EDIT: and that could also explain how releasing and renewing the WAN connection fixes it. If it loses connection with 4.2.2.2 and goes to Google's 8.8.8.8 and that's the problem, releasing/renewing might re-establish contact with 4.2.2.2.

saywhat

We were using 8.8.8.8 and 8.8.4.4

Changed them over to opendns and machines responded almost immediately

Trel

@saywhat:

We were using 8.8.8.8 and 8.8.4.4

Changed them over to opendns and machines responded almost immediately

That looks like it has a good chance at being the cause then.
I'm going to remove those from my list and just keep the Level3 ones (4.2.2.1 and 4.2.2.2) and see if it ever happens again.

That might also explain why it didn't happen until right after the 2.2 upgrade if dnsmasq had a higher tolerance before falling over to the secondary DNS server than unbound does.

Pakken

God, and I thought I was the only one having this problem since I came up reading this thread.

Any news about that? Same invalid cert, same google dns.
Spent the last night trying to figure out what the he** could have happened.

Trel

@Pakken:

God, and I thought I was the only one having this problem since I came up reading this thread.

Any news about that? Same invalid cert, same google dns.
Spent the last night trying to figure out what the he** could have happened.

Other than us three, I haven't found anyone who reported it anywhere but here.

But it's way too coincidental that three people got the same symptoms and had the same dns.

kejianshi

Me also - Thats main reason I turned off forwarder and turned on unbound on one of my systems.
The kids were reporting same exact issues as you…

Unbound with DNSSEC is technically slower than a forwarder but it seems faster in actual use and the kids report its solid.
I'm also using it over the VPN for my private use.

doktornotor

NSA testing some new (broken) toys? :D

kejianshi

I will just say I like unbound and leave it at that… (-;

Unbound + VPN = my tinfoil hat

Trel

I just had this happen with level3 DNS (4.2.2.1 and 4.2.2.2) as the DNS servers. I removed them leaving ONLY OpenDNS and it immediately started resolving correctly again.