New project: accountability software for a network (prevent porn)
-
I have an idea for a project, and I'm wondering whether it makes sense to build on top of pfSense or directly on top of FreeBSD. And I'm looking for suggestions on how to get started. Please poke holes in my ideas here, if you see flaws.
General idea for accountability software
I see a lot of people use pfsense to protect kids from accidentally seeing porn. People seem to usually recommend OpenDNS or SquidGuard for this. I think that's great.Covenant Eyes (http://www.covenanteyes.com/) monitors URLs from Windows, Mac, Android, and iOS devices. Then it generates reports of URLs that likely contain adult content, and sends them to a user's chosen accountability partner(s).
However, these days we have game consoles, smart TVs, etc. that actually have web browsers built in, and you can't install such software. Or you can boot Linux from a USB stick or DVD, and bypass Covenant Eyes or similar software installed on the computer.
I'd like to do something similar to Covenant Eyes, but protect an entire network from a home firewall. pfSense looks like a great platform for this, because of the BSD license, the great networking stack in FreeBSD, and the community in pfSense.
But pfSense UI is for networking nerds, and I'd want to develop a very simplified UI the average home user could configure. Having the full power of pfSense in advanced settings would be cool.
If I could actually accomplish this (big if), I'd probably want some of my work open source (thinking BSD or Apache licensed), some closed source, and have a monthly subscription for customers to generate reports and send them to accountability partners.
I develop enterprise application software for a living, but I have a lot of lower-level networking to learn. The algorithms and statistics to identify what is likely porn won't be easy either, but I really like math.
Questions
Would it make sense to build something like this on top of pfSense? Or would FreeBSD itself be a better platform?
What existing open-source projects should I be aware of? For transparent proxies, I think of squid, squidguard, Dan's guardian, Apache Traffic Server. How about for for logging DNS requests, and for forcing the "safe search" of Google, Bing, and YouTube.
Are there instructions in how to set up a development environment for pfSense packages? How are they related to FreeBSD ports?
Requirements & Design
1. Log DNS requests… think I can get hostnames this way even for HTTPS. Include IP address and hostname of client and a timestamp. Can't identify by user, unless it's paired with client software.
2. I think I want to avoid decrypting HTTPS, logging URLs, and re-encrypting. Installing certificates on each client is a bit much for the typical user. And I worry about introducing security holes this way that could leak passwords, credit card numbers, etc. But this might be required to tell whether you were viewing NSFW content on Reddit or general stuff on Reddit, for instance.
3. Log HTTP URLs from a transparent proxy. Record IP address (and hostname) of client and timestamp.
4. MAYBE also inspect HTML from the transparent proxy looking for porn-related keywords. This would be similar to email spam filtering algorithms. But... this might be too much load for a passively cooled Atom CPU.
5. Every X minutes, send the DNS and HTTP traffic log to a web service in the cloud.
6. (in the cloud), run jobs to identify adult content against a database of domain names and by keywords in the URL.
7. Force Google, Bing, and YouTube safe search.
8. Try to prevent people from getting around logging with SSH tunnels and proxy servers??? This sounds really hard, and probably would be a lower priority.