Squid - help with HTTPS and a reg expression?



  • Hi.

    I recently upgraded my pfsense box and decided to give v2.0.2 a spin.

    I am having issues with squid/squidguard and non-transparent proxy-ing.

    Setup:

    DSL > pfsense (wan pppoe | lan 192.168.1.1) > switches > clients
    
    DHCP:
    192.168.1.0/27 are all static mapped using MAC (this is least restricted range)
    192.168.1.90-99 is dynamically assigned (this is guest range)
    192.168.1.100-?? static mapped using MAC (kids range, most restricted)
    
    Aliases:
    all known pcs have alias
    aliases are then grouped (ie. kids, guests, etc)
    
    Forwarding DNS using Norton ip addresses (used to use open dns)
    
    firewall rules:
    allow 53 for all to norton dns alias
    deny 53 for all
    deny 80 for kids alias ip group
    deny 443 for kids alias ip group
    (default allow all is after these)
    
    squid:
    Non transparent mode
    
    squidguard:
    using MDES blacklist
    common ACL (deny all) !all
    groups ACL:
    guests > allow all except some MDES categories
    kids > allow ONLY custom target categories
    
    Set client proxy to HTTP and HTTPS using 192.168.1.1:3128.
    

    With no proxy settings, the firewall rules deny 80/443 traffic for the target machines. With proxy settings, whitelisted sites are allowed. If I whitelist mail.google.com, it fails. Error reports accounts.google.com is needed, so I allow both mail.google.com and accounts.google.com. Still fails. I was under the impression that this group (kids) would be default deny anything EXCEPT the whitelists. In that case, gmail.google.com should work.

    I don't understand the error here. Prior to this I was using transparent proxy, which of course allowed https://mail.google.com to go through. Another thing of note is that going to say http://google.com will show my squid denied page, whereas trying https://google.com just gives an error without the squid denied page showing. Is there something different in how squid and/or squidguard handles HTTPS that I am not aware of? I understand that HTTPS is encrypted, so squid cannot really "see into" the information, but I read it will filter HTTPS just like HTTP if you are not in transparent mode. What am I not understanding about this exactly?

    And finally, if I wanted to block google images from showing, how would I write a regular expression for that? If we assume that the images will always start with

    google.com/imgres?
    

    how do you write that for squid or squidguard?
    Further, this is a list I have gathered that might also work. realizing some would be used in a domain list, and some in a URL list:

    google.com
    gstatic.com
    images.google.com
    tbn.l.google.com
    t0.gstatic.com
    t1.gstatic.com
    t2.gstatic.com
    t3.gstatic.com
    t4.gstatic.com
    google.com/imghp
    google.com/images
    

    In short, I want to filter HTTPS with squid and squidguard, and get rid of google search images for certain ip groups. The safe search option doesn't filter out all I want it to unfortunately. I feel like I have a good grasp on most of this, but the inclusion of HTTPS is confusing me and regular expressions are, well, not really regular at all LOL.

    Thanks to any takers.

    Sul.



  • for google images and videos I'm blacklisting the following tags, so once they are in the url, the url is blocked instantly:
    tab=ii
    tab=iv
    tab=ti
    tab=vi
    tab=vv
    tab=wi
    tab=wv
    tbm=isch
    tbm=vid
    tbs=vid:1
    hope this is of some benefit to you and others.


Log in to reply