• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Caching a sharepoint library with HTTPS reverse proxy

Scheduled Pinned Locked Moved Cache/Proxy
4 Posts 1 Posters 975 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • O
    olivier-m
    last edited by Aug 25, 2017, 2:26 PM Aug 24, 2017, 12:52 PM

    Hello,

    I have setup a HTTPS reverse proxy accelerator with pfSense and Squid for a SharePoint Online library in Office 365.
    The reverse HTTPS is working fine, and I can see all the downloaded documents in the Squid logs.

    Now I would like to cache the documents to reduce latency for our branch office.
    By default the documents have the cache header indication "no-cache" or "cache-private".
    Still, I would like to force caching the shared libraries documents (otherwise this setup has no real interest).

    So my setup is very classic as described below:

    User PC <–- request tenant.sharepoint.com ---> pfSense reverse proxy with internal CA certificate <---> Microsoft SharePoint.com online library

    My squid conf file:

    
    ---------------------------------------------------------------------
    # This file is automatically generated by pfSense
    # Do not edit manually !
    
    http_port 10.10.10.10:3128
    icp_port 0
    digest_generation off
    dns_v4_first on
    pid_filename /var/run/squid/squid.pid
    cache_effective_user squid
    cache_effective_group proxy
    error_default_language en
    icon_directory /usr/local/etc/squid/icons
    visible_hostname pfSense Firewall
    cache_mgr pfsense@mycomp.cloud
    access_log /var/squid/logs/access.log
    cache_log /var/squid/logs/cache.log
    cache_store_log none
    netdb_filename /var/squid/logs/netdb.state
    pinger_enable on
    pinger_program /usr/local/libexec/squid/pinger
    
    logfile_rotate 7
    debug_options rotate=7
    shutdown_lifetime 3 seconds
    # Allow local network(s) on interface(s)
    acl localnet src  10.10.10.0/24
    forwarded_for on
    uri_whitespace strip
    
    cache_mem 128 MB
    maximum_object_size_in_memory 20 MB
    memory_replacement_policy heap GDSF
    cache_replacement_policy heap LFUDA
    minimum_object_size 0 KB
    maximum_object_size 20 MB
    cache_dir ufs /var/squid/cache 300 16 256
    offline_mode on
    cache_swap_low 90
    cache_swap_high 95
    cache allow all
    # Add any of your own refresh_pattern entries above these.
    refresh_pattern ^ftp:    1440  20%  10080
    refresh_pattern ^gopher:  1440  0%  1440
    refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
    refresh_pattern .    0  20%  4320
    refresh_pattern -i \.jpg$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
    refresh_pattern -i \.pdf$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
    refresh_pattern -i \.docx$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
    
    #Remote proxies
    
    # Setup some default acls
    # ACLs all, manager, localhost, and to_localhost are predefined.
    acl allsrc src all
    acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901 4443 3128 3129 1025-65535
    acl sslports port 443 563 4443
    ---------------------------------------------------------------------
    
    

    The Squid access log:

    Date IP Status Address User Destination
    24.08.2017 12:42:18 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/picture.jpg
    24.08.2017 12:42:17 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.pdf
    24.08.2017 12:42:16 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.docx

    The cache manager info:

    Cache information for squid:
    Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
    Hits as % of bytes sent: 5min: 0.0%, 60min: 0.0%
    Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
    Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
    Storage Swap size: 0 KB
    Storage Swap capacity: 0.0% used, 100.0% free
    Storage Mem size: 216 KB
    Storage Mem capacity: 0.2% used, 99.8% free
    Mean Object Size: 0.00 KB

    If I retry to download, I keep getting the HTTP_MISS. Can't get any file into the cache.

    Microsoft is tagging all documents with the following cache tag:

    HTTP/1.1 Cache-Control Header is present: private,max-age=0
    private: This response MUST NOT be cached by a shared cache.
    max-age: This resource will expire immediately. (0 sec)

    That's for security reasons I suppose (by default).
    But I would like to know if I can override this and evaluate the security risk.
    Microsoft offer a lot of online storage space, but without a proxy for caching, it's pretty much useless.

    Thank you for any help you could give.

    1 Reply Last reply Reply Quote 0
    • O
      olivier-m
      last edited by Aug 24, 2017, 3:18 PM

      I forgot to add that I am actually using the WebDAV protocol.
      But it seems that Squid support caching WebDAV : http://www.webdav.org/other/proxy.html

      I am not sure that this is going to work:

      1 - using WebDAV
      2 - over HTTPS
      3 - with files tagged with cache-control : no-cache / cache-private

      Even it I finally get it working, as I've read somewhere else, I have a high risk to create corruption data in the WebDAV repository.

      Please tell me if I'm wrong.

      1 Reply Last reply Reply Quote 0
      • O
        olivier-m
        last edited by Aug 25, 2017, 2:25 PM

        Long story short, I am now able to cache Sharepoint documents with the following configuration file:

        
        # This file is automatically generated by pfSense
        # Do not edit manually !
        
        http_port 10.10.10.10:3128
        icp_port 0
        digest_generation off
        dns_v4_first on
        pid_filename /var/run/squid/squid.pid
        cache_effective_user squid
        cache_effective_group proxy
        error_default_language en
        icon_directory /usr/local/etc/squid/icons
        visible_hostname sv-1101-wvp01.virtualdesk.cloud
        cache_mgr pfsense@virtualdesk.cloud
        access_log /var/squid/logs/access.log
        cache_log /var/squid/logs/cache.log
        cache_store_log none
        netdb_filename /var/squid/logs/netdb.state
        pinger_enable on
        pinger_program /usr/local/libexec/squid/pinger
        
        logfile_rotate 7
        debug_options rotate=7
        shutdown_lifetime 3 seconds
        # Allow local network(s) on interface(s)
        acl localnet src  92.222.209.0/24
        forwarded_for on
        uri_whitespace strip
        
        cache_mem 128 MB
        maximum_object_size_in_memory 512 KB
        memory_replacement_policy heap GDSF
        cache_replacement_policy heap LFUDA
        minimum_object_size 0 KB
        maximum_object_size 20 MB
        cache_dir ufs /var/squid/cache 100 16 256
        offline_mode on
        cache_swap_low 90
        cache_swap_high 95
        cache allow all
        
        # Cache documents regardless what the server says
        refresh_pattern .jpg 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .gif 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .png 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .txt 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .doc 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .docx 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .xls 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .xlsx 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        refresh_pattern .pdf 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
        
        # Setup acls
        acl allsrc src all
        http_access allow all
        
        request_body_max_size 0 KB
        delay_pools 1
        delay_class 1 2
        delay_parameters 1 -1/-1 -1/-1
        delay_initial_bucket_level 100
        delay_access 1 allow allsrc
        
        # Reverse Proxy settings
        https_port 10.10.10.10:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
        cache_peer tenant.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint
        deny_info TCP_RESET allsrc
        
        

        But unfortunately, it is not working yet.
        The WebDAV client (Windows) will not accept to download from the cache.

        I will receive errors from SQUID :

        TCP_OFFLINE_HIT_ABORTED/000

        (see attachment)

        ice_screenshot_20170825-152338.png
        ice_screenshot_20170825-152338.png_thumb

        1 Reply Last reply Reply Quote 0
        • O
          olivier-m
          last edited by Aug 31, 2017, 10:05 AM

          Found the right configuration with the help of the Squid Users mailing list.
          I had to add different options to ignore cache control and force the cache to keep and serve the content.
          But it's working now.
          For the record, I'm posting the working Squid Configuration below.

          
          http_port 10.10.10.10.108:3128
          icp_port 0
          digest_generation off
          dns_v4_first on
          pid_filename /var/run/squid/squid.pid
          cache_effective_user squid
          cache_effective_group proxy
          error_default_language en
          icon_directory /usr/local/etc/squid/icons
          visible_hostname pfSense Firewall
          cache_mgr pfsense@virtualdesk.cloud
          access_log /var/squid/logs/access.log
          cache_log /var/squid/logs/cache.log
          cache_store_log none
          netdb_filename /var/squid/logs/netdb.state
          pinger_enable on
          pinger_program /usr/local/libexec/squid/pinger
          
          logfile_rotate 7
          debug_options rotate=7
          shutdown_lifetime 3 seconds
          forwarded_for on
          uri_whitespace strip
          
          refresh_pattern -i \.(jpg|gif|png|txt|docx|xlsx|pdf) 30240 100% 43800 override-expire ignore-private ignore-reload store-stale
          
          cache_mem 128 MB
          maximum_object_size_in_memory 20480 KB
          memory_replacement_policy lru
          cache_replacement_policy lru
          minimum_object_size 0 KB
          maximum_object_size 50 MB
          cache_dir ufs /var/squid/cache 100 16 256
          offline_mode on
          cache_swap_low 90
          cache_swap_high 95
          cache allow all
          
          # Add any of your own refresh_pattern entries above these.
          refresh_pattern ^ftp:    1440  20%  10080
          refresh_pattern ^gopher:  1440  0%  1440
          refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
          refresh_pattern .    0  20%  4320
          
          #ACL allow all
          acl allsrc src all
          http_access allow allsrc
          
          request_body_max_size 0 KB
          delay_pools 1
          delay_class 1 2
          delay_parameters 1 -1/-1 -1/-1
          delay_initial_bucket_level 100
          delay_access 1 allow allsrc
          
          # Reverse Proxy settings
          https_port 10.10.10.10.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key defaultsite=tenant.sharepoint.com vhost
          
          #
          cache_peer 13.107.6.151 parent 443 0 ignore-cc no-query no-digest originserver login=PASSTHRU connection-auth=on round-robin ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint
          
          
          1 Reply Last reply Reply Quote 0
          1 out of 4
          • First post
            1/4
            Last post
          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
            This community forum collects and processes your personal information.
            consent.not_received