Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Caching a sharepoint library with HTTPS reverse proxy

    Scheduled Pinned Locked Moved Cache/Proxy
    4 Posts 1 Posters 964 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • O
      olivier-m
      last edited by

      Hello,

      I have setup a HTTPS reverse proxy accelerator with pfSense and Squid for a SharePoint Online library in Office 365.
      The reverse HTTPS is working fine, and I can see all the downloaded documents in the Squid logs.

      Now I would like to cache the documents to reduce latency for our branch office.
      By default the documents have the cache header indication "no-cache" or "cache-private".
      Still, I would like to force caching the shared libraries documents (otherwise this setup has no real interest).

      So my setup is very classic as described below:

      User PC <–- request tenant.sharepoint.com ---> pfSense reverse proxy with internal CA certificate <---> Microsoft SharePoint.com online library

      My squid conf file:

      
      ---------------------------------------------------------------------
      # This file is automatically generated by pfSense
      # Do not edit manually !
      
      http_port 10.10.10.10:3128
      icp_port 0
      digest_generation off
      dns_v4_first on
      pid_filename /var/run/squid/squid.pid
      cache_effective_user squid
      cache_effective_group proxy
      error_default_language en
      icon_directory /usr/local/etc/squid/icons
      visible_hostname pfSense Firewall
      cache_mgr pfsense@mycomp.cloud
      access_log /var/squid/logs/access.log
      cache_log /var/squid/logs/cache.log
      cache_store_log none
      netdb_filename /var/squid/logs/netdb.state
      pinger_enable on
      pinger_program /usr/local/libexec/squid/pinger
      
      logfile_rotate 7
      debug_options rotate=7
      shutdown_lifetime 3 seconds
      # Allow local network(s) on interface(s)
      acl localnet src  10.10.10.0/24
      forwarded_for on
      uri_whitespace strip
      
      cache_mem 128 MB
      maximum_object_size_in_memory 20 MB
      memory_replacement_policy heap GDSF
      cache_replacement_policy heap LFUDA
      minimum_object_size 0 KB
      maximum_object_size 20 MB
      cache_dir ufs /var/squid/cache 300 16 256
      offline_mode on
      cache_swap_low 90
      cache_swap_high 95
      cache allow all
      # Add any of your own refresh_pattern entries above these.
      refresh_pattern ^ftp:    1440  20%  10080
      refresh_pattern ^gopher:  1440  0%  1440
      refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
      refresh_pattern .    0  20%  4320
      refresh_pattern -i \.jpg$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
      refresh_pattern -i \.pdf$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
      refresh_pattern -i \.docx$ 30 50% 4320 ignore-reload ignore-no-cache ignore-no-store ignore-private
      
      #Remote proxies
      
      # Setup some default acls
      # ACLs all, manager, localhost, and to_localhost are predefined.
      acl allsrc src all
      acl safeports port 21 70 80 210 280 443 488 563 591 631 777 901 4443 3128 3129 1025-65535
      acl sslports port 443 563 4443
      ---------------------------------------------------------------------
      
      

      The Squid access log:

      Date IP Status Address User Destination
      24.08.2017 12:42:18 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/picture.jpg
      24.08.2017 12:42:17 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.pdf
      24.08.2017 12:42:16 10.10.10.100 TCP_MISS/200 https://tenant.sharepoint.com/sites/Marketing/Shared%20Documents/large1.docx

      The cache manager info:

      Cache information for squid:
      Hits as % of all requests: 5min: 0.0%, 60min: 0.0%
      Hits as % of bytes sent: 5min: 0.0%, 60min: 0.0%
      Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
      Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0%
      Storage Swap size: 0 KB
      Storage Swap capacity: 0.0% used, 100.0% free
      Storage Mem size: 216 KB
      Storage Mem capacity: 0.2% used, 99.8% free
      Mean Object Size: 0.00 KB

      If I retry to download, I keep getting the HTTP_MISS. Can't get any file into the cache.

      Microsoft is tagging all documents with the following cache tag:

      HTTP/1.1 Cache-Control Header is present: private,max-age=0
      private: This response MUST NOT be cached by a shared cache.
      max-age: This resource will expire immediately. (0 sec)

      That's for security reasons I suppose (by default).
      But I would like to know if I can override this and evaluate the security risk.
      Microsoft offer a lot of online storage space, but without a proxy for caching, it's pretty much useless.

      Thank you for any help you could give.

      1 Reply Last reply Reply Quote 0
      • O
        olivier-m
        last edited by

        I forgot to add that I am actually using the WebDAV protocol.
        But it seems that Squid support caching WebDAV : http://www.webdav.org/other/proxy.html

        I am not sure that this is going to work:

        1 - using WebDAV
        2 - over HTTPS
        3 - with files tagged with cache-control : no-cache / cache-private

        Even it I finally get it working, as I've read somewhere else, I have a high risk to create corruption data in the WebDAV repository.

        Please tell me if I'm wrong.

        1 Reply Last reply Reply Quote 0
        • O
          olivier-m
          last edited by

          Long story short, I am now able to cache Sharepoint documents with the following configuration file:

          
          # This file is automatically generated by pfSense
          # Do not edit manually !
          
          http_port 10.10.10.10:3128
          icp_port 0
          digest_generation off
          dns_v4_first on
          pid_filename /var/run/squid/squid.pid
          cache_effective_user squid
          cache_effective_group proxy
          error_default_language en
          icon_directory /usr/local/etc/squid/icons
          visible_hostname sv-1101-wvp01.virtualdesk.cloud
          cache_mgr pfsense@virtualdesk.cloud
          access_log /var/squid/logs/access.log
          cache_log /var/squid/logs/cache.log
          cache_store_log none
          netdb_filename /var/squid/logs/netdb.state
          pinger_enable on
          pinger_program /usr/local/libexec/squid/pinger
          
          logfile_rotate 7
          debug_options rotate=7
          shutdown_lifetime 3 seconds
          # Allow local network(s) on interface(s)
          acl localnet src  92.222.209.0/24
          forwarded_for on
          uri_whitespace strip
          
          cache_mem 128 MB
          maximum_object_size_in_memory 512 KB
          memory_replacement_policy heap GDSF
          cache_replacement_policy heap LFUDA
          minimum_object_size 0 KB
          maximum_object_size 20 MB
          cache_dir ufs /var/squid/cache 100 16 256
          offline_mode on
          cache_swap_low 90
          cache_swap_high 95
          cache allow all
          
          # Cache documents regardless what the server says
          refresh_pattern .jpg 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .gif 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .png 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .txt 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .doc 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .docx 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .xls 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .xlsx 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          refresh_pattern .pdf 60 90% 600 override-expire override-lastmod ignore-reload ignore-private
          
          # Setup acls
          acl allsrc src all
          http_access allow all
          
          request_body_max_size 0 KB
          delay_pools 1
          delay_class 1 2
          delay_parameters 1 -1/-1 -1/-1
          delay_initial_bucket_level 100
          delay_access 1 allow allsrc
          
          # Reverse Proxy settings
          https_port 10.10.10.10:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key
          cache_peer tenant.sharepoint.com parent 443 0 no-query no-digest originserver login=PASSTHRU connection-auth=on ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint
          deny_info TCP_RESET allsrc
          
          

          But unfortunately, it is not working yet.
          The WebDAV client (Windows) will not accept to download from the cache.

          I will receive errors from SQUID :

          TCP_OFFLINE_HIT_ABORTED/000

          (see attachment)

          ice_screenshot_20170825-152338.png
          ice_screenshot_20170825-152338.png_thumb

          1 Reply Last reply Reply Quote 0
          • O
            olivier-m
            last edited by

            Found the right configuration with the help of the Squid Users mailing list.
            I had to add different options to ignore cache control and force the cache to keep and serve the content.
            But it's working now.
            For the record, I'm posting the working Squid Configuration below.

            
            http_port 10.10.10.10.108:3128
            icp_port 0
            digest_generation off
            dns_v4_first on
            pid_filename /var/run/squid/squid.pid
            cache_effective_user squid
            cache_effective_group proxy
            error_default_language en
            icon_directory /usr/local/etc/squid/icons
            visible_hostname pfSense Firewall
            cache_mgr pfsense@virtualdesk.cloud
            access_log /var/squid/logs/access.log
            cache_log /var/squid/logs/cache.log
            cache_store_log none
            netdb_filename /var/squid/logs/netdb.state
            pinger_enable on
            pinger_program /usr/local/libexec/squid/pinger
            
            logfile_rotate 7
            debug_options rotate=7
            shutdown_lifetime 3 seconds
            forwarded_for on
            uri_whitespace strip
            
            refresh_pattern -i \.(jpg|gif|png|txt|docx|xlsx|pdf) 30240 100% 43800 override-expire ignore-private ignore-reload store-stale
            
            cache_mem 128 MB
            maximum_object_size_in_memory 20480 KB
            memory_replacement_policy lru
            cache_replacement_policy lru
            minimum_object_size 0 KB
            maximum_object_size 50 MB
            cache_dir ufs /var/squid/cache 100 16 256
            offline_mode on
            cache_swap_low 90
            cache_swap_high 95
            cache allow all
            
            # Add any of your own refresh_pattern entries above these.
            refresh_pattern ^ftp:    1440  20%  10080
            refresh_pattern ^gopher:  1440  0%  1440
            refresh_pattern -i (/cgi-bin/|\?) 0  0%  0
            refresh_pattern .    0  20%  4320
            
            #ACL allow all
            acl allsrc src all
            http_access allow allsrc
            
            request_body_max_size 0 KB
            delay_pools 1
            delay_class 1 2
            delay_parameters 1 -1/-1 -1/-1
            delay_initial_bucket_level 100
            delay_access 1 allow allsrc
            
            # Reverse Proxy settings
            https_port 10.10.10.10.108:443 accel cert=/usr/local/etc/squid/599eae0080989.crt key=/usr/local/etc/squid/599eae0080989.key defaultsite=tenant.sharepoint.com vhost
            
            #
            cache_peer 13.107.6.151 parent 443 0 ignore-cc no-query no-digest originserver login=PASSTHRU connection-auth=on round-robin ssl sslflags=DONT_VERIFY_PEER front-end-https=auto name=rvp_sharepoint
            
            
            1 Reply Last reply Reply Quote 0
            • First post
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.