Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    [SOLVED] Corrupted PDF download through new pfSense installation

    Scheduled Pinned Locked Moved General pfSense Questions
    3 Posts 2 Posters 742 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      RustBucket
      last edited by

      I have run into a very strange issue while testing a new pfSense build for my home network. When I download a file with the pfSense device in the network path, the file is corrupted. So far, I've only experienced it with this one file (that I've noticed), but since it's so reproducible, I'm afraid of becoming the house pariah (more so than usual  ;D) if I go live with the new system, and it turns out this is the tip of the iceberg. I've calculated the md5 checksum on the corrupted file, and it is always consistent. I get one value and a viewable file when using my old router, and a different checksum, and a file with missing pages when using the new router. This leads me to believe that there aren't hardware issues or other 'normal' problems that presumably would result in more random corruption of the file, but rather something that is consistently changing the content. The URL of the file I'm downloading is https://dl.ubnt.com/guides/UniFi/UniFi_Controller_V5_UG.pdf

      My observations:

      • Download this Ubiquiti manual with pfSense inline (using Chrome or Safari browser) = corrupted file

      • Download bypassing pfSense (using original TP Link router) = intact file

      • Download with pfSense inline (using curl) = intact file

      • Sampling of other file downloads with pfSense inline (using Chrome or Safari) = apparently intact files

      I've checked the LAN and WAN interfaces, and they show 0 errors. I've also done a wireshark packet capture, and the Analyze->Expert Information menu item shows nothing remarkable (a couple of out-of-order packets, and a couple of duplicate ACKs, with the good and bad downloads showing roughly the same number). Although this is a HTTPS download, I disabled the transparent Squid proxy to be safe, and I get the same bad md5 checksum with and without Squid active.

      I'm really at a loss as to how the combination of pfSense and browser could affect this particular file download, while it works with curl and with the original router regardless of whether I use a browser or curl. I'd write it off as some weird anomaly, except I don't have enough experience with the new hardware and software combination to know if I should have confidence, and also because hell hath no fury like a family with flakey internet.

      Any bright ideas on what I should check next, or entertaining theories on what could be going on would be greatly appreciated.

      1 Reply Last reply Reply Quote 0
      • DerelictD
        Derelict LAYER 8 Netgate
        last edited by

        You do realize that downloading a PDF like that via HTTPS means the firewall cannot see any of the content inside that stream and has no concept of pdf or otherwise and the browser and web server ensure end-to-end encryption and authentication of every bit.

        Unless you've done something like HTTPS man-in-the-middle.

        Hint: Whatever you're seeing is not pfSense. Unless maybe ^

        ETA: Downloads fine here: SHA256(UniFi_Controller_V5_UG.pdf)= 7542671dac5d5f743ae4e56529872cff3b70f4e6557d947537926647785baab7

        Chattanooga, Tennessee, USA
        A comprehensive network diagram is worth 10,000 words and 15 conference calls.
        DO NOT set a source address/port in a port forward or firewall rule unless you KNOW you need it!
        Do Not Chat For Help! NO_WAN_EGRESS(TM)

        1 Reply Last reply Reply Quote 0
        • R
          RustBucket
          last edited by

          Yeah, that all occurred to me, yet I still turned off squid transparent proxy, even though I knew it shouldn't affect HTTPS traffic, and to be honest, I actually didn't expect moving my ethernet connection from the pfSense box to the original router to make a difference, yet it did.

          Anyway mystery (mostly) solved after stepping back and looking at the packet captures more closely. The PDF file is hosted on an Amazon CloudFront content delivery network. It turns out that I was downloading the PDF from different servers depending on which device I was using as a router. Not too surprising in retrospect, since different DNS resolvers could have different answers in their cache. I think what really threw me (apart from sitting at my computer for too many hours straight), was that curl always downloaded the correct content even when I was connected through my new pfSense installation. For whatever reason, curl on OS X was getting the 'good' IP consistently, while the browsers consistently used the 'bad' IP that matched what I would get when using dig against the pfSense resolver.

          In any case, my confidence is restored in my new installation, and I guess I'm just going to have to live with  curl vs. browser DNS resolution mystery.

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.