Navigation

    Netgate Discussion Forum
    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search

    ACME standalone broken in pfSense 2.4.2?

    ACME
    2
    6
    2020
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      Ketchup_Bomb last edited by

      I made a few other changes to my pfSense machine, but upgraded to 2.4.2 was the biggest change. Wondering if it broke my existing ACME setup.

      I run standalone mode, with an HAproxy ACL:

         acl         ACME   path_beg -i /.well-known/acme-challenge/
      
      

      Which forwards HTTP traffic to the backend of the acme.sh script when it's running.

      –-

      When I attempted to create a new certificate for a new domain today, I got a message reading:

      [Tue Dec 5 17:32:14 PST 2017] REDACTED.example.com:Verify error:Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout
      
      

      I checked the REDACTED.example.com/acme_issuecert.log file and saw:

      [Tue Dec  5 17:32:04 PST 2017] checking
      [Tue Dec  5 17:32:04 PST 2017] GET
      [Tue Dec  5 17:32:04 PST 2017] url='[url]https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058[/url]'
      [Tue Dec  5 17:32:04 PST 2017] timeout
      [Tue Dec  5 17:32:04 PST 2017] curl exists=0
      [Tue Dec  5 17:32:04 PST 2017] wget exists=127
      [Tue Dec  5 17:32:04 PST 2017] _CURL='curl -L --silent --dump-header /tmp/acme/REDACTED.example.com//http.header '
      [Tue Dec  5 17:32:05 PST 2017] ret='0'
      [Tue Dec  5 17:32:05 PST 2017] original='{
        "type": "http-01",
        "status": "pending",
        "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058",
        "token": "GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To",
        "keyAuthorization": "GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To.SWhpVHEbWXGHa_ckmbUPkkwK6AJ-328vsI-mQuZhiAE"
      }'
      [Tue Dec  5 17:32:05 PST 2017] response='{"type":"http-01","status":"pending","uri":"https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058","token":"GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To","keyAuthorization":"GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To.SWhpVHEbWXGHa_ckmbUPkkwK6AJ-328vsI-mQuZhiAE"}'
      [Tue Dec  5 17:32:05 PST 2017] Pending
      [Tue Dec  5 17:32:05 PST 2017] sleep 2 secs to verify
      
      

      These lines are repeated ~4 times before finally reporting:

      [Tue Dec  5 17:32:14 PST 2017] checking
      [Tue Dec  5 17:32:14 PST 2017] GET
      [Tue Dec  5 17:32:14 PST 2017] url='[url]https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058[/url]'
      [Tue Dec  5 17:32:14 PST 2017] timeout
      [Tue Dec  5 17:32:14 PST 2017] curl exists=0
      [Tue Dec  5 17:32:14 PST 2017] wget exists=127
      [Tue Dec  5 17:32:14 PST 2017] _CURL='curl -L --silent --dump-header /tmp/acme/REDACTED.example.com//http.header '
      [Tue Dec  5 17:32:14 PST 2017] ret='0'
      [Tue Dec  5 17:32:14 PST 2017] original='{
        "type": "http-01",
        "status": "invalid",
        "error": {
          "type": "urn:acme:error:connection",
          "detail": "Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout",
          "status": 400
        },
        "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058",
        "token": "GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To",
        "keyAuthorization": "GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To.SWhpVHEbWXGHa_ckmbUPkkwK6AJ-328vsI-mQuZhiAE",
        "validationRecord": [
          {
            "url": "http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To",
            "hostname": "REDACTED.example.com",
            "port": "80",
            "addressesResolved": [
              "1.2.3.4"
            ],
            "addressUsed": "1.2.3.4",
            "addressesTried": []
          }
        ]
      }'
      [Tue Dec  5 17:32:14 PST 2017] response='{"type":"http-01","status":"invalid","error":{"type":"urn:acme:error:connection","detail":"Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout","status": 400},"uri":"https://acme-staging.api.letsencrypt.org/acme/challenge/mN6TKFUw4D4TW0eU88l455rsOMzO5Y100c1xg-FOOBAR/81224058","token":"GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To","keyAuthorization":"GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To.SWhpVHEbWXGHa_ckmbUPkkwK6AJ-328vsI-mQuZhiAE","validationRecord":[{"url":"http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To","hostname":"REDACTED.example.com","port":"80","addressesResolved":["73.158.251.232"],"addressUsed":"73.158.251.232","addressesTried":[]}]}'
      [Tue Dec  5 17:32:14 PST 2017] error='"error":{"type":"urn:acme:error:connection","detail":"Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout","status": 400'
      [Tue Dec  5 17:32:14 PST 2017] errordetail='Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout'
      [Tue Dec  5 17:32:14 PST 2017] REDACTED.example.com:Verify error:Fetching [url]http://REDACTED.example.com/.well-known/acme-challenge/GVi_q_EG8jQVcwj2w2v2O_bJRja29oqbarIYNOEb1To[/url]: Timeout
      
      

      After this point it seems to make another request to localhost:8082 (where the standalone instance is running)? This fails because the socket is closed at this point. And finally tries to ping LE one more time before quitting:

      [Tue Dec  5 17:32:16 PST 2017] original='{
        "type": "urn:acme:error:malformed",
        "detail": "Unable to update challenge :: The challenge is not pending.",
        "status": 400
      }'
      [Tue Dec  5 17:32:16 PST 2017] responseHeaders='HTTP/1.1 100 Continue
      Expires: Wed, 06 Dec 2017 01:32:16 GMT
      Cache-Control: max-age=0, no-cache, no-store
      Pragma: no-cache
      
      HTTP/1.1 400 Bad Request
      Server: nginx
      Content-Type: application/problem+json
      Content-Length: 132
      Boulder-Requester: 2578726
      Replay-Nonce: 4UZdr2lNsdFgkD_GW5-COZ5bWPKOxK-R7r4TBOvuEAs
      Expires: Wed, 06 Dec 2017 01:32:16 GMT
      Cache-Control: max-age=0, no-cache, no-store
      Pragma: no-cache
      Date: Wed, 06 Dec 2017 01:32:16 GMT
      Connection: close
      ^M'
      [Tue Dec  5 17:32:16 PST 2017] response='{"type":"urn:acme:error:malformed","detail":"Unable to update challenge :: The challenge is not pending.","status": 400}'
      [Tue Dec  5 17:32:16 PST 2017] code='400'
      
      

      (FOOBAR has replaced a portion of the LE token for redaction purposes.)

      –-

      I verified with a packet capture that LE is successfully reaching in through the firewall and HAproxy frontend to the backend:

      GET /.well-known/acme-challenge/WH7JUzQRpshMc5cvz3MDkHX3NBK-VozfezHARFQdViI HTTP/1.1
      Host: REDACTED.example.com
      User-Agent: Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)
      Accept: */*
      Accept-Encoding: gzip
      Connection: close
      
      

      And the response from my device:

      HTTP/1.1 200 OK
      
      WH7JUzQRpshMc5cvz3MDkHX3NBK-VozfezHARFQdViI.SWhpVHEbWXGHa_ckmbUPkkwK6AJ-328vsI-mQuZhiAE
      
      

      So the token is in place, as well as the keying mechanism. (Right? As far as I know the ACME protocol, things seem ok to this point.)

      –-

      I don't fully understand what is supposed to come next for the HTTP challenge model. From the logs it looks like the ACME client on my device is attempting to reach out to LE again but the challenge has moved from pending state to invalid, and rejects the ACME client's message with urn:acme:error:malformed.

      All of this process was working before the update to 2.4.2, though I never dug this deep into what was happening in the protocol since it was working.  :P
      Any thoughts? Thanks for any help.

      1 Reply Last reply Reply Quote 0
      • jimp
        jimp Rebel Alliance Developer Netgate last edited by

        It's possible there is something broken there. I know the acme.sh project has changed how they handle standalone mode and we haven't updated to the latest version of the script yet. There are changes coming (or maybe already done) on the LE side where they verify updates in a way that isn't compatible with the current standalone method.

        1 Reply Last reply Reply Quote 0
        • jimp
          jimp Rebel Alliance Developer Netgate last edited by

          I just synchronized us up to the latest acme.sh and I tested standalone on the new code and it worked. Update your ACME package to 0.1.30 and give it a shot.

          1 Reply Last reply Reply Quote 0
          • K
            Ketchup_Bomb last edited by

            Thanks for the input, jimp.

            I checked yesterday if the ACME package had an update and it didn't. I see a new commit in the pfsense/FreeBSD-ports repo – is this the update? Thanks for doing this work!

            I had done a few executions of the acme.sh script from the command line with the –debug 2 argument and noticed that it was complaining about nc and socat. I was going to look into that today.

            –-

            I just updated the ACME package to 2.7.5 and am running into the same timeout issue. When running with –debug 2, I'm seeing this towards the end of the execution:

            socat:
            socat by Gerhard Rieger and contributors - see www.dest-unreach.org
            Usage:
            socat [options] <bi-address><bi-address>options:
               ...</bi-address></bi-address> 
            

            Which makes me think that the invocation of socat is off in some way? This is similar output that the previous version of ACME had when trying to invoke nc.
            It still seems like LE reaches into my infrastructure, I reply with the proper nonce for the token, and then the ACME client polls the LE challenge a few times, waiting for the state to change from pending to something. The state eventually becomes invalid, and the ACME client fails.

            I suppose I can post in the acme.sh repo to see if Neilpang has any input since you've confirmed it's working on pfSense. Perhaps there is some file state on disk that is still causing it to fail on my end? I'll try with a fresh domain and confirmation process.

            Thanks jimp.

            1 Reply Last reply Reply Quote 0
            • jimp
              jimp Rebel Alliance Developer Netgate last edited by

              Yes that is the correct commit you linked from FreeBSD-ports.

              Did you update the ACME package to 0.1.30 or did you manually change files or copy things from the acme.sh project? Our files are not straight copies of theirs for the bulk of the code, there are some platform-specific differences, so if you tried to use the acme.sh code directly it may not work properly.

              I'm not sure what might be happening with your setup but I don't run acme.sh by hand, I let the package invoke everything and it all works properly. Unless you carefully setup the environment, running it manually is doomed to fail. I wouldn't consider that a good/proper test.

              If you get a timeout, the #1 suspect is that you aren't allowing traffic to hit the firewall's WAN address on port 80.

              1 Reply Last reply Reply Quote 0
              • K
                Ketchup_Bomb last edited by

                Ah ha! I found the issue! jimp, was HAProxy updated with the upgrade to 2.4.2?

                Using the standalone method, I created a backend in the HAProxy UI with a single server bound to localhost on port 8082, with no health checks, or timeout/retry settings.
                (I didn't use health checks was because this backend is "down" whenever the acme.sh script isn't running. So I just ignored the overhead of doing a health check.)

                The problem here is there is an IPv4 and IPv6 address for localhost, so in the newest version of HAProxy, it actually created two servers even though the UI only had one specified:

                backend 0_HTTP_ACME_Standalone_http_ipvANY
                	mode			http
                	log			global
                	timeout connect		30000
                	timeout server		30000
                	retries			3
                	server			pfsense_0 127.0.0.1:8082  
                	server			pfsense_1 ::1:8082  
                
                

                The ACME client only binds to the IPv4 interface using socat:

                [2.4.2-RELEASE][root@pfsense]/tmp/acme: sockstat -l46 | grep 8082
                root     socat      96563 5  tcp4   *:8082                *:*
                
                

                (I ran this sockstat command during the execution of certificate creation/renewal and it only ever listens on 127.0.0.1:8082.)

                And the nail in the coffin is that with the None option specified in the load balancing section of the backend, it defaults to Round Robin:

                @HAProxy:

                The load balancing algorithm of a backend is set to roundrobin when no other
                algorithm, mode nor option have been set. The algorithm may only be set once
                for each backend.

                So LE was successfully reaching in to my infrastructure (as I noted with the packet capture) on the 127.0.0.1:8082 server, but when it attempted to reach in again, it would be Round Robin'd to the ::1:8082 server, to which the ACME client wasn't bound. This would then timeout and cause the validation process to fail.

                –-

                I've adjusted the backend to only listen on 127.0.0.1, not localhost, but I would be more satisfied if I knew the proper knobs to turn such that the HAProxy backend would timeout quickly, and a new request would be issued to the next server in the list. Alas, I will have to figure that out another day.

                Thanks for your help, jimp!

                –-

                PS - I actually tried to pull the HEAD of acme.sh's repo to execute on the pfSense box, but as soon as I saw the output I knew there had to be adjustments made to the source code that made it compatible with pfSense. I gave that up quickly.

                I also only ran it from the command line because the ACME UI actually prints out the full command of what it's executing under the covers. I thought it was safe to copy and paste it so I could have control over execution during debugging.  :)

                Thanks again

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post

                Products

                • Platform Overview
                • TNSR
                • pfSense
                • Appliances

                Services

                • Training
                • Professional Services

                Support

                • Subscription Plans
                • Contact Support
                • Product Lifecycle
                • Documentation

                News

                • Media Coverage
                • Press
                • Events

                Resources

                • Blog
                • FAQ
                • Find a Partner
                • Resource Library
                • Security Information

                Company

                • About Us
                • Careers
                • Partners
                • Contact Us
                • Legal
                Our Mission

                We provide leading-edge network security at a fair price - regardless of organizational size or network sophistication. We believe that an open-source security model offers disruptive pricing along with the agility required to quickly address emerging threats.

                Subscribe to our Newsletter

                Product information, software announcements, and special offers. See our newsletter archive to sign up for future newsletters and to read past announcements.

                © 2021 Rubicon Communications, LLC | Privacy Policy