HaProxy and LetsEncrypt Cert Renewal Failure without 443 Port Forward



  • I have a pfSense 2.4.1 device running haproxy .52.14.

    Haproxy forwards 8 http and 8 https sites to two different Apache2 servers (one server hosts wordpress, the other hosts nextcloud).

    I'm using Let's Encrypt certs and since they expire every 30 days, haproxy forwards based on the port and what the name contains as opposed to handleing the certificate itself.  The apache2 server on each host the forwards any port 80 requests to 443 via Rewrite.

    This works great.  Each request is forwarded to the correct host and ends up with a valid SSL certificate regardless of if it was initialed with SSL or not.

    The ONLY issue that I've ran into is that my CRON job to renew the certificates fails (error below).  I can work around this by enabling a Port Forward rule in the firewall that temporarily redirects all port 443 traffic to the host needing renewal.  After renewal, I disable the port forward and all is back to normal.  It takes about 25 seconds so it's isn't a big deal.  To clarify, if I run the renewal process manually, without the port forward, it fails too.  Didn't want to mislead anyone into thinking it might be an issue with my CRON job.

    Any idea what I'm missing?

    The error letsencrypt throws without the port forward:

    2017-10-26 06:36:27,694:WARNING:letsencrypt.cli:Attempting to renew cert from /etc/letsencrypt/renewal/foo.com.conf produced an unexpected error: Failed authorization procedure. foo.com (tls-sni-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Error getting validation data, www.foo.com (tls-sni-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Error getting validation data. Skipping.
    
    

    I can immediately enable the port forward and the renewal process works successfully.


  • Rebel Alliance Developer Netgate

    What exact setup are you using for your ACME certificate config? What validation method?



  • I'm NOT using ACME.

    My wordpress host and nextcloud host manage their own certificates.


  • Rebel Alliance Developer Netgate

    That doesn't change the question though…

    It does at least appear to be triggering tls-sni-01 so it should be sending a hostname, but probably it's something in the request that haproxy can't differentiate well enough. How is the web server performing the redirect? I have heard that ACME doesn't like following things like a 301 before.

    You would be much better off putting the certificates on the firewall and letting the ACME package and HAProxy deal with the ACME validation.

    Otherwise you might have to set it up so that the proxy and web servers do not perform and https redirect on the request for the ACME validation files, so it stays HTTP.



  • I've done nothing with the Let's Encrypt client other than install and request the certs.  So whatever config it's ACME client started with, it still has.

    Perhaps something on my haproxy front end(s) needs to be tweaked?  I'm in SSL mode forwarding to the specified host based on "Server Name Indication TLS extension contains".

    The webserver is doing the redirect with a RewriteEngine on in the conf file.

    RewriteEngine on
    RewriteCond %{SERVER_NAME} =foo.com [OR]
    RewriteCond %{SERVER_NAME} =foo.foo.com
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,QSA,R=permanent] - Is this what you meant by your redirect question?

    I'm hesitant to move to the ACME package, simply because there still seems to be some growing pains.  My process works well, I just have to SSH into my hosts once every couple of weeks and run the LE update after enabling my port forward.  It would be great if I could determine what I had missed.

    By the way, thanks for you time and assistance.  You help so many people on here.


  • Rebel Alliance Developer Netgate

    First I would try setting up rewrite conditions to exclude the ACME request. There should be some examples around for that, based on the path requested, for example:

    RewriteCond %{REQUEST_URI} !^.well-known/acme-challenge
    

    That should eliminate the bulk of your issues. If that works, you could slowly tweak other things but I'd just let ACME through to HTTP like it wants.



  • Ok, so I added this so my sites-available and restarted apache.

    RewriteEngine on
    RewriteCond %{SERVER_NAME} =foo.com [OR]
    RewriteCond %{SERVER_NAME} =www.foo.com
    RewriteCond %{REQUEST_URI} !^.well-known/acme-challenge
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,QSA,R=permanent]

    Running letsencrypt renew resulted in:
    2017-10-27 08:18:43,783:WARNING:letsencrypt.cli:Attempting to renew cert from /etc/letsencrypt/renewal/foo.com.conf produced an unexpected error: Failed authorization procedure. foo.com (tls-sni-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Error getting validation data. Skipping.

    For my understanding, why do we think it's a rewrite issue if adding the port forward for 443 on pfSense allows the unchanged Rewrite config to work successfully?  If I'm showing a lack of understanding the ACME client, forgive me, but it seems to me that once the verification gets to the web server, all is fine.  Without the port forward, the validation request gets hung up at haProxy and dies - but I'm far from an expert.


  • Rebel Alliance Developer Netgate

    The port forward bypasses the need for SNI to function, which would appear to be at least part of the problem.

    My mod_rewrite skills are rusty so I'm not sure if that's correct syntax, maybe put it first? Or google around for ACME mod_rewrite examples, there are bound to be some complete ones out there.



  • Thanks, that makes alot more sense to me now.

    I've found a couple different options, trying this now:

    RewriteCond %{REQUEST_URI} ^/.well-known
    RewriteRule . - [L]

    Unfortunately, I've requested to many renewals now, so I have to wait a bit before I can test further.



  • While waiting for my rate limit to reset, I did a bit more research.

    My site.conf file rewrite looks like this now:

    RewriteEngine on
    RewriteCond %{REQUEST_URI} !^/.well-known/acme-challenge
    RewriteCond %{SERVER_NAME} =www.foo.com
    RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,QSA,R=permanent]

    I created a www.foo.com/.well-known/acme-challenge/test.txt and can confirm it does NOT redirect when access via HTTP.  So, I guess we'll see what happens when I can test my cert renewal in another 35 minutes.



  • No love.

    As I mentioned, I confirmed  www.foo.com/.well-known/acme-challenge isn't redirecting, but the renewal is still failing.

    Oddly, the error.log shows this:

    AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
    [Fri Oct 27 09:47:19.831094 2017] [ssl:warn] [pid 35459] AH01906: b0e858dd145cedac69bf9f2ff813bdce.4aff4cc0f637d578fdb7e19834ea33dc.acme.invalid:443:0 server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
    [Fri Oct 27 09:47:19.831626 2017] [mpm_prefork:notice] [pid 35459] AH00163: Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g configured – resuming normal operations
    [Fri Oct 27 09:47:19.831636 2017] [core:notice] [pid 35459] AH00094: Command line: '/usr/sbin/apache2'
    [Fri Oct 27 09:47:26.617630 2017] [mpm_prefork:notice] [pid 35459] AH00171: Graceful restart requested, doing restart
    [Fri Oct 27 09:47:26.657069 2017] [mpm_prefork:notice] [pid 35459] AH00163: Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g configured – resuming normal operations
    [Fri Oct 27 09:47:26.657091 2017] [core:notice] [pid 35459] AH00094: Command line: '/usr/sbin/apache2'

    If I can figure out how to force a renewal, I'll test on my other host and see if I can replicate it.


Log in to reply