HAproxy + Acme package = 503 Error servers not available locally
-
Hello All,
Let me start by saying, I have spent a few weeks researching this on and off. Despite many post here and elsewhere, I still can't seem to get things working. I appreciate any help anyone can provide! If I don't provide all the details you need, please just let me know, and I'll try to post them. THANKS in advance!
Background:
I'm running pfSense 2.4.4-Release on an old Dell R610 dedicated to be just the firewall/router. My installation is pretty vanilla with only Suricata, pfBlockerNG, ntopng and OpenVPN added as packages. On an adjacent R910 running as virtual machines, I have Plex, NextCloud, Home Assistant and a basic web server. While all are in initial stages of deployment, they are all accessible and working via their local IP addresses.
Wanting access from outside my network, I've set set up a domain with google and all my sub-domains are mapped. Next, I started down the path of the reverse proxies. Initially I got things working with Squid both internally and externally but the performance was deplorable and most posts suggested HAproxy is better integrated into pfSense; so I disabled Squid and implemented HAproxy version 1.7.11.Current Situation:
At this point, I have my important backends set up, and redirecting frontends setup listening to WAN as well as LetsEncrypt certification via the ACME package. All servers are accessible from outside my network via the FQDN. Redirects to https works as planned and they all show up as trusted; so that's great. Basically everything work from outside the house, or most of the time from inside the house using a cellular connection.The Problems:
- When I try to access the any of the servers via their domain names in the LAN, things don't fall apart so much as just stare blankly at me. (That gives me an idea for a 404 page though) I do get slightly different results from one server though which I think can be attributed to the DNS.
Main landing webserver: www.[domain].net results in CONNECTION TIMEOUT
Nextcloud server: cloud.[domain].net results in INVALID REDIRECT
Plex Media Server: plex.[domain].net results in CONNECTION TIMEOUTNote: From what I see, I have set each up the same way with one exception: the hostname of the nextcloud server is "cloud" so its host name on the domain matches the FQDN. I can rename the others if that will fix it.
- The acme script fails with a 503 error. It was working initially, but since things got elaborate (not sure when), I cannot renew certificates.
What I've tried:
An exhaustive google search trying option that even made remote sense, and some that didn't.
Back out all crazy option to my initial configuration.
Health check on and off.
Transparent ClientIP On and Off- this on is interesting because in Squip there is a Transparent proxy setting that is what got things working. In HAproxy though, it seem you should only enable that setting if you have a legitimate DMZ. Since my servers on on the same subnet, I think I should leave this disabled. Please correct me if I'm wrong.
Frontends for both http and https listening on LAN. These have been delete just to keep the config shorter.
Various misguided NAT rules directing LAN:80 and LAN:443 traffic to various places on the firewall (192.168.1.1, 127.0.0.1 and 192.168.1.250 - becasue that optin under NAT says "Virtual IP for Reverse Proxy" I'm not sure at this point if this came from something I read or if it was setup automatically.)
Flipping random buttons and switched; then hopefully putting them all back.
KEY: Following one particular tutorial, I used the HTTP redirect instad of Use Backend for many acls. Some work, but eventually I switched all actions back to Use Backend. I think this might be key because in the HAproxy configuration a few lines are there which seem to be remnants from the old http-redirect, and I don't know if they could be part of the problem:From the HTTPS Section:
Despite this in the GUI:
From the HTTP Section
Despite this in the GUI:
At this point, squid is disabled although not uninstalled. I can get to each server remotely via FQDN and via LAN using IP address, but I cannot get to them over LAN using FQDN and I am unable to renew or get new certificates. I am not sure it the two are related but HA proxy is at the center of everything so I probably need to solve that first.
Here is my full config domain and ports replaced with [alias]:
# Automaticaly generated, dont edit manually. # Generated on: 2019-02-08 20:34 global maxconn 50 stats socket /tmp/haproxy.socket level admin uid 80 gid 80 nbproc 1 hard-stop-after 15m chroot /tmp/haproxy_chroot daemon tune.ssl.default-dh-param 2048 server-state-file /tmp/haproxy_server_state listen HAProxyLocalStats bind 127.0.0.1:2200 name localstats mode http stats enable stats admin if TRUE stats show-legends stats uri /haproxy/haproxy_stats.php?haproxystats=1 timeout client 5000 timeout connect 5000 timeout server 5000 frontend [domain].net_HTTPS_Frontend bind 192.168.254.210:443 name 192.168.254.210:443 no-sslv3 ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256 ssl crt-list /var/etc/haproxy/[domain]_HTTPS_Frontend.crt_list mode http log global option http-keep-alive option forwardfor acl https ssl_fc http-request set-header X-Forwarded-Proto http if !https http-request set-header X-Forwarded-Proto https if https timeout client 30000 acl www-acl var(txn.txnhost) -m str -i www.[domain].net acl cloud-acl var(txn.txnhost) -m str -i cloud.[domain].net acl plex-acl var(txn.txnhost) -m str -i plex.[domain].net acl aclcrt_ [domain].net_HTTPS_Frontend var(txn.txnhost) -m reg -i ^cloud\.[domain]\.net(:([0-9]){1,5})?$ acl aclcrt_ [domain].net_HTTPS_Frontend var(txn.txnhost) -m reg -i ^plex\. [domain] \.net(:([0-9]){1,5})?$ acl aclcrt_ [domain].net_HTTPS_Frontend var(txn.txnhost) -m reg -i ^sol\. [domain] \.net(:([0-9]){1,5})?$ acl aclcrt_ [domain].net_HTTPS_Frontend var(txn.txnhost) -m reg -i ^www\. [domain] \.net(:([0-9]){1,5})?$ http-request set-var(txn.txnhost) hdr(host) use_backend webserver_ipvANY if www-acl aclcrt_[domain]_HTTPS_Frontend use_backend nextcloud_ipvANY if cloud-acl aclcrt_[domain]_HTTPS_Frontend use_backend plex_ipvANY if plex-acl aclcrt_[domain]_HTTPS_Frontend frontend [domain]_HTTP_Frontend bind 192.168.254.210:80 name 192.168.254.210:80 mode http log global option http-keep-alive option forwardfor acl https ssl_fc http-request set-header X-Forwarded-Proto http if !https http-request set-header X-Forwarded-Proto https if https timeout client 30000 acl acme-acl var(txn.txnpath) -m beg -i .well-known/acme-challenge/ acl www-redirect-acl var(txn.txnhost) -m str -i www.[domain].net acl cloud-redirect-acl var(txn.txnhost) -m str -i cloud.[domain].net acl plex-redirect-acl var(txn.txnhost) -m str -i plex.[domain].net acl vpn-redirect-acl var(txn.txnpath) -m str -i vpn.[domain].net acl mqtt-redirect-acl var(txn.txnpath) -m str -i mqtt.[domain].net http-request set-var(txn.txnpath) path http-request set-var(txn.txnhost) hdr(host) use_backend acme-gateway-server_ipvANY if acme-acl use_backend webserver_ipvANY if www-redirect-acl use_backend plex_ipvANY if plex-redirect-acl use_backend nextcloud_ipvANY if cloud-redirect-acl use_backend VPN_Server_ipvANY if vpn-redirect-acl use_backend mqtt-http_ipvANY if mqtt-redirect-acl backend webserver_ipvANY mode http id 103 log global timeout connect 30000 timeout server 30000 retries 3 option httpchk OPTIONS / server webserver 192.168.1.223:443 id 104 ssl check-ssl check inter 1000 verify none backend nextcloud_ipvANY mode http id 105 log global errorfile 503 /var/etc/haproxy/errorfile_nextcloud_ipvANY_503_CLOUDDOWN timeout connect 30000 timeout server 30000 retries 3 option httpchk OPTIONS / server nextcloud_server 192.168.1.8:443 id 106 ssl check-ssl check inter 1000 verify none backend plex_ipvANY mode http id 107 log global timeout connect 30000 timeout server 30000 retries 3 server plex_media_server 192.168.1.222:[plex_port] id 108 ssl check-ssl check inter 1000 verify none backend acme-gateway-server_ipvANY mode http id 109 log global timeout connect 30000 timeout server 30000 retries 3 server webserver 127.0.0.1:80 id 110 backend VPN_Server_ipvANY mode http id 102 log global timeout connect 30000 timeout server 30000 retries 3 server OpenVPN 127.0.0.1:[VPN_port] id 111 backend mqtt-http_ipvANY mode http id 114 log global timeout connect 30000 timeout server 30000 retries 3 server mqtt_http 192.168.1.249:[mqtt_port] id 115
-
I found this very useful. https://blog.devita.co/pfsense-to-proxy-traffic-for-websites-using-pfsense/
-
As traffic forwarding to my other hosts seemed to be working, I decided to troubleshoot the problem by taking HAProxy out of the mix and focus on the ACME script. I set up a VM with no web server and installed the ACME.sh script. By enabling the very verbose mode I found little other than the standalone server didn't seem to be connecting. Digging one layer deeper and enabling debugging mode, I could see the messages coming from socat and started getting somewhere. On the line of the script that established the socat connection, it fails. It didn't really say why, but a couple more days reading seemingly unrelated posts, I literally stumbled across an answer.
I had been using port 80 in the acme script which in turn asks socat to use port 80. Port 80 is a "privileged" port. Long story short, I changed it to something above 1024 and bingo. That cryptic message in the acme package in the pfSense GUI seems to suggest using port 80 will somehow be easier - wrong. I had changed it to 80 when I ran into earlier problems. Setting it to an unprivileged port and using a matching port in the HAproxy backend server fixes everything.
It might be valuable to enable verbosity and debugging to the ACME script in pfSense to get a little bit more info in the logs.
Maybe this will help others! -
@interloper Do you have a guide on how you setup your google domain settings for your subdomains? I am trying to figure it out but having a hard time. Here is my open topic on this forum (https://forum.netgate.com/post/830593).
Thanks