multiple https with haproxy
-
@piba
It happened again this morning. I had haproxy disabled so i don't think it is related to that (nothing I have is in 'production' at this time).I enabled the cloudflare protection as I had it off trying to get my mta to work.
I am currently watching traffic logs to see if I can see anything but was wondering about snort.
-
@coreyjohnson75
It doesn't sound familiar related to haproxy, and well you already pretty much ruled that out by disabling it.I've not really used Snort, but have seen mentioned it can be a cause of trouble if not properly configured...
-
@piba In reading some older posts, people have had similar issues and ended up backing up their config then reinstalling. This may be what I end up doing since I have ruled out (backed-out) all my recent changes with exception to the recent pfsense update to the latest stable version.
Right now I have to reboot pfsense every 4-10 hours to restore internet access. The oddest thing is that my vpn in/out always works, it only seems to affect normal traffic.
I've never done a restore so a little nervous as it took me 2 weeks earlier this year to get my pfsense into good working order.
-
@coreyjohnson75
Disabled snort? When internet 'fails' can you still login to the webgui / ssh ? Are firewall rules still loaded pfctl -sr ? With correct outbound nat rules pfctl -sn ? Anything in the system/firewall/routing -logs? PfSense itself still has internet access? Can it ping to 8.8.8.8 and to google.com ?
If vpn works, then already a large part of basic functions are 'working'. imho reinstalling is rarely the real solution.. Personally prefer to figure why it breaks, memory usage/firewall states or any other metric that shows a issue might be slowly ? tcpdump / packetcapture shows traffic on the lan/wan sides? -
@piba
I have had 24 hours without internet issue but here is what I did to get it...- I went to Cloudflare and deleted everything
- in pfsense I deactivated my port forwards from WAN-->WAN ADDRESS for 80/443
Whereas this has kept my home with internet (and kept my wife happy), this has defeated the entire purpose of what I am trying to do, server email and websites to the internet.
I have not doubt that I had something configured wrong which lead to my issues. I have run outside access via duckdns & port forwards before without issue so I am unsure of what I did wrong in setting up an actual domain. Should it have been set up in NAT rather than port forwards?
Here is what I had in cloudflare. At one point I also had it with all the clouds unchecked.
-
@coreyjohnson75
Port-forwards should not be needed with haproxy, firewall rules would be. (normally anyhow)I'm not sure why cloudflare settings would matter for having regular internet access, unless they are sending so much traffic its saturating the wan or otherwise causing the pfsense box to 'overload'..? Or is it only the website that becomes unreachable.?.
-
@piba It was anything outbound that became unreachable.. browser/google, FireTV, cellphones... I actually didn't try the website during these periods. What I would see from a pfsense point-of-view is a large spike in on the WAN then very low, flat. Perhaps it was the way my rules (poorly referenced as forwards) were before.
Here is what I had, enabled at the time:
-
@coreyjohnson75
Protocol should have been TCP only, but that wouldn't explain any issue.. There wouldn't be anything listening on UDP 80/443 so no harm should really have come from that.Pfsense itself directly has the wan-ip right? no isp router device in between thats trying to keep state for all traffic and might run out? No ideas otherwise atm..
-
@piba Correct, cable modem to pfsense. I'm about 2 hours into a partial test. I have re-added the A record to my domain and a CNAME to the webmail sub in Cloudflare. If nothing happens I will then re-enable the rules (after changing to TCP only) and enable haproxy again.
-
@piba That test failed I guess. Within an hour of that post I could not connect. I had a traceroute but lost it but here are two pings, 1 before and 1 after a reboot. Internal pings, ssh, and pages still worked.
-
@coreyjohnson75
That was without any pass rules on the wan-interface right.? And without haproxy running.That "Time to live exceeded" is a rather strange response. Would mean that the path to the target is to many 'hops' away, which could be if the packet bounced around for a while between 2 routers. The traceroute could have been rather interesting to see that could show the same effect, and show which router is forwarding traffic in the wrong direction.
I'm thinking that maybe the isp or isp-modem is 'blocking' you from hosting a website? Either intentionally, or perhaps the modem is buggy.? I don't suppose you could swap that for a different one?
-
@piba
Here is what my WAN rules look like.. I have not had the top two enabled for days or haproxy. It is installed but not enabled.I'm at work right now VPN'd in and wife just told me she doesn't have access from the house.
Here is a current traceroute:
I had thought of the isp but I have had a page served via duckdns for almost a year but deleted those settings so I could get a new domain and use it (wwolf.us) to server that page and email via haproxy rather than a port forward to a single server.
Unfortunately I do not have a spare modem to test with. Definitely not opposed to upgrading if it offered any benefit but would rather not buy just for a test. I currently have an Arris MTA TM1602AP2.
Odd thing, I turned on logging on my my rules and it stopped all internet facing traffic until I took them backoff... CPU was not overloaded... As soon as I turned them back off and rebooted internet was up again for the wife. my head hurts now.. ugh.
Thank you for all your help btw. This has become a mission to not be beaten by pfsense, cloudflare, isp, whatever.
Did some searching and found this on Spectrum's site (Charter):
Customer may set up one (1) web page per service account for personal use of the Service, but Customer may not establish a web page using a server located at Customer's home. Customer will not use, or allow others to use, Customer's home computer as a web server, FTP server, file server or game server or to run any other server applications or to provide network or host services to others via Charter's network. Customer will not use, or allow others to use, the Service to operate any type of business or commercial enterprise, including, but not limited to, IP address translation or similar facilities intended to provide additional access.
-
@coreyjohnson75
That traceroute looks like it worked alright, i guess a ping would also at that moment?As for being allowed to create 1 webpage, im not sure how they would monitor that.. And even for personal use, having a domain name would be handy, not needing to remember a IP.. But i somehow doubt its the ISP at this point.. Buying a new modem indeed wouldn't be a obvious step..
Anyhow your vpn from outside to pfSense work, while internet access from inside the house doesn't? If he ISP was blocking traffic, i doubt this would be the way they do it.
So back to square one.. :(.
- Simultaneous packet captures on lan and wan interface while the issue is happening? ( in two SSH sessions run
tcpdump -s 0 -w /root/capture1.pcap -ni <interfacename>
for each interface) - Create a copy of /tmp/rules.debug to read through and compare after it works again
- Check diagnostics/states is traffic on wan interface being natted properly?
- traceroute and ping
- Try and fix the issue with a smaller action than a complete reboot.
- status/filter reload
- save a interface configuration
- save system/routes settings
-Check every available log file for any event that happens around that same time.?.
I'm not sure what other things to try atm if/when it happens again..
- Simultaneous packet captures on lan and wan interface while the issue is happening? ( in two SSH sessions run
-
Well I reset everything up last night, worked for hours so I went to bed wondering what I had done differently to get it to work. Then I woke up to not being able to access the internet again...
Ping/traceroute both failed.
I ran the tcpdumps on my wan and normal access interfaces but if I try to open them in pfsense it just locks up the 'edit file' screen. I ssh'd in and did a cat on the files but it was just garbled. not sure what to do with them...I made copies of /tmp/rules.debug but didn't see anything.
I reloaded the rules and haproxy - no change
Looked at logs but didn't see anything sticking out to me in the last 8 hours not that I really know what to look for
So I rebooted, turned off haproxy and deleted my stuff from cloudflare. so Wife would have internet (has to work this morning).
When using something like Cloudflare, other than running dynamic dns, is there anything I need to add/change in pfsense? add their nameservers or dns ?
At this point I am thinking of buying another domain and using a different host just to make sure it isn't something there.
-
Lets see if I can be of any help here.
You have one external IP. (maybe even a dynamic one assigned from your ISP over DHCP or so?)
You want to have multiple sites published over 443 from that IP
You probably want the cert presented from your HA proxy to be accepted by most browser too?Right.
Well, lets get to it then ;) , it can be done with PFsense and a couple of plugins. As it happens, this is what I'm running myself. I will be using example.com as an example domain in this.My setup uses
Cloudflare as my DNS
The built in Dynamic DNS component to keep Cloudflare updated
the Acme Certificates plugin to get a wildcard certificate for my domain
HA proxy to do the redirection of SSL traffic.I only use Cloudflare as a DNS. I have turned off all features that can be turned off so I don't use the caching function at all. It might bee a good idea to turn it off during troubleshooting (one less thing involved).
As it seems that you already got valid DNS entries I'm skipping the Dynamic DNS part.
I am using Lets encrypt to provide a wildcard (ex *.example.com) cert for my domain. By using a wildcard cert I can simplify my HA proxy setup later on. There are guides on setting up lets encrypt with PFsense so I'm not going into details here but there are some specifics.
To get a wildcard cert you must use DNS validation and the V2 api. Since the Acme Certificates plugin has a DNS-Cloudflare method it very easy to do when using cloudflare. You will need the name of you domain, the mail account you used to create the cloudflare account and you will need your cloudflare API key. The later is available from the overview page of you cloudflare account. In the domain summary pane there is a Get you API key Link .Now you have the required data to fill in the required fields to get a wildcard cert from lets encrypt. Just remember ti fill in *.example.com in the san field.
Time to start working on HA proxy.
Start with creating the needed backends.
for this I would create a Mailbackend and a WWWbackend pointing to port 443 on the servers for these services.The next thing is the frontends. I use two of them. One for port 80 and one for port 443.
Basically I catch port 80 traffic and sends out redirects to 443 so lets skip that. It's just my way to make sure that people uses SSL ;)On 443 my frontend it's set to listen on the wan address port 443, I got SSL offloading selected and the type is set to HTTP/HTTPS (offloading)
Now it's time for the fun part. Access Control Lists and actions.
This is the part where I control what traffic should be sent to which backend.
Lets create two entries here, one for mail and one for a Webserver
the first one:
Name: mailACL
Expression: Host starts with
Value: mail.example.netThe second one
Name:WWWACL
Expression: Host starts with
Value: www.example.netTo map them to a backend we need to define actions
The first "rule" would be
Action: Use Backend (and select the Mailbackend)
Skip parameters and in the Condition acl names type the name of the ACL you created (Ie mailACL)And to create a "rule" for the Webserver we need
Action: Use Backend (and select the WWWbackend)
Skip parameters and in the Condition acl names type the name of the ACL you created (Ie WWWACL)Scroll down to SSL offloading and select the wildcard cert you created earlier.
That should be it :)
-
@mats said in multiple https with haproxy:
That should be it :)
Except for the part where serving websites isn't the actual biggest issue, its loosing outbound internet access 'shortly' after sending requests from outside towards pfSense. Other than that its a nice write-up of how to get going with haproxy
-
Ping/traceroute, that they fail to reach the intended target is okay(well its because there is some issue of course we are trying to diagnose..), but what messages are displayed does the ping complain about no route? Does the trace get 1 hop out.?
The pcap files can be read / analyzed by a program like Wireshark if you copy them to your local computer. They are in a 'binary' format that cat and 'edit file' wouldnt know how to display..
Not a single difference in the rules.debug before/after trouble?
Try rebooting the Arris modem next time instead of pfSense ?
Dns settings for your public domain and where they are hosted or pointing to really shouldn't affect pfSense in any way.. Unless someone is trying to ddos your domain..
-
Thank @Mats I've been able to access my sites externally with SSL so other than the wildcard part I think I have that setup. If you think that makes a difference then I will definitely look into it. I'll post my configuration below if you don't mind glancing over it.
Well, I thought about going back to duckdns but at least at the time I couldn't get 2 sites to work.
Here is where I am right now, about 4 hours into a stable system where I went to GoDaddy (vs Cloudflare), bought a 2nd domain (.xyz versus .us) and have the DNS through them. I have 2 https and 1 http site all working and have not dropped outbound internet. I didn't change anything in my configuration other than the .xyz in my haproxy frontends. I'll need to get new SSL certs if this holds up as I am currently pushing through the previous .us one and of course getting an error but can reach my backends nonetheless (?).
Now that doesn't mean anything until I at least go 12-24 hours without dropping outbound internet, fingers crossed.
Now, IF this all holds up, I'll be where I wanted to be a week ago and trying to figure out how to HAproxy the rest of my mailserver ports through lol.
global maxconn 10 stats socket /tmp/haproxy.socket level admin uid 80 gid 80 nbproc 1 hard-stop-after 15m chroot /tmp/haproxy_chroot daemon tune.ssl.default-dh-param 2048 server-state-file /tmp/haproxy_server_state listen HAProxyLocalStats bind 127.0.0.1:2200 name localstats mode http stats enable stats refresh 10 stats admin if TRUE stats show-legends stats uri /haproxy/haproxy_stats.php?haproxystats=1 timeout client 5000 timeout connect 5000 timeout server 5000 frontend Frontend_www bind XX.XX.XXX.XX:80 name XX.XX.XXX.XX:80 mode http log global option http-keep-alive timeout client 30000 acl www_acl var(txn.txnhost) -m beg -i www.wwolf.xyz http-request set-var(txn.txnhost) hdr(host) use_backend Test_Backend_ipvANY if www_acl frontend CF_443-merged bind XX.XX.XXX.XX:443 name XX.XX.XXX.XX:443 ssl crt-list /var/etc/haproxy/CF_443.crt_list mode http log global option http-keep-alive timeout client 30000 acl zim_acl var(txn.txnhost) -m beg -i webmail.wwolf.xyz acl ha_acl2 var(txn.txnhost) -m beg -i ha.wwolf.xyz acl Zimbra_acl var(txn.txnhost) -m str -i webmail.wwolf.xyz http-request set-var(txn.txnhost) hdr(host) use_backend Zimbra_Backend_ipvANY if zim_acl use_backend HA_Backend_ipvANY if ha_acl2 use_backend Zimbra_Backend_ipvANY if Zimbra_acl backend Test_Backend_ipvANY mode http id 103 log global timeout connect 30000 timeout server 30000 retries 3 option httpchk GET / server testpage 192.168.30.11:80 id 104 check inter 1000 backend Zimbra_Backend_ipvANY mode http id 102 log global timeout connect 30000 timeout server 30000 retries 3 option httpchk GET / server Zimbra_Backend_Server 192.168.30.5:443 id 101 ssl check inter 1000 verify none backend HA_Backend_ipvANY mode http id 100 log global timeout connect 30000 timeout server 30000 retries 3 option httpchk GET / server HA_Backend_Server 192.168.30.6:18122 id 101 check inter 1000
-
If this change to GoDaddy 'fixes' the issue, then its likely still a problem waiting to happen. Does Cloudflare do 'healthchecks' or something.?. So they might be requesting the website main page every few seconds.? and filling up some 'state table' somewhere (arris modem?) which eventually overflows and causes trouble.?.
@coreyjohnson75 said in multiple https with haproxy:
how to HAproxy the rest of my mailserver ports
That will be limited to 'mode tcp' and without any 'smart' backend selection.. It can balance between multiple servers for the same domain if required.. It for example wont switch port 25 between 2 servers that only accept mail for specific different domains..
-
I wanted to let you guys know that the problem ended up being rules related on the primary VLAN that kept going down. I'm still confused on why it would all work for a while and then go out AND why it would work longer (never actually crashed) with GoDaddy but I reworked all my rules with the help of someone and moved everything back to Cloudflare and have been up for over 20 hours without ANY issue. I had the right rules but they were below some bad rules. I have since moved many Iot devices onto their own separate vlan to make everything prettier and more simple.
Now... if either of you could assist me with 1) best practices for running an email server behind pfSense and 2) not get my emails flagged as spam, that would be awesome.