multiple https with haproxy

PiBa

@coreyjohnson75
It doesn't sound familiar related to haproxy, and well you already pretty much ruled that out by disabling it.

I've not really used Snort, but have seen mentioned it can be a cause of trouble if not properly configured...

coreyjohnson75

@piba In reading some older posts, people have had similar issues and ended up backing up their config then reinstalling. This may be what I end up doing since I have ruled out (backed-out) all my recent changes with exception to the recent pfsense update to the latest stable version.

Right now I have to reboot pfsense every 4-10 hours to restore internet access. The oddest thing is that my vpn in/out always works, it only seems to affect normal traffic.

I've never done a restore so a little nervous as it took me 2 weeks earlier this year to get my pfsense into good working order.

PiBa

@coreyjohnson75
Disabled snort? When internet 'fails' can you still login to the webgui / ssh ? Are firewall rules still loaded pfctl -sr ? With correct outbound nat rules pfctl -sn ? Anything in the system/firewall/routing -logs? PfSense itself still has internet access? Can it ping to 8.8.8.8 and to google.com ?
If vpn works, then already a large part of basic functions are 'working'. imho reinstalling is rarely the real solution.. Personally prefer to figure why it breaks, memory usage/firewall states or any other metric that shows a issue might be slowly ? tcpdump / packetcapture shows traffic on the lan/wan sides?

coreyjohnson75

@piba
I have had 24 hours without internet issue but here is what I did to get it...

I went to Cloudflare and deleted everything
in pfsense I deactivated my port forwards from WAN-->WAN ADDRESS for 80/443

Whereas this has kept my home with internet (and kept my wife happy), this has defeated the entire purpose of what I am trying to do, server email and websites to the internet.

I have not doubt that I had something configured wrong which lead to my issues. I have run outside access via duckdns & port forwards before without issue so I am unsure of what I did wrong in setting up an actual domain. Should it have been set up in NAT rather than port forwards?

Here is what I had in cloudflare. At one point I also had it with all the clouds unchecked.

0_1539174485683_ac4c4718-66ea-4793-9642-1341bed89767-image.png

PiBa

@coreyjohnson75
Port-forwards should not be needed with haproxy, firewall rules would be. (normally anyhow)

I'm not sure why cloudflare settings would matter for having regular internet access, unless they are sending so much traffic its saturating the wan or otherwise causing the pfsense box to 'overload'..? Or is it only the website that becomes unreachable.?.

coreyjohnson75

@piba It was anything outbound that became unreachable.. browser/google, FireTV, cellphones... I actually didn't try the website during these periods. What I would see from a pfsense point-of-view is a large spike in on the WAN then very low, flat. Perhaps it was the way my rules (poorly referenced as forwards) were before.

Here is what I had, enabled at the time:

0_1539178322112_a1eb8740-3259-4178-9c87-6f10f7ee0578-image.png

PiBa

@coreyjohnson75
Protocol should have been TCP only, but that wouldn't explain any issue.. There wouldn't be anything listening on UDP 80/443 so no harm should really have come from that.

Pfsense itself directly has the wan-ip right? no isp router device in between thats trying to keep state for all traffic and might run out? No ideas otherwise atm..

coreyjohnson75

@piba Correct, cable modem to pfsense. I'm about 2 hours into a partial test. I have re-added the A record to my domain and a CNAME to the webmail sub in Cloudflare. If nothing happens I will then re-enable the rules (after changing to TCP only) and enable haproxy again.

coreyjohnson75

@piba That test failed I guess. Within an hour of that post I could not connect. I had a traceroute but lost it but here are two pings, 1 before and 1 after a reboot. Internal pings, ssh, and pages still worked.

0_1539222061493_5993b12f-fb37-40e3-b0db-e96ac5e615a5-image.png

0_1539222068271_7038fb58-0620-469d-a20f-b3b501d6b5e5-image.png

PiBa

@coreyjohnson75
That was without any pass rules on the wan-interface right.? And without haproxy running.

That "Time to live exceeded" is a rather strange response. Would mean that the path to the target is to many 'hops' away, which could be if the packet bounced around for a while between 2 routers. The traceroute could have been rather interesting to see that could show the same effect, and show which router is forwarding traffic in the wrong direction.

I'm thinking that maybe the isp or isp-modem is 'blocking' you from hosting a website? Either intentionally, or perhaps the modem is buggy.? I don't suppose you could swap that for a different one?

coreyjohnson75

@piba
Here is what my WAN rules look like.. I have not had the top two enabled for days or haproxy. It is installed but not enabled.

0_1539262830654_22f3f505-7dbc-4835-8609-54f8f48f2449-image.png

I'm at work right now VPN'd in and wife just told me she doesn't have access from the house.
Here is a current traceroute:
0_1539263113691_273257c8-386f-43b2-a505-d69ac1a8cef0-image.png

I had thought of the isp but I have had a page served via duckdns for almost a year but deleted those settings so I could get a new domain and use it (wwolf.us) to server that page and email via haproxy rather than a port forward to a single server.

Unfortunately I do not have a spare modem to test with. Definitely not opposed to upgrading if it offered any benefit but would rather not buy just for a test. I currently have an Arris MTA TM1602AP2.

Odd thing, I turned on logging on my my rules and it stopped all internet facing traffic until I took them backoff... CPU was not overloaded... As soon as I turned them back off and rebooted internet was up again for the wife. my head hurts now.. ugh.

Thank you for all your help btw. This has become a mission to not be beaten by pfsense, cloudflare, isp, whatever.

Did some searching and found this on Spectrum's site (Charter):

Customer may set up one (1) web page per service account for personal use of the Service, but Customer may not establish a web page using a server located at Customer's home. Customer will not use, or allow others to use, Customer's home computer as a web server, FTP server, file server or game server or to run any other server applications or to provide network or host services to others via Charter's network. Customer will not use, or allow others to use, the Service to operate any type of business or commercial enterprise, including, but not limited to, IP address translation or similar facilities intended to provide additional access.

PiBa

@coreyjohnson75
That traceroute looks like it worked alright, i guess a ping would also at that moment?

As for being allowed to create 1 webpage, im not sure how they would monitor that.. And even for personal use, having a domain name would be handy, not needing to remember a IP.. But i somehow doubt its the ISP at this point.. Buying a new modem indeed wouldn't be a obvious step..

Anyhow your vpn from outside to pfSense work, while internet access from inside the house doesn't? If he ISP was blocking traffic, i doubt this would be the way they do it.

So back to square one.. :(.

Simultaneous packet captures on lan and wan interface while the issue is happening? ( in two SSH sessions run tcpdump -s 0 -w /root/capture1.pcap -ni <interfacename> for each interface)
Create a copy of /tmp/rules.debug to read through and compare after it works again
Check diagnostics/states is traffic on wan interface being natted properly?
traceroute and ping
Try and fix the issue with a smaller action than a complete reboot.
- status/filter reload
- save a interface configuration
- save system/routes settings
  -Check every available log file for any event that happens around that same time.?.

I'm not sure what other things to try atm if/when it happens again..

coreyjohnson75

Well I reset everything up last night, worked for hours so I went to bed wondering what I had done differently to get it to work. Then I woke up to not being able to access the internet again...

Ping/traceroute both failed.
I ran the tcpdumps on my wan and normal access interfaces but if I try to open them in pfsense it just locks up the 'edit file' screen. I ssh'd in and did a cat on the files but it was just garbled. not sure what to do with them...

I made copies of /tmp/rules.debug but didn't see anything.

I reloaded the rules and haproxy - no change

Looked at logs but didn't see anything sticking out to me in the last 8 hours not that I really know what to look for

So I rebooted, turned off haproxy and deleted my stuff from cloudflare. so Wife would have internet (has to work this morning).

When using something like Cloudflare, other than running dynamic dns, is there anything I need to add/change in pfsense? add their nameservers or dns ?

At this point I am thinking of buying another domain and using a different host just to make sure it isn't something there.

Mats

Lets see if I can be of any help here.

You have one external IP. (maybe even a dynamic one assigned from your ISP over DHCP or so?)
You want to have multiple sites published over 443 from that IP
You probably want the cert presented from your HA proxy to be accepted by most browser too?

Right.
Well, lets get to it then ;) , it can be done with PFsense and a couple of plugins. As it happens, this is what I'm running myself. I will be using example.com as an example domain in this.

My setup uses
Cloudflare as my DNS
The built in Dynamic DNS component to keep Cloudflare updated
the Acme Certificates plugin to get a wildcard certificate for my domain
HA proxy to do the redirection of SSL traffic.

I only use Cloudflare as a DNS. I have turned off all features that can be turned off so I don't use the caching function at all. It might bee a good idea to turn it off during troubleshooting (one less thing involved).

As it seems that you already got valid DNS entries I'm skipping the Dynamic DNS part.

I am using Lets encrypt to provide a wildcard (ex *.example.com) cert for my domain. By using a wildcard cert I can simplify my HA proxy setup later on. There are guides on setting up lets encrypt with PFsense so I'm not going into details here but there are some specifics.
To get a wildcard cert you must use DNS validation and the V2 api. Since the Acme Certificates plugin has a DNS-Cloudflare method it very easy to do when using cloudflare. You will need the name of you domain, the mail account you used to create the cloudflare account and you will need your cloudflare API key. The later is available from the overview page of you cloudflare account. In the domain summary pane there is a Get you API key Link .

Now you have the required data to fill in the required fields to get a wildcard cert from lets encrypt. Just remember ti fill in *.example.com in the san field.

Time to start working on HA proxy.

Start with creating the needed backends.
for this I would create a Mailbackend and a WWWbackend pointing to port 443 on the servers for these services.

The next thing is the frontends. I use two of them. One for port 80 and one for port 443.
Basically I catch port 80 traffic and sends out redirects to 443 so lets skip that. It's just my way to make sure that people uses SSL ;)

On 443 my frontend it's set to listen on the wan address port 443, I got SSL offloading selected and the type is set to HTTP/HTTPS (offloading)

Now it's time for the fun part. Access Control Lists and actions.
This is the part where I control what traffic should be sent to which backend.
Lets create two entries here, one for mail and one for a Webserver
the first one:
Name: mailACL
Expression: Host starts with
Value: mail.example.net

The second one
Name:WWWACL
Expression: Host starts with
Value: www.example.net

To map them to a backend we need to define actions
The first "rule" would be
Action: Use Backend (and select the Mailbackend)
Skip parameters and in the Condition acl names type the name of the ACL you created (Ie mailACL)

And to create a "rule" for the Webserver we need
Action: Use Backend (and select the WWWbackend)
Skip parameters and in the Condition acl names type the name of the ACL you created (Ie WWWACL)

Scroll down to SSL offloading and select the wildcard cert you created earlier.

That should be it :)

PiBa

@mats said in multiple https with haproxy:

That should be it :)

Except for the part where serving websites isn't the actual biggest issue, its loosing outbound internet access 'shortly' after sending requests from outside towards pfSense. Other than that its a nice write-up of how to get going with haproxy

PiBa

@coreyjohnson75

Ping/traceroute, that they fail to reach the intended target is okay(well its because there is some issue of course we are trying to diagnose..), but what messages are displayed does the ping complain about no route? Does the trace get 1 hop out.?

The pcap files can be read / analyzed by a program like Wireshark if you copy them to your local computer. They are in a 'binary' format that cat and 'edit file' wouldnt know how to display..

Not a single difference in the rules.debug before/after trouble?

Try rebooting the Arris modem next time instead of pfSense ?

Dns settings for your public domain and where they are hosted or pointing to really shouldn't affect pfSense in any way.. Unless someone is trying to ddos your domain..

coreyjohnson75

@piba & @Mats

Thank @Mats I've been able to access my sites externally with SSL so other than the wildcard part I think I have that setup. If you think that makes a difference then I will definitely look into it. I'll post my configuration below if you don't mind glancing over it.

Well, I thought about going back to duckdns but at least at the time I couldn't get 2 sites to work.

Here is where I am right now, about 4 hours into a stable system where I went to GoDaddy (vs Cloudflare), bought a 2nd domain (.xyz versus .us) and have the DNS through them. I have 2 https and 1 http site all working and have not dropped outbound internet. I didn't change anything in my configuration other than the .xyz in my haproxy frontends. I'll need to get new SSL certs if this holds up as I am currently pushing through the previous .us one and of course getting an error but can reach my backends nonetheless (?).

Now that doesn't mean anything until I at least go 12-24 hours without dropping outbound internet, fingers crossed.

Now, IF this all holds up, I'll be where I wanted to be a week ago and trying to figure out how to HAproxy the rest of my mailserver ports through lol.

global
	maxconn			10
	stats socket /tmp/haproxy.socket level admin 
	uid			80
	gid			80
	nbproc			1
	hard-stop-after		15m
	chroot				/tmp/haproxy_chroot
	daemon
	tune.ssl.default-dh-param	2048
	server-state-file /tmp/haproxy_server_state

listen HAProxyLocalStats
	bind 127.0.0.1:2200 name localstats
	mode http
	stats enable
	stats refresh 10
	stats admin if TRUE
	stats show-legends
	stats uri /haproxy/haproxy_stats.php?haproxystats=1
	timeout client 5000
	timeout connect 5000
	timeout server 5000

frontend Frontend_www
	bind			XX.XX.XXX.XX:80 name XX.XX.XXX.XX:80   
	mode			http
	log			global
	option			http-keep-alive
	timeout client		30000
	acl			www_acl	var(txn.txnhost) -m beg -i www.wwolf.xyz
	http-request set-var(txn.txnhost) hdr(host)
	use_backend Test_Backend_ipvANY  if  www_acl 

frontend CF_443-merged
	bind			XX.XX.XXX.XX:443 name XX.XX.XXX.XX:443   ssl crt-list /var/etc/haproxy/CF_443.crt_list  
	mode			http
	log			global
	option			http-keep-alive
	timeout client		30000
	acl			zim_acl	var(txn.txnhost) -m beg -i webmail.wwolf.xyz
	acl			ha_acl2	var(txn.txnhost) -m beg -i ha.wwolf.xyz
	acl			Zimbra_acl	var(txn.txnhost) -m str -i webmail.wwolf.xyz
	http-request set-var(txn.txnhost) hdr(host)
	use_backend Zimbra_Backend_ipvANY  if  zim_acl 
	use_backend HA_Backend_ipvANY  if  ha_acl2 
	use_backend Zimbra_Backend_ipvANY  if  Zimbra_acl 

backend Test_Backend_ipvANY
	mode			http
	id			103
	log			global
	timeout connect		30000
	timeout server		30000
	retries			3
	option			httpchk GET / 
	server			testpage 192.168.30.11:80 id 104 check inter 1000  

backend Zimbra_Backend_ipvANY
	mode			http
	id			102
	log			global
	timeout connect		30000
	timeout server		30000
	retries			3
	option			httpchk GET / 
	server			Zimbra_Backend_Server 192.168.30.5:443 id 101 ssl check inter 1000  verify none 

backend HA_Backend_ipvANY
	mode			http
	id			100
	log			global
	timeout connect		30000
	timeout server		30000
	retries			3
	option			httpchk GET / 
	server			HA_Backend_Server 192.168.30.6:18122 id 101 check inter 1000

PiBa

If this change to GoDaddy 'fixes' the issue, then its likely still a problem waiting to happen. Does Cloudflare do 'healthchecks' or something.?. So they might be requesting the website main page every few seconds.? and filling up some 'state table' somewhere (arris modem?) which eventually overflows and causes trouble.?.

@coreyjohnson75 said in multiple https with haproxy:

how to HAproxy the rest of my mailserver ports

That will be limited to 'mode tcp' and without any 'smart' backend selection.. It can balance between multiple servers for the same domain if required.. It for example wont switch port 25 between 2 servers that only accept mail for specific different domains..

coreyjohnson75

I wanted to let you guys know that the problem ended up being rules related on the primary VLAN that kept going down. I'm still confused on why it would all work for a while and then go out AND why it would work longer (never actually crashed) with GoDaddy but I reworked all my rules with the help of someone and moved everything back to Cloudflare and have been up for over 20 hours without ANY issue. I had the right rules but they were below some bad rules. I have since moved many Iot devices onto their own separate vlan to make everything prettier and more simple.

Now... if either of you could assist me with 1) best practices for running an email server behind pfSense and 2) not get my emails flagged as spam, that would be awesome.

Mats

Well, I can give you my best practice. IE it works for me (tm).

I have a windows server running Hmailserver behind a Pfsense box.
I don't use HAproxy for my Mailserver so I simply got NAT rules for Port 25 and 993 to my mailserver. I only allow encrypted IMAP to the mail server so therefore i only need one port besides port 25.
I also allow port 25 outbound from my mailserver to my ISP:s mailrelay for sending mail. My isp dosen't allow direct traffic from port so I have to send it through them.

There can be many reasons for your mail being marked as spam.
being on DHCP is one. Having a PTR record in DNS that doesn't match the hostname of your mailserver is another.

Therefore it's hard to answer that part but one way that might helt is to use a Mail relay service with a decent (or better) reputation. One example is amazons simple Email services (haven't tested it myself though)