Connection Drop after 10 Seconds, TCP, HTTP
-
OK so based on my research, i found 2 threads with simular issues.
https://forum.pfsense.org/index.php?topic=102175.0
https://forum.pfsense.org/index.php?topic=51423.0
Basically, i just want to let the outgoing HTTP traffic go out, even if its state has disappeared, expired, etc…
webserver is 192.168.1.2, and as it is answering requests from external systems, its source port will always be 80.
What rules do i need to create to accomplish this?
For the moment i created a rule on LAN, to pass, tcp flags: any, State type: none.
I suspect there is more to it than that. i believe i also need a floating rule, but im not sure on the specifics.
-
Switching the Firewall Optimization to High-Latency improves the problem, but it still times out occasionally.
Is there a way to manually adjust the Firewall Optimization just for outgoing source port 80 connections, to say double whatever the high-latency option provides???
-
i would really hate to have to use one of my support tickets to solve what should be such a simple rudimentary, tho very thinly documented issue.
-
There is not timer that would be for 10 seconds.
https://doc.pfsense.org/index.php/Advanced_Setup
[2.3.2-RELEASE][root@pfsense.local.lan]/root: pfctl -st tcp.first 120s tcp.opening 30s tcp.established 86400s tcp.closing 900s tcp.finwait 45s tcp.closed 90s tcp.tsdiff 30s udp.first 60s udp.single 30s udp.multiple 60s icmp.first 20s icmp.error 10s other.first 60s other.single 30s other.multiple 60s frag 30s interval 10s adaptive.start 58800 states adaptive.end 117600 states src.track 0s [2.3.2-RELEASE][root@pfsense.local.lan]/root:
-
To be clear, this is a fully open TCP connection that loses state after ~30 seconds?
If so, there seems to be a problem. No sane default timeout would ever be that low, so I doubt changing any of them would help.
Have you done a packet capture or monitored the states table?
-
i have monitored the state table and i did the packet capture before, here is how it happens.
a client connects to the webserver via a browser to request a report.
the server answers back and begins generating the report.
if it takes longer than ~10 seconds to generate, the server sends the report, but pfsense blocks it from going out, because its closed the state/connection.
client spins forever untill they timeout, not knowing the report was sent to them, because pfsense blocked it.
-
For the moment i created a rule on LAN, to pass, tcp flags: any, State type: none.
State type: none? You sure you want to do that?
I'd be very hesitant to start changing things since, by default, things should be working fine, keeping states for ~24 hours. If you start playing with a bunch of options you may run into many unforeseen problems later.
-
For the moment i created a rule on LAN, to pass, tcp flags: any, State type: none.
State type: none? You sure you want to do that?
I'd be very hesitant to start changing things since, by default, things should be working fine, keeping states for ~24 hours. If you start playing with a bunch of options you may run into many unforeseen problems later.
Actually i got it to work finally, using an unusual combination of settings strangely enough.
On the Rule corresponding to the NAT policy for port 80 inbound, i went under advanced and did the following:
State timeout 60
TCP Flags any
state type sloppyI tried those options individually, and it seems to require them all for some reason, but in addition i also changed the following under
System > Advanced > Firewall NAT
TCP First: 60
TCP Openning: 60
TCP Established: 60- Tested again and discovered this one has no effect on the issue, works great with it set empty again.
Other First: 60I doubt all of these need to be set this way, but im afraid to touch it as its now working flawlessly to generate the reports, they are working fine and to prove it, i even added a extra 30 second delay into the report generator to cause them to take nearly 50 seconds to complete.
and with these settings, even a 50 second report generating delay still works perfectly.
Im sure an admin, or someone else familiar could direct me to the better way to achieve these same results…..
interestingly i first tryed just TCP established: 60, but that wasnt enough to allow it to work either.....
UPDATE: TCP Established seems to not be involved, turning it off didnt break it.
My test file is here: http://pfmon.black-knights.org/test.php
Without the options set, it will count to 4-6 and then the connection stops working and hangs, with the settings above, it counts and processes all the way to completion. -
UPDATE: TCP Established seems to not be involved, turning it off didnt break it.
Turning it off defaults it to 86400 seconds or smaller/larger depending on the "Firewall Optimization" setting, I think.
You can run the "pftctl -st" command to see what it's set to.
-
"someone else familiar could direct me to the better way to achieve these same results….."
There should be no reason why you have to edit such settings. Did you take a look at pftop when your connections where active to see what the timeouts where in real time for your states??
Shouldn't that have been first place to look for such an issue?
-
Indeed, these hacks digging holes into your setup are just horrible and absolutely should not be required for anything.
-
They were not required before when i was using a Cisco 7507 at the gateway, when i moved this system where i have pfsense is when the issue first came around, but it was handlable and only intermittent untill the reports grew in size.
doktornotor, the fact that im looking for a better way to do this, in of itself denotes that im aware this is not ideal, so your post was not called for, if you arent going to contribute, please move along.
There should be no reason why you have to edit such settings. Did you take a look at pftop when your connections where active to see what the timeouts where in real time for your states??
I agree pftop would be able to help narrow the issue, if it were not for the fact that this network hosts 7 servers, a total of 27 websites. The one server the issue occurs on hosts 8 such sites, all on the same ports using apache virtualhosts if your familiar with it.(its not virtualization related) The number of states at peak times has hit 450,000.
This isnt a small 1 off network, this is at a datacenter, with a LOT of traffic, and the server in question being a 12 core(24 thread), 144 GB RAM monster box that handles MySQL for all the other servers as well as internet based systems using https apis.
not your average john boy setup to host a personal webpage from his basement on a extra pc.
-
My test file is here: http://pfmon.black-knights.org/test.php
I don't suppose you would share your code so I could test here eh?
Curious if you have tried 1:1 NAT in favor of port forwarding? ???
-
all that file does is:
while($i <= 30)
echo $1
$i = $i + 1;
sleep(11);it just sends numbers every 11 seconds to see if the connection is still alive.
if the browser counts all the way to 30, then the issue is fixed. if it stops for more than 11 seconds then its died.
-
"The number of states at peak times has hit 450,000."
So maybe your running into state exhaustion and pfsense is killing off the idle ones?
"The one server the issue occurs "
So you have other servers serving up stuff behind pfsense and this sort of thing doesn't happen with them? Why don't you isolate out this box or try and duplicate on test..
Dok is pointing out that what your doing is not a good idea, and that is very much so a valid contribution to the thread.. If someone like dok says its a bad idea - then its a BAD Idea!! And I agree what your doing is hack that should not have to be done… You got something else going on, what your doing is hiding the actual problem.
-
I really hate to state the obvious again, but – have you tried this with a physical machine?
-
the issue is solved, if it was a virtualization related issue i would not have solved it by changing the timeout of pfsense.
I think the source of the issue is this.
PFSense terminates sessions that are openning, if the machine behind pfsense doesnt respond within 10 seconds, period.
When apache/php is doing a large report processing job, it can take between 2 seconds for a small report, and 15-20 seconds for a large report.
if there was a problem in the virtualization, it would be affecting more than this 1 program.
This is not your average situation, this is a workload the likes of which you may not have seen before.
I agree this is not an ideal fix, but please doktornotor, please explain why this is a bad idea to you, from a technical standpoint, so maybe i can see your thought process for this assumption.
-
You know, because… well, this just happens to noone but you, pretty much.
PFSense terminates sessions that are openning, if the machine behind pfsense doesnt respond within 10 seconds, period.
Errr…. huh. No.
-
if your not going to back up your responses with anything technical, then find someone else to not help.
-
if your not going to back up your responses with anything technical, then find someone else to not help.
Your current fix includes lowering timeout values well below the defaults?