• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Connection Drop after 10 Seconds, TCP, HTTP

Scheduled Pinned Locked Moved NAT
26 Posts 5 Posters 8.1k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    MasterX-BKC- Banned
    last edited by Jan 19, 2017, 8:17 PM

    i would really hate to have to use one of my support tickets to solve what should be such a simple rudimentary, tho very thinly documented issue.

    1 Reply Last reply Reply Quote 0
    • J
      johnpoz LAYER 8 Global Moderator
      last edited by Jan 19, 2017, 8:41 PM

      There is not timer that would be for 10 seconds.

      https://doc.pfsense.org/index.php/Advanced_Setup

      
      [2.3.2-RELEASE][root@pfsense.local.lan]/root: pfctl -st    
      tcp.first                   120s                           
      tcp.opening                  30s                           
      tcp.established           86400s                           
      tcp.closing                 900s                           
      tcp.finwait                  45s                           
      tcp.closed                   90s                           
      tcp.tsdiff                   30s                           
      udp.first                    60s                           
      udp.single                   30s                           
      udp.multiple                 60s                           
      icmp.first                   20s                           
      icmp.error                   10s                           
      other.first                  60s                           
      other.single                 30s                           
      other.multiple               60s                           
      frag                         30s                           
      interval                     10s                           
      adaptive.start            58800 states                     
      adaptive.end             117600 states                     
      src.track                     0s                           
      [2.3.2-RELEASE][root@pfsense.local.lan]/root:              
      
      

      An intelligent man is sometimes forced to be drunk to spend time with his fools
      If you get confused: Listen to the Music Play
      Please don't Chat/PM me for help, unless mod related
      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

      1 Reply Last reply Reply Quote 0
      • N
        Nullity
        last edited by Jan 19, 2017, 8:44 PM

        To be clear, this is a fully open TCP connection that loses state after ~30 seconds?

        If so, there seems to be a problem. No sane default timeout would ever be that low, so I doubt changing any of them would help.

        Have you done a packet capture or monitored the states table?

        Please correct any obvious misinformation in my posts.
        -Not a professional; an arrogant ignoramous.

        1 Reply Last reply Reply Quote 0
        • M
          MasterX-BKC- Banned
          last edited by Jan 19, 2017, 9:28 PM

          i have monitored the state table and i did the packet capture before, here is how it happens.

          a client connects to the webserver via a browser to request a report.

          the server answers back and begins generating the report.

          if it takes longer than ~10 seconds to generate, the server sends the report, but pfsense blocks it from going out, because its closed the state/connection.

          client spins forever untill they timeout, not knowing the report was sent to them, because pfsense blocked it.

          1 Reply Last reply Reply Quote 0
          • N
            Nullity
            last edited by Jan 19, 2017, 10:03 PM

            @MasterX-BKC-:

            For the moment i created a rule on LAN, to pass, tcp flags: any, State type: none.

            State type: none? You sure you want to do that?

            I'd be very hesitant to start changing things since, by default, things should be working fine, keeping states for ~24 hours. If you start playing with a bunch of options you may run into many unforeseen problems later.

            Please correct any obvious misinformation in my posts.
            -Not a professional; an arrogant ignoramous.

            1 Reply Last reply Reply Quote 0
            • M
              MasterX-BKC- Banned
              last edited by Jan 20, 2017, 6:18 AM Jan 20, 2017, 6:06 AM

              @Nullity:

              @MasterX-BKC-:

              For the moment i created a rule on LAN, to pass, tcp flags: any, State type: none.

              State type: none? You sure you want to do that?

              I'd be very hesitant to start changing things since, by default, things should be working fine, keeping states for ~24 hours. If you start playing with a bunch of options you may run into many unforeseen problems later.

              Actually i got it to work finally, using an unusual combination of settings strangely enough.

              On the Rule corresponding to the NAT policy for port 80 inbound, i went under advanced and did the following:
              State timeout 60
              TCP Flags any
              state type sloppy

              I tried those options individually, and it seems to require them all for some reason, but in addition i also changed the following under
              System > Advanced > Firewall NAT
              TCP First: 60
              TCP Openning: 60
              TCP Established: 60 - Tested again and discovered this one has no effect on the issue, works great with it set empty again.
              Other First: 60

              I doubt all of these need to be set this way, but im afraid to touch it as its now working flawlessly to generate the reports, they are working fine and to prove it, i even added a extra 30 second delay into the report generator to cause them to take nearly 50 seconds to complete.

              and with these settings, even a 50 second report generating delay still works perfectly.

              Im sure an admin, or someone else familiar could direct me to the better way to achieve these same results…..

              interestingly i first tryed just TCP established: 60, but that wasnt enough to allow it to work either.....

              UPDATE:  TCP Established seems to not be involved, turning it off didnt break it.

              My test file is here:  http://pfmon.black-knights.org/test.php
              Without the options set, it will count to 4-6 and then the connection stops working and hangs, with the settings above, it counts and processes all the way to completion.

              1 Reply Last reply Reply Quote 0
              • N
                Nullity
                last edited by Jan 20, 2017, 7:08 AM

                @MasterX-BKC-:

                UPDATE:  TCP Established seems to not be involved, turning it off didnt break it.

                Turning it off defaults it to 86400 seconds or smaller/larger depending on the "Firewall Optimization" setting, I think.

                You can run the "pftctl -st" command to see what it's set to.

                Please correct any obvious misinformation in my posts.
                -Not a professional; an arrogant ignoramous.

                1 Reply Last reply Reply Quote 0
                • J
                  johnpoz LAYER 8 Global Moderator
                  last edited by Jan 20, 2017, 1:27 PM

                  "someone else familiar could direct me to the better way to achieve these same results….."

                  There should be no reason why you have to edit such settings.  Did you take a look at pftop when your connections where active to see what the timeouts where in real time for your states??

                  Shouldn't that have been first place to look for such an issue?

                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                  If you get confused: Listen to the Music Play
                  Please don't Chat/PM me for help, unless mod related
                  SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                  1 Reply Last reply Reply Quote 0
                  • D
                    doktornotor Banned
                    last edited by Jan 20, 2017, 5:02 PM

                    Indeed, these hacks digging holes into your setup are just horrible and absolutely should not be required for anything.

                    1 Reply Last reply Reply Quote 0
                    • M
                      MasterX-BKC- Banned
                      last edited by Jan 20, 2017, 7:40 PM Jan 20, 2017, 7:31 PM

                      They were not required before when i was using a Cisco 7507 at the gateway, when i moved this system where i have pfsense is when the issue first came around, but it was handlable and only intermittent untill the reports grew in size.

                      doktornotor, the fact that im looking for a better way to do this, in of itself denotes that im aware this is not ideal, so your post was not called for, if you arent going to contribute, please move along.

                      @johnpoz:

                      There should be no reason why you have to edit such settings.  Did you take a look at pftop when your connections where active to see what the timeouts where in real time for your states??

                      I agree pftop would be able to help narrow the issue, if it were not for the fact that this network hosts 7 servers, a total of 27 websites.  The one server the issue occurs on hosts 8 such sites, all on the same ports using apache virtualhosts if your familiar with it.(its not virtualization related)  The number of states at peak times has hit 450,000.

                      This isnt a small 1 off network, this is at a datacenter, with a LOT of traffic, and the server in question being a 12 core(24 thread), 144 GB RAM monster box that handles MySQL for all the other servers as well as internet based systems using https apis.

                      not your average john boy setup to host a personal webpage from his basement on a extra pc.

                      1 Reply Last reply Reply Quote 0
                      • C
                        chpalmer
                        last edited by Jan 21, 2017, 3:53 AM

                        My test file is here:  http://pfmon.black-knights.org/test.php

                        I don't suppose you would share your code so I could test here eh?

                        Curious if you have tried 1:1 NAT in favor of port forwarding?    ???

                        Triggering snowflakes one by one..
                        Intel(R) Core(TM) i5-4590T CPU @ 2.00GHz on an M400 WG box.

                        1 Reply Last reply Reply Quote 0
                        • M
                          MasterX-BKC- Banned
                          last edited by Jan 21, 2017, 4:20 AM

                          all that file does is:

                          while($i <= 30)
                          echo $1
                          $i = $i + 1;
                          sleep(11);

                          it just sends numbers every 11 seconds to see if the connection is still alive.

                          if the browser counts all the way to 30, then the issue is fixed.  if it stops for more than 11 seconds then its died.

                          1 Reply Last reply Reply Quote 0
                          • J
                            johnpoz LAYER 8 Global Moderator
                            last edited by Jan 21, 2017, 11:52 AM Jan 21, 2017, 11:47 AM

                            "The number of states at peak times has hit 450,000."

                            So maybe your running into state exhaustion and pfsense is killing off the idle ones?

                            "The one server the issue occurs "

                            So you have other servers serving up stuff behind pfsense and this sort of thing doesn't happen with them?  Why don't you isolate out this box or try and duplicate on test..

                            Dok is pointing out that what your doing is not a good idea, and that is very much so a valid contribution to the thread.. If someone like dok says its a bad idea - then its a BAD Idea!!  And I agree what your doing is hack that should not have to be done…  You got something else going on, what your doing is hiding the actual problem.

                            An intelligent man is sometimes forced to be drunk to spend time with his fools
                            If you get confused: Listen to the Music Play
                            Please don't Chat/PM me for help, unless mod related
                            SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                            1 Reply Last reply Reply Quote 0
                            • D
                              doktornotor Banned
                              last edited by Jan 21, 2017, 8:01 PM

                              I really hate to state the obvious again, but – have you tried this with a physical machine?

                              1 Reply Last reply Reply Quote 0
                              • M
                                MasterX-BKC- Banned
                                last edited by Jan 21, 2017, 9:31 PM

                                the issue is solved, if it was a virtualization related issue i would not have solved it by changing the timeout of pfsense.

                                I think the source of the issue is this.

                                PFSense terminates sessions that are openning, if the machine behind pfsense doesnt respond within 10 seconds, period.

                                When apache/php is doing a large report processing job, it can take between 2 seconds for a small report, and 15-20 seconds for a large report.

                                if there was a problem in the virtualization, it would be affecting more than this 1 program.

                                This is not your average situation, this is a workload the likes of which you may not have seen before.

                                I agree this is not an ideal fix, but please doktornotor, please explain why this is a bad idea to you, from a technical standpoint, so maybe i can see your thought process for this assumption.

                                1 Reply Last reply Reply Quote 0
                                • D
                                  doktornotor Banned
                                  last edited by Jan 21, 2017, 10:36 PM

                                  You know, because… well, this just happens to noone but you, pretty much.

                                  PFSense terminates sessions that are openning, if the machine behind pfsense doesnt respond within 10 seconds, period.

                                  Errr…. huh. No.

                                  1 Reply Last reply Reply Quote 0
                                  • M
                                    MasterX-BKC- Banned
                                    last edited by Jan 23, 2017, 10:14 PM

                                    if your not going to back up your responses with anything technical, then find someone else to not help.

                                    1 Reply Last reply Reply Quote 0
                                    • N
                                      Nullity
                                      last edited by Jan 24, 2017, 2:39 PM

                                      @MasterX-BKC-:

                                      if your not going to back up your responses with anything technical, then find someone else to not help.

                                      Your current fix includes lowering timeout values well below the defaults?

                                      Please correct any obvious misinformation in my posts.
                                      -Not a professional; an arrogant ignoramous.

                                      1 Reply Last reply Reply Quote 0
                                      • M
                                        MasterX-BKC- Banned
                                        last edited by Jan 24, 2017, 3:37 PM

                                        Actually making them longer, its seems as the stack is building the report it doesnt respond at all untill the report is actually complete, and then sends it.

                                        but the sending was happenning just after the timeout, so the reponse from the stack was getting blocked from going out, as its state had already been dropped.

                                        stack = MySQL, PHP, Apache.

                                        sidenote:  i only came to this conclusion after thouroughly testing all of the timeout settings in PHP, and apache, and nothing i did made any difference to the issue.  and as i was able to confirm with a proof-of-concept test file that simulated the same delay but without actually doing anything, the identical behavior was seen, this it isnt a load issue.  A php file that slept 11 seconds then printed a word, would never actually print anything, if the sleep was lowered to 9 seconds it reponded every time, and at 10 seconds it would respond intermittently because it was right on the wire timing wise.  A touch command inserted into the php file after the print, revealed that even when it failed to respond it was indeed processing to completion but pfsense was not allowing the data out due to the closing of its state.  Further evidenced when i saw a outbound denial in the firewall logs with a source port of 80 and from the webserver, meaning it was a response to an http request.

                                        1 Reply Last reply Reply Quote 0
                                        • N
                                          Nullity
                                          last edited by Jan 24, 2017, 4:22 PM

                                          I wonder why tcp.first & tcp.open made an impact since I assume tcp.established should be the only relavent parameter.

                                          I'm too curious to leave it alone but I guess if it works, it works.

                                          Why did you change state tracking to sloppy (or none?)?

                                          Please correct any obvious misinformation in my posts.
                                          -Not a professional; an arrogant ignoramous.

                                          1 Reply Last reply Reply Quote 0
                                          15 out of 26
                                          • First post
                                            15/26
                                            Last post
                                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                                            This community forum collects and processes your personal information.
                                            consent.not_received