• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Unpredictable connection timeouts

Scheduled Pinned Locked Moved General pfSense Questions
10 Posts 2 Posters 3.0k Views 3 Watching
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P Offline
    pfsenselessness
    last edited by Aug 19, 2019, 11:14 AM

    Hello,

    I have a modest home network setup running pfsense. With my previous ISP, I was using double NAT'ing as it was not possible to receive my PPPoE configuration settings (i.e. the ISP's device was giving a static IP address to my pfsense box'es WAN).

    I never had any problems with this setup.

    I have since changed ISP's and have therefore removed double NAT and have a direct connection on the WAN side using PPPoE. Ever since making this change, I have strange and unpredictable connectivity issues. Mostly when connecting to large websites with a CDN (for example Netflix or Amazon). Sometimes, the connection will consistently fail and timeout from 1 device (for example my phone), but it will connect and work as expected from another device on the same network (for example my laptop). This is consistent in the sense that I can refresh multiple times on the "broken" device and nothing will change, while the service continues to work on another device.

    Furthermore, when this happens, if I curl directly from my pfsense box, I always receive a response from the server. Unfortunately, these connectivity issues are unpredictable and I have not been able to debug or identify the cause (or even a predictable way to make this problem occur).

    My pfsense machine (with the web GUI open), runs at about 6% CPU, 0% state table size, 3% MBUF usage and 28% memory. I also tried disabling "heavy" services such as suricata and squid - but the problem persists.

    The only real change I can see here is that I am directly on the Internet, so my pfsense box is getting hit with a lot more external connections (which was previously filtered by the ISP's device).

    Does anybody have any ideas for what could be causing this?

    Any suggestions to fix the issue or debug the cause?

    Thanks and apologies if this is not the right section to ask this question.

    1 Reply Last reply Reply Quote 0
    • S Offline
      stephenw10 Netgate Administrator
      last edited by Aug 19, 2019, 6:15 PM

      The symptoms sound more like a DNS issue or possibly MTU related.

      You need to test from a failing client and see what's actually failing. Try running packet capture for that traffic to see what is being sent and whether it appears on both LAN and WAN in pfSense.

      Steve

      1 Reply Last reply Reply Quote 0
      • P Offline
        pfsenselessness
        last edited by Aug 19, 2019, 11:23 PM

        Thanks Steve,

        I have done this previously and what I saw with wireshark was that the TCP connections were being RST on the LAN side, but not on the WAN side. I couldn't identify any reason for this RST though.

        I don't think it's a DNS issue, as if I try to directly curl the IP address I also have the same problem (external hostnames resolve without an issue).

        How would I go about debugging the MTU? Or are there some variables/settings somewhere I can tinker with?

        1 Reply Last reply Reply Quote 0
        • S Offline
          stephenw10 Netgate Administrator
          last edited by Aug 20, 2019, 12:32 PM

          The TCP connections are setup from the client to the server directly. There is no TCP termination on the firewall. Any TCP RST packets would have to be coming from the remote server. It's hard to see how those could be on LAN but not WAN.
          The only exception to that would be if you are running Squid in pfSense.

          You said you only see this on CDN destinations. Do clients that are failing resolve to a different IP than those that can connect? They may have a different route if that's the case and then MTU size might come into play.
          https://docs.netgate.com/pfsense/en/latest/interfaces/low-throughput-troubleshooting.html#mtu-issues

          Steve

          1 Reply Last reply Reply Quote 0
          • P Offline
            pfsenselessness
            last edited by pfsenselessness Aug 21, 2019, 10:43 AM Aug 21, 2019, 10:36 AM

            @stephenw10 said in Unpredictable connection timeouts:

            might

            Hi Steve,

            Seems you are correct.

            When I nslookup from a machine which works, I receive a different result than from a machine which doesn't work.

            I tried playing around with the MSS clamping settings. The default MTU on the WAN is 1492 (PPPoE). I Tried setting the MSS to 1452, but it didn't make any difference..

            Should I be looking to change these settings on the WAN, or the VLANs?

            Do you have any values you would suggest to try?

            This is what I see in Wireshark when the connection is failing:
            fc219c30-8ff1-4dab-9982-3fec275b0631-image.png

            1 Reply Last reply Reply Quote 0
            • S Offline
              stephenw10 Netgate Administrator
              last edited by Aug 21, 2019, 2:38 PM

              Hmm, all of those packets are tiny. Unlikely to be an MTU issue.
              You are seeing traffic back from the target too so the route is good.
              Hard to say why it's failing then. I don't see any RST packets there.

              Steve

              1 Reply Last reply Reply Quote 0
              • P Offline
                pfsenselessness
                last edited by Aug 22, 2019, 6:27 AM

                Hi Steve,

                Any suggestions on what I could look into in order to debug this?

                I'm at a loss on what to look for to identify the problem.

                1 Reply Last reply Reply Quote 0
                • S Offline
                  stephenw10 Netgate Administrator
                  last edited by Aug 22, 2019, 1:51 PM

                  I would still be looking at packet captures to see what happens when it fails. Does the remote end just send a RST?

                  Are all devices using the same DNS servers?

                  You have any sort of VPN involved here?

                  Steve

                  1 Reply Last reply Reply Quote 0
                  • P Offline
                    pfsenselessness
                    last edited by Aug 27, 2019, 9:14 AM

                    Hi Steve.

                    They are all using the same DNS servers (received via DHCP from pfsense).

                    There is no VPN, direct Internet connection from my ISP.

                    The error I see over the wire is the error I posted from wireshark.

                    I have nothing else to go on to debug..

                    1 Reply Last reply Reply Quote 0
                    • S Offline
                      stephenw10 Netgate Administrator
                      last edited by Aug 27, 2019, 10:32 AM

                      The only actual issue I see there are two re-tranmissions but that may be normal packet loss. Not really something that should kill the connection. You are seeing traffic in both directions there.

                      Was that pcap on the WAN? How was it filtered? Do you see anything different on the internal interface?

                      Steve

                      1 Reply Last reply Reply Quote 0
                      10 out of 10
                      • First post
                        10/10
                        Last post
                      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                        This community forum collects and processes your personal information.
                        consent.not_received