Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Intermittent connection issue

    Scheduled Pinned Locked Moved DHCP and DNS
    115 Posts 6 Posters 24.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K
      kevindd992002 @johnpoz
      last edited by

      @johnpoz said in Non-forwarding Resolver intermittent operation:

      Where your going to have a problem is pinging stuff off their network and getting packet loss - they can always just say not their network..

      You need to have pings going to their network, and not getting back answers.. Also many (all really) an ISP do not promise zero loss, so unless you have significant packet loss - good luck.. TCP can work just fine with a small amount of packet loss.. Do they have anything in their SLA about amount of packet loss?

      If you call and say yeah over 10k pings I saw .01% loss - they will just laugh and say, ok so? But if you show that you have sustained loss say 5% then you might have something to complain about.

      So I am going to say it again, a few packets here or there loss is not going to be an issue.. And not the root of your problem with issues with resolving stuff.. Is not like dns only does 1 query, and if no answer just says F it, doesn't work... DNS will send multiple queries before it gives up, and can even switch to tcp vs udp, etc. So for packet loss to be a problem with dns resolution it really needs to be a significant issue.

      I'm going to ask this again too: Would a 10-minute ping to google.com with ALL RTO's not considered an issue for you? I'm really having a hard time thinking why you wouldn't consider that an issue.

      Like I said, I CARE LESS for few RTO's because they will be retried anyway, I agree with you completely. But if you start getting 100% RTO for a span of even just one minute and your clients cannot browse the Internet 100%, then where in the world is that not an issue?

      1 Reply Last reply Reply Quote 0
      • K
        kevindd992002
        last edited by

        I will record a video of the issue when I get the chance and post it here as proof.

        1 Reply Last reply Reply Quote 0
        • johnpozJ
          johnpoz LAYER 8 Global Moderator
          last edited by johnpoz

          10 minutes yeah that is a problem - But maybe outside the isp control, maybe its somewhere past where the traffic leaves the ISP.. You can get them to troubleshoot it is happening all the time and you can not get to google. But 8.8.8.8 is not google.

          And 8.8.8.8 does not have anything to do with overall dns, unless its the authoritative NS for what your looking for and can not talk to.. Which seems really odd, since its an anycast.. Unless you are forwarding to it, which per your title you are not.

          If you can show an outage pinging 8.8.8.8 for 10 minutes - then for sure you can bring that to your ISP attention and say hey - WTF... But unless you can show them that it happens more than once in a while your going to have a hard time getting their attention.

          edit: Do not post any stupid videos.. JFC nobody is going to watch such nonsense... You can either resolve something or you can not, you can either ping something you can not... Show the sniff of the traffic, show the traceroute to the IP.. So the sniffs of the resolving action and getting no response, etc.

          BTW - here are the NS involved in resolving google.com.. Notice 8.8.8.8 not there

          [2.4.4-RELEASE][admin@sg4860.local.lan]/: unbound-control -c /var/unbound/unbound.conf lookup google.com
          The following name servers are used for lookup of google.com.
          ;rrset 78708 4 0 2 0
          google.com.     78708   IN      NS      ns2.google.com.
          google.com.     78708   IN      NS      ns1.google.com.
          google.com.     78708   IN      NS      ns3.google.com.
          google.com.     78708   IN      NS      ns4.google.com.
          ;rrset 78623 1 0 1 0
          ns4.google.com. 78623   IN      A       216.239.38.10
          ;rrset 78623 1 0 1 0
          ns4.google.com. 78623   IN      AAAA    2001:4860:4802:38::a
          ;rrset 78623 1 0 1 0
          ns3.google.com. 78623   IN      A       216.239.36.10
          ;rrset 78623 1 0 1 0
          ns3.google.com. 78623   IN      AAAA    2001:4860:4802:36::a
          ;rrset 78623 1 0 1 0
          ns1.google.com. 78623   IN      A       216.239.32.10
          ;rrset 78623 1 0 1 0
          ns1.google.com. 78623   IN      AAAA    2001:4860:4802:32::a
          ;rrset 78623 1 0 1 0
          ns2.google.com. 78623   IN      A       216.239.34.10
          ;rrset 78623 1 0 1 0
          ns2.google.com. 78623   IN      AAAA    2001:4860:4802:34::a
          Delegation with 4 names, of which 0 can be examined to query further addresses.
          It provides 8 IP addresses.
          2001:4860:4802:34::a    expired, rto 154191216 msec, tA 0 tAAAA 0 tother 0.
          216.239.34.10           rto 223 msec, ttl 776, ping 7 var 54 rtt 223, tA 0, tAAAA 0, tother 0, EDNS 0 probed.
          2001:4860:4802:32::a    rto 376 msec, ttl 776, ping 0 var 94 rtt 376, tA 0, tAAAA 0, tother 0, EDNS 0 assumed.
          216.239.32.10           not in infra cache.
          2001:4860:4802:36::a    rto 376 msec, ttl 776, ping 0 var 94 rtt 376, tA 0, tAAAA 0, tother 0, EDNS 0 assumed.
          216.239.36.10           rto 311 msec, ttl 776, ping 3 var 77 rtt 311, tA 0, tAAAA 0, tother 0, EDNS 0 probed.
          2001:4860:4802:38::a    rto 376 msec, ttl 776, ping 0 var 94 rtt 376, tA 0, tAAAA 0, tother 0, EDNS 0 assumed.
          216.239.38.10           rto 252 msec, ttl 776, ping 4 var 62 rtt 252, tA 0, tAAAA 0, tother 0, EDNS 0 probed.
          [2.4.4-RELEASE][admin@sg4860.local.lan]/: 
          

          An intelligent man is sometimes forced to be drunk to spend time with his fools
          If you get confused: Listen to the Music Play
          Please don't Chat/PM me for help, unless mod related
          SG-4860 24.11 | Lab VMs 2.7.2, 24.11

          K 1 Reply Last reply Reply Quote 0
          • K
            kevindd992002 @johnpoz
            last edited by

            @johnpoz said in Non-forwarding Resolver intermittent operation:

            10 minutes yeah that is a problem - But maybe outside the isp control, maybe its somewhere past where the traffic leaves the ISP.. You can get them to troubleshoot it is happening all the time and you can not get to google. But 8.8.8.8 is not google.

            And 8.8.8.8 does not have anything to do with overall dns, unless its the authoritative NS for what your looking for and can not talk to.. Which seems really odd, since its an anycast.. Unless you are forwarding to it, which per your title you are not.

            If you can show an outage pinging 8.8.8.8 for 10 minutes - then for sure you can bring that to your ISP attention and say hey - WTF... But unless you can show them that it happens more than once in a while your going to have a hard time getting their attention.

            edit: Do not post any stupid videos.. JFC nobody is going to watch such nonsense... You can either resolve something or you can not, you can either ping something you can not... Show the sniff of the traffic, show the traceroute to the IP.. So the sniffs of the resolving action and getting no response, etc.

            I know 8.8.8.8 is not google.com. These are two different servers that I showed in my tests above. I'm not sure why you're not following.

            Well, I thought a video will help you believe me that there's a problem. If you don't want it, then fine. Obviously, your network troubleshooting skills are way better than mine but I make it to the point to give any information I deem necessary for everyone to check. This is why I'm asking for guidance.

            I thought we're already past the point where we're not considering this to be a DNS issue anymore? If it was, then the resolution part is where I'll have issues but when the issue happens pinging random servers show RTO's as well. I'm not mentioning here that it is a DNS issue. 8.8.8.8 was just the monitor IP I have for the WAN gateway from the very start so that's what I showed everyone in this forum.

            1 Reply Last reply Reply Quote 0
            • K
              kevindd992002
              last edited by kevindd992002

              Here's a traceroute that I sent them a few months ago: https://pastebin.com/JqPx326v

              That looks like the RTO starts from the hop that's within the ISP network. Is that enough evidence for them to conclude that the problem is in their network?

              And then 3 minutes after the issue, it got resolved and the traceroute results became like this: https://pastebin.com/XYbNMiWy

              Raffi_R 1 Reply Last reply Reply Quote 0
              • johnpozJ
                johnpoz LAYER 8 Global Moderator
                last edited by johnpoz

                You got to the end point in that trace.. That all hops along the way do not naswer does not always mean anything..

                Not sure what you think that shows as a problem?

                Tracing route to www.pfsense.org [208.123.73.69]
                over a maximum of 30 hops:
                
                  1    <1 ms    <1 ms    <1 ms  192.168.9.253
                  2    10 ms     9 ms    16 ms  50.4.132.1
                  3    11 ms    17 ms    10 ms  76.73.191.106
                  4     9 ms     9 ms     8 ms  76.73.164.142
                  5    12 ms    10 ms     9 ms  76.73.164.154
                  6    13 ms    10 ms    10 ms  76.73.191.242
                  7    11 ms    21 ms    10 ms  143.59.95.224
                  8    30 ms    15 ms    18 ms  75.76.35.8
                  9     *       13 ms    11 ms  4.16.38.157
                 10     *        *        *     Request timed out.
                 11    36 ms    46 ms    37 ms  4.14.49.2
                 12    41 ms    35 ms    35 ms  64.20.229.158
                 13    36 ms    35 ms    35 ms  66.219.34.194
                 14    34 ms    38 ms    35 ms  208.123.73.4
                 15    39 ms    35 ms    39 ms  208.123.73.69
                

                So from that trace I guess I am having issues getting to www.pfsense.org?

                same goes for cnn seems

                $ tracert -d www.cnn.com
                
                Tracing route to turner-tls.map.fastly.net [151.101.185.67]
                over a maximum of 30 hops:
                
                  1    <1 ms     3 ms    <1 ms  192.168.9.253
                  2    10 ms    11 ms    16 ms  50.4.132.1
                  3    19 ms    20 ms     8 ms  76.73.191.106
                  4    11 ms    10 ms     9 ms  76.73.164.142
                  5    13 ms    10 ms    11 ms  76.73.164.154
                  6    10 ms    11 ms    11 ms  76.73.191.242
                  7    10 ms    10 ms    10 ms  143.59.95.224
                  8    13 ms     9 ms    10 ms  75.76.35.8
                  9     *        *        *     Request timed out.
                 10    12 ms    10 ms    10 ms  151.101.185.67
                

                An intelligent man is sometimes forced to be drunk to spend time with his fools
                If you get confused: Listen to the Music Play
                Please don't Chat/PM me for help, unless mod related
                SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                K 1 Reply Last reply Reply Quote 0
                • K
                  kevindd992002 @johnpoz
                  last edited by

                  @johnpoz said in Non-forwarding Resolver intermittent operation:

                  You got to the end point in that trace.. That all hops along the way do not naswer does not always mean anything..

                  Not sure what you think that shows as a problem?

                  Tracing route to www.pfsense.org [208.123.73.69]
                  over a maximum of 30 hops:
                  
                    1    <1 ms    <1 ms    <1 ms  192.168.9.253
                    2    10 ms     9 ms    16 ms  50.4.132.1
                    3    11 ms    17 ms    10 ms  76.73.191.106
                    4     9 ms     9 ms     8 ms  76.73.164.142
                    5    12 ms    10 ms     9 ms  76.73.164.154
                    6    13 ms    10 ms    10 ms  76.73.191.242
                    7    11 ms    21 ms    10 ms  143.59.95.224
                    8    30 ms    15 ms    18 ms  75.76.35.8
                    9     *       13 ms    11 ms  4.16.38.157
                   10     *        *        *     Request timed out.
                   11    36 ms    46 ms    37 ms  4.14.49.2
                   12    41 ms    35 ms    35 ms  64.20.229.158
                   13    36 ms    35 ms    35 ms  66.219.34.194
                   14    34 ms    38 ms    35 ms  208.123.73.4
                   15    39 ms    35 ms    39 ms  208.123.73.69
                  

                  So from that trace I guess I am having issues getting to www.pfsense.org?

                  Of course not! Some routers are setup to not respond to ICMP requests, I know that.

                  But how do you explain my first and second traceroute before and after (three minutes interval) the issue? Is it because the route to the same server changed in a span of 3 minutes?

                  1 Reply Last reply Reply Quote 0
                  • Raffi_R
                    Raffi_ @kevindd992002
                    last edited by

                    @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                    Here's a traceroute that I sent them a few months ago: https://pastebin.com/JqPx326v

                    That looks like the RTO starts from the hop that's within the ISP network. Is that enough evidence for them to conclude that the problem is in their network?

                    And then 3 minutes after the issue, it got resolved and the traceroute results became like this: https://pastebin.com/XYbNMiWy

                    @kevindd992002 That's not really proof of an issue on their network. Not all hops along the route will always respond. It's common to have hops that don't respond along the route. As long as at the end you get to the server, that's what matters. Also, the hops are not taking very long so that also looks OK.

                    K 1 Reply Last reply Reply Quote 0
                    • K
                      kevindd992002 @Raffi_
                      last edited by

                      @Raffi_ said in Non-forwarding Resolver intermittent operation:

                      @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                      Here's a traceroute that I sent them a few months ago: https://pastebin.com/JqPx326v

                      That looks like the RTO starts from the hop that's within the ISP network. Is that enough evidence for them to conclude that the problem is in their network?

                      And then 3 minutes after the issue, it got resolved and the traceroute results became like this: https://pastebin.com/XYbNMiWy

                      @kevindd992002 That's not really proof of an issue on their network. Not all hops along the route will always respond. It's common to have hops that don't respond along the route. As long as at the end you get to the server, that's what matters. Also, the hops are not taking very long so that also looks OK.

                      Right, that's what I thought. I just posted the screenshots here in case you guys see something out of the ordinary.

                      1 Reply Last reply Reply Quote 0
                      • Raffi_R
                        Raffi_ @kevindd992002
                        last edited by

                        @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                        Yeah, I know how ISP's react when you say that. They panic and makes things solved faster. There are competitors, for sure, but in my condo there's only one offering FTTH connections, the one that I'm using now. The others are using crappy phone copper cables which are very substandard. And I just switched from that copper ISP to this fiber ISP since January 2019 so not long ago.

                        Off topic, but we're actually in the copper test industry. Believe it or not, they have technology now that is able to get close to Gigabit speeds on those old copper lines if the ISP is willing to invest in it. I'm curious are you in Australia? Here in the US, the old phone lines have been mostly abandoned in terms of further investment.

                        K 1 Reply Last reply Reply Quote 0
                        • K
                          kevindd992002 @Raffi_
                          last edited by

                          @Raffi_ said in Non-forwarding Resolver intermittent operation:

                          @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                          Yeah, I know how ISP's react when you say that. They panic and makes things solved faster. There are competitors, for sure, but in my condo there's only one offering FTTH connections, the one that I'm using now. The others are using crappy phone copper cables which are very substandard. And I just switched from that copper ISP to this fiber ISP since January 2019 so not long ago.

                          Off topic, but we're actually in the copper test industry. Believe it or not, they have technology now that is able to get close to Gigabit speeds on those old copper lines if the ISP is willing to invest in it. I'm curious are you in Australia? Here in the US, the old phone lines have been mostly abandoned in terms of further investment.

                          I can imagine. I was mostly talking about how sub-standard the copper wires are here in our condo. Even the copper ISP's themselves tell me that the copper wires that the contractors used in this condo are crap. The copper wires in the building's cabinet are worse than how spaghetti looks like. And no one wants to invest to replace those. I'm in the Philippines, so a third-world country, but Internet service here came a long way already. My two service plans are 35 down/35 up (around $31) and 300 down/300 up (around $87).

                          1 Reply Last reply Reply Quote 1
                          • K
                            kevindd992002
                            last edited by kevindd992002

                            @Raffi_

                            The Asus router as the main router and without pfsense has been issue-free for the last two days. It's still too early to tell but I'll continue monitoring during the weekend (the time when the issue usually occurs most) before I come to a conclusion. If it does run flawless until Monday though, I'm not sure how to continue troubleshooting pfsense except to uninstall and reinstall it from scratch. I mean that's an easy task when I just need to reload the config but if I am to go that route I would want to not carry over any settings from the config (which might be corrupted or something, for all we know).

                            Raffi_R 1 Reply Last reply Reply Quote 0
                            • Raffi_R
                              Raffi_
                              last edited by

                              @kevindd992002 Interesting. Ok, that sounds like a good plan. Yea give it a little while to see how it goes. We'll see what the next step is from there. Have a good weekend.
                              Raffi

                              1 Reply Last reply Reply Quote 0
                              • K
                                kevindd992002
                                last edited by

                                @Raffi_

                                After 5 days of continuously using the ASUS router, I've never had any single occurrence of the issue! That isolates the ASUS router, cables, and ISP modem from being the root cause of the issue.

                                I've decided, just now, to switch to pfsense and as soon as I've plugged it in and waited for everything to go green in the Dashboard, I experienced the issue. It's got to be either the pfsense software itself or the physical hardware that hosts pfsense (although I doubt this). What can you recommend as a next step here?

                                1 Reply Last reply Reply Quote 0
                                • johnpozJ
                                  johnpoz LAYER 8 Global Moderator
                                  last edited by

                                  And your asus router was actual resolving for dns?

                                  You title says non forwarding problems.. I find it unlikely that your asus router was resolving for dns vs forwarding..

                                  Do you understand what the difference is?

                                  An intelligent man is sometimes forced to be drunk to spend time with his fools
                                  If you get confused: Listen to the Music Play
                                  Please don't Chat/PM me for help, unless mod related
                                  SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                  K 1 Reply Last reply Reply Quote 0
                                  • K
                                    kevindd992002 @johnpoz
                                    last edited by kevindd992002

                                    @johnpoz said in Non-forwarding Resolver intermittent operation:

                                    And your asus router was actual resolving for dns?

                                    You title says non forwarding problems.. I find it unlikely that your asus router was resolving for dns vs forwarding..

                                    Do you understand what the difference is?

                                    Yes, I understand the difference between DNS resolver and DNS forwarder. I've already established this a few posts above. How can I rename the title for this whole thread and move it to the correct section? So that we can all be over the technicalities. Are my test results still not convincing for you that pfsense is causing my issue? What can I do to convince you?

                                    The ASUS router is NOT a DNS resolver. It is a DNS forwarder and I was forwarding to the OpenDNS servers. That's the only main difference I see: pfsense was set as a DNS resolver (using root hints and not forwarding) while the ASUS router does not have this feature and is simply doing DNS forwarding.

                                    1 Reply Last reply Reply Quote 0
                                    • johnpozJ
                                      johnpoz LAYER 8 Global Moderator
                                      last edited by

                                      So then - troubleshoot your dns problem when your "resolving" Or set pfsense to forward to opendns like your asus was doing.

                                      An intelligent man is sometimes forced to be drunk to spend time with his fools
                                      If you get confused: Listen to the Music Play
                                      Please don't Chat/PM me for help, unless mod related
                                      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

                                      1 Reply Last reply Reply Quote 0
                                      • Raffi_R
                                        Raffi_ @kevindd992002
                                        last edited by

                                        @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                                        How can I rename the title for this whole thread and move it to the correct section?

                                        FYI, you can change the title of this topic by going up to your very first post, then click on the 3 dots and go to edit. Then above the text box where it shows the title, you can go in and edit it to help avoid any confusion. Maybe something more generic like intermittent connection issue or whatever you like. As for moving it to the correct section, that would have to be done by someone else. You can start a new topic in the correct section if you'd like, but I think we already made some progress here.

                                        After 5 days of continuously using the ASUS router, I've never had any single occurrence of the issue! That isolates the ASUS router, cables, and ISP modem from being the root cause of the issue.

                                        I've decided, just now, to switch to pfsense and as soon as I've plugged it in and waited for everything to go green in the Dashboard, I experienced the issue. It's got to be either the pfsense software itself or the physical hardware that hosts pfsense (although I doubt this). What can you recommend as a next step here?

                                        Here is what I would suggest doing if you haven't already.

                                        • Make a backup of your current pfSense config.
                                        • Do a fresh install of the latest pfSense (2.4.4-RELEASE-p3).
                                        • Configure your interfaces as needed.
                                        • Do not install any additional packages.
                                        • Leave all the default settings. pfSense is pretty darn secure out of the box, certainly more secure then the Asus anyway.

                                        Then let's see where that gets you.

                                        Raffi

                                        K 1 Reply Last reply Reply Quote 0
                                        • K
                                          kevindd992002 @Raffi_
                                          last edited by

                                          @johnpoz said in Non-forwarding Resolver intermittent operation:

                                          So then - troubleshoot your dns problem when your "resolving" Or set pfsense to forward to opendns like your asus was doing.

                                          I thought even you agreed that this is not a DNS problem? So why would I concentrate on troubleshooting DNS? I'm pinging an IP address when the issue happens so that in itself tells everyone already that it isn't a DNS issue.

                                          @Raffi_ said in Non-forwarding Resolver intermittent operation:

                                          @kevindd992002 said in Non-forwarding Resolver intermittent operation:

                                          How can I rename the title for this whole thread and move it to the correct section?

                                          FYI, you can change the title of this topic by going up to your very first post, then click on the 3 dots and go to edit. Then above the text box where it shows the title, you can go in and edit it to help avoid any confusion. Maybe something more generic like intermittent connection issue or whatever you like. As for moving it to the correct section, that would have to be done by someone else. You can start a new topic in the correct section if you'd like, but I think we already made some progress here.

                                          After 5 days of continuously using the ASUS router, I've never had any single occurrence of the issue! That isolates the ASUS router, cables, and ISP modem from being the root cause of the issue.

                                          I've decided, just now, to switch to pfsense and as soon as I've plugged it in and waited for everything to go green in the Dashboard, I experienced the issue. It's got to be either the pfsense software itself or the physical hardware that hosts pfsense (although I doubt this). What can you recommend as a next step here?

                                          Here is what I would suggest doing if you haven't already.

                                          • Make a backup of your current pfSense config.
                                          • Do a fresh install of the latest pfSense (2.4.4-RELEASE-p3).
                                          • Configure your interfaces as needed.
                                          • Do not install any additional packages.
                                          • Leave all the default settings. pfSense is pretty darn secure out of the box, certainly more secure then the Asus anyway.

                                          Then let's see where that gets you.

                                          Raffi

                                          Changed the title already, thanks.

                                          I was afraid that the only step I'm left with is to reinstall. But yeah, I can do that, but I'll probably not be able to until next week because I don't have a serial-to-USB adapter with me right now. I'll report back when I'm done with this but I'll continue monitoring while using the current pfsense installation.

                                          1 Reply Last reply Reply Quote 0
                                          • K
                                            kevindd992002
                                            last edited by kevindd992002

                                            @Raffi_ , since I didn't have time to start pfsense from scratch yet, I decided to just reset to factory defaults and restore the config. Same thing still happens.

                                            One semi-consistent thing that I observed though is that the issue happens when I turn on my desktop computer from a state where it is turned off for a couple of hours. I say "semi-consistent" because when I restart or shut it down now and turn it back on immediately, the issue cannot be recreated. I've been observing this for a couple of days and it's as if that it is the one bogging down the whole network. Then I wait for 5 to 10 minutes and things come back to normal.

                                            There's no special network configuration with my desktop computer aside from these things:

                                            1. Uses NUT (Network UPS Tools) to connect to the UPS master connected to pfsense.
                                            2. Wireshark/NPCap is installed but not running (the Npcap Loopback Adapter is installed)
                                            3. Hyper-V Manager is installed and the default virtual switch (has its own internal subnet) is configured.
                                            4. A mapped drive connecting to the Synology NAS on the other end of the OpenVPN tunnel is configured.

                                            Can you think of anything out of those that could cause this?

                                            Could it perhaps be a broadcast storm caused by the desktop PC? If so, how can I confirm this?

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.