Intermittent connection issue
-
@kevindd992002 said in Intermittent connection issue:
how can I confirm this?
Sniff and see - be it on any device on that same L2 or pfsense... If you want to know what is going on - just look.
-
I agree with what John said. If you suspect something like the desktop traffic being an issue, run a packet capture on pfsense using the LAN interface.
I would suggest starting a packet capture on pfsense LAN before you turn on that PC which is causing the issues. Then while the capture is running, turn on the PC (after it's been off long enough to cause trouble) and see what's happening. Take note of the time when you first connect the PC so you can figure out at what point in the capture the PC was connected. I think it's a little coincidental that the default DHCP lease time in pfsense is 2 hours and the time it takes for your PC to stay disconnected before it starts causing trouble matches up with that.
Have you tried manually configuring a fixed IP on that PC instead of requesting a DHCP lease?
Raffi
-
@Raffi_ said in Intermittent connection issue:
I agree with what John said. If you suspect something like the desktop traffic being an issue, run a packet capture on pfsense using the LAN interface.
I would suggest starting a packet capture on pfsense LAN before you turn on that PC which is causing the issues. Then while the capture is running, turn on the PC (after it's been off long enough to cause trouble) and see what's happening. Take note of the time when you first connect the PC so you can figure out at what point in the capture the PC was connected. I think it's a little coincidental that the default DHCP lease time in pfsense is 2 hours and the time it takes for your PC to stay disconnected before it starts causing trouble matches up with that.
Have you tried manually configuring a fixed IP on that PC instead of requesting a DHCP lease?
Raffi
Ok, I'll try that and report back.
No, I have not tried a static IP on the PC but it has an IP reservation from the DHCP server, like all my other servers/PC's that I need to RDP from outside do. But yeah, I'll see if a static IP makes a difference to rule out DHCP.
-
Keep in mind that default packet capture on pfsense is set to 100, so you will most likely want to adjust that.. 0 would be a good setting.. Just don't forget its running ;)
-
Alright, so after some more testing I can say it's not the desktop causing the issue as it was more like a coincidence that happened. I've did more steps regarding DNS and the results were reproducible this time:
- Changed from regular DNS Resolver to DNS Resolver with forwarding (set to forward to 192.168.100.1 which is the modem's IP/gateway). Ping'ed 208.78.70.23 (one IP I saw in Status -> DNS Resolver) from pfsense's ping utility and there were no issues. Internet is back up and running.
- Changed back to regular DNS Resolver. Ping'ed the same IP from pfsense's ping utility and no response from 192.168.100.1.
So it looks like the problem is DNS but not the resolution part of DNS per se. For some reason, when DNS Resolver without forwarding is set the WAN gateway doesn't respond properly (or responds intermittently) as well, at least that's what's happening during my testing. So if you try to resolve with DNS Resolver, it's "as if" it is not resolving properly because of course it doesn't get any responses from the root hints servers consistently.
I know, the issue doesn't make sense but will getting a WAN packet capture while recreating the issue help in this situation? Or both WAN and LAN?
EDIT:
Also, I get a ton of Timeout A's under Status -> DNS Resolver when forwarding is NOT set. When forwarding, I get 0.
-
@kevindd992002 said in Intermittent connection issue:
Alright, so after some more testing I can say it's not the desktop causing the issue as it was more like a coincidence that happened. I've did more steps regarding DNS and the results were reproducible this time:
- Changed from regular DNS Resolver to DNS Resolver with forwarding (set to forward to 192.168.100.1 which is the modem's IP/gateway). Ping'ed 208.78.70.23 (one IP I saw in Status -> DNS Resolver) from pfsense's ping utility and there were no issues. Internet is back up and running.
- Changed back to regular DNS Resolver. Ping'ed the same IP from pfsense's ping utility and no response from 192.168.100.1.
So it looks like the problem is DNS but not the resolution part of DNS per se. For some reason, when DNS Resolver without forwarding is set the WAN gateway doesn't respond properly (or responds intermittently) as well, at least that's what's happening during my testing. So if you try to resolve with DNS Resolver, it's "as if" it is not resolving properly because of course it doesn't get any responses from the root hints servers consistently.
I know, the issue doesn't make sense but will getting a WAN packet capture while recreating the issue help in this situation? Or both WAN and LAN?
EDIT:
Also, I get a ton of Timeout A's under Status -> DNS Resolver when forwarding is NOT set. When forwarding, I get 0.
Ooooohhh I can't believe it didn't hit me sooner. This is a pretty well known issue with the DNS resolver (unbound) in pfSense. Here is one thread on it that I ran into when I had similar issues. https://forum.netgate.com/topic/120838/unbound-appears-to-restart-frequently-and-fails-to-resolve-domains-sometimes.
There are a number of solutions to this. As you found, use DNS forwarding. I would suggest forwarding to servers like Cloudflare (1.1.1.1 and 1.0.0.1) or Google (8.8.8.8 and 8.8.4.4). If you are going to use pfSense for resolving, go into the DNS resolver settings and uncheck the DHCP Registration option. That would explain exactly the issue you were seeing. When a PC is connected after 2 hours, it would request a new DHCP lease. When that lease is handed out, it also causes Unbound to restart because Unbound has to update its info since DHCP leases are being registered. This was a major pain for me. If you also have things that slow down the unbound startup process like lots of pfblocker IP lists, that will make matters worse.My current setup has DHCP registration disabled in the DNS Resolver options. I also have forwarding enabled to 1.1.1.1 and 1.0.0.1. Ideally, I wanted to have DNS resolution done by pfSense without having to forward, but it was not always reliable. I think the better solution would be to have a DNS resolver completely separate from pfSense, like Pi hole. I hear good things about it but never had a chance to play around with it.
-
@Raffi_ said in Intermittent connection issue:
@kevindd992002 said in Intermittent connection issue:
Alright, so after some more testing I can say it's not the desktop causing the issue as it was more like a coincidence that happened. I've did more steps regarding DNS and the results were reproducible this time:
- Changed from regular DNS Resolver to DNS Resolver with forwarding (set to forward to 192.168.100.1 which is the modem's IP/gateway). Ping'ed 208.78.70.23 (one IP I saw in Status -> DNS Resolver) from pfsense's ping utility and there were no issues. Internet is back up and running.
- Changed back to regular DNS Resolver. Ping'ed the same IP from pfsense's ping utility and no response from 192.168.100.1.
So it looks like the problem is DNS but not the resolution part of DNS per se. For some reason, when DNS Resolver without forwarding is set the WAN gateway doesn't respond properly (or responds intermittently) as well, at least that's what's happening during my testing. So if you try to resolve with DNS Resolver, it's "as if" it is not resolving properly because of course it doesn't get any responses from the root hints servers consistently.
I know, the issue doesn't make sense but will getting a WAN packet capture while recreating the issue help in this situation? Or both WAN and LAN?
EDIT:
Also, I get a ton of Timeout A's under Status -> DNS Resolver when forwarding is NOT set. When forwarding, I get 0.
Ooooohhh I can't believe it didn't hit me sooner. This is a pretty well known issue with the DNS resolver (unbound) in pfSense. Here is one thread on it that I ran into when I had similar issues. https://forum.netgate.com/topic/120838/unbound-appears-to-restart-frequently-and-fails-to-resolve-domains-sometimes.
There are a number of solutions to this. As you found, use DNS forwarding. I would suggest forwarding to servers like Cloudflare (1.1.1.1 and 1.0.0.1) or Google (8.8.8.8 and 8.8.4.4). If you are going to use pfSense for resolving, go into the DNS resolver settings and uncheck the DHCP Registration option. That would explain exactly the issue you were seeing. When a PC is connected after 2 hours, it would request a new DHCP lease. When that lease is handed out, it also causes Unbound to restart because Unbound has to update its info since DHCP leases are being registered. This was a major pain for me. If you also have things that slow down the unbound startup process like lots of pfblocker IP lists, that will make matters worse.My current setup has DHCP registration disabled in the DNS Resolver options. I also have forwarding enabled to 1.1.1.1 and 1.0.0.1. Ideally, I wanted to have DNS resolution done by pfSense without having to forward, but it was not always reliable. I think the better solution would be to have a DNS resolver completely separate from pfSense, like Pi hole. I hear good things about it but never had a chance to play around with it.
I'm aware that DHCP registration restarts unbound but it does it just for a few seconds and everything should be back to normal. It doesn't explain what I was seeing with my desktop because that machine has an DHCP IP reservation. I thought those won't restart unbound? Also, if the unbound restart is the issue I won't be experiencing the issue for more than a few seconds. But in my case, I experience it for around 5 minutes or so, every single time.
Another weird thing is that I have the same exact setup on my network in the other end of the tunnel. That network also uses unbound without forwarding. And that network is 10x bigger than the network I'm trying to troubleshoot here. The network in question in this thread is made up of just 1 desktop, 1 nas, 1 laptop, 2 mobile devices, 1 nvidia shield, and 1 ps4. That's it.
Also, if it was a matter of unbound not working, I should be able to ping a public IP address (that's known to be pingable, of course) using pfsense. But like I said above, as long as I use unbound without forwarding, the issue of me not being able to ping presents itself right away.
-
Right, that was the thing that I couldn't figure out as well. Even if unbound was restarting and DNS was not working, I would still expect pings to 8.8.8.8 to work.
Do you have "Register DHCP static mappings" enabled in the DNS Resolver settings? I would expect that will cause unbound to restart as well. To be clear, I don't think the DHCP/Static IP requests are the cause of the problem, but I think they seem to be triggering the problem. You pretty much have it isolated down to unbound. Enabling forwarding bypasses unbound.
Bypassing unbound = working.
Not bypassing unbound = not working
Is that correct?It sounds like you haven't actually checked whether the action of connecting the PC was causing unbound to restart. Check the DNS Resolver log under system logs. It could be that unbound is not behaving as expected so look into that. Is unbound actually restarting quickly as expected? Is it hanging and causing pfSense to hang as well? pfSense itself will use unbound if that's the only option available for DNS resolution. There is even a setting to disable that default behavior in the general setup.
I'm not an expert on the topic of unbound but I would suggest looking into that though. Make sure that unbound is running and there aren't multiple instances of it or something crazy like that. I'm not being sarcastic here, I honestly do not know how pfSense would behave if the DNS server it was relying on to do its job was not working. -
@Raffi_ said in Intermittent connection issue:
Right, that was the thing that I couldn't figure out as well. Even if unbound was restarting and DNS was not working, I would still expect pings to 8.8.8.8 to work.
Do you have "Register DHCP static mappings" enabled in the DNS Resolver settings? I would expect that will cause unbound to restart as well. To be clear, I don't think the DHCP/Static IP requests are the cause of the problem, but I think they seem to be triggering the problem. You pretty much have it isolated down to unbound. Enabling forwarding bypasses unbound.
Bypassing unbound = working.
Not bypassing unbound = not working
Is that correct?It sounds like you haven't actually checked whether the action of connecting the PC was causing unbound to restart. Check the DNS Resolver log under system logs. It could be that unbound is not behaving as expected so look into that. Is unbound actually restarting quickly as expected? Is it hanging and causing pfSense to hang as well? pfSense itself will use unbound if that's the only option available for DNS resolution. There is even a setting to disable that default behavior in the general setup.
I'm not an expert on the topic of unbound but I would suggest looking into that though. Make sure that unbound is running and there aren't multiple instances of it or something crazy like that. I'm not being sarcastic here, I honestly do not know how pfSense would behave if the DNS server it was relying on to do its job was not working.Yes, that's pretty much correct. Since your last post, I decided to keep forwarding enabled in unbound and things were smooth similar to how things were when I used my Asus router. And then I decided to turn on my desktop at 11/6/2019 6:51 AM today and surprisingly the monitoring graph caught the issue:
These were the system logs at that time:
Nov 6 06:53:13 php-fpm 77800 /index.php: Successful login for user 'kevindd992002' from: 192.168.20.21 (Local Database) Nov 6 06:53:18 rc.gateway_alarm 54269 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:21.129ms RTTsd:.260ms Loss:21%) Nov 6 06:53:18 check_reload_status updating dyndns WAN_DHCP Nov 6 06:53:18 check_reload_status Restarting ipsec tunnels Nov 6 06:53:18 check_reload_status Restarting OpenVPN tunnels/interfaces Nov 6 06:53:18 check_reload_status Reloading filter Nov 6 06:53:20 php-fpm 380 /rc.dyndns.update: MONITOR: WAN_DHCP is down, omitting from routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.139ms|0.274ms|21%|down Nov 6 06:53:20 php-fpm 77800 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Nov 6 06:53:20 php-fpm 77800 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Nov 6 06:53:20 php-fpm 380 /rc.dyndns.update: 380MONITOR: WAN_DHCP is available now, adding to routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.141ms|0.276ms|20%|loss Nov 6 06:53:20 php-fpm 73145 /rc.filter_configure_sync: MONITOR: WAN_DHCP is down, omitting from routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.139ms|0.3ms|21%|down Nov 6 06:53:21 php-fpm 380 /rc.dyndns.update: 380MONITOR: WAN_DHCP is available now, adding to routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.142ms|0.296ms|20%|loss Nov 6 06:53:22 php-fpm 380 /rc.dyndns.update: phpDynDNS (Condo): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Nov 6 06:53:30 rc.gateway_alarm 83271 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:0 RTT:21.140ms RTTsd:.302ms Loss:20%) Nov 6 06:53:30 check_reload_status updating dyndns WAN_DHCP Nov 6 06:53:30 check_reload_status Restarting ipsec tunnels Nov 6 06:53:30 check_reload_status Restarting OpenVPN tunnels/interfaces Nov 6 06:53:30 check_reload_status Reloading filter Nov 6 06:53:31 rc.gateway_alarm 87686 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:1 RTT:21.144ms RTTsd:.318ms Loss:21%) Nov 6 06:53:31 check_reload_status updating dyndns WAN_DHCP Nov 6 06:53:31 check_reload_status Restarting ipsec tunnels Nov 6 06:53:31 check_reload_status Restarting OpenVPN tunnels/interfaces Nov 6 06:53:31 check_reload_status Reloading filter Nov 6 06:53:31 php-fpm 73145 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Nov 6 06:53:31 php-fpm 73145 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Nov 6 06:53:31 php-fpm 380 /rc.filter_configure_sync: MONITOR: WAN_DHCP is down, omitting from routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.144ms|0.318ms|21%|down Nov 6 06:53:32 php-fpm 380 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Nov 6 06:53:32 php-fpm 380 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Nov 6 06:53:33 php-cgi notify_monitor.php: Message sent to kevindd992002@yahoo.com OK Nov 6 06:53:36 php-fpm 381 /rc.dyndns.update: phpDynDNS (Condo): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Nov 6 06:53:40 php-fpm 77800 /rc.dyndns.update: phpDynDNS (Condo): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Nov 6 06:53:57 php-cgi notify_monitor.php: Message sent to kevindd992002@yahoo.com OK Nov 6 06:54:11 rc.gateway_alarm 30595 >>> Gateway alarm: WAN_DHCP (Addr:8.8.8.8 Alarm:0 RTT:21.204ms RTTsd:.290ms Loss:20%) Nov 6 06:54:11 check_reload_status updating dyndns WAN_DHCP Nov 6 06:54:11 check_reload_status Restarting ipsec tunnels Nov 6 06:54:11 check_reload_status Restarting OpenVPN tunnels/interfaces Nov 6 06:54:11 check_reload_status Reloading filter Nov 6 06:54:12 php-fpm 33202 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Nov 6 06:54:12 php-fpm 33202 /rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WAN_DHCP. Nov 6 06:54:12 php-fpm 380 /rc.dyndns.update: 380MONITOR: WAN_DHCP is available now, adding to routing group Failover 8.8.8.8|192.168.100.2|WAN_DHCP|21.195ms|0.296ms|18%|loss Nov 6 06:54:14 php-fpm 380 /rc.dyndns.update: phpDynDNS (Condo): No change in my IP address and/or 25 days has not passed. Not updating dynamic DNS entry. Nov 6 06:54:32 php-cgi notify_monitor.php: Message sent to kevindd992002@yahoo.com OK
I don't see anything in the DNS Resolver logs too. I mean, the unbound service restarted as expected but it's just 1 second and that shouldn't be the issue:
Nov 6 06:46:22 unbound 28470:0 notice: Restart of unbound 1.9.1. Nov 6 06:46:22 unbound 28470:0 notice: init module 0: validator Nov 6 06:46:22 unbound 28470:0 notice: init module 1: iterator Nov 6 06:46:23 unbound 28470:0 info: start of service (unbound 1.9.1). Nov 6 06:46:29 unbound 28470:1 info: generate keytag query _ta-4f66. NULL IN Nov 6 06:53:21 filterdns merge_config: configuration reload Nov 6 06:53:21 filterdns Adding Action: pf table: HostsToTunnel host: plex.tv Nov 6 06:53:31 filterdns merge_config: configuration reload Nov 6 06:53:31 filterdns Adding Action: pf table: HostsToTunnel host: plex.tv Nov 6 06:53:33 filterdns merge_config: configuration reload Nov 6 06:53:33 filterdns Adding Action: pf table: HostsToTunnel host: plex.tv Nov 6 06:54:13 filterdns merge_config: configuration reload Nov 6 06:54:13 filterdns Adding Action: pf table: HostsToTunnel host: plex.tv Nov 6 07:43:46 unbound 28470:0 info: service stopped (unbound 1.9.1). Nov 6 07:43:46 unbound 28470:0 info: server stats for thread 0: 740 queries, 216 answers from cache, 524 recursions, 0 prefetch, 0 rejected by ip ratelimiting Nov 6 07:43:46 unbound 28470:0 info: server stats for thread 0: requestlist max 2 avg 0.0877863 exceeded 0 jostled 0 Nov 6 07:43:46 unbound 28470:0 info: average recursion processing time 0.065975 sec Nov 6 07:43:46 unbound 28470:0 info: histogram of recursion processing times Nov 6 07:43:46 unbound 28470:0 info: [25%]=0.0055808 median[50%]=0.0104123 [75%]=0.0318903 Nov 6 07:43:46 unbound 28470:0 info: lower(secs) upper(secs) recursions
I also don't see anything unusual in the DHCP logs:
Nov 6 06:51:31 dhcpd DHCPDISCOVER from <MAC> via igb1 Nov 6 06:51:31 dhcpd DHCPOFFER on 192.168.20.21 to <MAC> via igb1 Nov 6 06:51:34 dhcpd DHCPREQUEST for 192.168.20.21 (192.168.20.1) from <MAC> via igb1 Nov 6 06:51:34 dhcpd DHCPACK on 192.168.20.21 to <MAC> via igb1 Nov 6 07:00:41 dhcpd DHCPREQUEST for 192.168.20.21 from <MAC>via igb1 Nov 6 07:00:41 dhcpd DHCPACK on 192.168.20.21 to <MAC> via igb1 Nov 6 07:00:42 dhcpd DHCPREQUEST for 192.168.20.21 from <MAC> via igb1 Nov 6 07:00:42 dhcpd DHCPACK on 192.168.20.21 to <MAC> via igb1
So at this point, it's a mix of unbound and the desktop causing the problem? I'm scratching my head hard here. I can definitely say that with forwarding enabled, it was stable for 4 straight days. Turning on my laptop did not do anything to the network compared to when it also caused issues when using just unbound.
-
@kevindd992002 said in Intermittent connection issue:
Loss:21%)
That triggered bunch of different things... Since you prob have it set to reset on loss of gateway.
Why are you monitoring 8.8.8.8 and not pfsense gateway?
-
@johnpoz said in Intermittent connection issue:
@kevindd992002 said in Intermittent connection issue:
Loss:21%)
That triggered bunch of different things... Since you prob have it set to reset on loss of gateway.
Why are you monitoring 8.8.8.8 and not pfsense gateway?
Right, but is it just a big coincidence that the loss happened right after I turned on the desktop?
I'm monitoring 8.8.8.8 because pfsense's gateway is 192.168.100.1 (modem IP because of CGNAT). So if Internet is down, that private IP will always be up and this obviously would not be accurate for gateway monitoring, is it?
-
That is really weird. Now it would seem like unbound isn't an issue, maybe? This doesn't make a lot of sense.
8.8.8.8 should be fine for a monitor IP. I use it and have not had issues with it so far. I have other issues with my network but that isn't one :)
I have no idea why pfSense is unable to ping 8.8.8.8 when that PC is plugged in. It doesn't make a lot of sense. This may be a band-aid, but if you get around 21% loss when that PC is connected, maybe increase the packet loss threshold.
This is how mine is currently setup because I didn't want my main WAN being taken down from packet loss that was able to correct itself after several seconds.
Sorry if you already mentioned this, but is this issue is ONLY happening when that specific PC is plugged in? If so, I would suggest double checking the settings on that PC. For example, does the network interface on the PC have any additional IP aliases which could be conflicting when it is first plugged in? In Windows, they do a pretty good job of hiding these settings in deeper menus. I may be reaching here, but everything I'm reading does seem to boil down to that PC being connected to the network. -
@Raffi_ said in Intermittent connection issue:
That is really weird. Now it would seem like unbound isn't an issue, maybe? This doesn't make a lot of sense.
8.8.8.8 should be fine for a monitor IP. I use it and have not had issues with it so far. I have other issues with my network but that isn't one :)
I have no idea why pfSense is unable to ping 8.8.8.8 when that PC is plugged in. It doesn't make a lot of sense. This may be a band-aid, but if you get around 21% loss when that PC is connected, maybe increase the packet loss threshold.
This is how mine is currently setup because I didn't want my main WAN being taken down from packet loss that was able to correct itself after several seconds.
Sorry if you already mentioned this, but is this issue is ONLY happening when that specific PC is plugged in? If so, I would suggest double checking the settings on that PC. For example, does the network interface on the PC have any additional IP aliases which could be conflicting when it is first plugged in? In Windows, they do a pretty good job of hiding these settings in deeper menus. I may be reaching here, but everything I'm reading does seem to boil down to that PC being connected to the network.It really doesn't make any sense. But like I said in my last reply to johnpoz, it could be just a coincidence that when I turned on my desktop there was really a packet loss to 8.8.8.8 causing the gateway to be down. If you remember, in my past tests I do not have any gateway down indications when the issue is happening. I turned on the desktop again just now and it did not have any issues. I'm still going towards unbound being the issue. I'll test more later.
That's indeed a band-aid and would not work for me.
When pfsense was still set to use unbound, I thought it was only this desktop causing the issue but after a few days my laptop also caused the same issue. This is what made me try and use forwarding with unbound. So I don't think this desktop PC is the issue.
-
Your issue is you had 21% packet loss that triggered a gateway down event.. So no shit unbound would not be able to resolve during that period..
-
@johnpoz said in Intermittent connection issue:
Your issue is you had 21% packet loss that triggered a gateway down event.. So no shit unbound would not be able to resolve during that period..
Yes, I agree 100%. Like I said, this could just be a coincidence with the power-up event of my desktop PC.
Do you have any comments on my observation when using unbound w/ forwarding vs. without? I have not experienced a single occurrence of the issue (except when the gateway went down) when I enabled forwarding. So that tells us that unbound w/o forwarding is the issue here but I can't point out why because I have another pfsense box on the same ISP that uses unbound w/o forwarding flawlessly.
-
As I have already gone over - if your line is having packet loss, then yes you can have an issue with resolving something more than say a forward. You have to talk to multiple servers all over the internet to resolve.. With a forward your just asking that 1 guy for what the answer is..
Upping your logging level in unbound (in the advanced section of unbound), also logging queries and answers will give you some insight to what might be the problem with resolving specific sites.
In the options box
server: log-queries: yes log-replies: yes
Look at your cache for your unbound for any problem sites that are not resolving.. If unbound is restarting you loose your cache.. Just because you haven't seen packet loss issues in the past, doesn't mean your not having them.. Your path to 8.8.8.8 is not the whole internet... It's an anycast address.. There are MULTIPLE paths to get to that address.
If your having issues with unbound resolving something - you have to troubleshoot the resolving issue.. Which is why setup your log to log more info, and log the queries.. and the answers..
On a problematic connection forwarding can be less likely to see problems than resolving. Especially if your restarting unbound and loosing your local cache. Especially if you have issues to talking to specific NS, which unbound keeps track of and doesn't try to use via its infra info.. But when the cache is lost on a restart, all of that info is lost as well..
-
@johnpoz said in Intermittent connection issue:
As I have already gone over - if your line is having packet loss, then yes you can have an issue with resolving something more than say a forward. You have to talk to multiple servers all over the internet to resolve.. With a forward your just asking that 1 guy for what the answer is..
Upping your logging level in unbound (in the advanced section of unbound), also logging queries and answers will give you some insight to what might be the problem with resolving specific sites.
In the options box
server: log-queries: yes log-replies: yes
Look at your cache for your unbound for any problem sites that are not resolving.. If unbound is restarting you loose your cache.. Just because you haven't seen packet loss issues in the past, doesn't mean your not having them.. Your path to 8.8.8.8 is not the whole internet... It's an anycast address.. There are MULTIPLE paths to get to that address.
If your having issues with unbound resolving something - you have to troubleshoot the resolving issue.. Which is why setup your log to log more info, and log the queries.. and the answers..
On a problematic connection forwarding can be less likely to see problems than resolving. Especially if your restarting unbound and loosing your local cache. Especially if you have issues to talking to specific NS, which unbound keeps track of and doesn't try to use via its infra info.. But when the cache is lost on a restart, all of that info is lost as well..
Like I said though, it's not even really the resolving part that's the issue. When I'm using unbound and the issue is present, I don't receive responses when I ping the IP address of www.google.com (so no resolution involved here, but unbound is the one causing it). When I switch to a forwarder, I don't encounter this issue.
So the problem is when using unbound, somehow pfsense cannot reach the multiple NS that it's trying to query from. But yeah, upping the logging level wouldn't hurt for me to try.
Is there such thing where an ISP doesn't work well with unbound setup on their customer's premises? Maybe bad routes to some NS servers or something? I'm just trying to think out of the box here.
Also, is it recommended to disable all these options so that unbound will not restart? If so, how can I resolve my local clients by their FQDN?
-
@kevindd992002 said in Intermittent connection issue:
(so no resolution involved here, but unbound is the one causing it)
unbound has ZERO!!! Let me repeat that ZERO!!! to do with you pinging some IP.. 8.8.8.8 is not resolved, so you resolver has ZERO to do with it... If you can not ping 8.8.8.8 then you have a connectivity issue and ZERO!!! Again ZERO to do with any forwarder or resolver you would be running..
Yes I would recommend you turn off registering dhcp or vpn in unbound - that causes a restart of it. Static is fine.
-
@johnpoz said in Intermittent connection issue:
@kevindd992002 said in Intermittent connection issue:
(so no resolution involved here, but unbound is the one causing it)
unbound has ZERO!!! Let me repeat that ZERO!!! to do with you pinging some IP.. 8.8.8.8 is not resolved, so you resolver has ZERO to do with it... If you can not ping 8.8.8.8 then you have a connectivity issue and ZERO!!! Again ZERO to do with any forwarder or resolver you would be running..
Yes I would recommend you turn off registering dhcp or vpn in unbound - that causes a restart of it. Static is fine.
Again, I know that pinging an IP address DOES NOT involve DNS resolution, I'm not a beginner here. I don't know how else to say what I'm observing but like I said when I switch to unbound the issue randomly shows itself but when I use forwarding I do not experience the issue. So as I see it, unbound is bugging the whole pfsense box that it's acting up in intermittently in reaching external servers.
If I turn off dhcp registration, how do I resolve my internal clients?
-
So I switched to unbound again while using my laptop and NOTHING ELSE. It was working for maybe around 30 mins until I experienced the issue again. Here's what I see in my cache:
When it was working, I had 0 Timeout A's for all those servers. And then it just started happening.
As for the logging level for DNS Resolver, what level should I put it? Increase it to level 5 right away? I've noticed that when I do that, I cannot see all of the logs under System Logs even though I increase the log filter quantity to an insane amount. That just means that there's too much data.