DNS Resolver not resolving part 1234

henkbart

Yes,
45451 root 1 52 0 12M 2252K kqread 1 0:00 0.00% dhcpleases

is still running

jrey

@henkbart said in DNS Resolver not resolving part 1234:

Lucky you,

But I think the point being made by both myself and @johnpoz is that is not likely a unbound issue as such.

many people do indeed report problems and when doing so they are assuming it to be unbound because that is what they see. But that is generally not the root cause..
They see the effect, not the cause.

unbound generally just works. Right out of the box.

That's all we are saying.

@Gertjan is giving you some good advice on things to look for. including the current path you are on: re: DHCP.
edit: however if DHCP is causing the issue, you'd likely see a stream of unbound restarts and you say they are not there in the log.

henkbart

@jrey

The problem is, that every now and then (and that could be weeks) i loses the ability to connect to the internet.
I have my own PBX here that uses VOIP and SIP trunks.
Some time they can no register with the host.
And from that time, also no other can connect to any internet address.
Modem is UP,
WAN is UP.
LAN is UP.
Than all DHCP mus fail because the are on differnt ip addresses.
But also no entries in the log files to give any clue.

So where to look else for....

jrey

@henkbart said in DNS Resolver not resolving part 1234:

The problem is, that every now and then (and that could be weeks) i loses the ability to connect to the internet.

sounds interesting. so when you "lose the ability to connect to the internet" unbound would not be able to up stream resolve, but would still be running. If it can't talk upstream how would it resolve.

so the question you might start looking into is why have you lost ability to connect to the internet?

do you have a timestamp from the last time this happened?

check the logs (not unbound logs specifically) for events that might tell you why you lost the internet.

If you can't find anything in current logs regarding the last time it happened

Then the next time it does.... do this...

@johnpoz said in DNS Resolver not resolving part 1234:

A good test might be to try and resolve something just local.. say your pfsense fqdn via your fav local tool, nslookup, dig, host, doggo, etc.. Does that work, just not external? Its best to use a cmd line tool because then you can see the actual response from unbound, be it NX or servfail, refused, etc.

Gertjan

@henkbart said in DNS Resolver not resolving part 1234:

45451 root 1 52 0 12M 2252K kqread 1 0:00 0.00% dhcpleases

Ok.
Doesn't make any sense.

I propose :
Switch back to 'dhcp' mode.
Save.
Goto the Resolver settings.
Now, the DHCP Client registration (and "Static DHCP Client Registration") should be visible.
Note that your "DHCP Client registration" is checked - is this the case ?
Uncheck it.
Save, and then Apply.

Go back to System > Advanced > Networking and select kea again.

The "dhcpleases" process is gone now. Correct ?

Btw :
As far as I can see on my my pfSense, while using kea, the /var/dhcpd/var/db/dhcpd.leases file isn't used. That file is "watched" by the dhcpleases process, and if it changes, unbound is send a signal to restart.
So, harmess, I guess.
But still, strange, as it should even be started in the first place.

Gertjan

@henkbart said in DNS Resolver not resolving part 1234:

The problem is, that every now and then (and that could be weeks) i loses the ability to connect to the internet.

You can check your uplink quality.

It should be constant, flat and as small as possible.
If it start to go up and down, or worse :

you are saturating your connection, and if the 'pipe', up or down is to full, dpinger starts to miss ping packets, it can go in panic mode, and 'restart' your WAN interface.
No need to explain that if the pipe (uplink) is bad or full, or not working well, the resolver can't do its work neither. Right ?
.....
call your ISP and say : good bye, I'll leave you for a better one.

henkbart

@Gertjan
Hello,
I did what you wrote.
Enable DHCP.
Got :
FireShot Pro Webpage Capture 278 - 'firewall.private.lan - Services_ DNS Resolver_ General Settings' - 192.168.1.1.png

So disabled the DHCP Client Registration.
Saved it.
And then switch to KEA

And the dhcpleases is gone now.

jrey

@Gertjan said in DNS Resolver not resolving part 1234:

dpinger starts to miss ping packets,

if that's the problem OP might be able to mitigate some of that by changing the ping times / loss interval etc or by selecting a different monitor IP. (if that is even setup)

Most users wouldn't notice the difference between the default 500ms setting and even 2-3-4 or 5 seconds.

if Applicable OP could look https://docs.netgate.com/pfsense/en/latest/routing/gateway-configure.html

for the Probe Interval, Loss Interval, Time Period, and Alert Interval seeing how the adjusts could be made and the rules to follow.

henkbart

@Gertjan
Here is mine

FireShot Pro Webpage Capture 279 - 'firewall.private.lan - Status_ Monitoring' - 192.168.1.1.png

jrey

@Gertjan said in DNS Resolver not resolving part 1234:

It should be constant, flat and as small as possible.

Like anything, that depends a lot on scaling resolution, duration of sample as well as the connection. Great guideline but not to be interpreted as a blanket statement.

For example, the scaling that appears on the graphs, when everything is mostly sub 1ms as shown from my system the graph appears much less flat, but doesn't mean there is a problem.

Screen Shot 2023-12-12 at 11.06.59 AM.png

Looking at the data summary under the pretty graph might tell us more ?

Screen Shot 2023-12-12 at 11.18.51 AM.png

henkbart

@jrey

FireShot Pro Webpage Capture 280 - 'firewall.private.lan - Status_ Monitoring' - 192.168.1.1.png

That is mine

jrey

@henkbart

Nothing really interesting here, did you have an issue with connectivity as you've described in the past day?

jrey

@henkbart

Anything interesting in

System Logs > System (Tab) -> Gateways (Tab)?

henkbart

@jrey

This is in the Gateway

Dec 12 11:01:24 dpinger 91974 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 213.93.180.1 bind_addr 213.93.180.238 identifier "WAN_DHCP "
Dec 12 11:01:24 dpinger 91974 exiting on signal 15
Dec 12 11:01:24 dpinger 97044 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 213.93.180.1 bind_addr 213.93.180.238 identifier "WAN_DHCP "
Dec 12 11:01:26 dpinger 97044 exiting on signal 15
Dec 12 11:01:26 dpinger 57389 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 213.93.180.1 bind_addr 213.93.180.238 identifier "WAN_DHCP "
Dec 12 11:01:27 dpinger 57389 exiting on signal 15
Dec 12 11:01:27 dpinger 81718 send_interval 500ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 1 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% alarm_hold 10000ms dest_addr 213.93.180.1 bind_addr 213.93.180.238 identifier "WAN_DHCP "

jrey

@henkbart
Ah - so read this

https://forum.netgate.com/topic/174601/dpinger-exiting-on-signal-15?_=1702390469713

is that logging constant of just those few entries. What else happened at the same time?

take that time stamp and start looking at system and other logs, there will likely be something obvious - or something you observed at that time.

do you have both the gateways and interfaces widgets on the dashboard?
yes -> do you see any of the ports bouncing up and down?
no -> put them both on the dashboard?

You'll need to figure out what is killing dpinger that rapidly, as mentioned on the other thread it is being explicitly terminated. That can then lead to a whole bunch of other things happening.

is the port connection speed and duplex what you expect and what it should be?

Tried a different cable? (modem <-> wan)

dest_addr 213.93.180.1 so VODAFONE_ZIGGO
that's your gateway.
ping something further out, setup a monitor IP on
System -> Routing -> Gateways - Edit
The field is "Monitor IP" try something external but local to you or pick one of the any-cast big boys 8.8.8.8 or 1.1.1.1 etc
what kind of response you get from that?

Could still be DHCP / but I think you have tried all those checks based on previous items posted.