Gateway monitor down
-
Or maybe they have famous DNS servers (like Google, Cloudfare, etc.) whitelisted or something? I tried disabling DNSSEC for both cases and it just doesn't work in resolver mode. Do you have any other ideas? What I know is that this happened exactly the same time they put my ONU in route mode.
Snippet of unbound logs:
Dec 3 21:41:55 unbound 6032 [6032:0] info: control cmd: dump_infra Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 202.12.27.33#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:1::53 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 202.12.27.33#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:2d::d port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 192.112.36.4#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:503:c27::2:30 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 199.9.14.201#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:200::b port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:7fe::53 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 198.97.190.53#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 198.41.0.4#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:503:c27::2:30 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 192.36.148.17#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:2::c port 53
-
Mmm, well you seem to have to v6 servers configured that are not responding or you cannot reach so the first thing I would do is disable those.
-
@stephenw10 said in Gateway monitor down:
Mmm, well you seem to have to v6 servers configured that are not responding or you cannot reach so the first thing I would do is disable those.
How do I disable those? I tried removing the IPv6 link-local in "Network Interfaces" of the DNS Resolver settings and nothing changed.
That screenshot above is when DNS Resolver is NOT FORWARDING. So not sure how to instruct unbound to not query ipv6 DNS servers.
-
Do you actually have a routable IPv6 address on that firewall?
-
@stephenw10 said in Gateway monitor down:
Do you actually have a routable IPv6 address on that firewall?
I don't. I even have ipv6 disabled:
-
Checking that box allows IPv6. But it doesn't matter. Unbound is trying to reach an external c6 server and obviously cannot if you don't have a public v6 IP it can use.
That in itself should not be an issue as long as it's not only using IPv6 servers, which it isn't.
Steve
-
@stephenw10 said in Gateway monitor down:
Checking that box allows IPv6. But it doesn't matter. Unbound is trying to reach an external c6 server and obviously cannot if you don't have a public v6 IP it can use.
That in itself should not be an issue as long as it's not only using IPv6 servers, which it isn't.
Steve
Oops, you're right. I had this unchecked before but checked it just recently. Correct, it shouldn't be an issue since it still tries ipv4. That's why I think all DNS queries from unbound are being blocked somehow. As to how the ISP detects this type of traffic vs when just forwarding to a few DNS servers is what I don't understand. And how when in bridge mode it all works just fine.
At this day and age, is it recommended to just allow all ipv6? I know in Windows the official recommendation from MS is to not disable ipv6.
-
Either enable it completely if your ISP supports it or disable it completely.
The worst case is where you have some IPv6 but not actual connectivity and client try to use it over v4.
Steve
-
@stephenw10 said in Gateway monitor down:
Either enable it completely if your ISP supports it or disable it completely.
The worst case is where you have some IPv6 but not actual connectivity and client try to use it over v4.
Steve
I know my ISP supports it but I haven't gotten to setting it up yet because of the main issue I'm having. Do clients prefer v6 over v4 if they're both setup?
-
Yes, most OSes will prefer v6 if the think they have a valid IP and that can introduce lengthy delays whilst it times out.
-
@stephenw10 They put it back to bridge mode now and DNS resolving is working properly again. However, my DHCP lease issue is back. So to sum it all up:
Bridge mode: no DNS resolving issue but DHCP lease issue is present
Route mode: No DHCP lease issue but DNS resolving issue is presentI need to be in bridge mode so that's where I'll focus my troubleshooting on. When my gateway went down, this is what I saw:
https://pastebin.com/tP1wm3Uf
However, a new IP was given to my WAN interface (gateway went up) after around 3 minutes of downtime. I'm seeing DHCPNAK's. What does that tell us? Also, why am I seeing frequent "renewal in 1800 seconds" messages? Does that mean the DHCP lease is just every 30 minutes?
I also got a packet capture while this is a happening. Since that contains public IP addresses, do you want me to send it to you?
-
Could it be similar to this issue?
https://forum.netgate.com/topic/112869/dhclient-on-wan-occasionally-fails-to-renew-lease-with-cable-isp
-
It happened again at 4:10PM. Here's a clearer view of what's happening (I filtered the dhclient process only):
Dec 8 16:44:57 dhclient 26504 bound to {New WAN IP} -- renewal in 1800 seconds. Dec 8 16:44:57 dhclient 17600 Creating resolv.conf Dec 8 16:44:57 dhclient 16975 RENEW Dec 8 16:44:57 dhclient 26504 unknown dhcp option value 0x52 Dec 8 16:44:57 dhclient 26504 DHCPACK from {DHCP Server/WAN interface Gateway} Dec 8 16:44:57 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:14:57 dhclient 26504 bound to {New WAN IP} -- renewal in 1800 seconds. Dec 8 16:14:57 dhclient 83905 Creating resolv.conf Dec 8 16:14:57 dhclient 83772 /sbin/route add default {DHCP Server/WAN interface Gateway} Dec 8 16:14:57 dhclient 83557 /sbin/route add -host {DHCP Server/WAN interface Gateway} -iface igb0 Dec 8 16:14:57 dhclient 82650 Adding new routes to interface: igb0 Dec 8 16:14:57 dhclient 82454 New Routers (igb0): {DHCP Server/WAN interface Gateway} Dec 8 16:14:57 dhclient 82184 New Broadcast Address (igb0): {New WAN Broadcast IP} Dec 8 16:14:57 dhclient 81938 New Subnet Mask (igb0): 255.255.224.0 Dec 8 16:14:57 dhclient 81784 New IP Address (igb0): {New WAN IP} Dec 8 16:14:57 dhclient 81144 ifconfig igb0 inet {New WAN IP} netmask 255.255.224.0 broadcast {New WAN Broadcast IP} Dec 8 16:14:57 dhclient 80989 Starting add_new_address() Dec 8 16:14:57 dhclient 80666 BOUND Dec 8 16:14:57 dhclient 26504 unknown dhcp option value 0x52 Dec 8 16:14:57 dhclient 26504 DHCPACK from {DHCP Server/WAN interface Gateway} Dec 8 16:14:56 dhclient 26504 DHCPREQUEST on igb0 to 255.255.255.255 port 67 Dec 8 16:14:56 dhclient 80354 ARPCHECK Dec 8 16:14:54 dhclient 79636 ARPSEND Dec 8 16:14:54 dhclient 26504 unknown dhcp option value 0x52 Dec 8 16:14:54 dhclient 26504 DHCPOFFER from {DHCP Server/WAN interface Gateway} Dec 8 16:14:54 dhclient 26504 DHCPDISCOVER on igb0 to 255.255.255.255 port 67 interval 1 Dec 8 16:14:54 dhclient 26504 DHCPNAK from {DHCP Server/WAN interface Gateway} Dec 8 16:14:54 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:14:26 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:13:46 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:13:32 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:13:20 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:13:09 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:12:37 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:12:22 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:12:09 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:57 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:52 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:50 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:49 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:48 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67 Dec 8 16:11:47 dhclient 26504 DHCPREQUEST on igb0 to {DHCP Server/WAN interface Gateway} port 67
So the client tries to do a DHCPREQUEST for several times until it finally receives a DHCPNAK from the server to initiate the whole DORA process again. At 4:44PM, it does the same thing but the server sends a DHCPACK after the first DHCPREQUEST from the client.
-
@kevindd992002 said in Gateway monitor down:
I'm seeing DHCPNAK
Looks like the upstream DHCP server send a DHCPNAK. This was with your ISP device in bridge mode ? So it was the DHCP server from the ISP ... ?
" I'm seeing DHCPNAK" => The ISP is seeing your DHCPDISCOVERS and didn't expect them ? It tells the pfSEnse DHCP client 'to shut up'.@kevindd992002 said in Gateway monitor down:
seeing frequent "renewal in 1800 seconds" messages? Does that mean the DHCP lease is just every 30 minutes?
This part :
Dec 8 12:11:44 dhclient 26504 bound to {WAN IP} -- renewal in 1800 seconds. Dec 8 12:11:43 dhclient 24813 Creating resolv.conf Dec 8 12:11:43 dhclient 24678 RENEW Dec 8 12:11:43 dhclient 26504 unknown dhcp option value 0x52 Dec 8 12:11:43 dhclient 26504 DHCPACK from {DHCP server/WAN interface gateway} Dec 8 12:11:43 dhclient 26504 DHCPREQUEST on igb0 to {DHCP server/WAN interface gateway} port 67 Dec 8 11:45:06 dhcpd 21997 DHCPACK on 192.168.20.253 to 0a:d6:94:12:78:5c via igb1 Dec 8 11:45:06 dhcpd 21997 DHCPREQUEST for 192.168.20.253 from 0a:d6:94:12:78:5c via igb1 Dec 8 11:45:06 dhcpd 21997 reuse_lease: lease age 20117 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.20.253
You're getting a TFC1918 = 192.168.20.253 is an IP from your ISP device in router mode ....
Your ISP would not give ypu a RFC1918 it the device was in bridge mode. It shouldn't.Btw : the dhcp (pfSense) client receives a option 0x52 = 82 decimal = "Relay Agent Information" and the client doesn't know what that means / isn't aware of that option / doesn't know what to do with it.
You saw several
dhcpleases xxxxxSending HUP signal to dns daemo
Go to the Services > DNS Resolver > General Settings and un check "DHCP Registration". You deal with that issue later (it's a very known : evey time a (any) device on your LAN asks for a new IP by DHCO, or renews, the Resolver gets restarted. If you have many devices, or a device that likes to ask a new IP every xx seconds, the resolver (unbound) passes more time with restating as doing its actual job = handling your DNS).
Just un check "DHCP Registration" and have this stopped.edit :
What about telling the dhcp pfSense client to wait for a minute or two when a WAN UP/DOWN event is detected ?
Check this one :
look up the meaning of the several time out values here https://www.freebsd.org/cgi/man.cgi?query=dhclient.conf&sektion=5&n=1
You could also enter the IP (RFC1918) of your ISP device to be rejected :
Read :
To have the DHCP client reject offers from specific DHCP servers, enter their IP addresses here (separate multiple entries with a comma). This is useful for rejecting leases from cable modems that offer private IP addresses when they lose upstream sync.
So :
if you don't want to accept an IP from your ISP device - it's internal DHCP server (when it is in bridge mode).
-
@gertjan said in Gateway monitor down:
@kevindd992002 said in Gateway monitor down:
I'm seeing DHCPNAK
Looks like the upstream DHCP server send a DHCPNAK. This was with your ISP device in bridge mode ? So it was the DHCP server from the ISP ... ?
" I'm seeing DHCPNAK" => The ISP is seeing your DHCPDISCOVERS and didn't expect them ? It tells the pfSEnse DHCP client 'to shut up'.Correct. That is the upstream DHCP server from the ISP because it is in bridge mode.
You mean the ISP is seeing DHCPREQUESTs and not DHCPDISCOVERs, right? I'm seeing multiple DHCPREQUESTs that aren't being answered.
@kevindd992002 said in Gateway monitor down:
seeing frequent "renewal in 1800 seconds" messages? Does that mean the DHCP lease is just every 30 minutes?
This part :
Dec 8 12:11:44 dhclient 26504 bound to {WAN IP} -- renewal in 1800 seconds. Dec 8 12:11:43 dhclient 24813 Creating resolv.conf Dec 8 12:11:43 dhclient 24678 RENEW Dec 8 12:11:43 dhclient 26504 unknown dhcp option value 0x52 Dec 8 12:11:43 dhclient 26504 DHCPACK from {DHCP server/WAN interface gateway} Dec 8 12:11:43 dhclient 26504 DHCPREQUEST on igb0 to {DHCP server/WAN interface gateway} port 67 Dec 8 11:45:06 dhcpd 21997 DHCPACK on 192.168.20.253 to 0a:d6:94:12:78:5c via igb1 Dec 8 11:45:06 dhcpd 21997 DHCPREQUEST for 192.168.20.253 from 0a:d6:94:12:78:5c via igb1 Dec 8 11:45:06 dhcpd 21997 reuse_lease: lease age 20117 (secs) under 25% threshold, reply with unaltered, existing lease for 192.168.20.253
You're getting a TFC1918 = 192.168.20.253 is an IP from your ISP device in router mode ....
Your ISP would not give ypu a RFC1918 it the device was in bridge mode. It shouldn't.Please ignore the dhcpd events on the 1st set of logs that I posted today. Those are for pfsense acting as the DHCP server to my "LAN clients" which is why you see RFC1918 addresses in the logs. This is why I posted a 2nd set of logs that only shows the dhclient entries which is what's important for my WAN DHCP lease renewal issue.
Btw : the dhcp (pfSense) client receives a option 0x52 = 82 decimal = "Relay Agent Information" and the client doesn't know what that means / isn't aware of that option / doesn't know what to do with it.
Yes, I saw that too. So that means that the ISP DHCP server is using DHCP relay which is why this whole issue could be related to this, no?
You saw several
dhcpleases xxxxxSending HUP signal to dns daemo
Go to the Services > DNS Resolver > General Settings and un check "DHCP Registration". You deal with that issue later (it's a very known : evey time a (any) device on your LAN asks for a new IP by DHCO, or renews, the Resolver gets restarted. If you have many devices, or a device that likes to ask a new IP every xx seconds, the resolver (unbound) passes more time with restating as doing its actual job = handling your DNS).
Just un check "DHCP Registration" and have this stopped.I am totally aware of this and this only affects the DHCP server service (dhcpd) in pfsense, not the dhclient. I don't really care if the the DHCP server restarts every now and then because of DHCP registrations. I accept the fact that it does this.
-
I've edited - add another part ot my reply above.
-
@gertjan said in Gateway monitor down:
edit :
What about telling the dhcp pfSense client to wait for a minute or two when a WAN UP/DOWN event is detected ?
Check this one :
look up the meaning of the several time out values here https://www.freebsd.org/cgi/man.cgi?query=dhclient.conf&sektion=5&n=1
I'm looking into this too but I don't want to be breaking any RFC rules that aren't supposed to be broken. Not sure if the problem is in the client side or the ISP DHCP server side.
You could also enter the IP (RFC1918) of your ISP device to be rejected :
Read :
To have the DHCP client reject offers from specific DHCP servers, enter their IP addresses here (separate multiple entries with a comma). This is useful for rejecting leases from cable modems that offer private IP addresses when they lose upstream sync.
So :
if you don't want to accept an IP from your ISP device - it's internal DHCP server (when it is in bridge mode).
The IP of my ISP's DHCP server is a public IP which is expected. So not sure if this has some effect.
-
@kevindd992002 said in Gateway monitor down:
I don't really care if the the DHCP server restarts every now and then because of DHCP registrations. I accept the fact that it does this.
No, no the pfSense DHCP server. It's far worse.
When the pfSense DHCP server gave an IP lease to a LAN based device, it will :Sending HUP signal to dns daemon
This means : it will restart unbound, the DNS resolver.
Ok if it does so ones in a while.
Not every minute or so, as you will be loosing your DNS cache every time it restarts.
The DNS functionality on your LAN will be not available during restart.
And that's bad .... -
@gertjan said in Gateway monitor down:
@kevindd992002 said in Gateway monitor down:
I don't really care if the the DHCP server restarts every now and then because of DHCP registrations. I accept the fact that it does this.
No, no the pfSense DHCP server. It's far worse.
When the pfSense DHCP server gave an IP lease to a LAN based device, it will :Sending HUP signal to dns daemon
This means : it will restart unbound, the DNS resolver.
Ok if it does so ones in a while.
Not every minute or so, as you will be loosing your DNS cache every time it restarts.
The DNS functionality on your LAN will be not available during restart.
And that's bad ....Ohhh, you're right. Yeah, then I should probably disable that if it deletes the cache every single time :) Even though I have my own DNS server (adguard home), it is still pointed to pfsense's unbound for faster resolution.
-
@kevindd992002 said in Gateway monitor down:
Also, why am I seeing frequent "renewal in 1800 seconds" messages? Does that mean the DHCP lease is just every 30 minutes?
The dhcp client will typically renew at half the lease time to prevent the lease ever expiring. So it looks like the ISP is handing you a 1 hour lease.
Steve