Gateway monitor down
-
@kevindd992002 said in Gateway monitor down:
I know DNS is UDP
Get ready to know something more. When do DNS queries use TCP instead of UDP?
And because we are in the 2020ties, DNS, known since 1980, evolved, and DNS queries and answer became bigger. To big. UDP becomes useless, and DNS traffic shifts over to TCP.
A good example is DNSSEC, DNS traffic becomes a pure TCP thing, as answers just don't fit any more into "one packet" UDP.Btw : it's a known issue : pfSense is installed, and outgoing traffic is all blocked, with some exceptions : they allowed DNS ... over UDP only. Suddenly it looks like 'something' failed. It's of course 'the firewall' - or, no, wait ; the ISP !! or no, it's some one but not me !!
That's a classic newbie fail : one should understand things before using them. -
@gertjan said in Gateway monitor down:
@kevindd992002 said in Gateway monitor down:
I know DNS is UDP
Get ready to know something more. When do DNS queries use TCP instead of UDP?
And because we are in the 2020ties, DNS, known since 1980, evolved, and DNS queries and answer became bigger. To big. UDP becomes useless, and DNS traffic shifts over to TCP.
A good example is DNSSEC, DNS traffic becomes a pure TCP thing, as answers just don't fit any more into "one packet" UDP.Right, that's as much as I know, it mainly uses UDP and sometimes uses TCP. Thanks for the link that explains when it uses TCP.
Btw : it's a known issue : pfSense is installed, and outgoing traffic is all blocked, with some exceptions : they allowed DNS ... over UDP only. Suddenly it looks like 'something' failed. It's of course 'the firewall' - or, no, wait ; the ISP !! or no, it's some one but not me !!
That's a classic newbie fail : one should understand things before using them.
Not sure what your point is here. The only thing that's being blocked is DNS resolving and things like torrenting with multiple TCP connections. DNS forwarding was not blocked. For reference, here's my past thread:https://forum.netgate.com/topic/159232/dns-resolver-timeouts
It's pointless to discuss about that here though as it is out-of-topic. My main issue here is regarding the packet losses I experience with my current ISP, not the DNS resolving issues I had with my past ISP. It's not like I'll be going back to my past ISP anytime soon because of the lockup period I have with my current one.
-
@stephenw10 said in Gateway monitor down:
Yes, just run a pcap on WAN when it fails and see what's happening.
If the DHCP lease has expired you should see pfSense requesting a new lease.
Steve
Gotcha, I'll make sure to get a pcap when it happens again. Do you have any ideas regarding the unstable ping WAN monitor results though?
-
Not really. Is anything logged?
Do you see the same loss if you choose a different IP?
If you use the ISP gateway IP directly?
Steve
-
@stephenw10 said in Gateway monitor down:
Not really. Is anything logged?
Do you see the same loss if you choose a different IP?
If you use the ISP gateway IP directly?
Steve
Yes, so far I tried 8.8.8.8, 8.8.4.4, 1.1.1.1, and my ISP gateway IP. It's less prevalent on the ISP gateway IP but it still fluctuates. With my past ISP, here's how it looks with 8.8.4.4:
You can just see the difference with the graph in my original post.
-
Mmm, that sounds like an actual WAN issue. Especially since it hit's your VoIP too and that's not going through pfSense as I understand it.
Steve
-
@stephenw10 said in Gateway monitor down:
Mmm, that sounds like an actual WAN issue. Especially since it hit's your VoIP too and that's not going through pfSense as I understand it.
Steve
Yeah, probably. What I noticed is that those peaks in the standard deviation are mostly consistent with a 14-15 mins interval which is kinda weird.
For the DHCP lease issue, could this setting in the WAN interface potentially be what's causing the issue?
I'm just thinking out loud as their DHCP server can be a private IP address.
-
@kevindd992002 said in Gateway monitor down:
I'm just thinking out loud as their DHCP server can be a private IP address.
"pcap"ing will tell you that.
(during testing, remove that block rule on the WAN). -
@kevindd992002 said in Gateway monitor down:
could this setting in the WAN interface potentially be what's causing the issue?
No, that will only prevent incoming connections from a private IP on WAN. The DHCP client initiates the connections outbound.
If it was exactly 15min or some other exact interval I'd be looking at some ARP problem perhaps. But I'm not seeing a pattern that matches that.
Steve
-
@stephenw10 said in Gateway monitor down:
@kevindd992002 said in Gateway monitor down:
could this setting in the WAN interface potentially be what's causing the issue?
No, that will only prevent incoming connections from a private IP on WAN. The DHCP client initiates the connections outbound.
If it was exactly 15min or some other exact interval I'd be looking at some ARP problem perhaps. But I'm not seeing a pattern that matches that.
Steve
That's my exact hunch. I know dhclient from pfsense is the initiating the connection to the ISP DHCP server so it's got to be an outbound connection.
The ISP is still troubleshooting the issue from their end and even though I told them to not touch anything on my ONU about it being in "bridge" mode, they did. Surprise surprise. And now I'm currently at route mode where my pfsense WAN interface is getting a private IP from the ONU DHCP server (NAT).
What I notice is that I don't get the DHCP lease issue when on route mode. The problem now is that unbound as a resolver won't work. All it gives my clients are THROWAWAY/SERVFAIL results. When I turn it to a forwarder, it works. So there's got to be something with route mode that's hijacking DNS for some reason. I'm still pushing them to put back everything to bridge mode as that is very important for me and told them about the issue being potentially caused by the DHCP lease.
-
If they are hijacking DNS it would still fail in forwarding mode unless you're forwarding to their DNS servers perhaps. Or maybe you are disabling DNSSec in forwarding mode allowing them to.
-
Or maybe they have famous DNS servers (like Google, Cloudfare, etc.) whitelisted or something? I tried disabling DNSSEC for both cases and it just doesn't work in resolver mode. Do you have any other ideas? What I know is that this happened exactly the same time they put my ONU in route mode.
Snippet of unbound logs:
Dec 3 21:41:55 unbound 6032 [6032:0] info: control cmd: dump_infra Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 202.12.27.33#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:1::53 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 202.12.27.33#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:2d::d port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 192.112.36.4#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:503:c27::2:30 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 199.9.14.201#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:200::b port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:7fe::53 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 198.97.190.53#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 198.41.0.4#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:503:c27::2:30 port 53 Dec 3 21:41:54 unbound 6032 [6032:3] info: query response was THROWAWAY Dec 3 21:41:54 unbound 6032 [6032:3] info: reply from <.> 192.36.148.17#53 Dec 3 21:41:54 unbound 6032 [6032:3] info: response for . NS IN Dec 3 21:41:54 unbound 6032 [6032:3] info: error sending query to auth server 2001:500:2::c port 53
-
Mmm, well you seem to have to v6 servers configured that are not responding or you cannot reach so the first thing I would do is disable those.
-
@stephenw10 said in Gateway monitor down:
Mmm, well you seem to have to v6 servers configured that are not responding or you cannot reach so the first thing I would do is disable those.
How do I disable those? I tried removing the IPv6 link-local in "Network Interfaces" of the DNS Resolver settings and nothing changed.
That screenshot above is when DNS Resolver is NOT FORWARDING. So not sure how to instruct unbound to not query ipv6 DNS servers.
-
Do you actually have a routable IPv6 address on that firewall?
-
@stephenw10 said in Gateway monitor down:
Do you actually have a routable IPv6 address on that firewall?
I don't. I even have ipv6 disabled:
-
Checking that box allows IPv6. But it doesn't matter. Unbound is trying to reach an external c6 server and obviously cannot if you don't have a public v6 IP it can use.
That in itself should not be an issue as long as it's not only using IPv6 servers, which it isn't.
Steve
-
@stephenw10 said in Gateway monitor down:
Checking that box allows IPv6. But it doesn't matter. Unbound is trying to reach an external c6 server and obviously cannot if you don't have a public v6 IP it can use.
That in itself should not be an issue as long as it's not only using IPv6 servers, which it isn't.
Steve
Oops, you're right. I had this unchecked before but checked it just recently. Correct, it shouldn't be an issue since it still tries ipv4. That's why I think all DNS queries from unbound are being blocked somehow. As to how the ISP detects this type of traffic vs when just forwarding to a few DNS servers is what I don't understand. And how when in bridge mode it all works just fine.
At this day and age, is it recommended to just allow all ipv6? I know in Windows the official recommendation from MS is to not disable ipv6.
-
Either enable it completely if your ISP supports it or disable it completely.
The worst case is where you have some IPv6 but not actual connectivity and client try to use it over v4.
Steve
-
@stephenw10 said in Gateway monitor down:
Either enable it completely if your ISP supports it or disable it completely.
The worst case is where you have some IPv6 but not actual connectivity and client try to use it over v4.
Steve
I know my ISP supports it but I haven't gotten to setting it up yet because of the main issue I'm having. Do clients prefer v6 over v4 if they're both setup?