Odd DNS lookup issue (via PPPoE)
-
Hello everyone,
I'm a new pfsense user, and this is my first post here.
I am running 2.4.5 (CE) on vanilla amd64 hardware.
Now, I'll try to describe my issue as best I can. I have got one WAN interface,connected to a provider supplied modem/router (in router mode.) That interface is getting its IP from said router. On the LAN interface I have the DHCP service running, providing the pfsense LAN address as DNS option. That works all as expected. DNS resolver is in forward mode, and fowards to the provider router. No issues on the LAN computers with the pfsense box set as DNS server, and Diagnostics -> DNS Lookup in the webgui works as well.
Now, I added a third interface. This time PPPoE, using the WAN interface above as parent. (PPP passthrough enabled on the modem, and multiple PPP logins are allowed by the provider. You may ask yourself "why?" - I eventually want to use the provider box in bridge mode only.) That works fine, gets a PPP provided IP as well as two DNS server addresses. I checked, they are the same as the ones provided to the provider box, which is no surprise. As per the pfsense configuration, they are added to the list of available DNS servers. Naturally, automatic routes are added by pfsense which route packets for these DNS server addresses through the PPP interface. Expected and desired behavior. Now, going to Diagnostics -> DNS lookup and trying to look up any name works, but the two servers received by PPP get a No Response in the Query Time column. Odd. Resolution works, however, because the provider box answers.
So, I assumed there's some connectivity or routing issue. Pinging those DNSes works fine however (from the webgui, and the LAN clients.) Also, doing an nslookup using any of those servers works fine...strange. Time for some trial and error testing. I disabled the WAN interface, leaving only the PPP one active for anything going out. Works fine as long as I configure the clients to use the PPP provided DNSes directly. Huh. Maybe DNS resolver/unbound is having some issue. Disabled it (note: DNS forwarder is also disabled.) I also replaced the DNS entries on pfsense with a single one: Google's 8.8.8.8. Now, doing a DNS Lookup in the webgui: gets an answer, but still no response in the query time column (and it takes a while.) On the console shell: nslookup works fine using 8.8.8.8, no timeout.
Disabling the PPP interface, and re-enabling the "WAN" one -> dns lookup via webgui works fine again.
Back to PPP only, WAN disabled. Did a packet capture on the PPP interface for port 53. There are indeed replies from 8.8.8.8 to the PPP address (how else would it be able to show the correct responses, right?) They are also not late, they do arrive quickly, so no timeout.
I got a bit frustrated, but since I've been message around with the pfsense's capabilities for the last few days, I though maybe I messed up some firewall or NAT rules or something. So: factory reset. Changed nothing, except adding the PPPoE interface. No luck, same issue.
Before I start pulling out my few remaining hairs, I thought I'd ask here, if anyone had another idea what to check? At this time, it's more about me obsessing to find the reason why this is so, it's not like I can't live without using DNS resolver, or continuing with the modem in router mode. :)
Sorry for the long description, but I wanted to be as comprehensive as possible
-
I did some further investigating. I compared the packet capture from one of the (semi-)failed lookups to a working one. It seems it tries to get the root servers (f.root-servers.net. and so forth) first. In the working case, it gets them immediately, and in the failed case, it attempts a few times, times out and continues on to get the A and AAAA records (which succeeds.)
On the console, I did a dig NS . and sure enough, it times out if using the PPP interface. Tried the same on a client connected through the provider router: works. So I guess the issue may come from something the provider box does when passing through PPPoE, or some obscure (or maybe only obscure to me...) DNS config. To prove this, I used another computer, established a PPPoE connection using its client (in this case MacOS's PPP client) -> same issue, cannot dig the root servers. Hmph.
Next thing to try is actually not using PPPoE pass-through, but using a modem in bridge mode. Unfortunately, that neutered provider supplied box cannot (easily) be convinced to work in bridge mode (and my AllNet modem has not yet arrived.) I do have a Lancom 1790VA router around, which can be put into bridge mode easily, I can't do this during the day, because I need the internet connection for my office VPN connection.
For those few of you still reading. Any ideas about the root cause (no pun intended :P)? Or a means of preventing unbound or dnsmasq from trying to get the root NSes?