Changes in DNS?



  • Have there been changes in the DNS code recently?
    This weekend I started getting severe connectino problems. After some investigations I found out the problem isn't related to a specific internet link (I use 4 different providers right now). After some more investigations it looked like some dns requests don't get answered. I think restarting the dns service sometimes helped.
    I tried different snapshots and always had the same problems. even snapshots.pfsense.org wasn't found once.
    Manually setting a different DNS (not pfsense) helped…
    Now I went back to the oldest snapshot online (17th) and it looks as if the problem weren't there in that version.



  • Hm, I found something now
    (going back to an older snapshot didn't really help, I just forgot to delete the manually configured dns from my system…)

    I had the option 'Allow DNS server list to be overridden by DHCP/PPP on WAN' active and at least one of the DNS in that list wasn't working. So probably that's what caused the problems. But then, the forwarder code doesn't seem to be very robust, when one non working server out of 8 causes severe problems…


  • Rebel Alliance Global Moderator

    Would depend on how that 1 wasn't working.. And were you sending to all, or had you enabled sequential?

    Query DNS servers sequentially
    If this option is set, pfSense DNS Forwarder (dnsmasq) will query the DNS servers sequentially in the order specified (System - General Setup - DNS Servers), rather than all at once in parallel.

    If your asking all of them, and this 1 bad one answers with nxdomain - and it comes in first, guess what..  Your out of luck, or if it answers first with wrong info, etc.  But if it does not answer then NO it would not have any effect on anything other then pfsense sending out extra packets it doesn't need too.

    Now if you were using sequential queries and it was first on the list, then sure your clients on the other side of pfsense might get some timeouts and not resolve stuff because pfsense was wasting time asking something that is either really slow or does not answer (you said it was bad)

    Your lack of understanding of how dns works, or the forwarder works does not mean its not robust ;)

    A better option vs bouncing around snapshots when you thought there might be something wrong with dns, would of been to actually troubleshoot why you were getting dns issues.  " looked like some dns requests don't get answered."  Why were they not getting answered - did you think to see if pfsense was sending out the queries, and where they getting answered via a simple sniff?  Or just manually doing the queries via say nslookup or dig with either debug or +trace set to see what is happening in the query, etc.



  • I agree with the above.  You can't blame all outages on pfsense or even the ISP.
    As a for instance, I was getting outages last week and some quick diagnosing revealed that pfsense wasn't the issue and neither was my internet provider.  My DNS was failing because about 5 hops down the path from my computer to the dyndns servers there is one particular link that is suffering on average 95% packet loss.  This condition has existed between me and all their servers for a week or so.  I'm on the East coast.  Logging into another of my computers more central USA and using the same DNS servers, there is no issue.  Different route, so no packet loss.

    In short, its not pfsense's fault, not my ISP's fault, not the fauilt of my DNS provider - Its no one's fault I can directly motivate.  So, here, I just changed my DNS servers in the short term and I do an MTR about once a day to check and see when/if that weak link is repaired ever.


  • Rebel Alliance Global Moderator

    ^ exactly..  Troubleshoot a bit is much better course of action vs just trying different snaps..  And blaming something without RCA is just pointless.



  • I set up my first BINDs back in the 90ties, thank you. I think some of them are still running. So I think I have some very basic knowledge at least.
    The DNSs that weren't responding weren't answering at all, not NX DOMAIN.
    I'm glad your protecting your providers so much. So you think if they provide you with non working DNS servers via DHCP that's clearly not them to blame… How nice of you, but I don't see it that way. One of the ISPs answered I should use 2 different ones or even better 8.8.8.8 und 8.8.4.4. anyway. LOL.
    Of course I'm happy to troubleshoot, but first one has to find out where to shoot at.
    I was quite surprised to find out that pfsense used the DNSses sent by DHCP, I was quite confident it wasn't that way until at least some weeks ago. Seems I was wrong. Just never game me problems in the past. In fact that's nice, but I just wasn't aware of that fact. Of course I first checked the ones I entered manually and those all worked.

    I didn't mean to insult anybody but if 2 not responding DNS server out of 8 give that kind of problems that's not my understanding of redundancy. Then it's clearly better to only use 1 DNS. In my opinion not answering DNS should not break lookup as long as there are alternate DNS configured. I also think that's the way it was intended...
    (Query secuentally is not active)

    My intention wasn't to 'blame' anybody (I don't believe that does anybody any good). Could just have been that somebody says: Of course, I changed something in the code recently, and now I see that could be the problem.


  • Banned

    @sirdir:

    I didn't mean to insult anybody but if 2 not responding DNS server out of 8 give that kind of problems that's not my understanding of redundancy.

    You configured 8 DNS servers? Well, that is pretty amazing, considering the GUI allows just 4. In case you did not and you got those 8 DNS servers assigned via DHCP, I'd like to remind you that some platforms (such as Linux/glibc) allow for only 3 nameservers. Might be something for your ISP to think about. Since if their first 3 DNS servers are useless, then no others will ever get used.



  • @doktornotor:

    @sirdir:

    I didn't mean to insult anybody but if 2 not responding DNS server out of 8 give that kind of problems that's not my understanding of redundancy.

    You configured 8 DNS servers? Well, that is pretty amazing, considering the GUI allows just 4. In case you did not and you got those 8 DNS servers assigned via DHCP, I'd like to remind you that some platforms (such as Linux/glibc) allow for only 3 nameservers. Might be something for your ISP to think about. Since if their first 3 DNS servers are useless, then no others will ever get used.

    Yes I'm talking about the ones assigned by DHCP. And as I said I have 4 providers, so every provider just gives me 2…


  • Banned

    Best course of action would be to stop using those DHCP-assigned DNS servers at all. If 25% of them fails at best, clearly those are useless. Either set up your own or use the public ones, such as Google public DNS, OpenDNS or whatever.



  • I'd agree with you in most cases, but just when you think you have gone and made something like DNS idiot proof, they go and invent a better idiot.  I wanted to see how badly I could shoot myself in the foot, so just to be stupid, I loaded 21 DNS servers on a VM.  (It won't be staying that way)




  • @doktornotor:

    Best course of action would be to stop using those DHCP-assigned DNS servers at all. If 25% of them fails at best, clearly those are useless. Either set up your own or use the public ones, such as Google public DNS, OpenDNS or whatever.

    Yes that's what I did. Like I said I wasn't even aware they ware used in first place.

    Maybe we could learn one thing from the whole story: Maybe it's not really clear what happenes when there are DNS servers configured in 'General setup' and provided by DHCP as well. Are the manually set used at all? is only the assigned gateway used if the same server is provided via DHCP etc…

    PS: Another thing. The gui says:
    When using multiple WAN connections there should be at least one unique DNS server per gateway.

    Now, you can only enter 4 nameservers in 'general setup'. Maybe that's why I subconsiously used the DHCP provided ones? I used to have 5 WAN links, so I wasn't able to provide one DNS per gateway….


  • Rebel Alliance Global Moderator

    Yes you would normally want to have atleast 1 dns server per wan connection.. In case your other connection goes down, etc..  If that name server is only available via that connection.

    Here is the thing with ISP dns - they are normally only able to be queried from their NETWORK!!  So if you have multiple wan connections, which path are you taking to the name servers IP?  Since its unlikely the name server is on the same segment the connection is on.  You could be taking any of your other connections paths to try and get to a specific IP - what is your default route, do you have specific routes setup for those dns IPs?

    So if your having issues doing queries to ISP based dns – its quite possible your trying to hit them from a source IP that is not their network.  And then yeah they most likely will not answer you.

    Again - your lack of understanding does not mean a system is not robust ;)



  • @johnpoz:

    Again - your lack of understanding does not mean a system is not robust ;)

    Please, could you stop making a fool of yourself? I've set up RIP,  OSPF, EIGRP, static and last, but not least BGP4 routing in the 90ies, I've built an ISP we sold in the year 2000 so you can guess I know some things about routing. I'm even capable of distinguishing between 'not reachable' and 'no dns service running'.
    Anyway, even if my routing would be screwed up, having 2 DNS servers that are not reachable (never mind the reason) breaking pfsense couldn't be called robust, could it?

    No, don't answer, I already know the answer… My lack of understanding is responsible for every bug that ever had been in pfsense…


  • Banned

    Oh, I see… My DNS servers are unreachable -> pfsense suxxxx, it does not resolve. Makes a lot of sense. facepalm



  • I tend to prefer public servers. I've been testing the OpenNIC servers for a while to see how reliable they are.
    I usually give pfsense 4 geographically separated DNS servers not too far away and then point all the clients at pfsense only.
    I think we should all have about 3 double espressos and chat this some more ;D
    Maybe during a traffic jam on the way home…


  • Netgate

    If I were OP I would turn off the DNS forwarder in pfSense and set up a couple or three local, caching name servers (with no forwarders configured) and point my local clients at them.

    They would do recursion on behalf of the clients using whatever WAN links happen to be available at the time.  They would only be seeking answers from authoritative servers so the "local queries only" problem with multiple WANs would not exist.

    I would completely disregard the name servers the WAN links set.



  • @doktornotor:

    Oh, I see… My DNS servers are unreachable -> pfsense suxxxx, it does not resolve. Makes a lot of sense. facepalm

    Probably you had too many facepalms.
    What do you have several DNS for? Redundancy? So, if 2 out of 8 don't work, of course it's normal that name resolution doesn't work anymore?



  • @kejianshi:

    I tend to prefer public servers. I've been testing the OpenNIC servers for a while to see how reliable they are.
    I usually give pfsense 4 geographically separated DNS servers not too far away and then point all the clients at pfsense only.
    I think we should all have about 3 double espressos and chat this some more ;D
    Maybe during a traffic jam on the way home…

    My clients are pointing to pfsense, too (caching…). I still like to use the ISP nameservers when ever possible? Why? My internet connections aren't the fastest ones and no DNS can be nearer than the one of the ISP - possibly one with an overloaded upstream…



  • I am swilling coffee as we speak and also taking isoproterenol (an adrenaline antagonist).
    I'll be ready to share my feelings on DNS forwarder function in pfsense momentarily.

    As far as "fast", I agree that the local ones ping faster but once the local ones have proven unreliable, fast doesn't matter.
    I'd prefer reasonable ping time + reliability over speed.  Especially once I realized that when one of my WAN links drop that DNS server is just going to become a big speed bump in my internet.



  • @Derelict:

    If I were OP I would turn off the DNS forwarder in pfSense and set up a couple or three local, caching name servers (with no forwarders configured) and point my local clients at them.

    They would do recursion on behalf of the clients using whatever WAN links happen to be available at the time.  They would only be seeking answers from authoritative servers so the "local queries only" problem with multiple WANs would not exist.

    I would completely disregard the name servers the WAN links set.

    I do disregard them now. But don't you think your setup is somewhat an overkill for a private household? 3 additional nameservers? Disabling the DHCP provided DNS already solved my problems, I think that's good enough for me. By the way, WAN links weren't the problem, there the failover works. And there's no 'local queries only' problem, the routes are correct. Of course, I don't know wether pfsense is smart enough not to query over a gateway that is marked down… But I guess so.
    Well I have one BIND running in my network already, of course I could use that one. On the other hand I have to reboot that machine from time to time…



  • @kejianshi:

    As far as "fast", I agree that the local ones ping faster but once the local ones have proven unreliable, fast doesn't matter.
    I'd prefer reasonable ping time + reliability over speed.  Especially once I realized that when one of my WAN links drop that DNS server is just going to become a big speed bump in my internet.

    Of course you're right. But in the last years the DNS never were a problem, the only problem was that 2 providers sent out 2 non working servers. The 'first' ones in the list always worked.



  • Well - Now that thats been solved…
    On to new challenges.




  • @kejianshi:

    Well - Now that thats been solved…
    On to new challenges.

    Well, maybe you wish to share your thoughts on the forwarder?



  • The forwarder has always worked well for me.  I did have one problem once but that was self inflicted.  My list of DNS servers were pretty much co-located servers, so when the path to one went down, they were all down.


  • Rebel Alliance Global Moderator

    "So, if 2 out of 8 don't work, of course it's normal that name resolution doesn't work anymore?"

    What part are you just not getting??  Who's the one making a fool out of themselves?

    This is NOT the case, unless as I asked at the start of the thread you are doing sequential.  Forwarder by default asks ALL your dns listed at the same time and uses the first one that answers.

    Does not matter as long as 1 answers in a reasonable amount of time..  Now if they answer nxdomain - like in my first example then no they wont resolve at your client..  Is this what is happening?  Don't know because you couldn't be bothered to take 2 seconds and actually see what pfsense was or was not doing, and what you were or were not getting back from the dns servers you had listed to use

    So you can see, look how pfsense asked all the nameservers I have listed in etc/resolv.conf –- I added more so you could see ones that don't answer

    [2.1-RC0][admin@pfsense.local.lan]/root(3): cat /etc/resolv.conf
    domain local.lan
    nameserver 127.0.0.1
    nameserver 64.81.159.2
    nameserver 129.250.35.250
    nameserver 75.75.75.75
    nameserver 1.1.1.1
    nameserver 2.2.2.2
    nameserver 3.3.3.3
    nameserver 4.4.4.4
    nameserver 5.5.5.5
    nameserver 6.6.6.6

    See how pfsense asked them all!  And 3 answered..  WAD!

    Now in second example – I made sure I cleared my local client cache, and restarted dns forwarder so nothing cached on pfsense.  Notice how it asks All of them again, but 1 answers first..  Answer that gets used, look that one straggler he answers but a bit latter than the rest.  But what 6 our of the 9 i have set did not answer at all.. But resolution still worked.. Fancy that, not bad for such a non robust setup ;)

    So what part of this do you just not get??

    edit: BTW on side note - notice that my local isp dns, 75.75 comcast did not answer first in the nxdomain query.  the x.ns.gin.ntt.net one did, you would think my local isp 1 should answer first ;)  Not always the case as already mentioned.






  • @johnpoz:

    "So, if 2 out of 8 don't work, of course it's normal that name resolution doesn't work anymore?"

    What part are you just not getting??  Who's the one making a fool out of themselves?

    This is NOT the case, unless as I asked at the start of the thread you are doing sequential.  Forwarder by default asks ALL your dns listed at the same time and uses the first one that answers.

    Which part are you not getting? That's not what has happened in my case! I know it should be like that, but it wasn't.
    And i TOLD you sequential is not active. And I also told you the failing servers didn't answer nx dmain. There seems to be no DNS active at all (and NO, I was querying via the correct gateway, thank you)

    So what part of this do you just not get??

    That it's not what happened in my case. Maybe something is handled differently when the servers are provided by DHCP?
    I don't know, I just know it didn't work as expected. Maybe you could just for one second imagine that what I'm describing actually happened instead of trying to make a fool of me.


  • Rebel Alliance Global Moderator

    I would love to see what you were seeing, why should I have to image it?

    For such a tech guy, what you couldn't post a screenshot of your sniff of what pfsense was doing or not doing for dns?

    And no I can not image what you described because that is NOT how it works..  So all those snaps you switched too all had the bad code?  Come on dude really?  Simple sniff would of shown everyone what was happening..

    I don't have to try anything - anyone that jumps to multiple snaps without basic troubleshooting already painted a very clear picture ;)


  • Netgate

    @sirdir:

    I do disregard them now. But don't you think your setup is somewhat an overkill for a private household?

    And 4 WAN links isn't?  Never occurred to me we were talking about a private home network.  Good luck.



  • @Derelict:

    @sirdir:

    I do disregard them now. But don't you think your setup is somewhat an overkill for a private household?

    And 4 WAN links isn't?  Never occurred to me we were talking about a private home network.  Good luck.

    Guess it is ;)
    It were even 5 but I suspended one (and will probably cancel it). It's difficult to explain. First I had ADSL which is slow and flaky, then I added a WIFI link, then Sat, then a better WIFI link and then another WIFI Link that (because it's very cheap) should replace ADSL as a backup. I'll probably cancel the Sat link when the contract period is over…

    @Johnpoz: I just wanted to jump to the last known working version but I wasn't sure which one that was… so simple…
    When I did this I wasn't even aware that it's a DNS problem. First idea was that it's an ISP problem. As you might know most websites load pics/ads/whatever from different servers and when one of the lookup fails that may cause problems that don't directly point to dns problems.


  • Banned

    @sirdir:

    When I did this I wasn't even aware that it's a DNS problem. First idea was that it's an ISP problem.

    Broken DNS being served via DHCP by ISP sure like hell is ISP problem.



  • @sirdir:

    It were even 5 but I suspended one (and will probably cancel it). It's difficult to explain. First I had ADSL which is slow and flaky, then I added a WIFI link, then Sat, then a better WIFI link and then another WIFI Link that (because it's very cheap) should replace ADSL as a backup.

    Makes me wonder…who operates the WiFi APs? You neighbor, or your landlord, or some idiot who forgot to enable security on his AP...? :-)

    It might just be aomeone trying to perform an attack utlizing a fake DNS server (but obviously too incompetent to succeed).

    Well, I might just be paranoid. But that doesn't mean that conspiracy theories must be all wrong, right? Seen anything suspicious lately? UFOs? Elvis? Any droids which weren't the droids you were looking for? ;-)



  • As an X-Conspirator, I believe in some conspiracy theories…  No reptiles though...  Thats just crazy talk  :P


  • Rebel Alliance Global Moderator

    "one of the lookup fails that may cause problems that don't directly point to dns problems."

    How is that?  That would be the first thing it would point too, if something doesn't load you would verify name resolution.  Once you verify name resolution, then you check connectivity.  Your name resolution problem may well be a connectivity issue.

    Some websites don't load, images not working - so try a different snap?? Come on dude seriously??



  • @doktornotor:

    @sirdir:

    When I did this I wasn't even aware that it's a DNS problem. First idea was that it's an ISP problem.

    Broken DNS being served via DHCP by ISP sure like hell is ISP problem.

    I agree. On the other hand, 2 bad ones out of 8 shouldn't be a problem (even 7 out of 8 shouldn't). But we're running circles. Maybe I'll try to reproduce the problem some day. What's the best way to capture dns requests on pfsense? Seams to be possible within the gui as I saw in the other posting? For whatever reason the list of available packets doesn't load right now…



  • Nonsense!  Don't stop now.  I've just gotten my popcorn and soda :-[



  • @Klaws:

    Makes me wonder…who operates the WiFi APs? You neighbor, or your landlord, or some idiot who forgot to enable security on his AP...? :-)

    In this area there are a lot of ISPs that provide their services with directed pt2pt WIFI links. The other side of my main link is more than 8 km away on a hill. There's no neighbor signal I could pick up ;)



  • @johnpoz:

    "one of the lookup fails that may cause problems that don't directly point to dns problems."

    How is that?  That would be the first thing it would point too, if something doesn't load you would verify name resolution.  Once you verify name resolution, then you check connectivity.  Your name resolution problem may well be a connectivity issue.

    Some websites don't load, images not working - so try a different snap?? Come on dude seriously??

    It was not the first thing I did… Listen, I already know you're a genious, OK? As I didn't have any DNS problems the last years when some pages don't load correctly it wasn't the first thing to come to my mind. And unfortunately dig and nslookup behave quite differently form safari. It wouldn't have been the first time my multi WAN setup was causing problems and it wouldn't have been the first time 'trying another snap' would resolve it. Heck, the last few builds even crashed safari beta builds.