Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Extremely slow DNS solved by disabling & re-enabling Python mode (unbound)

    Scheduled Pinned Locked Moved DHCP and DNS
    19 Posts 5 Posters 1.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • HLPPCH
      HLPPC Galactic Empire @bernieke
      last edited by HLPPC

      @bernieke Could be a dns rebind attack. You could try to capture http requests sent to and from unbound.

      https://youtu.be/y9-0lICNjOQ?si=61DIfH4BWwwGqGIK

      Says that pfSense can have issues

      A friend of mine says that some ISPs bad Routing Public Key Infastructure Means people are subject to replay attacks and DNS rebind attacks. And also random interference if someone else out there has the same ipv4 address as you.

      Also maybe try running tests on internet.nl see if your dns is encrypting correctly.

      https://youtu.be/YKxKnVE5FaE?si=gCvUZ9IOFOVISHf4

      Python is weird sometimes too. I do like cins army blocklists on a WAN.

      1 Reply Last reply Reply Quote 0
      • HLPPCH
        HLPPC Galactic Empire @bernieke
        last edited by HLPPC

        @bernieke

        Maybe the ISP wants your attention or you to do actual routing with BGP or something. They sometimes downgrade dns

        https://youtu.be/jXG8fuJ-fUI?si=zTHyKmZNEy5vlDAC

        If you are blocking dns, sometimes the same dns requests will repeat and go all the way around the world until successful.

        pfBlocker has sent me to Qatar for google ads while blocking google but using google dns. Plausible too since jitter and whatnot with fq_codel is like zero. You can encrypt DNS extensions and use dns encryption curves and whatnot on some sides of the planet more bearably than others.

        1 RSA/MD5 Must Not Implement Must Not Implement
        3 DSA/SHA-1 Must Not Implement Must Not

        2 SHA-256 RFC 4509 Required Required

        Also, some dns encryption allows for dnssec and others forbid it. Fiber may employ TR-069 which uses MD5 somewhere and that isn't supposed to show up in dns encryption tests, according to a friend who wrote the internet.nl website.

        Their dnssec resolver algorithm test is pretty good.

        https://en.m.wikipedia.org/wiki/Domain_Name_System_Security_Extensions

        Might be some Big-O stuff I didn't go to school for 😂

        GertjanG 1 Reply Last reply Reply Quote 0
        • GertjanG
          Gertjan @HLPPC
          last edited by

          @HLPPC said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

          according to a friend who wrote the internet.nl website.

          When you see him next time, show him this :

          2811bf77-8759-4fea-8b68-e0c5f7803ed8-image.png

          compared to :

          bd005442-200a-4175-a6d9-7598a1e484c7-image.png

          Mails I send to gmail (example) confirm that my domain name, my mail server supports DANE.
          This makes me doubt about the "internet.nl" DANE test.

          @HLPPC said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

          Their dnssec resolver algorithm test is pretty good.

          Those who implement DNNSEC will probably use :
          https://dnsviz.net/ example : https://dnsviz.net/d/test-domaine.fr/dnssec/
          https://dnssec-analyzer.verisignlabs.com/

          as these sites will indicate what is wrong if things are wrong.

          @HLPPC said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

          Also, some dns encryption allows for dnssec and others forbid it

          DNSSEC isn't about encryption ( ≈ making your DNS requests invisible for others ).
          They have to not-TLS as the DNS root servers don't support secured DNS (DNS over TLS) yet. And IMHO, it will take sometime before they do, as if all DNS has to go over TLS it would need to "thousand fold" the capacity of each DNS server, root and TLD, as TLS needs way more system resources as plain text small UDP packets.

          The (your) wiki page :

          DNSSEC does not provide confidentiality of data; in particular, all DNSSEC responses are authenticated but not encrypted.

          DNSSEC is only about getting the correct answer.

          No "help me" PM's please. Use the forum, the community will thank you.
          Edit : and where are the logs ??

          1 Reply Last reply Reply Quote 1
          • B
            bernieke
            last edited by bernieke

            The past two days without pfblockerng worked perfectly fine. So I think we can conclude the problem isn't with DNS itself.

            As soon as I enable pfblockerng + perform an update, I get the slow resolving. (I didn't realize the update was necessary when debugging this on Sunday, which is why the problem took a while to reappear, it only came after the next update on the hour.)

            It's not just related to the update itself either, the problem doesn't go away afterwards, even after multiple hours. So simply changing the schedule to daily / weekly won't help me either.

            There was mention of unbound restarts, but looking at the logs and the start time in a "ps aux|grep unbound" this doesn't seem to be the case for me, both indicate the time of my last change.

            Disabling DNSBL, and updating again, fixes the problem as well. So clearly the problem is specifically with DNSBL.

            I've now been trying to see if it's any specific group of feeds that's causing the problem.

            The first step was enabling DNSBL with all groups disabled. This was stable.

            I then tried enabling groups one by one. But what I'm seeing is that the problem starts occurring at some point (not immediately after the update though, so it's unclear as of now which (if any) group might be the cause), but simply disabling groups and updating won't make it go away afterwards either (even if I disable all the groups), until I actually disable DNSBL altogether! So it seems something is getting stuck on the unbound side? Maybe the cache which I see mention of being restored after the unbound restart?

            I'll continue trying to play with enabling groups of feeds to see if I can further pinpoint the problem (but this time with DNSBL disables / enables in between the changes). Although I'm starting to think it's not with any specific feed, but rather something (cache?) which keeps being added to instead of replacing / updating...

            EDIT: It just happened with all groups disabled as well.

            HLPPCH GertjanG 2 Replies Last reply Reply Quote 0
            • HLPPCH
              HLPPC Galactic Empire @bernieke
              last edited by

              The ambient temperature with my ISP's equipment has to be like, 180 degrees F (82 C) right now. I wonder if that matters with dns and weird retransmissions.

              1 Reply Last reply Reply Quote 0
              • GertjanG
                Gertjan @bernieke
                last edited by

                @bernieke said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                There was mention of unbound restarts, but looking at the logs and the start time in a "ps aux|grep unbound" this doesn't seem to be the case for me, both indicate the time of my last change.

                Use this to get start (and thus stop) moments :

                grep 'start' /var/log/resolver.log
                

                Your stats :

                @bernieke said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                Jun 16 10:09:15 cerberus unbound[35808]: [35808:0] info: average recursion processing time 12.517736 sec

                @bernieke said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                Jun 16 10:09:15 cerberus unbound[35808]: [35808:0] info: average recursion processing time 10.398178 sec

                These two are the times it takes for a resolve process : get in contact with a root server (maybe already cached), then a TLD (may be already cached) and then a domain name server for an, example, A record.
                These times are huge !

                Just to be sure : these two, do they become way faster when there is not 'pfBlockerng' ?
                or 'pfBlockerng' but no 'DNSBL' ?
                Btw : python mode was introduced so 'plugins' or 'packages' or 'add-ons' could be written for unbound. Doing 'DNSBL' is just one of the possibilities = comparing the requested host name with a big list.

                Typically, I see :

                total.recursion.time.avg=0.128068
                

                Which is still a whopping 128 msec

                How big is your main DNSBL list ?

                b0648c60-d29c-48e3-b40e-7a15837f1537-image.png

                The list is stored in /var/unbound/pfb_py_hsts.txt ( I guess ).

                Btw : you can see stats all the time with :

                unbound-control -c /var/unbound/unbound.conf stats
                

                No "help me" PM's please. Use the forum, the community will thank you.
                Edit : and where are the logs ??

                B 1 Reply Last reply Reply Quote 0
                • B
                  bernieke @Gertjan
                  last edited by

                  @Gertjan said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                  Use this to get start (and thus stop) moments :

                  grep 'start' /var/log/resolver.log
                  

                  No starts except for the ones I initiated.

                  Your stats :
                  These times are huge !

                  Just to be sure : these two, do they become way faster when there is not 'pfBlockerng' ?
                  or 'pfBlockerng' but no 'DNSBL' ?

                  This is with pfblockerng enabled and DNSBL disabled:

                  thread0.recursion.time.avg=0.031121
                  thread0.recursion.time.median=0.0216847
                  thread1.recursion.time.avg=0.027506
                  thread1.recursion.time.median=0.015819
                  total.recursion.time.avg=0.029363
                  total.recursion.time.median=0.0187519
                  

                  How big is your main DNSBL list ?

                  It happens even with all the groups disabled, just having DSNBL enabled is sufficient (or, I guess, more correctly: having the unbound python module enabled?):

                   UPDATE PROCESS START [ v3.2.0_10 ] [ 06/19/24 08:38:02 ]
                  
                  ===[  DNSBL Process  ]================================================
                  
                   Loading DNSBL Statistics... completed
                   Missing DNSBL stats and/or Unbound DNSBL files - Rebuilding
                  
                   Loading DNSBL SafeSearch... enabled
                   Loading DNSBL Whitelist... completed
                   Loading TOP1M Whitelist... completed
                  
                  Clearing all DNSBL Feeds
                  Added DNSBL Unbound python integration settings
                  Adding DNSBL Unbound python mounts:
                    Creating: /var/unbound/usr/local/bin
                    Mounting: /usr/local/bin
                    Creating: /var/unbound/usr/local/lib
                    Mounting: /usr/local/lib
                  
                  DNS Resolver ( enabled ) unbound.conf modifications:
                    Added DNSBL Unbound Python mode
                    Added DNSBL Unbound Python mode script
                  
                  VIP address(es) configured
                  Restarting DNSBL Service
                  
                  TLD Analysis not required.
                  Stopping Unbound Resolver
                  Unbound stopped in 1 sec.
                  Additional mounts (DNSBL python):
                     Mounting: /dev
                  Starting Unbound Resolver... completed [ 06/19/24 08:38:22 ]
                  Restarting DNSBL Service (DNSBL python)
                  DNSBL update [ 0 | PASSED  ]... completed
                  ------------------------------------------------------------------------
                  
                  ===[  GeoIP Process  ]============================================
                  
                  
                  ===[  IPv4 Process  ]=================================================
                  
                  [ Amazon_AWS_v4 ]		 exists.
                  [ Atlassian_v4 ]		 exists.
                  [ Cloudflare_v4 ]		 exists.
                  [ GitHub_v4 ]			 exists.
                  [ Google_v4 ]			 exists.
                  [ Office_365_v4 ]		 exists.
                  [ Zendesk_v4 ]			 exists.
                  [ Abuse_Feodo_C2_v4 ]		 exists.
                  [ Abuse_SSLBL_v4 ]		 exists.
                  [ CINS_army_v4 ]		 exists.
                  [ ET_Block_v4 ]			 exists.
                  [ ET_Comp_v4 ]			 exists.
                  [ ISC_Block_v4 ]		 exists.
                  [ Spamhaus_Drop_v4 ]		 exists.
                  [ Spamhaus_eDrop_v4 ]		 exists.
                  [ Talos_BL_v4 ]			 exists.
                  [ MS_1_v4 ]			 exists.
                  
                  ===[  IPv6 Process  ]=================================================
                  
                  [ Amazon_AWS_v6 ]		 exists.
                  [ Atlassian_v6 ]		 exists.
                  [ Cloudflare_v6 ]		 exists.
                  [ GitHub_v6 ]			 exists.
                  [ Google_v6 ]			 exists.
                  [ Office_365_v6 ]		 exists.
                  
                  ===[  Aliastables / Rules  ]==========================================
                  
                  No changes to Firewall rules, skipping Filter Reload
                  No Changes to Aliases, Skipping pfctl Update
                  
                  
                  ** Restarting firewall filter daemon **
                  
                   UPDATE PROCESS ENDED [ 06/19/24 08:38:32 ]
                  

                  I just now tested this again, and I got timeout errors five minutes after enabling DSNBL without any feeds.

                  These are the unbound stats when the problem is occuring (note that getting these stats took a few seconds as well, while with DNSBL disabled they are immediate):

                  thread0.num.queries=86937
                  thread0.num.queries_ip_ratelimited=0
                  thread0.num.queries_cookie_valid=0
                  thread0.num.queries_cookie_client=0
                  thread0.num.queries_cookie_invalid=0
                  thread0.num.cachehits=83450
                  thread0.num.cachemiss=3487
                  thread0.num.prefetch=0
                  thread0.num.queries_timed_out=0
                  thread0.query.queue_time_us.max=0
                  thread0.num.expired=0
                  thread0.num.recursivereplies=3485
                  thread0.num.dnscrypt.crypted=0
                  thread0.num.dnscrypt.cert=0
                  thread0.num.dnscrypt.cleartext=0
                  thread0.num.dnscrypt.malformed=0
                  thread0.requestlist.avg=7.79036
                  thread0.requestlist.max=27
                  thread0.requestlist.overwritten=0
                  thread0.requestlist.exceeded=0
                  thread0.requestlist.current.all=2
                  thread0.requestlist.current.user=2
                  thread0.recursion.time.avg=10.313877
                  thread0.recursion.time.median=9.76303
                  thread0.tcpusage=0
                  thread1.num.queries=84599
                  thread1.num.queries_ip_ratelimited=0
                  thread1.num.queries_cookie_valid=0
                  thread1.num.queries_cookie_client=0
                  thread1.num.queries_cookie_invalid=0
                  thread1.num.cachehits=81243
                  thread1.num.cachemiss=3356
                  thread1.num.prefetch=0
                  thread1.num.queries_timed_out=0
                  thread1.query.queue_time_us.max=0
                  thread1.num.expired=0
                  thread1.num.recursivereplies=3355
                  thread1.num.dnscrypt.crypted=0
                  thread1.num.dnscrypt.cert=0
                  thread1.num.dnscrypt.cleartext=0
                  thread1.num.dnscrypt.malformed=0
                  thread1.requestlist.avg=8.15435
                  thread1.requestlist.max=22
                  thread1.requestlist.overwritten=0
                  thread1.requestlist.exceeded=0
                  thread1.requestlist.current.all=1
                  thread1.requestlist.current.user=1
                  thread1.recursion.time.avg=11.145005
                  thread1.recursion.time.median=10.6155
                  thread1.tcpusage=0
                  total.num.queries=171536
                  total.num.queries_ip_ratelimited=0
                  total.num.queries_cookie_valid=0
                  total.num.queries_cookie_client=0
                  total.num.queries_cookie_invalid=0
                  total.num.cachehits=164693
                  total.num.cachemiss=6843
                  total.num.prefetch=0
                  total.num.queries_timed_out=0
                  total.query.queue_time_us.max=0
                  total.num.expired=0
                  total.num.recursivereplies=6840
                  total.num.dnscrypt.crypted=0
                  total.num.dnscrypt.cert=0
                  total.num.dnscrypt.cleartext=0
                  total.num.dnscrypt.malformed=0
                  total.requestlist.avg=7.96887
                  total.requestlist.max=27
                  total.requestlist.overwritten=0
                  total.requestlist.exceeded=0
                  total.requestlist.current.all=3
                  total.requestlist.current.user=3
                  total.recursion.time.avg=10.721543
                  total.recursion.time.median=10.1893
                  total.tcpusage=0
                  time.now=1718782293.781636
                  time.up=3192.962215
                  time.elapsed=3192.962215
                  mem.cache.rrset=292578
                  mem.cache.message=309466
                  mem.mod.iterator=16716
                  mem.mod.validator=0
                  mem.mod.respip=0
                  mem.cache.dnscrypt_shared_secret=0
                  mem.cache.dnscrypt_nonce=0
                  mem.mod.dynlibmod=0
                  mem.streamwait=0
                  mem.http.query_buffer=0
                  mem.http.response_buffer=0
                  histogram.000000.000000.to.000000.000001=41
                  histogram.000000.000001.to.000000.000002=0
                  histogram.000000.000002.to.000000.000004=0
                  histogram.000000.000004.to.000000.000008=0
                  histogram.000000.000008.to.000000.000016=0
                  histogram.000000.000016.to.000000.000032=0
                  histogram.000000.000032.to.000000.000064=0
                  histogram.000000.000064.to.000000.000128=0
                  histogram.000000.000128.to.000000.000256=0
                  histogram.000000.000256.to.000000.000512=0
                  histogram.000000.000512.to.000000.001024=0
                  histogram.000000.001024.to.000000.002048=0
                  histogram.000000.002048.to.000000.004096=0
                  histogram.000000.004096.to.000000.008192=2
                  histogram.000000.008192.to.000000.016384=79
                  histogram.000000.016384.to.000000.032768=57
                  histogram.000000.032768.to.000000.065536=50
                  histogram.000000.065536.to.000000.131072=34
                  histogram.000000.131072.to.000000.262144=5
                  histogram.000000.262144.to.000000.524288=75
                  histogram.000000.524288.to.000001.000000=92
                  histogram.000001.000000.to.000002.000000=252
                  histogram.000002.000000.to.000004.000000=676
                  histogram.000004.000000.to.000008.000000=1229
                  histogram.000008.000000.to.000016.000000=3014
                  histogram.000016.000000.to.000032.000000=1054
                  histogram.000032.000000.to.000064.000000=177
                  histogram.000064.000000.to.000128.000000=3
                  histogram.000128.000000.to.000256.000000=0
                  histogram.000256.000000.to.000512.000000=0
                  histogram.000512.000000.to.001024.000000=0
                  histogram.001024.000000.to.002048.000000=0
                  histogram.002048.000000.to.004096.000000=0
                  histogram.004096.000000.to.008192.000000=0
                  histogram.008192.000000.to.016384.000000=0
                  histogram.016384.000000.to.032768.000000=0
                  histogram.032768.000000.to.065536.000000=0
                  histogram.065536.000000.to.131072.000000=0
                  histogram.131072.000000.to.262144.000000=0
                  histogram.262144.000000.to.524288.000000=0
                  num.query.type.A=9419
                  num.query.type.PTR=8
                  num.query.type.AAAA=161889
                  num.query.type.SRV=28
                  num.query.type.HTTPS=192
                  num.query.class.IN=171536
                  num.query.opcode.QUERY=171536
                  num.query.tcp=136
                  num.query.tcpout=0
                  num.query.udpout=4380
                  num.query.tls=0
                  num.query.tls.resume=0
                  num.query.ipv6=0
                  num.query.https=0
                  num.query.flags.QR=0
                  num.query.flags.AA=0
                  num.query.flags.TC=0
                  num.query.flags.RD=171536
                  num.query.flags.RA=0
                  num.query.flags.Z=0
                  num.query.flags.AD=25
                  num.query.flags.CD=0
                  num.query.edns.present=161685
                  num.query.edns.DO=0
                  num.answer.rcode.NOERROR=82178
                  num.answer.rcode.FORMERR=0
                  num.answer.rcode.SERVFAIL=0
                  num.answer.rcode.NXDOMAIN=89355
                  num.answer.rcode.NOTIMPL=0
                  num.answer.rcode.REFUSED=0
                  num.answer.rcode.nodata=72787
                  num.query.ratelimited=0
                  num.answer.secure=0
                  num.answer.bogus=0
                  num.rrset.bogus=0
                  num.query.aggressive.NOERROR=0
                  num.query.aggressive.NXDOMAIN=0
                  unwanted.queries=0
                  unwanted.replies=0
                  msg.cache.count=1096
                  rrset.cache.count=977
                  infra.cache.count=2
                  key.cache.count=0
                  msg.cache.max_collisions=4
                  rrset.cache.max_collisions=3
                  dnscrypt_shared_secret.cache.count=0
                  dnscrypt_nonce.cache.count=0
                  num.query.dnscrypt.shared_secret.cachemiss=0
                  num.query.dnscrypt.replay=0
                  num.query.authzone.up=0
                  num.query.authzone.down=0
                  
                  GertjanG 1 Reply Last reply Reply Quote 1
                  • GertjanG
                    Gertjan @bernieke
                    last edited by

                    @bernieke

                    Ok.
                    Hummmm.
                    That info doesn't look suspicious at all.

                    How do you use your IP feeds ? Are they used so you, pfSense, and your LAN devices, can't connect to them ?
                    I mean, if these lists contains also DNS domain name servers, then that's a shoot in the foot situation.

                    So, what about the other way around :
                    Use some DNSBL lists,
                    but no IP lists (or have these list block for inbound only, not outbound, which will make them useless, as per default WAN rules, all inbound is already blocked anyway).

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    B 1 Reply Last reply Reply Quote 0
                    • B
                      bernieke @Gertjan
                      last edited by

                      @Gertjan This was all working fine up to Saturday evening.

                      I'm not sure what you're asking concerning the IP feeds, this is what I have configured for them:
                      6bb3789e-e84c-4043-97b9-d5ed0062c548-image.png
                      I think it's a pretty standard setup.

                      I've just tried the following:

                      • enable DSNBL in unbound (non-python) mode, with full reload: works fine
                      • changed to python mode, with full reload: immediately have the problem again (30+% cpu usage of unbound, very slow recursion time with lots of time outs)
                      • changed back to non-python mode, with full reload: works fine (and again only 3% unbound cpu usage)

                      So it seems the issue is pretty clearly with the python module? Unless you think the IP feeds could have a different effect in the unbound vs unbound-python module?

                      The only thing I'm seeing with the non-python mode is an unbound-control 100% cpu spike every few seconds, but that's not hurting my dns resolution in any way:

                      total.recursion.time.avg=0.031341
                      

                      I switched to python mode a few weeks back since I needed to exclude our chromecast from the filtering, as some apps from the local broadcast networks wouldn't work otherwise.

                      I think I'm going to have a look at pi-hole or adguard home, and kick out pfblockerng altogether...

                      GertjanG 1 Reply Last reply Reply Quote 0
                      • GertjanG
                        Gertjan @bernieke
                        last edited by Gertjan

                        @bernieke said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                        I'm not sure what you're asking concerning the IP feeds,

                        How you use your listed IP feeds.
                        If some of them are outbound blocking
                        and
                        the list contains DNS servers you use
                        then ... well, that's problematic.

                        @bernieke said in Extremely slow DNS solved by disabling & re-enabling Python mode (unbound):

                        So it seems the issue is pretty clearly with the python module?

                        I get what you mean.
                        But the thing is : not every pfSense user (a million or so ?) has pfBlockerng installed.
                        But some, a couple of hundred thousands ( ? ) do use this pfSense package, I'm just one of them.
                        We al have this single file in common. That's the "python module" file, ever-body is using that file. No other files are involved.
                        => compare that file with yours.
                        Of course, there could be a bug. A bug you hit, and "no one" else... and that's why it way more plausible that it's a config (local setup) issue.
                        The question is : which one.

                        Btw : I'm using a 4100 with pfSense Plus 24.03, like you..
                        Only difference : you have a arm, I have a x86.

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        HLPPCH 1 Reply Last reply Reply Quote 0
                        • HLPPCH
                          HLPPC Galactic Empire @Gertjan
                          last edited by HLPPC

                          Maybe unbound and pfblocker should be set up before your cloud software and everything else. Immutable data types were just mentioned on youtube. I had to have a crack at threads misbehaving. You do have cache misses. Also you only have 4 gigs of memory.

                          Last time I ran wildcards in pfBlocker with it maxed out (and with Suicata maxed out) I ended up chewing through over 32 gigs of ram and halfway into a swap. With and without the python part it was and with and without suricata, it still took loads of memory.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.