• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Help in understanding Unbound's host cache limit

Scheduled Pinned Locked Moved DHCP and DNS
9 Posts 3 Posters 892 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C
    chickendog
    last edited by Sep 17, 2024, 11:56 AM

    I tried my best Google this but haven't been able to get a clear answer.

    I've been watching the msg.cache.count metric via this command:

    unbound-control -c /var/unbound/unbound.conf stats_noreset | egrep 'total.num|cache.count'
    

    I have host cache num-hosts set to the default of 10000 but the msg.cache.count is exceeding this number. Should Unbound not be evicting records once 10k is reached?

    The results of the command is:

    total.num.queries=200478
    total.num.queries_ip_ratelimited=0
    total.num.queries_cookie_valid=0
    total.num.queries_cookie_client=0
    total.num.queries_cookie_invalid=0
    total.num.cachehits=181155
    total.num.cachemiss=19323
    total.num.prefetch=82558
    total.num.queries_timed_out=0
    total.num.expired=82558
    total.num.recursivereplies=18665
    total.num.dnscrypt.crypted=0
    total.num.dnscrypt.cert=0
    total.num.dnscrypt.cleartext=0
    total.num.dnscrypt.malformed=0
    msg.cache.count=13625
    rrset.cache.count=9984
    infra.cache.count=2
    key.cache.count=0
    dnscrypt_shared_secret.cache.count=0
    dnscrypt_nonce.cache.count=0
    

    I have recently turned on serve-expired and set serve-expired-ttl: 86400 but I'm not sure that matters in this case? As the cache limit should still be in effect?

    J 1 Reply Last reply Sep 17, 2024, 1:27 PM Reply Quote 0
    • J
      johnpoz LAYER 8 Global Moderator @chickendog
      last edited by johnpoz Sep 17, 2024, 1:38 PM Sep 17, 2024, 1:27 PM

      @chickendog said in Help in understanding Unbound's host cache limit:

      rrset.cache.count=9984

      Pretty sure that his your host cache count.. Msg count would be more validation results and rcodes, etc.. Ie the headers so to speak from a query..

      Take your forwarding and not resolving.. because your infra count is super low

      As to serve zero counting against your host cache - hmmm, never looked into that.. Just an off the cuff guess, since even if the ttl is zero it would still be in the cache.. So I would think it would be purged as you hit your limit.. You could set your host cache to something super low and do some testing.. but 10k hosts cached is a lot of records ;)

      If your concerned bump it up.

      An intelligent man is sometimes forced to be drunk to spend time with his fools
      If you get confused: Listen to the Music Play
      Please don't Chat/PM me for help, unless mod related
      SG-4860 24.11 | Lab VMs 2.7.2, 24.11

      C 1 Reply Last reply Sep 17, 2024, 11:53 PM Reply Quote 1
      • C
        chickendog @johnpoz
        last edited by Sep 17, 2024, 11:53 PM

        @johnpoz I see thanks for enlightening me.

        Yep I am forwarding.

        10k is indeed a lot but that's just over one day in my environment with serving expired turned on. I ended up restarting unbound for something else so I will monitor the rrset.cache.count and see what it does.

        I think I'll leave it at 10k for now. I want to have a fairly up-to-date cache where records are both fresh for frequently used domains but stale records don't stay there for more than a day. Serve-expired in tandem with serve-expired-ttl seems to do want I want.

        Prefetch doesn't work that well because you need to hit the record within 10% of the TTL. For low TTL records this isn't that good because while you may have devices requesting a given record frequently for some time. If that domain is no longer used for say 5 mins if the TTL is 5min then it get's expired and removed from the cache. Many domains are low TTL so this happens a lot.

        Take a scenario where a device is doing X on a website at 9am, then stops, comes back at 11am. All records are expired and gone so new ones must be acquired.

        I know what you're thinking, other thing I could do is set minimum TTL to higher, say one day, BUT the negative there is that assuming these domain owners have set the TTL low for some reason my record will be even more stale than with serve-expired where the record is served and then refreshed then and there. So it's kind of best of both worlds.

        Anyway I have gone off on a tangent and yes I am probably over-engineering it but these settings are there for a reason ;)

        Thanks again @johnpoz

        J 1 Reply Last reply Sep 18, 2024, 12:04 AM Reply Quote 0
        • J
          johnpoz LAYER 8 Global Moderator @chickendog
          last edited by Sep 18, 2024, 12:04 AM

          @chickendog said in Help in understanding Unbound's host cache limit:

          need to hit the record within 10% of the TTL.

          I don't think that is the way it works, its worded a bit funny - where they say it could increase your dns traffic by 10%

          total.num.prefetch=7349

          is what I show for number times unbound has prefetched..

          Not a fan of very low ttls - its pointless.. Unless you were about to change your record to point to a new IP, have ttls in 30 and 60 second range does nothing but increase dns traffic - for what point other then them wanting to figure out how long maybe you been on some site?

          I set my min ttl to 3600 seconds, yeah its not good practice to mess with ttls set by the owners - but its stupid to have be do a query every 30 freaking seconds for something.. Defaults the whole point of a "cache" if you ask me.

          I serve zero, prefetch and min ttl 3600.. And I have yet to run into any sort of issue that I am aware of or have noticed anything odd, etc.

          An intelligent man is sometimes forced to be drunk to spend time with his fools
          If you get confused: Listen to the Music Play
          Please don't Chat/PM me for help, unless mod related
          SG-4860 24.11 | Lab VMs 2.7.2, 24.11

          C 1 Reply Last reply Sep 18, 2024, 12:17 AM Reply Quote 1
          • C
            chickendog @johnpoz
            last edited by chickendog Sep 18, 2024, 5:47 AM Sep 18, 2024, 12:17 AM

            @johnpoz I read that is the way it works on the Unbound docs here: https://unbound.docs.nlnetlabs.nl/en/latest/topics/core/serve-stale.html#serve-expired

            total.num.prefetch is having it's count incremented just from having serve-expired turned on. I don't have prefetch turned on anymore. It makes sense though, it technically is a prefetch that it does after it serves the record. You can check the code to validate this, the above docs also describe this.

            And yeah I have minimum TTL set to 5min from the upstream DNS provider NextDNS - so I catch that silly situation as well.

            G 1 Reply Last reply Sep 18, 2024, 5:38 AM Reply Quote 0
            • G
              Gertjan @chickendog
              last edited by Gertjan Sep 18, 2024, 5:41 AM Sep 18, 2024, 5:38 AM

              @chickendog

              From what I make of it, when

              a40faaf3-e44c-45a8-8547-10f2f614459e-image.png

              is activated, record won't expire anymore, and refreshed when needed.
              So "serve-expired" becomes irrelevant.
              Your local unbound dns cache slowly fills up with the DNS names you most often use, no more waiting for DNS.

              No "help me" PM's please. Use the forum, the community will thank you.
              Edit : and where are the logs ??

              C 1 Reply Last reply Sep 18, 2024, 5:47 AM Reply Quote 0
              • C
                chickendog @Gertjan
                last edited by Sep 18, 2024, 5:47 AM

                @Gertjan No that's not correct. You need to read the Unbound docs, see my link above.
                Or better yet sift through the code.

                G 1 Reply Last reply Sep 18, 2024, 6:16 AM Reply Quote 0
                • G
                  Gertjan @chickendog
                  last edited by Sep 18, 2024, 6:16 AM

                  @chickendog

                  What is incorrect ?
                  Prefetching ?

                  The description seems fine to me :

                  e80ef4c8-f289-4b72-b2d6-3299bf835bd8-image.png

                  When records already present in the cache, are refreshed 90 % of their TTL - so not expired yet - and updated within a second, these records can't expire anymore (except if the TTL was less then 10 seconds ^^)

                  My own measurements (munin script running unbound-control to question the unbound stats) shows me that nearly all requests are handled with data available in the cache.

                  No "help me" PM's please. Use the forum, the community will thank you.
                  Edit : and where are the logs ??

                  C 1 Reply Last reply Sep 18, 2024, 7:36 AM Reply Quote 0
                  • C
                    chickendog @Gertjan
                    last edited by Sep 18, 2024, 7:36 AM

                    @Gertjan Not correct as in serve-expired is not irrelevant.
                    Your case might work ok for you, as it depends on how many clients you have and what domains they are requesting.
                    But if a domain is not requested within 10% of the TTL then it will not be prefetched.

                    If you don't believe me you can check the code, or ask the dev
                    https://github.com/search?q=repo%3ANLnetLabs%2Funbound%20prefetch&type=code

                    So a scenario where a record is fetched, not reused within it's TTL, and then expired - (thereby removed from the cache) is required to be fetched again even with prefetch enabled.
                    Say you are only using prefetch....with 5 min TTLs or less (or even 30 min TTLs) or less being the norm today, you can have scenarios where peak periods of your network are serviced well by the cache. But if there is another peak period later in the day, the cache has to get almost rebuilt.

                    With serve-expired you can keep these records in the cache from one peak period to the next. Then use serve-expired-ttl to optimise how long the records are kept. For me 1 day is good so that the peak periods throughout the day are served with an already healthy cache before the device requests it.

                    Hope that makes sense.

                    1 Reply Last reply Reply Quote 0
                    1 out of 9
                    • First post
                      1/9
                      Last post
                    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
                      This community forum collects and processes your personal information.
                      consent.not_received