• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Serve Expired - Clearification :)

DHCP and DNS
2
5
3.1k
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T
    Taz79
    last edited by Taz79 Apr 15, 2019, 6:46 AM Apr 15, 2019, 6:37 AM

    Hello!

    I have some questions about the feature "Serve Expired". It might be basic DNS knowledge though. I have tried googling about it without finding much about how its handled exactly..

    login-to-view

    After 2 days my DNS statistics looks like this:

    unbound-control -c /var/unbound/unbound.conf stats_noreset | grep total.num
    total.num.queries=131025
    total.num.queries_ip_ratelimited=0
    total.num.cachehits=127505
    total.num.cachemiss=3520
    total.num.prefetch=15811
    total.num.zero_ttl=15027
    total.num.recursivereplies=3520
    

    With Cache count:

    unbound-control -c /var/unbound/unbound.conf stats_noreset |grep cache.count
    msg.cache.count=4722
    rrset.cache.count=10998
    infra.cache.count=3620
    key.cache.count=680
    

    So what i could gather from this is that more than 10% of the queries ends up using a DNS entry wich has TTL=0. I seem to have very little cachemiss hits.. Only 2,7%

    So my questions is:
    How does Serve Expire work? Will the records stay in the DNS cache with TTL=0 forever until it gets a hit again? Or will the TTL=0 entries be purged by some setting eventually?

    This is what i have found in the documentation from Unbound regarding the different statistic topics:

    num.queries
    number of queries received by thread

    num.cachehits
    number of queries that were successfully answered using a cache
    lookup

    num.cachemiss
    number of queries that needed recursive processing

    num.prefetch
    number of cache prefetches performed. This number is included
    in cachehits, as the original query had the unprefetched answer
    from cache, and resulted in recursive processing, taking a slot
    in the requestlist. Not part of the recursivereplies (or the
    histogram thereof) or cachemiss, as a cache response was sent.

    num.zero_ttl
    number of replies with ttl zero, because they served an expired
    cache entry.

    num.recursivereplies
    The number of replies sent to queries that needed recursive pro-
    cessing. Could be smaller than threadX.num.cachemiss if due to
    timeouts no replies were sent for some queries.

    msg.cache.count
    The number of items (DNS replies) in the message cache.

    rrset.cache.count
    The number of RRsets in the rrset cache. This includes rrsets
    used by the messages in the message cache, but also delegation
    information.

    infra.cache.count
    The number of items in the infra cache. These are IP addresses
    with their timing and protocol support information.

    key.cache.count
    The number of items in the key cache. These are DNSSEC keys,
    one item per delegation point, and their validation status.

    1 Reply Last reply Reply Quote 0
    • T
      Taz79
      last edited by Apr 16, 2019, 6:26 AM

      Been monitoring this since yesterday and i cannot see that the cache.count is declining at all. So it seems all the TTL=0 records stays in the cache?

      [2.4.4-RELEASE][admin@Fenix.localdomain]/root: unbound-control -c /var/unbound/unbound.conf stats_noreset | egrep 'total.num|cache.count'

      15/4-2019 10:30
      total.num.queries=138229
      total.num.queries_ip_ratelimited=0
      total.num.cachehits=134153
      total.num.cachemiss=4076
      total.num.prefetch=17233
      total.num.zero_ttl=16396
      total.num.recursivereplies=4076
      msg.cache.count=5893
      rrset.cache.count=13071
      infra.cache.count=4319
      key.cache.count=884
      
      15/4-2019  23:11
      total.num.queries=178540
      total.num.queries_ip_ratelimited=0
      total.num.cachehits=173816
      total.num.cachemiss=4724
      total.num.prefetch=23519
      total.num.zero_ttl=22422
      total.num.recursivereplies=4724
      msg.cache.count=6518
      rrset.cache.count=13949
      infra.cache.count=4848
      key.cache.count=957
      
      16/4-2019 08:11
      total.num.queries=203688
      total.num.queries_ip_ratelimited=0
      total.num.cachehits=198712
      total.num.cachemiss=4976
      total.num.prefetch=25949
      total.num.zero_ttl=24683
      total.num.recursivereplies=4976
      msg.cache.count=6774
      rrset.cache.count=14133
      infra.cache.count=5119
      key.cache.count=961
      
      1 Reply Last reply Reply Quote 0
      • T
        Taz79
        last edited by Apr 16, 2019, 6:47 AM

        I found some more configuration entries for serve-expired.. So this parameters explains it all. The TTL 0 entries will stay in cache if these entries are not used. That is what i was looking for.. :) Case closed! :)

           serve-expired-ttl: <seconds>
                  Limit serving of expired responses to configured seconds after
                  expiration. 0 disables the limit. This option only applies when
                  serve-expired is enabled. The default is 0.
        
           serve-expired-ttl-reset: <yes or no>
                  Set the TTL of expired records to the serve-expired-ttl value
                  after a failed attempt to retrieve the record from upstream.
                  This makes sure that the expired records will be served as long
                  as there are queries for it. Default is "no".
        
        1 Reply Last reply Reply Quote 0
        • C
          chrcoluk
          last edited by Apr 16, 2019, 8:25 AM

          I am the source of the feature been added to pfsense.

          So basically.

          The reaosn it was added is in the modern itnernet many mainstream services use DNS to route their traffic, and because of things like maintenance, DDOS attacks and so forth, they use extremely low TTL values, so they can reroute very quickly if required.

          TTL values of 30 seconds or less is now fairly common.

          As you can imagine, having to do a new DNS lookup so often has a performance hit.

          The issue with the prefetch feature is it only works if you do a DNS lookup when less than 10% of the TTL is left, so basically with a 30 secs TTL, if you dont do another lookup within the last 3 seconds of the TTL, then prefetch isnt providing you any benefit. Its operating scope is too narrow.

          So unbound implemented serve expired, what it does is when a record is expired, it will stay in the cache with the TTL value as 0, if another lookup comes in from the LAN (or to whatever networks your unbound is serving), then it will be served as a cached record for performance. However at the same time a new lookup is initiated from unbound to the authoritative server, so when there is a newer lookup later, it will server a newer record.

          So its important to note the same expired record isnt served forever, its only served once, then a new one is fetched.

          Newer versions of unbound allow this to be tweaked further and the good news is in the latest stable build of pfsense, we have the newer version (it was updated for security), I am considering getting another commit done to take advantage of it, as there is now an option as well that if e.g. you are uncomfortable perhaps using a cached record that might have been sitting there for a day you can set an effective expiry on the cached record itself using the more granular controls now available, I will see if i can get that field added to the UI as well.

          pfSense CE 2.7.2

          1 Reply Last reply Reply Quote 1
          • C
            chrcoluk
            last edited by Apr 16, 2019, 8:34 AM

            Since I cannot edit (I cannot fix the typos sorry).

            But also to clarify, there is a reason this is off by default as you can imagine it is down to the admin if they are ok with records been served from a cache after they expired upstream :)

            The description in pfsense I tried to make as understanding as possible whilst as short as possible so it wasnt bloating the interface.

            pfSense CE 2.7.2

            1 Reply Last reply Reply Quote 1
            3 out of 5
            • First post
              3/5
              Last post
            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.