What is the optimal cache size?

Thanks for this article, it explains things very clearly. I am using the beta and unbound as described and it works very well for me. I am happy to accept a small performance penalty for the privacy benefits. In actual fact we have no perceivable performance hit - not noticed at all.

I did experiment with setting a larger cache (up from the default 10,000 to 250,000) but found I could only make this work by editing 01-pihole.conf. When I edited /etc/dnsmasq.conf Pi-hole's DNS services wouldn't run. Same if I created a new file (e.g. cache-increse.conf) in the /etc/dnsmasq.d directory.

Question: Is it only possible to do this in the 01-pihole.conf file as this somehow works around the native dnsmasq 10,000 cache limit?

Thanks for a great product!

1 Like

We've had some discussion about cache values lately, I think the native 10,000 is patched out, but the config may still be in force. @DL6ER was the patch for total cache, or TTL, or none of the above?

The modification (removal of upper limit) is for the total cache. You'll have to change it in 01-pihole.conf - not because this file is special but because it can only be set in one config file at a time. If you want to configure it in another file, it would be defined in there plus in 01-pihole.conf and the resolver wouldn't know which you actually want and hence fails.

1 Like

Thanks, solves the mystery. It's certainly working fine changing it in 01-pihole.conf. I'll just need to remember to change it after future upgrades.

This looks really cool - just installed it and so far looking good.

Followed the tutorial as listed. If done with all those steps, will the unbound service start automatically on reboot of the pi? Or is there another command we need to add to get it running on startup?

1 Like

Yes, unbound starts automatically afterwards.

What number do you recommend for the cache limit? I am trying this:

cache-size=100000

Cache sizes of more than maybe 1000 are only useful in very specific environments (very heterogeneous clients querying a large amount of different domains). I added a cache efficiency measure into FTLDNS (similar to what dnsmasq offers). Although it is not (yet) exposed nicely on the GUI, you can query it manually.

Run

echo ">cacheinfo" | nc pi.hole 4711

on your Pi-hole.

You should get something like

cache-size: 1000
cache-live-freed: 0
cache-inserted: 12620

The individual numbers mean the following:

  • cache-size - the (maximum) cache size. With Pi-hole, you typically specify this number directly in 01-pihole.conf. It is the number of entries that can be actively cached at the same time
  • cache-live-freed - the number of cache entries that had to be removed although they haven't expired so far. Cache entries get removed when the cache is full at some point and older cache entries need to be removed to fit newer queries in. The cache size should only be increased when this number is larger than zero
  • cache-inserted - total insertions of queries into the cache. This number may be substantially higher than cache-size because it is a global sum and cache entries naturally make room for new insertions over time when they expire

TL;DR: As long as cache-live-freed is really low (or even zero), your cache size is sufficient. It may even be too large.

I see that there is a certain lack of clarity when it comes to DNS caching and will consider to create a description for our documentation pages before we release Pi-hole v4.0.

4 Likes

Thanks, this is helpful and educational.

I had to use "echo ">cacheinfo" | nc 127.0.0.1 4711" (maybe because I had edited my hosts file to show "unbound" or 'cause I'm on the beta - not sure).

Anyway I get the result below which suggests 250000 is OVERKILL and not necessary at all.

Could you explain how the dnsmasq cache and the unbound cache work together?
Output of "unbound-control stats_noreset" is also below. Many thanks.

cache-size: 250000
cache-live-freed: 0
cache-inserted: 143
---EOM---

thread0.num.queries=8721
thread0.num.cachehits=640
thread0.num.cachemiss=8081
thread0.num.prefetch=167
thread0.num.zero_ttl=0
thread0.num.recursivereplies=8081
thread0.requestlist.avg=0.677861
thread0.requestlist.max=17
thread0.requestlist.overwritten=0
thread0.requestlist.exceeded=0
thread0.requestlist.current.all=0
thread0.requestlist.current.user=0
thread0.recursion.time.avg=0.298242
thread0.recursion.time.median=0.237763
thread0.tcpusage=0
total.num.queries=8721
total.num.cachehits=640
total.num.cachemiss=8081
total.num.prefetch=167
total.num.zero_ttl=0
total.num.recursivereplies=8081
total.requestlist.avg=0.677861
total.requestlist.max=17
total.requestlist.overwritten=0
total.requestlist.exceeded=0
total.requestlist.current.all=0
total.requestlist.current.user=0
total.recursion.time.avg=0.298242
total.recursion.time.median=0.237763
total.tcpusage=0
time.now=1528747140.238917
time.up=252015.844783
time.elapsed=252015.844783

Agreed.

They are separate. However, while dnsmasq will only cache results to actual requests (e.g. some.domain.de), unbound will cache also the intermediate steps along the DNS path, e.g.

;rrset 86392 6 0 2 0
de.     172792  IN      NS      n.de.net.
de.     172792  IN      NS      l.de.net.
de.     172792  IN      NS      z.nic.de.
de.     172792  IN      NS      a.nic.de.
de.     172792  IN      NS      s.de.net.
de.     172792  IN      NS      f.nic.de.
;rrset 86392 1 0 1 0
n.de.net.       172792  IN      AAAA    2001:67c:1011:1::53
;rrset 86392 1 0 1 0
l.de.net.       172792  IN      A       77.67.63.105
;rrset 86392 1 0 1 0
s.de.net.       172792  IN      AAAA    2003:8:14::53
;rrset 86392 1 0 1 0
f.nic.de.       172792  IN      AAAA    2a02:568:0:2::53
;rrset 86392 1 0 1 0
f.nic.de.       172792  IN      A       81.91.164.5
;rrset 86392 4 0 2 0
google.de.      86392   IN      NS      ns2.google.com.
google.de.      86392   IN      NS      ns4.google.com.
google.de.      86392   IN      NS      ns3.google.com.
google.de.      86392   IN      NS      ns1.google.com.
;rrset 86392 1 0 8 0
ns1.google.com. 345592  IN      AAAA    2001:4860:4802:32::a
;rrset 86392 1 0 8 0
ns1.google.com. 345592  IN      A       216.239.32.10
;rrset 3592 1 0 8 3
google.de.      3592    IN      A       172.217.16.195

Knowledge about how to resolve .de or google domains may come in handy for subsequent queries and making them notably faster.

1 Like

Great read, thanks to all for posting.

What is the "lifespan" "length of time until expiration" of an insertion?

What specifically is the difference between an insertion and an entry?

It's the ttl ("time-to-life"). See how it decreased from 206 (sec) to 201 sec in my example

nanopi@nanopi:~$ dig google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8595
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		206	IN	A	172.217.22.206

;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mi Nov 11 06:12:33 CET 2020
;; MSG SIZE  rcvd: 55

nanopi@nanopi:~$ dig google.com

; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53469
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		201	IN	A	172.217.22.206

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mi Nov 11 06:12:38 CET 2020
;; MSG SIZE  rcvd: 55
1 Like

(please see edit at end)
Thank you, and can you please explain.

That seems counterintuitive. TTL is number of hops, in some sense, of a packet. (sure, time to live)

I interpret "expiration" to be chronological in nature, applying to an entry in a database - perhaps something like "elapsed time since last access of a given entry". That entry is written in the database/cache, and not going anywhere... not sure I understand how hops through a network matches this.

TIA for your explanation.

EDIT/AFTERTHOUGHT: OK, I need to read more before posting... just read this, and need to read more, I guess...

1 Like

You are describing the IPv4 TTL header field (equivalent to IPv6 hop limit).

There are multiple defintions for TTL applicable in different contexts.

In DNS, an authoritative DNS server sets the TTL for a given DNS record to tell a client like a recursive or local resolver how long it should cache such a record for.

1 Like

Sorry, this turned out a much bigger reply than I first though

After reading this thread I worried-- How could I quantify 'too large'? I looked at the FTLDNS docs:

"the DNSSEC validation process uses the cache."

Now, just looking at my main Pi-hole's dashboard. 'Query Types' shows DS and DNSKEY requests to make up ~3% of queries. Does this mean that ~3% of my cache is from DNSSEC?

I think to evaluate: since TTL seems to be short (broadly 150-250 seconds) we can look at spikes in requests from the Pi-hole. Again, looking at the dashboard, I'm seeing ~600 total permitted & blocked queries at the highest 10-minute period of just today.

This ^ makes more sense now, and building on top of that, from the FTLDNS docs:

"The number of cache entries that had to be removed although the corresponding entries were NOT EXPIRED. Old cache entries get removed if the cache is full to make space for more recent domains. The cache size should be increased when this number is larger than zero."

Again, making sure to differentiate that cache insertions does not mean you have that much stuff cached.

Though to go through all this my basic takeaway is that the bigger the cache, you're allocating more memory to a cache that could only be 10% utilized most of the time, peaking at maybe 50%. Memory that could be allocated to other services for Pi-hole and your server. Is analyzing the peak the right way to go? Is there a better approach? I believe the default cache size, if I'm going the right direction, should have an asterisk telling you to change it.

The default cache size is sufficient for most at-home-network scenarios and still yields a high performance. Performance is degraded as the cache grows - a larger cache also means you have to search through more stuff until you find that you don't have something in your cache.

The takeaway message builds on this and

If the latter is not happening for you, we recommend to leave it as the default value. The default value will neither eat an awful lot of memory nor will it require you to come back in regular intervals to check if your lowered value is stiff sufficient.


Other topic:

Open your Pi-hole, switch to a page that requires authentication and enter the password. Once you did that, open http://pi.hole/admin/api.php?getCacheInfo in the same browser. This will give you a break-down of how the cache is filled at this point in time.

1 Like

Okay I have never seen this, and this is awesome:

{
  "cacheinfo": {
    "cache-size": 10000,
    "cache-live-freed": 0,
    "cache-inserted": 100031,
    "ipv4": 76,
    "ipv6": 31,
    "srv": 0,
    "cname": 165,
    "ds": 223,
    "dnskey": 25,
    "other": 1,
    "expired": 1229,
    "immortal": 20
  }
}

First of all, immortal is the coolest name for them, and secondly looking up 'immortal memory' or cache only gets me obituaries. Are these the root servers from named.root for unbound? Going into the github for FTLDNS I see the definition of getCacheInformation() method gives more info:

// <immortal> cache records never expire (e.g. from /etc/hosts)

So it could be from my hosts file, knowing my host file isn't more than 10 entries, maybe it is from unbound. The FTL api's getCacheInformation is much more verbose, I would like to see this not behind a finicky GET request. Thank you for your help!

When you run

sudo killall -USR1 pihole-FTL

your Pi-hole will dump the content of the cache into its log file /var/log/pihole.log

When you look under Flags, you'll notice an I next to each immortal cache entry. H means it comes from a HOSTS file, other sources for immortal cache records could, for instance, be definitions in config files.

Not sure if you have looked into the cache yet, I just want to give a heads up on our documentation we just added about how to read the cache dump report.

https://docs.pi-hole.net/ftldns/cache_dump/