tl;dr: prefetching could help increase Pi-hole cache utilization, and improve overall network speed at the cost of increased outbound DNS queries
Prefetching is a common DNS practice designed to increase cached response rate for DNS servers, as cached responses are typically faster than forwarded responses - especially for users who utilize a recursive DNS resolver like Unbound.
For the uninitiated, prefetching is the act of preemptively querying domains prior to their TTL expiring.
Functionally, from what I've found, there are two common methods for prefetching domains at the DNS server level:
Top X Prefetching - where the top X domains are prefetched as long as they have been queried within the last Y minutes. In the paper linked, the researchers found a "sweet spot" somewhere around the top 20 domains for a home network. For personal reference, looking at my network the top 20 non-blocked domains represent roughly 50% of all non-blocked queries. This style of prefetching is best for smaller networks where the number of unique devices accessing the network is relatively low, but ultimately results in significantly more outbound queries. In the study linked, the number of outbound queries for popular domains on home networks increased by a factor of ~5-150. When properly tuned with a good cutoff algorithm, this factor was capped at <10. Since Pi-hole actively tracks top domains (assuming your privacy settings are set to allow this), it's not unreasonable to imagine a system where this data is applied for prefetching.
TTL EOL Prefetching - as used by Unbound where domains are prefetched in response to any domain queries occuring within the last 10% of the cached TTL. This approach does not require a top list per-se, but commonly prefetches domains queried more often than their respective TTL time. The upper limit on additional outbound queries is 10%, but only reaches this limit on networks where popular domains are queried frequently enough by unique users. The downside of this approach is that if you have a limited number of devices, or different browsing habits amongst members of your your household, devices may preserve a DNS response in their cache until the TTL expires, so prefetching may never occur.
The purpose of this post is to request implementation of a prefetching system on Pi-hole to improve network performance, and to start a dialog around my personal assertion that the optimal route would be to allow users to toggle both Top X Prefetching and/or TTL EOL Prefetching depending on their personal network configuration and added load tolerance.
Implementation of prefetching, especially TTL EOL Prefetching designed to mimic Unbound prefetching, could resolve the main issue driving users to modify cache size on Pi-hole to 0, against developer recommendation, as Unbound would now be receiving those near-EOL requests typically absorbed by Pi-Hole. Everyone wins.
Additional thoughts on potential configuration parameters:
Top X Prefetching
- # of domains - how many of the top are maintained prefetched (default ~20)
- cutoff time - how long between organic queries before top domain is removed (temporarily) from prefetching queue (default ~600s)
TTL EOL Prefetching
- % of TTL Cutoff - queries in the final % of TTL trigger prefetching (default ~10% to match Unbound)
- global TTL response scale - a global scaling factor (0-1) applied to all received TTLs forwarded to clients (default ~0.95-1). Scaling all TTL by something like 0.95 could further increase the effectiveness of this prefetching approach as device caches would expire just in time to trigger a cached response and signal pi-hole to pre-fetch the domain. This scaling would be especially beneficial on smaller networks with fewer unique client requests.
Thoughts?