Prefetch Popular Domains to Improve Cached Response Frequency

Per DL6ER: Caching in Pi-hole inhibits Unbound's prefetching algorithm.

Prefetching in Pi-hole would both support DNSSEC validation (because you wouldn't be disabling the cache) and fix the issue (caching for full TTL duration) that causes prefetching limitations.

Additionally, as proposed, there's more than one way to prefetch, and the first algorithm mentioned is likely favorable in smaller home networks where queries to popular sites aren't as statistically likely to hit during the final 10% of TTL.

Plus, pre-fetching in Pi-hole would benefit everyone, not just those who use Unbound.

Quite the jump to go from "might not have an effect" to changing his words to make it an absolute.

Give us some actual numbers of what kind of benefit this gives and you'll stand a better chance to getting it. Anecdotes and "It feels faster" won't do much.

Happy to get some numbers comparing prefetching stats

As for twisting DL6ER's words, that was not my intent, which is why I directly quoted him - can't get any less twisted than verbatim. But the specific phrasing of his statement was a bit hard to follow - when he says "prefetching might not have an effect", what he's saying is "if the algorithm follows what is stated in the mail thread, prefetching may not happen at all if Pi-hole caches queries". Maybe inhibit was the wrong word choice on my part, as it is vague and could be interpreted as completely prevents, but my intent was to say that it hinders the algorithm stated by the Unbound developers. There may be more going on behind the scenes, so it's not totally blocked, but you will get fewer <10% TTL queries if Pi-hole holds the query for the entire TTL duration.

Maybe ask him to clarify instead of assuming you know what he meant?

@DL6ER could you clarify what you meant please?

I guess you have read the whole topic you linkes above? I also thought that unbound might not prefetch if I leave pihole's cache enabled (as I understood the explanation in the mail thread) so I just went and looked at the unbound stats: a lot of prefetching happened despite pihole's cache enabled. My resume was that the description in the mail is not accurate (any more?) .

I have been following a long time in read-only mode before deciding to create an account here to gain read-write access. I also read the linked article about the addition of CACHE_SIZE to setupVars.conf and I agree with @yubiuser that @DL6ER's message was quite clear. It was just quoted out of context (by quoting only parts of it).

When I first read this other discussion two months ago, I was confused. I am, still.

  • Does unbound do prefetching? This is what @yubiuser suggests
  • What would be the benefit of adding (another layer of) prefetching to Pi-hole?
    (I write "another layer" but this is obviously only meant for the unbound users)

Pi-hole already caches domains. There is no upstream query performed for a typically long time. When a domain is requested after the TTL expired, the domain is requested upstream once and then again cached for the entire TTL.

I see only two effects of implementing prefetching in Pi-hole:

  1. PRO: The delay for the queries (every hour?) is decreased by some 10-100 milliseconds.
  2. CON: Pi-hole has to implement their own algorithm to decide what is a "hold" domain and what is a "cold" domain. May be how often the domain is queried in total (> 2 per minute) or relative to others (> 2% of total number of queries) or or or...

If you add it, can we have an option to control the algorithm used here? :slight_smile: Maybe use the same option to provide a way to disable it.

As I see it, the terms used to distinguish prefetching (Top X and TTL EOL) do not define different approaches of prefetching at all.

The authors of the quoted study "Accelerating Last-Mile Web Performance with Popularity-Based Prefetching" seem to use a similar time-based cache eviction strategy as unbound, they may just choose a different set of input variables for their parameters, i.e. an arbitrary theshold instead of TTL. Likewise, unbound has to decide which entries get evicted from its cache once it hits the cache size limit, which again is an arbitrary value of 20 in the study, based on some small data sets heuristics.

More importantly, the study does combine DNS prefetching with TCP connection caching for HTTP, collocating them on the same router, calling this combination "popularity-based prefetching".
It does so in order to "mitigate latency bottlenecks in the last mile", but falls short in providing actual latency numbers. Instead, it solely relies on separate figures for DNS and TCP cache hit ratio improvements.
This makes it difficult to assess both its overall benefit and the contribution of DNS, TCP and the effect of same-device collocation towards that total.
Furthermore, the study doesn't detail the traffic structure (remote vs. local), which would have an impact of latency incurred on cache misses.

So I can only guess here: Based on the fact that DNS makes up for a very small fraction of a network's data traffic, I'd expect the major benefit of the the study's proposal to be attributable to reuse of TCP connections, with a significantly smaller contribution by DNS, and maybe some effect of collocating those on the same device.

As Pi-hole is not involved in HTTP traffic (or any traffic other than DNS), the benefit of DNS prefetching -according to the study- would be to raise cache hit ratio from 15% to 50% while increasing the number of DNS requests tenfold (if optimised).

If you had to pay for each lookup, this would be an ineffective cost driver.
For every 1,000 DNS requests, you'd pay for:
1,000 x (100% - 15%) = 850 lookups without prefetching
1,000 x 50% x 10 = 5,000 lookups with prefetching

The benefit would be that you incur a higher latency less often with prefetching enabled, affecting your average "latency" as follows (assuming 1 ms for cache and 50ms for forwards) :
(15% x 1) + (85% x 50) = 42ms
(50% x 1) + (50% x 50) = 26ms
That's an advantage of 16ms on average, occuring once every 43 seconds or so (based on average daily DNS lookups per day from the study).

(Note that different metrics would apply to unbound, as a full recursion may take significantly longer than a straight DNS lookup. Prefetching therefore would seem more beneficial to unbound, so it's not surprising it actually can be configured for it.)

Of course, this ignores the unavoidable penalty of the first lookup for a non-cached entry, applicable in both scenarios, and any average contemplation displaces minimum and maximum observed values as well as their frequencies.
I acknowledge there is the occasional DNS resolution that takes a rather long time. I do not have any long-term data here; my own last 24 hours show 0.5% of queries taking a second and more.
It would take a more detailed study to verify why those occur, and whether they would typically be queried repeatedly beyond their cache expiration time to benefit from prefetching.

By and large, I doubt that a user will note a difference at all.

In case latency gains of that order are really important for someone, you should start by optimising it for the bulk of your network traffic, not the tiny fraction that DNS accounts for.

1 Like

Your math is spot on, but it wouldn't be every three minutes, it would apply to every query. By default (anecdotally, this roughly matches my statistics with Pi-Hole), ~15% of all requests are cached and ~85% are forwarded. Prefetching could make this closer to a 50/50 split for ALL QUERIES not just unique ones. Based on your estimates, for normal users (not using Unbound) that's a 38% decrease in average DNS latency, not just unique DNS latency - a non-trivial improvement.

"Every 3 minutes" bit would be more applicable to Unbound, because forwarded domains could still hit Unbound's cache, but unique domains would be more likely to trigger a recursive lookup which would, as stated, have a much longer response time.

I'm gathering data now, which will take some time. Preliminarily, Unbound still appears to do some prefetching, regardless of Pi-hole's caching, as others have stated, but time will tell if it is done to the same extent when Pi-hole caching affects its internal statistic gathering.

Can you point me to where response time data is stored? I'd also like to leverage this for a comparison.

Thanks, you are right, I picked the wrong numbers as baseline there.
I'll rework that bit. :wink:
EDIT: The study lists just below 2,000 DNS queries a day on average - one query every 43 seconds.

That's more around the 20% mark for me. I suspect that to be highly individual - and quite volatile, too.

AFAIAAO, response times are an in-memory only issue.
You can query those values over Pi-hole's Telnet API. You are looking for the last column of the getallqueries result set. Note that these values lose the decimal fraction by the query returning only the integer part.

EDIT: I just realised Pi-hole 5 adds another two columns to that output.

I've added an example to clarify output structure, with response times of 14.2 ms (click for details)
timestamp query type domain client status type DNSSEC reply type delay CNAME domain RegEx#
1593693424 AAAA fonts.gstatic.com 192.168.1.42 1 0 3 142 N/A -1
1593695428 A flurry.com smartphone.lan 1 0 4 142 N/A -1

Yes.

I'm working to gather data to indicate whether or not this is the case, but prefetching at the Pi-hole level could actually trigger additional prefetching (in a good way) at the Unbound level - based on the publicly shared Unbound prefetching algorithm. This assumes Pi-hole's algorithm functionally mimics Unbound's, and caching is done at the Pi-hole. If Pi-hole's algorithm differs from Unbound's, in a complementary fashion, it's possible that it could further increase the overall cached response rate.

Additionally, you highlighted the other obvious benefit - non-Unbound users would be able to capitalize on a DNS prefetching on their Pi-hole.

  1. Does your network only see one query per hour? If not, better cache return ratios are likely to be more frequent than hourly... See Bucking_Horn's math below your comment for an estimate on the impact to the average user. I'm still gathering my own statistics, but using an existing post from sawsanders as a reference, he had 13137 recursive look-ups in a 6-day period, or ~90 recursive look-ups per hour (assuming they were evenly distributed over a 24 hour period, and not preferentially happening during hours where people are actually using the network). For me, at least, an unbound recursive look-up takes somewhere between 100 and 1000 ms. Anything done to reduce the number of recursive lookup responses in a day is beneficial. A 1s lookup has a noticeable impact on page loading.

  2. Yes, the "con" to every feature request is that it will require development. That's the basic premise of a feature request. The beautiful thing is that once the basic prefetching code is in place, algorithms can be tuned to optimize prefetching over time. The system can start with a basic pre-fetching system like either or both of the algorithms I suggested in my first post, but over time can evolve. This is very much not the scope of the request today, but if a feature like this were to be implemented and allowed to evolve over time, prefetching would be a perfect candidate for a machine learning algorithm - they're designed for pattern recognition and prediction.

Abstract—An increasingly popular technique fordecreasing user-perceived latency while browsing theWeb is to optimistically pre-resolve (or prefetch) do-main name resolutions. In this paper, we presenta large-scale evaluation of this practice using datacollected over the span of several months, and showthat it leads to noticeable increases in load on nameservers—with questionable caching benefits. Further-more, to assess the impact that prefetching can haveon the deployment of security extensions to DNS(DNSSEC), we use a custom-built cache simulatorto perform trace-based simulations using millionsof DNS requests and responses collected campus-wide. We also show that the adoption of domainname prefetching raises privacy issues. Specifically,we examine how prefetching amplifies informationdisclosure attacks to the point where it is possible toinfer the context of searches issued by clients.

See section V.A (Results)

@DanSchaper Speaking of taking things out of context - this article is about browser hyperlink prefetching:

The soup du jour for decreasing user perceived latency is to optimize the use of the domain namesystem by pre-resolving (or prefetching) names in hyperlinks. Since DNS is responsible for translating human-readable names into IP addresses,nearly every initial visit to a website involves a name resolution. Thus, by proactively resolving hyperlinks in pages a user visits, the sites being referred to can be immediately contacted if, and when, the user decides to click on one of the links.

Pi-hole does not have exposure to hyperlinks, and thus is not impacted by the privacy concerns discussed in this article. No additional data would be exposed, and the prefetched queries at the Pi-hole level would be for domains that you actually want to connect do, not just every domain linked on a given search page.

Recall that our main goal is to study the effects ofbrowser-based DNS pre-resolution.

So you didn't read it?

To summarize - Pi-hole code should be changed to do pre-fetching because it may offer a very small speed benefit, regardless of the impact on any of the nameservers?

I you are using a third party DNS, query times are typically on the order of tens of milliseconds. Saving a few milliseconds or tens of milliseconds will have zero impact on the performance of any apps. I don't think I see the benefit here, other than it may look cool to have "better" cache usage stats shown on your dashboard. What am I missing?

I don't think anyone said regardless of the impact on the nameservers - that would naturally be a factor in determining the optimal prefetching algorithm. It is true by the very nature of prefetching that there will be an increase in DNS queries in order to optimize network performance. You may have just made a good point though for limiting the amount of user control over their personal settings. Without some amount of testing and QA, people could go wind up over-aggressively prefetching.

So prefetch once the TTL expires

My point is that it's not worth implementing this at all if there are no significant benefits. A few msec here and there...

It may be a few ms on your local nameserver, but on my connection, likely more crowded or a bit farther from any popular providers, I'm about 100-200ms from any or the major DNS providers, based on my own benchmarking, erring higher if I lean towards providers with privacy conscious policies. On the flip side, I can get a response on my system in <5ms from my Pi-hole. 0.1s may not seem like a lot, but in aggregate when loading pages and fetching potentially dozens of DNS records per site, it adds up.

If you are concerned with privacy, why use any of them? Use unbound and eliminate the third party and get pre-fetching in the bargain.

I do. But I'm saying there's room for improvement. Especially now that the results are slowly coming in - it's too early to say anything definitive, but based on my testing, disabling Pi-hole's cache appears to decrease the proportion of total.num.recursivereplies as a function of total unblocked queries on my network. (~4851/180k, or 3% with cache = 0, vs ~1685/28k, or 6% with cache = 10000)

I'd like to extend the test for at least two weeks, one with cache on and a week with it off to better understand the impacts. Normally I wouldn't even mention the results before then because I'm not totally comfortable with them, but to be quite frank, the amount of hostility towards a feature request here is staggering, and I'm a bit on the defensive.

Unfortunately, Pi-hole does not log the response times in the db - it would be much more interesting to be able to specifically compare average response time under each scenario.