Prefetch Popular Domains to Improve Cached Response Frequency

Matt · June 18, 2020, 9:05pm

It may be a few ms on your local nameserver, but on my connection, likely more crowded or a bit farther from any popular providers, I'm about 100-200ms from any or the major DNS providers, based on my own benchmarking, erring higher if I lean towards providers with privacy conscious policies. On the flip side, I can get a response on my system in <5ms from my Pi-hole. 0.1s may not seem like a lot, but in aggregate when loading pages and fetching potentially dozens of DNS records per site, it adds up.

jfb · June 18, 2020, 9:08pm

If you are concerned with privacy, why use any of them? Use unbound and eliminate the third party and get pre-fetching in the bargain.

Matt · June 18, 2020, 9:28pm

I do. But I'm saying there's room for improvement. Especially now that the results are slowly coming in - it's too early to say anything definitive, but based on my testing, disabling Pi-hole's cache appears to decrease the proportion of total.num.recursivereplies as a function of total unblocked queries on my network. (~4851/180k, or 3% with cache = 0, vs ~1685/28k, or 6% with cache = 10000)

I'd like to extend the test for at least two weeks, one with cache on and a week with it off to better understand the impacts. Normally I wouldn't even mention the results before then because I'm not totally comfortable with them, but to be quite frank, the amount of hostility towards a feature request here is staggering, and I'm a bit on the defensive.

Unfortunately, Pi-hole does not log the response times in the db - it would be much more interesting to be able to specifically compare average response time under each scenario.

jfb · June 18, 2020, 9:55pm

I don't see hostility, I see questions. As with any feature request, there has to be a benefit commensurate with writing and maintaining additional code.

DanSchaper · June 18, 2020, 10:19pm

Show me something I can look at and see the cost/benefit ratio. If this is to increase speeds, show me the increased speeds. Show me pages loading, TTFB, anything that's conclusive.

You're talking about a big undertaking as far as code goes, so the request needs some equally big benefits.

Coro · June 19, 2020, 9:47am

I already talked with @DL6ER over PM and he has an idea how to implement this without too much additional costs as far as code is concerned. The idea is, if I got this right, to change the resolve thread in FTL to not only resolve host names for clients and upstream destinations but to also periodically check the TTL of all known domains.

If it sees that a domain is approaching end-of-life (EOL), it can either prefetch it in any case (he called this "aggressive") or based on whatever algorithm (he called this "smart(er)"). Algorithm may be percentage of domain queries relative to total queries or total number of queries or ...

Let's see what he says about this. I think he is a bit hesitant to do this because it is yet another feature to implement and maintain, however, it may be worth it to keep an always up-to-date cache in Pi-hole? I'm still undecided if I'd vote myself for this or not (I didn't, so far).

More opinions?

h2xtreme · June 19, 2020, 1:38pm

I voted for this.

Today many domains have really short TTL (specially when they use anycast behind the scenes) which makes caching completly useless.

Looking at my pihole setup, for example, the most queried domains are netflix and google by far. Most of the queries to those domains are forwards because of the TTL. Netflix api has a TTL of 45 seconds or something like that.

DNS prefetch its the only thing that I see being able to help here. I don't personally like to temper with TTL at cache resolver layer or serving expired TTL records.

sawsanders · June 19, 2020, 3:03pm

I think if you want prefetching, let Unbound do it. Maybe just give users the ability to set the Pi-hole's cache to zero from the GUI to possibly improve Unbound's prefetching.

jfb · June 19, 2020, 4:03pm

How long is it taking to resolve one of these DNS requests using your chosen upstream resolver? It takes me 22 msec with Cloudflare, which makes pre-fetching somewhat pointless.

DanSchaper · June 19, 2020, 4:47pm

Prefetching will just increase the number of queries happening.

DL6ER · June 20, 2020, 1:58am

How?

Prefetching would lead to querying this domain every 45 seconds to keep the cache up-to-date. How does that (=the total number of upstream-forwarded queries per hour for this domain) differ from the current way Pi-hole works (only forward when the cache entry is already expired)?

Not trying to argue around things here, I'm not seeing what you want to show with your example, because

is really also the only thing I currently see being the result of prefetching in the suggested manner.

When posting details of a "private" (as in: direct) conversation, you should be complete. I will leave it to you to also quote me on the set of disadvantages involved in this. Like I never really said the "too much" in

in addition to what you quoted (the resolve thread), we'd also need to add properties to the queries datastructure to mark a query as being prefetched. We'd maybe even have to store this in the long-term database, as user's would ask why the heck localhost started to query netflix domains even when there is, e.g., not even a browser installed on the Pi. This additional information then also need to get parsed and shown on the dashboard in a few places + users will (understandably) want to see extra statistics around prefetching.

h2xtreme · June 22, 2020, 12:26pm

It will increase the number of queries done by the DNS cache resolving layer (pihole in this case), fully agree on that. But the point of prefecting DNS isn't to reduce number of queries, on the pihole itself, but to increase the cache hit ratio for the clients.

A forward query from my computer (client) takes around 20ms to be resolved. A cached query takes around 2ms. And that should be the purpose of a DNS cache resolver, serve DNS requests as fast as possible to the clients.

With that being said, I totally understand if this feature isn't implemented because of the amount of work that it requires and/or performance trade offs that the pihole project is unwilling to make (many people run pihole in pi zeros and pi 1 which would suffer performance wise with this). Messing with TTL records (which I'm not a fan of) or setting up unbound between pihole and the internet achieve pretty much the same thing.

DanSchaper · June 22, 2020, 4:53pm

The difference between 20ms and 2ms is not worth the effort to tweak things.

Coro · June 23, 2020, 8:52am

I agree that the number of points you raised was long(ish). I may have been to enthusiastic. I agree this should not be added just yet.

anon94500823 · June 25, 2020, 12:16am

I agree with the others here. Just use Unbound. I have compiled Unbound for use as an Authorative/Validating/Recursive/Caching DNS server and have disabled Pihole’s caching altogether. Although the devs recommend you do not do this, I prefer Unbound’s caching and prefetching over Pihole’s cache. Most of Pihole’s blocked queries take <2-3ms, and I don’t see the need to have PiHole process these queries and use resources to store them in cache when you have an all-around better caching solution in Unbound. You can also setup “auth_zones” and “Stub Zones so Unbound also acts as an Authorative DNS Server for your home lab. Once your cache on Unbound has “warmed” up, you’ll barely see any upstream queries over 1ms, most are <1ms on cached entries. Uncached entries can be from 5ms all the way to hundreds of milliseconds, that all depends on server location. But, even on un-cached requests I rarely get queries over 20-30ms. The number one complaint of Unbound users was the loss of cache during a restart/reboot. But, if you enable “unbound-control” you can easily “dump_cache” to a file then “reload_cache” after rebooting the system and your cache will be restored. Been using Unbound for several months and it simply blows away any other upstream provider I’ve tried, and I’ve tried most of them.

Edit: Correct Spelling

Coro · June 25, 2020, 9:48am

I don't think this is correct. They only said that you cannot do this and use DNSSEC validation inside Pi-hole at the same time. Otherwise, the entire DNSSEC chain would have to be re-requested for each query. However, when using unbound as validating resolver, you do not need to use DNSSEC in Pi-hoel and everything is fine.

This is not something you should be doing routinely.

Yes. I agree. I used the guide from the Pi-hole docs page and everything is running perfect for more than two years.

anon94500823 · June 25, 2020, 2:49pm

Hello. I’ve observed on several threads here the moderators/devs telling users not to disable PiHole cache at all when using Unbound. It is their recommendation to leave PiHole cache as is. I mentioned that because I don’t want users going against developer recommendations just because I decided to do so on my setup.

As far as the cache is concerned, dumping and reloading cache just has to be used responsibly. The main concern with that is restoring bad/expired data in the cache, which you might already have depending on your cache settings eg., “prefetch” & “serve-expired”. Let’s say you’re performing simple maintenance or updates and reboot your system, it should only take you about a minute to reload that cache. I usually get it done under a minute to avoid issues. In that scenario, it’s completely safe to do this. Now, if you plan to dump the cache and restore it a few days later, then that can definitely cause issues. It just has to be used with a little bit of common sense and caution. I’ve honestly done it dozens of times as I’m always tinkering with Unbound settings as I learn more and more about it and I haven’t ran into a single issue with reloading cache. Again, I’m usually restoring the dumped cache file within a minute of reloading or restarting Unbound. I also run “flush_bogus” right after restoring to get rid of any bad data in the cache. This isn’t something people should be doing on a regular basis, just for maintenance or when forced to restart and you don’t wanna lose a warm cache.

Yeah it’s amazing what Unbound can do. It’s insanely fast and there’s so much to configure and tweak that you can use Unbound basically in just about any scenario for home/pro use. I don’t think I’ll ever be using third-party upstream dns servers again.

Edit: Grammar Check

Coro · June 29, 2020, 6:03am

The manpage says:

Copy the DNSSEC Authenticated Data bit from upstream servers to downstream clients. This is an alternative to having dnsmasq validate DNSSEC, but it depends on the security of the network between dnsmasq and the upstream servers, and the trustworthiness of the upstream servers. Note that caching the Authenticated Data bit correctly in all cases is not technically possible. If the AD bit is to be relied upon when using this option, then the cache should be disabled using --cache-size=0. In most cases, enabling DNSSEC validation within dnsmasq is a better option.

Coro · June 29, 2020, 9:27am

Which does what with this bit?

Coro · June 30, 2020, 8:47am

But isn't this also the case without proxy-dnssec? And do we know if any client out there cares about the AD bit at all? Either they get an address or not is what I'm afraid their only concern is.