"Cached" vs "Forwarded" Queries Incorrectly Labeled Sometimes

sujay · July 20, 2017, 4:15pm

Some DNS queries have query chains, where a query might have a few CNAMEs that end in a final A record. All these parts of the chain can have different TTLs. So it might have parts of the chain cached but one part of it might have a low TTL, and dnsmasq will then re-request the original request if part of it is out of date. I've noticed though that when this happens, Pi-hole incorrectly labels this as a cached query, even when dnsmasq forwards the request again.

Example:

Pi-hole always considers this as:

Can this behavior be corrected? Thanks.

DanSchaper · July 20, 2017, 4:19pm

The log is asynchronous, so you may be seeing the cached response listed first, but in fact it's reply that happens first, they just get listed out of sync in the log. We could force the log to be synchronous, but that has performance penalties involved. @DL6ER knows the dnsmasq source code better than I, so he may have some insight into this.

sujay · July 20, 2017, 4:22pm

It's possible you're right about it just being log ordering but this is behavior I've noticed for a while now.

If you want you could try to reproduce it yourself with the "download.qnap.com" domain. You'll notice that the final CNAME (e6994.dscb.akamaiedge.net) that it gets forwarded to has a very low TTL, and that's what triggers it to be forwarded again.

DanSchaper · July 20, 2017, 4:25pm

We could prioritize this better if we knew what the end goal was, is this for cache tuning and improving the performance?

sujay · July 20, 2017, 4:27pm

Neither really, it's a minor request for the sake of improving accuracy in the admin interface.

In cases like this where there is a query chain and differing TTLs, the Pi-Hole interface is always incorrectly marking it as a cached query when in fact it's a forwarded query, so it can potentially be throwing off cumulative stats about cached queries.

DanSchaper · July 20, 2017, 4:28pm

Fair point, may be possible with a newer version of dnsmasq.

DL6ER · July 20, 2017, 4:39pm

I see what you are after, but let me explain what the technical background is, so you know what I'm talking about.
Currently, FTL implements what we think is the bast approach to scan the asynchronous log written by dnsmasq:

Once we find a query entry, we look for the next line that contains (a) the domain name that was just requested and (b) contains the information if it was forwarded, answered from cache, etc. Hence, in this case, FTL determines the status cached in your case and any latter forwardes is ignored.

In the case you are seeing, we could potentially only remember that there was a cached reply and always seek for a forwarded reply nevertheless, however, there is nothing that guarantees us that this forwarded we find actually belongs to the very same query.

Example log snippet where this would work (similar to yours):

...
12:00:00 query[A] download.com from 1.2.3.4
12:00:00 cached download.com is <CNAME>
12:00:00 cached abc.download.com is <CNAME>
12:00:01 forwarded download.com to 8.8.4.4
12:00:01 reply download.com is <CNAME>
...

With the new scheme we would recognize that it was cached, but we would also find the forwarded entry, so we could show the query as forwarded.

Another example (assume it would not be forwarded, but the cache would still be valid in the first query):

...
12:00:00 query[A] download.com from 1.2.3.4
12:00:00 cached download.com is <CNAME>
12:00:00 cached abc.download.com is <CNAME>
...
14:00:00 query[A] download.com from 1.2.3.4
14:00:00 forwarded download.com to 8.8.4.4
14:00:00 reply download.com is <CNAME>
...

Here, we would skip the first (correct) cached entry and would find the forwarded that appeared in the logt later.

The general problem comes down to that we are not able to identify which answer belongs to which query. We implement the best analysis we could come up with but it has some intrinsic weak spots. As @DanSchaper already pointed out, more modern versions of dnsmasq actually allow to add IDs into the log file which would allow unambiguous identification of requests and answers but the problem is that out software implementation has to stay general enough to also work with legacy OSs.

sujay · July 20, 2017, 4:48pm

Makes a lot of sense, thanks for the detailed reply.

Seems like you have the right approach right now with the info that is currently made available to you. Maybe in the future when the dnsmasq IDs feature is more reliably available you might be able to improve it. Good thing this is more of a "would be nice to have" request and not a critical issue.

Thanks again for the detailed reply and keep up the great work. I wouldn't have even noticed this if I didn't love Pi-hole so much.