reduce upstream requests as much as possible

Washuu · April 1, 2025, 10:44pm

Hi,
I'm using newest pi-hole (6.0.6+6.1) on a Debian machine.
My general case is that I need to reduce significantly amount of DNS requests sent upstream.

So I added min-cache-ttl=1800 to the configuration, but it doesn't seem to work. (Yes, I am aware that some hosts may change their IP multiple times in such long period and I am aware of the risk.)

If I run tcpdump on the outgoing interface, I see a lot of requests sent to upstream servers, sometimes as often as a few seconds apart. If I run pihole -t I also see the same names of hosts, sent twice for the A records, not to mention the fact that still they are repeated after just a few minutes.

I think I see also requests for BLOCKED requests sent to the upstream servers, and this part I don't understand at all.

Please give me some hints, how should I reconfigure pi-hole to force caching and reduce upstream queries.
I'm thinking about adding some top hosts to the static lists, like /etc/hosts, but as a last resort.

Washuu
PS. I'm overall very pleased with pi-hole, and thank you very much for the great job you;ve done with this project.

rdwebdesign · April 1, 2025, 10:46pm

Why?
Can you detail your use case and the reasons for that?

What do you want to achieve reducing the number of upstream DNS queries?

Ladrien · April 1, 2025, 10:58pm

I've found the config variable BLOCK_TTL to be very useful in this scenario. That's a v5 variable, but I'd imagine v6 also has this variable. Pi-hole by default attaches a very short TTL for blocked queries, which can quickly get out of hand with a large amount of clients.

I would recommend against adjusting TTLs for legitimate queries. You will experience disruptions due to the skewed TTLs.

Washuu · April 1, 2025, 11:14pm

thanks for the reply.
I cannot tell you all the details, but in short - the upstream link is VERY SLOW and VERY EXPENSIVE (I am charged for every MB sent). I am trying to use caching wherever possible (I also force using squid for outgoing access). There are just 6 hosts in this network.
I just captured a nice fragment in pihole -t output depicting some weird duplicates (believe me, I didn't add any lines, just replaced the host name):

Apr  1 23:07:10: query[A] some.public.host from 192.168.0.10
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: query[AAAA] some.public.host from 192.168.0.10
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:10: reply error is SERVFAIL
Apr  1 23:07:10: query[A] some.public.host from 192.168.0.10
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: reply error is SERVFAIL
Apr  1 23:07:10: query[AAAA] some.public.host from 192.168.0.10
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:10: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:10: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:11: reply error is SERVFAIL
Apr  1 23:07:11: reply error is SERVFAIL
Apr  1 23:07:11: query[A] some.public.host from 192.168.0.10
Apr  1 23:07:11: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:11: query[AAAA] some.public.host from 192.168.0.10
Apr  1 23:07:11: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:11: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:11: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:11: forwarded some.public.host to 9.9.9.9
Apr  1 23:07:11: forwarded some.public.host to 149.112.112.112
Apr  1 23:07:11: reply error is SERVFAIL
Apr  1 23:07:11: reply error is SERVFAIL

And let me repeat: I don't really care about the rate of INCOMING DNS queries too much, but I want to reduce the number of upstream DNS requests, within reasonable limits.

Bucking_Horn · April 2, 2025, 7:41am

Your log excerpt shows the requests as created by your clients and received by Pi-hole.

Pi-hole has received an A and AAAA request from 192.168.0.10 and forwarded that upstream, but the immediate upstream answer is SERVFAIL, indicating that either the upstream or an authoritative DNS server encountered an error trying to serve an answer.
That error may be temporary (e.g. if an authoritative DNS server was in the process of resigning its DNS records, causing the upstream's DNSSEC to fail) or permanent (e.g. if an authoritative DNS server is misconfigured for the requested domain).

Upon receiving the SERVFAIL answer, your client at 192.168.0.10 then immediately resends the DNS reques, which Pi-hole again forwards to its upstream to retrieve the correct reply.

DNS replies come with a kind of best-before date (TTL), indicating that clients may cache the reply until at most that date before they should re-request it.
That TTL is specific for each domain, and its value is controlled by the maintainers of the authoritative DNS servers for that domain.
Pi-hole is already keeping DNS record replies in its cache for as long as that TTL (actually, v6 even keeps it for a bit longer, to be able to serve a reply immediately while refreshing the stale record).

As Pi-hole is not serving DNS records for some.public.host from its cache, that would indicate a zero TTL.

What's the output of dig some.public.host?

As BLOCK_TTL / dns.blockTTL controls blocking behaviour, it has no impact on requests forwarded to upstreams, and thus cannot reduce your data usage.

You should note that DNS requests would make up for but a fraction of your overall network traffic, with a typical single reply accounting for 100 to 200 bytes. For my own network, e.g., that would be ~0.04% (~600K of DNS replies comparing to ~800MB download volume) for yesterday's traffic.
You may want to determine your own average rate, as your specific rate would differ, based on what your clients are typically requesting. That would allow you to better decide whether investing your efforts into further DNS improvements would be worth following.
With my numbers, reducing DNS replies by 50% may result in only 0.02% of traffic reduction.

Would you be able to share an example for that?

Washuu · April 2, 2025, 11:05am

Hi, thank you for the reply.

As for the rationale - when the webpage contains many elements like banners, pictures, frames, etc, loading it requires several (sometimes more than 10) DNS requests beforehand. It can slow down loading, regardless of the content transfer (which is sometimes cached by Squid webproxy).

As for the SERVFAIL, you are totally right.
I've checked that the domain of the public.host is registered in the root DNS servers, BUT it does not have the relevant NS or SOA records - so that's why I get SERVFAIL instead of NXDOMAIN.
As this server indeed is not resolving and is not needed, I need to add it as a static entry, to stop sending queries about it, right?

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 29629
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; EDE: 9 (DNSKEY Missing)
;; QUESTION SECTION:
;some.public.host.         IN      A

;; Query time: 274 msec
;; SERVER: 9.9.9.9#53(9.9.9.9) (UDP)
;; WHEN: Wed Apr 02 10:19:06 UTC 2025
;; MSG SIZE  rcvd: 56

As for the blocked and forwarded entries, I investigated it a bit, and in most cases the queries were not sent upstream, but sometimes still they were:

Apr  2 10:52:20 dnsmasq[1707711]: query[A] www.should_be_blocked.com from 192.168.0.210
Apr  2 10:52:20 dnsmasq[1707711]: exactly denied www.should_be_blocked.com is 0.0.0.0
Apr  2 10:52:22 dnsmasq[1707711]: query[A] www.should_be_blocked.com from 192.168.0.210
Apr  2 10:52:22 dnsmasq[1707711]: exactly denied www.should_be_blocked.com is 0.0.0.0
Apr  2 10:52:24 dnsmasq[1707711]: query[A] www.should_be_blocked.com from 192.168.0.210
Apr  2 10:52:24 dnsmasq[1707711]: exactly denied www.should_be_blocked.com is 0.0.0.0
Apr  2 10:52:26 dnsmasq[1707711]: query[A] www.should_be_blocked.com from 192.168.0.210
Apr  2 10:52:26 dnsmasq[1707711]: forwarded www.should_be_blocked.com to 9.9.9.9
Apr  2 10:52:26 dnsmasq[1707711]: reply www.should_be_blocked.com is <CNAME>
Apr  2 10:52:26 dnsmasq[1707711]: reply www.should_be_blocked.com.should_be_blocked.jiasu.com is <CNAME>
Apr  2 10:52:26 dnsmasq[1707711]: reply www.should_be_blocked.com.w.kunluncan.com is blocked during CNAME inspection
Apr  2 10:52:26 dnsmasq[1707711]: exactly denied www.should_be_blocked.com is 0.0.0.0

This happened only about 20 times in the last 24 hours, while the client queries are sent every five seconds, so it's not a big deal - I was just surprised to see the domain in the tcpdump logs captured for the upstream traffic.

I got the bottomline from the responses given - the reducing upstream queries is not trivial and it's not worth the effort. Most likely you are right, even if I save a second or two, it's not much. I just hoped for some simple solution, not requiring to dig deeper in the internals.

My last question: when I add the local DNS record, it seems it is used in A queries only, but not with HTTPS queries. Is there any way to force the local DNS records to be used in HTTPS responses as well (as IPv4 hints)?

Bucking_Horn · April 3, 2025, 2:56pm

That wouldn't stop clients from sending queries, but Pi-hole would not forward them upstream to resolution anymore.

Blocking some.public.host in Pi-hole would have much the same effect, and it would work for all requests, including A, AAAA and HTTPS.

If yours is a tightly controlled network where you would happen to know exactly(!) what domains your network of clients would be required to resolve successfully(!), you could also consider to block everything and only allow those very domains.
This could be the case for a network of IoT clients only, which would be programmed to contact a few well-known domains, some of which you would deem unnecessary to resolve.

In cases where clients are expected to connect to any domain, a block all approach is bound to cause frustration for users and would quickly become a maintenance burden.

With your redacting of log output, I can't make any sensible statement.

If you would like us to have a look, please upload a debug log and post just the token URL that is generated after the log is uploaded by running the following command from the Pi-hole host terminal:

pihole -d

or if you run your Pi-hole as a Docker container:

docker exec -it <pihole-container-name-or-id> pihole -d

where you substitute <pihole-container-name-or-id> as required.

In addition, you could upload your log file findings to our servers, where it auto-deletes after 48 hours, and only trusted members of the Pi-hole team can access it:

cat my.file.log | pihole tricorder

Again, you'd just need to share the token then.

system · April 24, 2025, 2:57pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.