Pihole not Caching

Hi all,

iam using PiHole + knot-resolver as Upstream and noticed, that for whatever reason, pihole is not caching at all.


 echo ">cacheinfo" | nc 127.0.0.1 4711
cache-size: 10000
cache-live-freed: 0
cache-inserted: 0
ipv4: 4
ipv6: 6
srv: 0
cname: 0
ds: 1
dnskey: 0
other: 16
expired: 0
immortal: 27

It will stay like that.

Is there any reason, why dnsmasq is not caching the answers from the upstream? I have absolute no special config there.(Just root-forwarding)

DNSSEC is enabled in pihole, but disabling it makes no difference.

Contrary to your assumption, that result shows Pi-hole to be caching (54 entries in total).

If those numbers are indeed not changing, than I'd suspect your network is not using Pi-hole for DNS (yet).

What's the output for:

echo ">stats >quit" | nc localhost 4711

And please upload a debug log and post just the token URL that is generated after the log is uploaded by running the following command from the Pi-hole host terminal:

pihole -d

or do it through the Web interface:

Tools > Generate Debug Log

https://tricorder.pi-hole.net/rbgbHBD5/

The PiHole is used fine, its the only allowed DNS Server in my network.

Network Setup:

Client --> Knot Frontend from there are two ways:
Way 1: my Router for Forward and Reverse lookup of local names/ips OR
Way 2: everythign else to pihole

From Way2 its going forward to my knot-backend and from there without any logic recursive ...

Reolving is working fine, its just the dnsmasq notcaching for whatever reason :frowning:

Sorry, the output(flushed the logs around 30-60minutes ago, makes it easier to debug):

root@pihole01:~$ echo ">stats >quit" | nc localhost 4711
domains_being_blocked 1316495
dns_queries_today 2293
ads_blocked_today 171
ads_percentage_today 7.457479
unique_domains 364
queries_forwarded 2000
queries_cached 9
clients_ever_seen 7
unique_clients 7
dns_queries_all_types 2293
reply_UNKNOWN 122
reply_NODATA 433
reply_NXDOMAIN 7
reply_CNAME 369
reply_IP 493
reply_DOMAIN 4
reply_RRNAME 5
reply_SERVFAIL 4
reply_REFUSED 0
reply_NOTIMP 0
reply_OTHER 0
reply_DNSSEC 0
reply_NONE 0
reply_BLOB 856
dns_queries_all_replies 2293
privacy_level 0
status enabled

Your debug log shows your router to distribute your knot resolver at 192.168.1.10 as local DNS server:

*** [ DIAGNOSING ]: Discovering active DHCP servers (takes 10 seconds)
   Scanning all your interfaces for DHCP servers
   
   * Received 334 bytes from eth0:192.168.1.1
     Offered IP address: 192.168.1.12
     DHCP options:
      Message type: DHCPOFFER (2)
      router: 192.168.1.1
      dns-server: 192.168.1.10
      --- end of options ---

At the same time, your Pi-hole is using that knot resolver at 192.168.1.10#5353 as its upstream

*** [ DIAGNOSING ]: contents of /etc/dnsmasq.d

-rw-r--r-- 1 root root 55 Jan 12 08:32 /etc/dnsmasq.d/00-pihole-extra.conf

-rw-r--r-- 1 root root 1.5K Jan 12 08:20 /etc/dnsmasq.d/01-pihole.conf
   server=192.168.1.10#5353

This would potentially close a DNS loop, where queries bounce back and forth between knot and Pi-hole forever or until time-out.

Also note that port 5353 is reserved for mDNS usage and may interfere with clients making use of it (like Apple's bonjour or Linux' avahi).

You should consider to rearrange your DNS resolution chain:
Have your router distribute Pi-hole as local DNS server, and configure Pi-hole to use your knot resolver as its only upstream.

EDIT: I just saw your edited post explaining your setup with front and upstream knot usage, together with your stats results. :wink:
Again, it shows Pi-hole is caching, if only a few entries.
Perhaps your front-end knot resolver is doing so as well, so it won't forward requests for known DNS records until their TTL expires, by which time they also already would have expired from Pi-hole's cache.
I'd still recommend to consider rearranging the resolution chain as suggested above.

Ok, let me try to explain it a bit more:

  • As you already figured out, there is no DNS loop :slight_smile:
  • Good comment about 5353, i will change that, easy thing todo (should not ahve any impact, but why not using different port)
  • The cached entries are maybe just the queries from pihole itself, it does try to cache localhost, knot and so on but the backend doesnt know it (negative cache)

The Reason why i changed from pihole --> knot to knot --> pihole --> knot was intendet, knot has much more fancy stuff for splitting up domains, and conditions. And the caching is even better. Anyway, ne frontend knot is always asking the pihole for refreshing the entries, so they should be cached. When iam using 8.8.8.8 as backend, entries are cached again, so somehow dnsmasq/pihole is forced by the upstream to NOT cache reults, but i dont know why, because everything in the upstream knot is default.

You wouldn't lose that if Pi-hole would be first in your chain, as your knot would still be seeing all traffic, apart from Pi-hole's blocking (which would happen at the earliest point with my suggestion).

What's the output of the following command:

dig heise.de @192.168.1.10 -p 5353

Knot Backend:

root@knot01:~$ dig heise.de @192.168.1.10 -p 5353

; <<>> DiG 9.16.33-Debian <<>> heise.de @192.168.1.10 -p 5353
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27111
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;heise.de.                      IN      A

;; ANSWER SECTION:
heise.de.               75232   IN      A       193.99.144.80

;; Query time: 0 msec
;; SERVER: 192.168.1.10#5353(192.168.1.10)
;; WHEN: Thu Jan 12 11:15:52 UTC 2023
;; MSG SIZE  rcvd: 53

Knot Frontend:

root@knot01:~$ dig heise.de @192.168.1.10

; <<>> DiG 9.16.33-Debian <<>> heise.de @192.168.1.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31301
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;heise.de.                      IN      A


;; ANSWER SECTION:
heise.de.               75176   IN      A       193.99.144.80

;; Query time: 0 msec
;; SERVER: 192.168.1.10#53(192.168.1.10)
;; WHEN: Thu Jan 12 11:16:48 UTC 2023
;; MSG SIZE  rcvd: 53

PiHole:

root@knot01:~$ dig heise.de @192.168.1.12

; <<>> DiG 9.16.33-Debian <<>> heise.de @192.168.1.12
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4124
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;heise.de.                      IN      A

;; ANSWER SECTION:
heise.de.               75133   IN      A       193.99.144.80

;; Query time: 28 msec
;; SERVER: 192.168.1.12#53(192.168.1.12)
;; WHEN: Thu Jan 12 11:17:31 UTC 2023
;; MSG SIZE  rcvd: 53

Having pihole in front (again) would cause much more CPU time because every single request, even internally, will be checked against the blacklist.

As you can see, pihole is "slow" compared to knot, i guess because of missing cache. (When iam running the same query again directly against the pihole it IS cached.. very weird)

Update:
When i MIRROR the queries to pihole, instead of FORWARDING them, they get cached. Can a client somehow says: "please dont cache, pihole" or does pihole doenst cache forwarded queries?

On the contrary: Your front-end must also compare domains to tell locals from others.
By forwarding domains that would have been blocked by Pi-hole immediately, you'll have two resolvers inspect those domains.
So your assumption could only be correct if the number of DNS requests for local domains would exceed those for blocked domains.

Besides, Pi-hole has been optimised for that very job - checking is very fast even for million-strong blocklists.

As you are running those commands from the knot resolver itself, they are inconclusive. The dig to your Pi-hole at least incurs the network latency penalty, its also not clear whether the other resolvers would apply DNSSEC validation, and whether that lookup had been cached - which brings us back to your observation. :wink:

My request to run that dig against your upstream knot was trying to establish whether that would supply a shorter TTL, which could have exlained why Pi-hole would cache entries for a shorter time.
As can be seen from your result, that is not the case - if Pi-hole would have sent that request, your upstream knots result would have prompted it to cache that A record for its TTL of 75232 seconds.

Nah, that sounds completely expected to me. :wink:

Could you verify if that previous request to your front-end knot has registered in Pi-hole's Query Log?

As you are running those commands from the knot resolver itself, they are inconclusive. The dig to your Pi-hole at least incurs the network latency penalty, its also not clear whether the other resolvers would apply DNSSEC validation, and whether that lookup had been cached - which brings us back to your observation. :wink:

Containers :slight_smile: So no network latency. All running on the same HV. You can ignore that (of cause, you can not be aware of that) Anyway i testet it from inside the pihole container, excalty the same behavior, just to proof what i say :wink:

Nah, that sounds completely expected to me. :wink:

Could you verify if that previous request to your front-end knot has registered in Pi-hole's Query Log?

I can, i even cleared the cache of knot after first and second try(pihole log):

SHORT Version:

Jan 12 12:31:41: query[A] HEiSE.de from 192.168.1.10
Jan 12 12:31:41: forwarded HEiSE.de to 192.168.1.10#5353
Jan 12 12:31:41: reply HEiSE.de is 193.99.144.80

Jan 12 12:31:41: query[A] hEiSE.DE from 192.168.1.10
Jan 12 12:31:41: forwarded hEiSE.DE to 192.168.1.10#5353
Jan 12 12:31:41: reply hEiSE.DE is 193.99.144.80

Long Version:

Jan 12 12:31:41: query[A] HEiSE.de from 192.168.1.10
Jan 12 12:31:41: forwarded HEiSE.de to 192.168.1.10#5353
Jan 12 12:31:41: reply HEiSE.de is 193.99.144.80
Jan 12 12:31:41: query[DS] hEiSe.De from 192.168.1.10
Jan 12 12:31:41: forwarded hEiSe.De to 192.168.1.10#5353
Jan 12 12:31:41: reply hEiSe.De is NODATA
Jan 12 12:31:41: query[NS] HEIsE.DE from 192.168.1.10
Jan 12 12:31:41: forwarded HEIsE.DE to 192.168.1.10#5353
Jan 12 12:31:41: reply HEIsE.DE is <NS>
Jan 12 12:31:41: reply HEIsE.DE is <NS>
Jan 12 12:31:41: reply HEIsE.DE is <NS>
Jan 12 12:31:41: reply HEIsE.DE is <NS>
Jan 12 12:31:41: reply HEIsE.DE is <NS>
Jan 12 12:31:41: query[DNSKEY] . from 192.168.1.10
Jan 12 12:31:41: forwarded . to 192.168.1.10#5353
Jan 12 12:31:41: reply . is <DNSKEY>
Jan 12 12:31:41: reply . is <DNSKEY>
Jan 12 12:31:41: query[DS] dE from 192.168.1.10
Jan 12 12:31:41: forwarded dE to 192.168.1.10#5353
Jan 12 12:31:41: reply dE is <DS>
Jan 12 12:31:41: query[DNSKEY] De from 192.168.1.10
Jan 12 12:31:41: forwarded De to 192.168.1.10#5353
Jan 12 12:31:41: reply De is <DNSKEY>
Jan 12 12:31:41: reply De is <DNSKEY>
Jan 12 12:31:41: query[A] hEiSE.DE from 192.168.1.10
Jan 12 12:31:41: forwarded hEiSE.DE to 192.168.1.10#5353
Jan 12 12:31:41: reply hEiSE.DE is 193.99.144.80
Jan 12 12:31:41: query[DS] hEiSe.dE from 192.168.1.10
Jan 12 12:31:41: forwarded hEiSe.dE to 192.168.1.10#5353
Jan 12 12:31:41: reply hEiSe.dE is NODATA
Jan 12 12:31:41: query[NS] HEISe.DE from 192.168.1.10
Jan 12 12:31:41: forwarded HEISe.DE to 192.168.1.10#5353
Jan 12 12:31:41: reply HEISe.DE is <NS>
Jan 12 12:31:41: reply HEISe.DE is <NS>
Jan 12 12:31:41: reply HEISe.DE is <NS>
Jan 12 12:31:41: reply HEISe.DE is <NS>
Jan 12 12:31:41: reply HEISe.DE is <NS>

As you can see, same requests but no cache and always forwarded...

echo "cache.clear()" | socat - UNIX-CONNECT:/run/knot-resolver/control/frontend1; dig A heise.de @192.168.1.10; echo "cache.clear()" | socat - UNIX-CONNECT:/run/knot-resolver/control/frontend1; dig A heise.de @192.168.1.10;

FYI, that command was used

I am not aware of the DNS protocol allowing a DNS client to control a DNS server's caching behaviour (and I also can't imagine there would be - this would be a major security concern).
But it's designed to work in the opposite direction: As I've mentioned above, a DNS server can suggest a client how long to cache a DNS record for (via TTL).

Would MIRRORing perhaps cause your upstream knot (which I assume would also receive the mirrored requests?) behave differently, like injecting different TTLs in its results?

That's strange, as I cannot reproduce that behaviour.

This is what I see for a set of similar requests (click for details):
Jan 12 13:36:18  query[A] HEiSE.de from 192.168.0.20
Jan 12 13:36:18  forwarded HEiSE.de to 127.0.1.1#5335
Jan 12 13:36:18  reply HEiSE.de is 193.99.144.80
Jan 12 13:36:18  query[AAAA] HEiSE.de from 192.168.0.20
Jan 12 13:36:18  forwarded HEiSE.de to 127.0.1.1#5335
Jan 12 13:36:18  reply HEiSE.de is 2a02:2e0:3fe:1001:302::

Jan 12 13:36:35  query[A] hEiSE.DE from 192.168.0.20
Jan 12 13:36:35  cached hEiSE.DE is 193.99.144.80
Jan 12 13:36:35  query[AAAA] hEiSE.DE from 192.168.0.20
Jan 12 13:36:35  cached hEiSE.DE is 2a02:2e0:3fe:1001:302::

Let's see how many cache insertions your Pi-hole has handled:

dig +short chaos txt insertions.bind @192.168.1.12
 dig +short chaos txt insertions.bind @192.168.1.12
"0"

And yes, it IS strange, thats why iam trying to fogire out what could cause this.. Maybe the upstream is somehow telling the pihole to not cache but the answers are the same, the flags are the same.. so i cant see any difference here :frowning:

Edit:

I have tried another DNS Upstream (level3 or cloudflare) and still no caching.. WTF ?!

Jan 12 14:00:49: query[A] heise.de from 192.168.1.12
Jan 12 14:00:49: forwarded heise.de to 4.2.2.1
Jan 12 14:00:49: validation result is INSECURE
Jan 12 14:00:49: reply heise.de is 193.99.144.80
Jan 12 14:00:50: query[A] heise.de from 192.168.1.12
Jan 12 14:00:50: forwarded heise.de to 4.2.2.1
Jan 12 14:00:50: validation result is INSECURE
Jan 12 14:00:50: reply heise.de is 193.99.144.80
Jan 12 14:00:50: query[A] heise.de from 192.168.1.12
Jan 12 14:00:50: forwarded heise.de to 4.2.2.1
Jan 12 14:00:50: validation result is INSECURE
Jan 12 14:00:50: reply heise.de is 193.99.144.80

I can think of two reasons why Pi-hole would not bother inserting DNS records into its cache:
a) Pi-hole sees a zero seconds TTL
b) Pi-hole's cache is disabled

This would have me suspect a) again as I can see from your debug log that b) isn't the case:

*** [ DIAGNOSING ]: contents of /etc/dnsmasq.d

-rw-r--r-- 1 root root 1.5K Jan 12 08:20 /etc/dnsmasq.d/01-pihole.conf
   cache-size=10000

Just in case this would be related to DNSSEC somehow:
Could you try if disabling Pi-hole's DNSSEC validation has an impact?

Jan 12 14:14:36: query[A] heise.de from 192.168.1.12
Jan 12 14:14:36: forwarded heise.de to 1.0.0.1
Jan 12 14:14:36: reply heise.de is 193.99.144.80
Jan 12 14:14:37: query[A] heise.de from 192.168.1.12
Jan 12 14:14:37: forwarded heise.de to 1.0.0.1
Jan 12 14:14:37: reply heise.de is 193.99.144.80
Jan 12 14:14:38: query[A] heise.de from 192.168.1.12
Jan 12 14:14:38: forwarded heise.de to 1.0.0.1
Jan 12 14:14:38: reply heise.de is 193.99.144.80
Jan 12 14:14:38: query[A] heise.de from 192.168.1.12
Jan 12 14:14:38: forwarded heise.de to 1.0.0.1
Jan 12 14:14:38: reply heise.de is 193.99.144.80
Jan 12 14:14:39: query[A] heise.de from 192.168.1.12
Jan 12 14:14:39: forwarded heise.de to 1.0.0.1
Jan 12 14:14:39: reply heise.de is 193.99.144.80
Jan 12 14:14:39: query[A] heise.de from 192.168.1.12
Jan 12 14:14:39: forwarded heise.de to 1.0.0.1
Jan 12 14:14:39: reply heise.de is 193.99.144.80

So the same. We can remove KNOT from the list of reasons, iam query the pihole directly and pihole is forwarding to cloudflare. Maybe its really a bug?

I've installed a second pihole in a ubuntu container and caching works there.. so: WTF.. Iam trying to find the difference now

Maybe it's tied to your virtualisation somehow.
What kind of container is this anyhow? Your debug log would suggest it isn't Docker?

Could you run a dig to an upstream resolver (knot/public) from within your Pi-hole VM/container?

EDIT:
Also

I'm surprised that's working, as your debug log has previously shown your Pi-hole to be unable to talk to public upstreams:

*** [ DIAGNOSING ]: Name resolution (IPv4) using a random blocked domain and a known ad-serving domain
[✓] devilfishe.com is 0.0.0.0 on lo (127.0.0.1)
[✓] devilfishe.com is 0.0.0.0 on eth0 (192.168.1.12)
[✗] Failed to resolve doubleclick.com via a remote, public DNS server (8.8.8.8)

You wouldn't have some firewall rules in place to prevent arbitrary or to enforce certain DNS server usage?

What kind of container is this anyhow? Your debug log would suggest it isn't Docker?

Its LXC in Proxmox

Could you run a dig to an upstream resolver (knot/public) from within your Pi-hole VM/container?

Thats what i did the last posts. The new container (ubuntu instead of Debian, might be related, might be not, need to test that) is working fine so far

I'm surprised that's working, as your debug log has previously shown your Pi-hole to be unable to talk to public upstreams:

Temporary enabled it for debugging. My FW is not enfording, but blocking all known public doh servers even over port 53. For testing purpose i disabled it to make sure its not KNOT related, but it looks like its not.

Ok, another update:
Caching for using pihole directly is working, even with knot.
If the knot is asking (without dig, but using knot-resolver) responses are uncached... So somehow pihole seems to show different behavior between dig and knot asking for an address

Indeed, it does not seem to be tied to a specific upstream resolver.

I won't rule out a possible misbehaviour in dnsmasq/pihole-FTL at this point, but I'd consider that rather unlikely. If caching wouldn't work, we'd see far more reports, and I am also unable to reproduce your issue with my bare metal or Docker testing platforms.

If it is a bug, then the first hard part will be to reproduce it.
Unfortunately, I do not run LXC/proxmox myself. :frowning:

You could configure pihole-FTL to log some more details about how it's processing queries by adding the following flags to your /etc/pihole/pihole-FTL.conf:

DEBUG_QUERIES=true
DEBUG_EXTRA=true
DEBUG_EDNS0=true

Run pihole restartdns to apply those settings.

I will postpone this to tomorrow. I need to keep dns running again for a while (currently back to client->pihole->knot) but with additional logging i might be able to takle that issue down, thanks so far :slight_smile:

That actually doesn't help debug the problem.