Strange behavior once DNSSEC is enabled - resolving issues inside the Pi-Hole black box

Please follow the below template, it will help us to help you!

If you are Experiencing issues with a Pi-hole install that has non-standard elements (e.g you are using nginx instead of lighttpd, or there is some other aspect of your install that is customised) - please use the Community Help category.

Expected Behaviour:

Using DNSSEC does not create DNS resolving issues at all.

BUT (long story):
During the last years I tried several times to enable DNSSEC at http://pi.hole/admin/settings.php?tab=dns. Earlier this gave pretty ugly results like

  • de.wikipedia.org not loading (iOS)
  • buying apps on iOS not possible (p18-buy.itunes.apple.com)
  • Firefox sync not working (token.services.mozilla.com)
  • Updating QOwnNotes not possible (api.qownnotes.org)
  • YouTube not playing videos
  • Spotify not able to download content (iOS-App)
  • ...

Therefore I had to disable DNSSEC again. The problem is, all those issues get discovered only painfully after quite some time (over days and weeks) as things are not working as they should.

Today I gave it a try and enabled it again. So far all of the above seems to work (it's been a few years since those experiences), BUT: https://wander.science/projects/dns/dnssec-resolver-test/ does not load anymore at all (iOS, Windows seems to have it cached from prior to enabling DNSSEC, details with nslookup below).

Actual Behaviour:

grafik

This is the setup:

  • Pi-Hole up to date (stable)
  • Upstream servers:
PIHOLE_DNS_1=9.9.9.9
PIHOLE_DNS_2=5.9.164.112

While Quad9 supports DNSSEC (https://quad9.net/support/faq/#dnssec), I could not verify that for 5.9.164.112 (https://digitalcourage.de/support/zensurfreier-dns-server).

BUT as according to Pi-Hole log Quad9 seems to answer the request on wander.science, I don't get why it seems to fail.

Additionally from a Windows client:

  • [DNSSEC enabled] Using Pi-Hole as resolver (nslookup wander.science or nslookup wander.science ip-address-of-pi-hole):

Either (first test on Windows endpoint 1):

Server:  pi.hole
Address:  xxx.xxx.xxx.xxx

DNS request timed out.
    timeout was 2 seconds.
Nicht autorisierende Antwort:
Name:    wander.science

Or (second test on Windows endpoint 2):

Server:  pi.hole
Address:  xxx.xxx.xxx.xxx

*** wander.science wurde von pi.hole nicht gefunden: Unspecified error.

with query log output:
grafik

This is the query log output when trying to access wander.science in Safari browser on an iOS device:
grafik

  • [DNSSEC disabled] Once DNSSEC is disabled, using Pi-Hole (nslookup wander.science / nslookup wander.science ip-address-of-pi-hole gives):
Server:  pi.hole
Address:  xxx.xxx.xxx.xxx

Nicht autorisierende Antwort:
Name:    wander.science
Addresses:  2a01:4f8:13b:2048::113
          195.201.14.36

...and query log shows:
grafik

  • [DNSSEC enabled] Using Quad9 as resolver (nslookup wander.science 9.9.9.9):
Server:  dns9.quad9.net
Address:  9.9.9.9

Nicht autorisierende Antwort:
Name:    wander.science
Addresses:  2a01:4f8:13b:2048::113
          195.201.14.36
  • [DNSSEC enabled] Same for using the other upstream DNS server (nslookup wander.science 5.9.164.112):
Server:  dns3.digitalcourage.de
Address:  5.9.164.112

Nicht autorisierende Antwort:
Name:    wander.science
Addresses:  2a01:4f8:13b:2048::113
          195.201.14.36

I absolutely don't get what happens inside Pi-Hole. This is just an example which is likely a good one to discover why DNSSEC creates such resolving issues.


Feel free to move this to https://discourse.pi-hole.net/c/bugs-problems-issues/community-help/36.

As those lookups are explicitly using public name servers, Pi-hole's DNSSEC setting would have no bearing on them.

Those time-outs would match the N/A replies for the DS lookup in your Pi-hole screenshots, i.e. the upstream DNS resolver has not answered that query (yet).

This happens if the upstream does not reply in time, e.g. if the upstream takes too long for the initial lookups. Consecutive lookups for the same domain could be expected to be faster.

Does your observation persist over several consecutive lookups?

Also, instead of nslookup, you may want to consider using BIND9's delv for further analysis, as that would provide you with additional DNSSEC related information.

I know. That was just to check/prove there seems to be nothing wrong with the upstream DNS servers itself.

Unfortunately yes, it does.

I did not (yet) have a closer look at all the options, but simply using delv @pi.hole wander.science gives

;; resolution failed: SERVFAIL

with DNSSEC enabled, and

;; broken trust chain resolving 'wander.science/DS/IN': xxx.xxx.xxx.xxx#53
;; broken trust chain resolving 'wander.science/DNSKEY/IN': xxx.xxx.xxx.xxx#53
;; broken trust chain resolving 'wander.science/A/IN': xxx.xxx.xxx.xxx#53
;; resolution failed: broken trust chain

with DNSSEC disabled, where xxx.xxx.xxx.xxx equals Pi-Hole's IP address.

With delv employing DNSSEC validation by default, that second output is expected.
When using it with Pi-hole, DNSSEC should be enabled.

+rtrace will list all DNS requests triggered for the validation, allowing you to align them with Pi-hole's Query Log:

delv +rtrace wander.science @pi.hole

Its output should look similar to mine:

;; fetch: wander.science/A
;; fetch: wander.science/DNSKEY
;; fetch: wander.science/DS
;; fetch: science/DNSKEY
;; fetch: science/DS
;; fetch: ./DNSKEY
; fully validated
wander.science.		1617	IN	A	195.201.14.36
wander.science.		1617	IN	RRSIG	A 13 2 1799 20240222000000 20240201000000 55755 wander.science. WlyicDagAt4SkY8DgUJi1K19GEGU8f/3wM054bq+GKNX5+Q6x+YI1hh1 H0FBMdth4LR0wMRWN2LR6qL9XzYNMg==

This is with using 9.9.9.9 as my Pi-hole's only upstream.

Interestingly enough, delv prompts a validation error for me when using 9.9.9.9 directly:

delv +rtrace wander.science @9.9.9.9
;; fetch: wander.science/A
;; fetch: science/DS
;; fetch: ./DNSKEY
;; fetch: wander.science/DS
;; fetch: science/DNSKEY
;; insecurity proof failed resolving 'wander.science/A/IN': 9.9.9.9#53
;; resolution failed: insecurity proof failed

However, consecutive calls were always successful.
This would suggest a short-termed, temporary issue on 9.9.9.9.
If that wasn't a hiccup in 9.9.9.9's resolver itself, I may have been unlucky enough to hit it during a domain's periodic signature substitution, which may have triggered the failure.

It wouldn't match your observation of repeatable lookup failures, though.

when DNSSEC is disabled in Pi-Hole:

;; fetch: wander.science/A
;; fetch: wander.science/DNSKEY
;; fetch: wander.science/DS
;; fetch: science/DNSKEY
;; broken trust chain resolving 'wander.science/DS/IN': xxx.xxx.xxx.xxx#53
;; broken trust chain resolving 'wander.science/DNSKEY/IN': xxx.xxx.xxx.xxx#53
;; broken trust chain resolving 'wander.science/A/IN': xxx.xxx.xxx.xxx#53
;; resolution failed: broken trust chain

with DNSSEC enabled in Pi-Hole:

;; fetch: wander.science/A
;; resolution failed: SERVFAIL

tried that one several time, always the same output.

How to make any progress here?

The zone is misconfigured.

image

image

1 Like

So it's a... server side issue? I mean it's the service Pi-Hole links at http://pi.hole/admin/settings.php?tab=dns (and just the very first thing I tested once DNSSEC is enabled and that already fails...)...

If you depend on wander.science for reasons, you may consider to contact the domain owners/ maintainers and inform them that they are using an illegit digest algorithm to sign their DNS records, making it difficult to access their site with DNSSEC enabled.

Well,

(first things first: it's not "difficult" to access, it's simply impossible. Not resolving, as simple as that. Broken. That's the reality :frowning: )

on the one hand - as stated in my last post - I'm wondering why you Pi-Hole experts are referring to that service as ultimate test service if it fails to do what it is supposed to do. That would really made me laugh in a strange way if it wouldn't be that sad :slight_smile:

On the other hand - as stated in my original post - it was just one (very first) example of many issues I had in the past with DNSSEC enabled. Now with even the recommended test website failing, that really leaves not that much confidence all the DNS issues with DNSSEC from the past are actually a thing from the past.

Ultimate quiz question: Enable it and wait if/until things break (again) or leave it off again (forever)... hmm... :thinking:

Yeah, I'm done with you. You've been an asshat from day one.

I don't know where that harshness is coming from.

Relate to that please ...

...and keep calm, offending users trying to understand and fix things is probably not the path the Pi-Hole community management should choose. Thank you sir.

You are querying the domain wander.science but your screenshots showed that Pi-hole is querying that domain plus two other domains of the form

wander.science.<ext.domain>
wander.science.<int.domain>

At a guess, these appear to be possible intranet and extranet domains for a company. Why is your computer asking Pi-hole to query those domains? Is this a managed work computer?

Just your reality, though.
As mentioned, when trying to recreate your configuration, I cannot reproduce your issue.

Let me repeat that running delv when DNSSEC is disabled will always produce errors, as delv will apply DNSSEC validation. There is no need to and no benefit from resharing those results.

Then it's probably time to familiarize yourself with DNSSEC.

Assuming the time information on your Pi-hole machine is correct, I consider it highly likely that any of the non-resolving domains you've seen in the past when DNSSEC was enabled can be tracked down to either some server configuration issue or to compromised DNS replies, i.e. DNSSEC detected that the DNS records have indeed been tampered with and protects you from using them.

I ruled out a potential VPN software interfering (likely in split-tunnel mode) on one Windows endpoint used for initial testing. I did those tests again with a "normal" Windows endpoint now (and an iOS and Debian device, like I did when initially creating this forums topic but only used the Windows device output for screenshots) and updated the original post now.

With the different upstream servers (and if conditional forwarding plays a role at all, at least that's where fritz.box is used, now visible in the test screenshots as wander.science.fritz.box) I don't think it's an upstream server issue.

With Strange behavior once DNSSEC is enabled - resolving issues inside the Pi-Hole black box - #6 by DanSchaper I thought "OK so it's a wander.science homemade issue", but when it resolves just fine for you others here WITH DNSSEC enabled, everything points back to my Pi-Hole setup.

Got it.

Assumption is correct I think. sudo systemctl status ntp gives green light. /etc/ntp.conf contains the default NTP servers

pool 0.debian.pool.ntp.org iburst
pool 1.debian.pool.ntp.org iburst
pool 2.debian.pool.ntp.org iburst
pool 3.debian.pool.ntp.org iburst

I really need to learn how DNSSEC works now. Currently I can not imagine what server configuration part might be a potential issue creator here. The only customized thing always was and still is Pi-Hole running on a virtual network interface (eth0:0). Might that be an issue for DNSSEC? I can not imagine (as other requests work just fine with DNSSEC enabled), but as I still don't understand DNSSEC I better ask.

Can't speak for the former issues back then, that's why I am focussing that much on wander.science now as that's an issue I can provoke/reproduce now constantly on all devices in the network (and every device uses Pi-Hole as the router/DHCP server is serving Pi-Hole as DNS server).


My ultimate confusion (still with no sufficient knowledge on how DNSSEC actually works in detail) is:
with DNSSEC disabled in Pi-Hole, I can surf at DNSSEC Resolver Test and run the DNSSEC Resolver Test ...which gives:

Not easy to understand for me I am sorry.

It is the DNS resolver that you are using which determines if DNSSEC is enabled or disabled. In the case of Quad9's 9.9.9.9, and Unbound configured as per the Pi-hole guide, DNSSEC is enabled.

The setting to which you refer is in Pi-hole's Settings > DNS > Use DNSSEC. This setting effectively means "make use of the DNSSEC information via the DNS resolver". If the DNS resolver does not support DNSSEC then it should be left off (it is off by default). If the DNS resolver does support DNSSEC then it can be left off or turned on. If it is turned on then Pi-hole displays additional queries and information about the DNSSEC status for all its queries. This also increases the amount of info being logged in Pi-hole's query database.

That is why you are seeing that result even when the Use DNSSEC setting is turned off – your resolver is still doing DNSSEC.

Another possibility, when using a web browser, is that your browser is funneling its DNS requests through the browser vendor's DNS server, or a third-party add-on's DNS, via DNS-over-HTTPS or DNS-over-TLS, where that DNS server supports DNSSEC. This would cause that result to show that DNSSEC was enabled regardless of Pi-hole and its upstream resolver. However, in this case, this would mean you don't see these requests in the Query Log, nor would it be relevant for non-browser diagnostics such as delv.

There have been issues with DNSSEC in dnsmasq, and, hence, Pi-hole. That's true and there is no arguing around that. However, those were kicked out around 2018/2019.

Just to add one further datapoint: ever since then, I always had DNSSEC enabled in my home Pi-hole and never have witnessed a single issue with pages not loading because DNS not resolving. My setup uses a local unbound resolver as the sole upstream.

I concur with @chrislph that you are seeing the test passing because Quad9 does the validation for you. That's fine - but only in principle as there is nothing ensuring the reply wasn't mangled on the last path of the DNS queries life (Quad9 to your home). That's why I am running my own resolver locally.

This is really simple to answer: Because we are a free and open source project. Nobody is living from this project, we have no employed developers or anything like that as a project. We simply have no manpower to monitor such external things. When the original test was taken down, we didn't notice either. A user create an issue ticket and we replaced the link. At this time, it worked properly. Between then and now it died and, again, nobody realized - until you did. I am not sure there is anything wrong about this... I don't think so.

2 Likes

Hi,

I noticed the same thing happening over the last few days with lloydsbanking.com, doesn't resolve when DNSSEC is enabled. As of today I was unable to log into apple store, 2FA authentication on Office365 failed. Disabled DNSSEC and the problem was resolved. (pihole is running as a container)

using nslookup (DNSSEC off)
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
www.lloydsbank.com canonical name = s5933.cdn.lloydsbanking.com.
Name: s5933.cdn.lloydsbanking.com
Address: 95.101.250.249

nslookup DNSSEC On:
nslookup www.lloydsbank.com
;; communications error to 127.0.0.53#53: timed out
;; communications error to 127.0.0.53#53: timed out
;; communications error to 127.0.0.53#53: timed out
;; no servers could be reached

Trying a different address lookup with DNSSEC on:
nslookup www.saix.net
Server: 127.0.0.53
Address: 127.0.0.53#53

Non-authoritative answer:
Name: www.saix.net
Address: 196.25.1.204

Note: I am wondering whether my ISP is blocking DNS sec queries, not sure how to test this specifically.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.