Unbound & Pihole latest Version Timeout nur beim Servfail test ?!?

Bitte beachte diese Vorlage, damit wir dir bestmöglich helfen können!

Beobachtetes und erwartetes Verhalten

[Bitte beschreibe, welches Verhalten du beobachtet hast und was du eigentlich erwarten würdest. Bitte sei so präzise wie möglich. Nenne wichtige Details, z.B. verwendete Hardware und Betriebssystem]

Raspberry Pi4B mit aktuellstem Raspian OS 64bit LITE

Debug Token:

_[Bitte lade ein Debug Log hoch und poste hier anschließend nur die Token-URL.
Das Token generierst Du über pihole -d oder die Weboberfläche via Tools > Generate Debug Log]

Debug Log_

Ich hab ein komplett frisches Raspian OS 64bit (LITE) aufgesetzt.

Pihole über

curl -sSL https://install.pi-hole.net | bash

installiert, danach Unbound über

sudo apt install unbound

und die ganze Install Anleitung von hier:

Docs Unbound

Zum ersten hatte ich nachdem alles lauffähig war Probleme mit einigen Einstellung in Pihole.
z.b. Groups konnte ich nicht hinzufügen, oder Clients.

Da kam der Fehler oben rechts "Something went wrong, Attempt to write a Readonly Database"

Das konnte ich dann mit usermod -a -G pihole www-data

Wie hier beschrieben lösen:
Error message: “Attempt to write a readonly database”

Jetzt hab ich aber ein weiteres bisher unlösbares Problem.

Nach der Anleitung von Unbound soll man DNSSEC testen mit:

dig fail01.dnssec.works @127.0.0.1 -p 5335
dig dnssec.works @127.0.0.1 -p 5335

letzteres geht einwandfrei, das erste macht immer Probleme und zwar bekomm ich bei der ersten Abfrage immer `dig fail01.dnssec.works @127.0.0.1 -p 5335

; <<>> DiG 9.16.33-Debian <<>> fail01.dnssec.works @127.0.0.1 -p 5335
;; global options: +cmd
;; connection timed out; no servers could be reached
`
Frage ich das ganze nochmals ab dann geht es, aber meistens mit einem sehr hohen Ping oder es geht gar nicht.

Jemand ne Idee ? liegt das an deren Server ?

Weiterhin wollte ich wie im Pihole unter Settings->DNS-> DNSSEC mit dieser Seite DNSSEC testen aber die ist offline https://dnssec.vs.uni-due.de/

Hier nochmal ein Bild: Wenn ich Timeout habe steht bei Pihole N/A

Ich hab den "Fehler" gefunden.

In diesem Thread:

https://community.letsencrypt.org/t/during-secondary-validation-no-valid-ip-addresses-found/121881/7

In der /etc/unbound/unbound.conf.d/pi-hole.conf

Muss der

msg-buffer-size: 4096
edns-buffer-size: 512

geändert bzw. hinzugefügt werden, dann funktioniert es wieder mit

dig fail01.dnssec.works @127.0.0.1 -p 5335

Anscheinend ist der Buffer zu groß gewesen und dadurch ging der erste Abruf in einen Timeout

This change is in conflict with the change, made by DL6ER, see here.

from the unbound docs ( unbound.conf(5) — Unbound 1.17.0 documentation (nlnetlabs.nl)

edns-buffer-size: <number>
Number of bytes size to advertise as the EDNS reassembly buffer size. This is the value put into datagrams over UDP towards peers. The actual buffer size is determined by msg-buffer-size: (both for TCP and UDP). Do not set higher than that value. Setting to 512 bypasses even the most stringent path MTU problems, but is seen as extreme, since the amount of TCP fallback generated is excessive (probably also for this resolver, consider tuning outgoing-num-tcp:).

Default: 1232 (DNS Flag Day 2020 recommendation)

msg-buffer-size: <number>
Number of bytes size of the message buffers. Default is 65552 bytes, enough for 64 Kb packets, the maximum DNS message size. No message larger than this can be sent or received. Can be reduced to use less memory, but some requests for DNS data, such as for huge resource records, will result in a SERVFAIL reply to the client.

Default: 65552

@DL6ER, would really like to know your opinion, and if this needs to be changed in the docs / configuration.

edit

as indicated here, I'm using good-A.test.dnssec-tools.org and good-AAAA.test.dnssec-tools.org

I also have the problem (not yet investigated, until now) that the test for good-A.test.dnssec-tools.org produces ;; connection timed out; no servers could be reached.

I only changed the setting edns-buffer-size: 512, leaving the setting msg-buffer-size default (65552), restarted unbound, the test now succeeds.

there may be a valid reason to change the unbound setting into edns-buffer-size: 512; note however, I didn't change the dnsmasq setting edns-packet-max=1232.

Would it be recommended to also change this?

/edit

edit2

note that the response for the test domains is different:
fail01.dnssec.works: info: query response was DNSSEC LAME
good-A.test.dnssec-tools.org: info: query response was THROWAWAY

/edit2

I have tested a bit and deleted the msg-buffer-size because its not involved in the Timeout Error

In my opinion its also only the edns-buffer-size: 512 in the Unbound Config (Pi-hole.conf)

I left the dnsmasq Settings at edns-packet-max: 1232 in the Config File in /etc/dnsmasq.d

If i change the edns-buffer-size: 512 back to edns-buffer-size: 1232 timeout appears again in my case only for the SERVFAIL DNSSEC Test (fail01.dnssec.works)

Don´t know why, i hope a Developer can answer this

A good way to test where recursive resolution might be failing in the process is +trace

; <<>> DiG 9.18.1-1ubuntu1.2-Ubuntu <<>> good-A.test.dnssec-tools.org +trace
;; global options: +cmd
.			1341	IN	NS	m.root-servers.net.
.			1341	IN	NS	f.root-servers.net.
.			1341	IN	NS	k.root-servers.net.
.			1341	IN	NS	d.root-servers.net.
.			1341	IN	NS	i.root-servers.net.
.			1341	IN	NS	a.root-servers.net.
.			1341	IN	NS	e.root-servers.net.
.			1341	IN	NS	b.root-servers.net.
.			1341	IN	NS	h.root-servers.net.
.			1341	IN	NS	j.root-servers.net.
.			1341	IN	NS	g.root-servers.net.
.			1341	IN	NS	c.root-servers.net.
.			1341	IN	NS	l.root-servers.net.
.			1341	IN	RRSIG	NS 8 0 518400 20221221050000 20221208040000 18733 . jaGkSMNOWgVH4lPHfpLLmwTBy1RMmyaJS9GZrNeyaOUl38RZctdVSacg UwesPpcCUL5pHZK9divZoJm5vAZKrFM5nbn/uwVBS3Knd5zXuaupmhVK kicx5QsxOLpc7e6DsXWjUM/R4YiC1C7kclPAdp9ix3YRPcOFVFsjlozw cF8s0C+h9SlUObhZRlIexMmzEubEUoICsaSRnZ+NqJvCDyRZJMXzwekt b3nMp5U/ZZQXvEPOnaoO+yub9IHkq2PoPPwOgdubBt/kW+oibDNbS92h RKFW+8MioSADqDGzpVunms7ubylfDcsgeN9mGRF2xCwmlDOdyQOQZ71h edoBrw==
;; Received 525 bytes from 192.168.2.11#53(192.168.2.11) in 8 ms

org.			172800	IN	NS	b2.org.afilias-nst.org.
org.			172800	IN	NS	d0.org.afilias-nst.org.
org.			172800	IN	NS	a0.org.afilias-nst.info.
org.			172800	IN	NS	b0.org.afilias-nst.org.
org.			172800	IN	NS	a2.org.afilias-nst.info.
org.			172800	IN	NS	c0.org.afilias-nst.info.
org.			86400	IN	DS	26974 8 2 4FEDE294C53F438A158C41D39489CD78A86BEB0D8A0AEAFF14745C0D 16E1DE32
org.			86400	IN	RRSIG	DS 8 1 86400 20221222050000 20221209040000 18733 . E0Ro4Bny4EED2MLaD1W71QSvK0uqtzMkR0P3FleX5/Dx275KFUYIoP6Y HIo0QLJgIpJM+P2o9wkIWebNRguKHkJctH6pOWDBgA2TyWEqW0dpiEZH pi/g1lPgLEU/Or6iTSclxmbGK+KrOhAH87EnA/0stsKwvPovZH41U0vL tLWrZXTM4JX4XO1VnyKFVagjeD8Izkf7yKSU6AdurGasPWPnqv0PV2xz LjaGWT70IIKcamuQWvi0YgIbCMM0g3+OzJ1e4XMp4RDX1Nb5WJ7YKNrj ESoJIslL0K0eNKEGtATbQo1wseWS596afWVtpZhZdOqPNedvIbh5Nw8i J8KvEQ==
;; Received 800 bytes from 202.12.27.33#53(m.root-servers.net) in 100 ms

dnssec-tools.org.	3600	IN	NS	nsw.dnssec-tools.org.
dnssec-tools.org.	3600	IN	NS	ns2.rollernet.us.
dnssec-tools.org.	3600	IN	NS	ns1.rollernet.us.
dnssec-tools.org.	3600	IN	NS	nsm.dnssec-tools.org.
dnssec-tools.org.	3600	IN	DS	9638 13 2 92551AA25C4ADE8E2882FBF4BEB5B54F9D84379B153848852B68BB3C 793F4B0B
dnssec-tools.org.	3600	IN	RRSIG	DS 8 2 3600 20221222152400 20221201142400 37749 org. k52BCz7auLaAp4vyPmYOcOPKSvGMwW7NGSd1F3kn8Q7DV02lx9ll4vlc BuygIewJ9zyXhx/3aRvlR1LJ6x+0/O/TbG18HgQkG01wnLFXa0XAaAgq OpH+KuCKMLnuosjB6PhvsGuVdR5dQehFayU/bsV+48/s7gEArFFNQfM4 awc=
;; Received 384 bytes from 199.19.53.1#53(c0.org.afilias-nst.info) in 188 ms

test.dnssec-tools.org.	300	IN	NS	dns2.test.dnssec-tools.org.
test.dnssec-tools.org.	300	IN	NS	dns1.test.dnssec-tools.org.
test.dnssec-tools.org.	300	IN	NSEC	www.dnssec-tools.org. NS RRSIG NSEC
test.dnssec-tools.org.	300	IN	RRSIG	NSEC 13 3 300 20221221125125 20221207112125 17868 dnssec-tools.org. 7USn9Qs/z7D6eRoW0+GwMMTgMn0riEU3C8wSYApTqyHSVmQzWt+kyIYF 7CKMEsNqNYrPbFbenC6TDbMX/FS2zA==
couldn't get address for 'dns2.test.dnssec-tools.org': failure
couldn't get address for 'dns1.test.dnssec-tools.org': failure
dig: couldn't get address for 'dns2.test.dnssec-tools.org': no more

We see that actual problem here is couldn't get address for 'dns[1,2].test.dnssec-tools.org'.
This is confirmed in the unbound logs:

[...]
[1670585773] unbound[2916344:3] info: response for dns1.test.dnssec-tools.org. A IN
[1670585773] unbound[2916344:3] info: reply from <test.dnssec-tools.org.> 104.254.247.74#53
[1670585773] unbound[2916344:3] info: query response was THROWAWAY
[1670585774] unbound[2916344:2] info: resolving dns2.test.dnssec-tools.org. AAAA IN
[1670585774] unbound[2916344:2] info: response for dns2.test.dnssec-tools.org. AAAA IN
[1670585774] unbound[2916344:2] info: reply from <test.dnssec-tools.org.> 104.254.247.74#53
[1670585774] unbound[2916344:2] info: query response was THROWAWAY
[...]

And, indeed dig AAAA dns2.test.dnssec-tools.org leads to a SERVFAIL.

Interestingly enough, I cannot confirm this. I did the change, ensured I did restart unbound properly but the issue remained for me - I still can resolve neither dns1.test.dnssec-tools.org nor dns2.test.dnssec-tools.org.

Something similar happens for me when I test the other domain (fail01.dnssec.works) - this test always works for me, regardless of the edns-buffer-size - I do always get the DNSSEC LAME reply even when this takes really long (8 seconds), compare the pihole.log output here:

Dec  9 12:48:02 dnsmasq[3197856]: query[A] fail01.dnssec.works from 127.0.0.1
Dec  9 12:48:02 dnsmasq[3197856]: forwarded fail01.dnssec.works to 127.0.0.1#5335
Dec  9 12:48:10 dnsmasq[3197856]: validation fail01.dnssec.works is BOGUS
Dec  9 12:48:10 dnsmasq[3197856]: reply error is SERVFAIL

Just an idea: maybe you are hitting the timeout but DNS resolution would have actually worked when the timeout would be a bit larger? Try using dig fail01.dnssec.works +timeout=15 and remember that the default timeout is only 5 seconds.


Summary: I cannot reproduce what the two of you are describing here. My unbound version is 1.13.1 on a plain standard Ubuntu 22.04.1 LTS. My unbound config is the same as the one on the docs except for

    logfile: "/var/log/unbound/unbound.log"
    verbosity: 2
    prefer-ip6: yes
    num-threads: 4

edit

Second try: I flushed all buffers, tried again with my config set to exactly the same as in the config and found

Dec  9 12:55:19 dnsmasq[3197856]: query[A] fail01.dnssec.works from 127.0.0.1
Dec  9 12:55:19 dnsmasq[3197856]: forwarded fail01.dnssec.works to 127.0.0.1#5335
Dec  9 12:55:24 dnsmasq[3197856]: query[A] fail01.dnssec.works from 127.0.0.1
Dec  9 12:55:24 dnsmasq[3197856]: forwarded fail01.dnssec.works to 127.0.0.1#5335
Dec  9 12:55:29 dnsmasq[3197856]: query[A] fail01.dnssec.works from 127.0.0.1
Dec  9 12:55:29 dnsmasq[3197856]: forwarded fail01.dnssec.works to 127.0.0.1#5335
Dec  9 12:55:37 dnsmasq[3197856]: validation fail01.dnssec.works is BOGUS
Dec  9 12:55:37 dnsmasq[3197856]: reply error is SERVFAIL

so it took 18 seconds to get the reply with (edns-buffer-size: 1232). When I reduce edns-buffer-size to 512, I can confirm getting the response within one second. FWIW, I still cannot use good-A.test.dnssec-tools.org due to the same error as above with any of the two configurations.

As long as you are not observing any other issues with DNS resolution, I'd leave it as the recommended default, the justification is given right in the description:

Number of bytes size to advertise as the EDNS reassembly buffer size. This is the value put into datagrams over UDP towards peers. The actual buffer size is determined by msg-buffer-size: (both for TCP and UDP). Do not set higher than that value. Setting to 512 bypasses even the most stringent path MTU problems, but is seen as extreme, since the amount of TCP fallback generated is excessive (probably also for this resolver, consider tuning outgoing-num-tcp:).

I have tested this with standard edns setting edns-buffer-size: 1232, and in my case now i get a answer.

dig fail01.dnssec.works +timeout=15
; <<>> DiG 9.16.33-Debian <<>> fail01.dnssec.works +timeout=15
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 38338
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;fail01.dnssec.works.           IN      A

;; Query time: 2719 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Dec 09 18:09:04 CET 2022
;; MSG SIZE  rcvd: 48

very slow 2719 msec

Without the +timeout=15 i run in a Timeout

 dig fail01.dnssec.works

; <<>> DiG 9.16.33-Debian <<>> fail01.dnssec.works
;; global options: +cmd
;; connection timed out; no servers could be reached

My Unbound systemctl status says:

unbound[7099:0] info: validation failure <fail01.dnssec.works. A IN>: no signatures from 5.45.109.212

Unbound Version is 1.13.1

Where is the Problem here ?

Additional i tested from this site here:

https://www.cyberciti.biz/faq/unix-linux-test-and-validate-dnssec-using-dig-command-line/

dig www.dnssec-failed.org
dig www.brokendnssec.net +dnssec

The Last above works without Problems

Both work for me using the standard settings, the first one is slow (2.4s), the other one is faster (0.4s).

As much as I hate to say "that's not my business", I still have to (kind of) say it here. In the end, this looks very clearly like an unbound issue and even a rather obscure one. I'm pretty sure that even if we invest hours of work into trying all sorts of combination, this will still not tell us what is going wrong where.
This is a question that needs to be raised in an unbound forum where developers of unbound are present that know how to properly debug unbound. We could do this for any Pi-hole component ourselves but nobody has ever touched the unbound internals (to the best of my knowledge).

Good Morning @DL6ER

I can only thank you very much for your help, then i try to open an Unbound Issue.

Your work on Pi-hole is great keep it up :smiley:

need to use at in this post (means @), due to error: An error occurred: Sorry, you can only mention 2 users in a post. Thanks for the tip @rdwebdesign

After some additional tests, I'm not really convinced this is an unbound problem.
I'm also running knot resolver on my system, pihole-FTL (dnsmasq) is configured to use unbound as the upstream resolver, no backup. I use knot resolver only from scripts (@127.10.10.5 - p 5555), and for specific domain queries that cause problems with unbound.

Results:

  • unbound, configured with edns-buffer-size: 1232: dig @127.10.10.2 -p 5552 +trace fail01.dnssec.works -->> ;; connection timed out; no servers could be reached
  • kresd, configured with net.bufsize(512): dig @127.10.10.5 -p 5555 +trace fail01.dnssec.works -->> ;; Connection to 199.9.14.201#5555(199.9.14.201) for fail01.dnssec.works failed: timed out.

I verified kresd is actially using the setting:

 sudo kresd -c "/etc/knot-resolver/kresd.conf"
Interactive mode:
> net.bufsize()
512     -- result # 1
512     -- result # 2

running the tests for the domain good-A.test.dnssec-tools.org produces identical results.

looking at the results from DNSSEC Analyzer (verisignlabs.com):

not sure here, my interpretation is that there is something wrong with the nameservers for the domains.

my work around for these kind of (unbound) problems:

/etc/dnsmasq.d/17-BypassUnbound.conf

server=/b2discourse.pi-hole.net/127.10.10.5#5555
server=/b2discourse.b-cdn.net/127.10.10.5#5555
server=/discourse.pi-hole.net/127.10.10.5#5555
server=/discourse-cdn.pi-hole.net/127.10.10.5#5555
server=/docs.pi-hole.net/127.10.10.5#5555
server=/piholediscourse.b-cdn.net/127.10.10.5#5555
server=/wp-cdn.pi-hole.net/127.10.10.5#5555

/etc/dnsmasq.d/18-serversfile.conf

servers-file=/etc/dnsmasq.d/17-BypassUnbound.conf

As you know, this configuration requires only a SIGHUP to reread the specific server configuration (dnsmasq man --servers-file=<file> )

pid=$(ps -e | grep 'pihole-FTL' | awk '{print $1}')
sudo /bin/kill -1 $pid

The entries in the file /etc/dnsmasq.d/17-BypassUnbound.conf are required (for me only), due to my unbound RPZ configuration and the ever changing address of the domains (hosted by a cloud provider?), just examples for the readers.

However ...

To overcome (unbound?) problems with the domains we have been using in this topic, one could use entries like:

server=/fail01.dnssec.works/8.8.8.8
server=/good-A.test.dnssec-tools.org/8.8.8.8

It looks like (dig @8.8.8.8 +trace fail01.dnssec.works) has the missing information cached, thus using their cached information, the domain query is producing the expected reply.

If you put the text inside an inline code block like this @127.10.10.5 - p 5555, the @ doesn't trigger a mention.

Another example: @jpgpi250

Don't do that.

While this may allow the dig commands to produce the expected results, it would also completely defeat the purpose of verifying your unbound installation - you are running them via Google's DNS servers instead.

We've only recently switched our documentation to use the dnssec.works domain for verification, as the previous ones seemingly went out of business.
Given OP's observation, there may be a better alternative, but until we found that, I'd stick with testing unbound instead of Google. :wink:


Bitte nicht nachmachen.

Dies kann zwar dazu führen, dass die dig-Befehle die erwarteten Ergebnisse liefern, aber es würde auch den Zweck der Überprüfung Deiner unbound-Installation völlig zuwider laufen - Du testest stattdessen Googles DNS-Server.

Wir haben unsere Dokumentation erst vor kurzem auf die Verwendung der Domäne dnssec.works zur Überprüfung umgestellt, da die vorherigen Server anscheinend ihren Betrieb eingestellt haben.
In Anbetracht der Beobachtung des OP könnte es wohl eine bessere Alternative geben, aber bis wir diese gefunden haben, würde ich erst einmal dabei bleiben, unbound anstelle von Google zu überprüfen. :wink:

1 Like

The entire point of me using this dnsmasq feature is to bypass my local unbound. I'm NOT saying I would use / recommend this to verify DNSSEC is working, I'm using this to get to domains that aren't reachable, using unbound. I also don't use @8.8.8.8 (google) to bypass unbound, I'm using a locally installed knot resolver (also DNSSEC capable) to get an answer for the query (see the first section of examples, using @127.10.10.5#5555

I just used @8.8.8.8 as an example to explain that using an alternative upstream resolver is possible, this for users that don't have a second local recursive resolver installed.

As far as I'm aware, both @8.8.8.8, @1.1.1.1, @9.9.9.9, ... are using DNSSEC, so the user that has DNSSEC enabled, using pihole, is warned about DNSSEC errors, even if NOT using unboud as upstream.

Again, the proposed solution (that I have been using for a few months now) needs to be implemented only, if the domain query is, and remains, unresolvable using unbound, e.g. work around.

trying this, now, 6 @ code blocks in this reply, thanks for the hint, learned something here, again...

I have created an Issue on Github Unbound:

https://github.com/NLnetLabs/unbound/issues/803

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.