Very slow DNS resolution through Pi-Hole using Stubby

Expected Behaviour:

Very quick DNS resolution when doing a forward lookup. Typical would expect under 100ms

Actual Behaviour:

After upgrade to v4.0 doing a forward lookup that returns a CNAME and subsequent lookup takes much much longer, maybe 400-1500ms which times out a lot of applications.

$ dig @192.168.1.27 yahoo.jp

; <<>> DiG 9.10.6 <<>> @192.168.1.27 yahoo.jp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38251
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;yahoo.jp.			IN	A

;; ANSWER SECTION:
yahoo.jp.		300	IN	A	183.79.23.196
yahoo.jp.		300	IN	A	183.79.227.111

;; Query time: 1629 msec
;; SERVER: 192.168.1.27#53(192.168.1.27)
;; WHEN: Fri Aug 17 13:55:19 CDT 2018
;; MSG SIZE  rcvd: 85

Debug Token:

7i4q6900ga

What upstream DNS is stubby using, and have you tried to query that one directly for the same DNS request, bypassing stubby?

Does this delay also happen if you take Stubby out of the picture (use, e.g. 8.8.8.8 or 9.9.9.9 as upstream DNS provider)?

Fair enough point. Looks like I have some of default servers in here in addition to the Quad9 server. First three appear to be test servers with spki in use. Might just straight disable those.

  • address_data: 145.100.185.15
  • address_data: 145.100.185.16
  • address_data: 185.49.141.37
  • address_data: 9.9.9.9

Difference in Google 8.8.8.8 resolution and Pi-Hole + Stubby:

pi@raspi-pihole:~ $ dig @8.8.8.8 yahoo.jp

; <<>> DiG 9.10.3-P4-Raspbian <<>> @8.8.8.8 yahoo.jp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5913
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;yahoo.jp.			IN	A

;; ANSWER SECTION:
yahoo.jp.		186	IN	A	183.79.23.196
yahoo.jp.		186	IN	A	183.79.227.111

;; Query time: 25 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Aug 17 15:38:37 CDT 2018
;; MSG SIZE  rcvd: 69

pi@raspi-pihole:~ $ dig @192.168.1.27 yahoo.jp

; <<>> DiG 9.10.3-P4-Raspbian <<>> @192.168.1.27 yahoo.jp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44127
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;yahoo.jp.			IN	A

;; ANSWER SECTION:
yahoo.jp.		300	IN	A	183.79.227.111
yahoo.jp.		300	IN	A	183.79.23.196

;; Query time: 672 msec
;; SERVER: 192.168.1.27#53(192.168.1.27)
;; WHEN: Fri Aug 17 15:38:50 CDT 2018
;; MSG SIZE  rcvd: 85

Removing the three test servers does speed things up somewhat. Now into the sub-200ms range. I'm thinking that might be closer to "good" with the TLS overhead.

Just as a single point of comparison, I use unbound and it's a few hundred msec for that cold query, and it had to go to the authoritative servers.

pi@Pi-3B:~ $ dig yahoo.jp

; <<>> DiG 9.10.3-P4-Raspbian <<>> yahoo.jp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53010
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;yahoo.jp.			IN	A

;; ANSWER SECTION:
yahoo.jp.		3600	IN	A	183.79.23.196
yahoo.jp.		3600	IN	A	183.79.227.111

;; AUTHORITY SECTION:
yahoo.jp.		3600	IN	NS	ns11.yahoo.co.jp.
yahoo.jp.		3600	IN	NS	ns01.yahoo.co.jp.
yahoo.jp.		3600	IN	NS	ns12.yahoo.co.jp.
yahoo.jp.		3600	IN	NS	ns02.yahoo.co.jp.

;; Query time: 346 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Aug 17 14:21:03 CDT 2018
;; MSG SIZE  rcvd: 154

@jfb Thanks, that does help add some comparison. Mobile apps seem particularly sensitive to this. The kiddos have been bugging me about things not working the way they used to before. Learning opportunity, they too can learn Wireshark and networking!

With FTLDNS, directly queried on my Pi 3B:

$ dig yahoo.jp

; <<>> DiG 9.9.5-9+deb8u15-Raspbian <<>> yahoo.jp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36674
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 4, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;yahoo.jp.                      IN      A

;; ANSWER SECTION:
yahoo.jp.               3600    IN      A       183.79.23.196
yahoo.jp.               3600    IN      A       183.79.227.111

;; AUTHORITY SECTION:
yahoo.jp.               3600    IN      NS      ns01.yahoo.co.jp.
yahoo.jp.               3600    IN      NS      ns11.yahoo.co.jp.
yahoo.jp.               3600    IN      NS      ns12.yahoo.co.jp.
yahoo.jp.               3600    IN      NS      ns02.yahoo.co.jp.

;; Query time: 842 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Aug 17 23:11:54 CEST 2018
;; MSG SIZE  rcvd: 154

(this was the first round with my local unbound resolver not being aware of this domain)

$ dig yahoo.jp

; <<>> DiG 9.9.5-9+deb8u15-Raspbian <<>> yahoo.jp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29200
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;yahoo.jp.                      IN      A

;; ANSWER SECTION:
yahoo.jp.               3545    IN      A       183.79.227.111
yahoo.jp.               3545    IN      A       183.79.23.196

;; Query time: 11 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Aug 17 23:12:49 CEST 2018
;; MSG SIZE  rcvd: 69

on my second try. I repeated some more and see reply times between 1 to 12 milliseconds but not more.

My subsequent queries to this domain were 1, 0 and 1 msec from cache.

I changed my upstream provider to Quad9 and saw:

$ dig yahoo.jp                       

; <<>> DiG 9.9.5-9+deb8u15-Raspbian <<>> yahoo.jp
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54779
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;yahoo.jp.                      IN      A

;; ANSWER SECTION:
yahoo.jp.               300     IN      A       183.79.23.196
yahoo.jp.               300     IN      A       183.79.227.111

;; Query time: 284 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Aug 17 23:16:50 CEST 2018
;; MSG SIZE  rcvd: 69

on subsequent queries the queries again completed in at most a few msec.

Looking at debugging data from FTL, it doesn't seem like yahoo.jp is a CNAME:

[2018-08-17 23:16:49.815] **** new UDP query[A] "yahoo.jp" from 127.0.0.1 (ID 4)
[2018-08-17 23:16:49.815] **** forwarded yahoo.jp to 9.9.9.9 (ID 4)
[2018-08-17 23:16:50.096] **** got reply yahoo.jp is 183.79.23.196 (ID 4)
[2018-08-17 23:16:50.097] **** got reply yahoo.jp is 183.79.227.111 (ID 4)

You can see that 281 milliseconds between "forwarded to 9.9.9.9" and "got reply" have really been caused by the upstream provider.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.