Dig error using dnscrypt-proxy "Truncated, retrying in TCP mode"

dnscrypt

#1

I’m running pihole-ftl (port 53) and dnscrypt-proxy (port 54) on a raspberry pi zero w installed with raspbian stretch.

When run dig pointing at pihole-ftl, the first line of the output shows this:

;; Truncated, retrying in TCP mode

Note that this doesn’t show up when I point dig at dnscrypt-proxy.

Expected Behaviour:

Should be same output as when I run dig -p 54 google.com

; <<>> DiG 9.10.3-P4-Raspbian <<>> -p 54 google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15460
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1252
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		599	IN	A	172.217.26.78

;; Query time: 14 msec
;; SERVER: 127.0.0.1#54(127.0.0.1)
;; WHEN: Sun Oct 07 14:15:12 PDT 2018
;; MSG SIZE  rcvd: 65

Actual Behaviour:

Output of command dig google.com shown below:

;; Truncated, retrying in TCP mode.

; <<>> DiG 9.10.3-P4-Raspbian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12362
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		1131	IN	A	172.217.20.78

;; Query time: 34 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sun Oct 07 14:15:17 PDT 2018
;; MSG SIZE  rcvd: 55

Debug Token:

y0kk052dei


Pi-hole not caching when using dnscrypt-proxy
#3

Which guide did you use to set up DNSCrypt proxy?

To help narrow down the problem, temporarily change your upstream DNS server in Pi-Hole from 127.0.0.1#54 (dnscrypt), to one of the commercial IPV4 servers (say, Cloudflare). Then restart pihole-FTL (sudo service pihole-FTL restart) and repeat the dig google.com command.


#4

I mainly used this guide: https://github.com/jedisct1/dnscrypt-proxy/wiki/Installation-linux

Here’s my dnscrypt-proxy configuration, maybe this will help?

listen_addresses = ['127.0.0.1:54']
max_clients = 250

ipv4_servers = true
ipv6_servers = false

dnscrypt_servers = true
doh_servers = true

require_dnssec = true
require_nolog = true
require_nofilter = true

force_tcp = false
timeout = 2500
keepalive = 30

cert_refresh_delay = 240
tls_disable_session_tickets = true
tls_cipher_suite = [52392, 49199]

fallback_resolver = '9.9.9.9:53'
ignore_system_dns = false
netprobe_timeout = 30

log_files_max_size = 10
log_files_max_age = 7
log_files_max_backups = 1

block_ipv6 = false

cache = true
cache_size = 512
cache_min_ttl = 600
cache_max_ttl = 86400
cache_neg_min_ttl = 60
cache_neg_max_ttl = 600

[sources]
  [sources.'public-resolvers']
  urls = ['https://raw.githubusercontent.com/DNSCrypt/dnscrypt-resolvers/master/v2/public-resolvers.md', 'https://download.dnscrypt.info/resolvers-list/v2/public-resolvers.md']
  cache_file = 'public-resolvers.md'
  minisign_key = 'RWQf6LRCGA9i53mlYecO4IzT51TGPpvWucNSCh1CBM0QTaLn73Y7GFO3'
  refresh_delay = 72
  prefix = ''

Here’s my setupVars.conf as well:

PIHOLE_INTERFACE=eth0
IPV4_ADDRESS=192.168.1.2/24
IPV6_ADDRESS=
QUERY_LOGGING=true
INSTALL_WEB_SERVER=true
INSTALL_WEB_INTERFACE=true
LIGHTTPD_ENABLED=true
DHCP_ACTIVE=true
DHCP_START=192.168.1.100
DHCP_END=192.168.1.149
DHCP_ROUTER=192.168.1.1
DHCP_LEASETIME=24
PIHOLE_DOMAIN=lan
DHCP_IPv6=false
DNSMASQ_LISTENING=single
PIHOLE_DNS_1=127.0.0.1#54
DNS_FQDN_REQUIRED=true
DNS_BOGUS_PRIV=true
DNSSEC=true
CONDITIONAL_FORWARDING=false

I tried dig google.com after setting pihole-ftl’s usptream server to cloudflare, and the additional line does not show up:

pi@pi-hole:~ $ dig google.com

; <<>> DiG 9.10.3-P4-Raspbian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60894
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		112	IN	A	172.217.14.238

;; Query time: 105 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Sun Oct 07 20:12:44 PDT 2018
;; MSG SIZE  rcvd: 55

The message ;; Truncated, retrying in TCP mode. resurfaces the moment I switch back to dnscrypt-proxy.

I’m curious about why the combination of pihole and dnscrypt causes this issue (it doesn’t arise when either is used independently).


#6

Usually, this retrying is caused by the fact that the reply is too long to fit into an UDP package. The change to TCP is then necessary as UDP packets have a certain maximum allowed size. This limit may also be influenced by the MTU in your network.

How do the corresponding lines in /var/log/pihole.log look like when you try to resolve the domain once with dnscrypt and once with the other server (where this error doesn’t show up)?


#7

Thanks for taking interest in this question, I’ve been investigating various possibilities but haven’t looked in to the MTU thing so far. I will have a look at this now.

I hope the output of /var/log/pihole.log will help!

Output with upstream set to dnscrypt-proxy:

Oct  7 21:33:23 dnsmasq[27806]: 28533 192.168.1.145/51818 query[A] google.com from 192.168.1.145
Oct  7 21:33:23 dnsmasq[27806]: 28533 192.168.1.145/51818 forwarded google.com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[27806]: * 192.168.1.145/51818 dnssec-query[DS] com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[27806]: * 192.168.1.145/51818 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  7 21:33:23 dnsmasq[27806]: 28533 192.168.1.145/51818 reply google.com is 172.217.4.174
Oct  7 21:33:23 dnsmasq[9256]: 28534 192.168.1.145/52460 query[A] google.com from 192.168.1.145
Oct  7 21:33:23 dnsmasq[9256]: 28534 192.168.1.145/52460 forwarded google.com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 dnssec-query[DS] com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply . is DNSKEY keytag 20326, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply . is DNSKEY keytag 41656, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply . is DNSKEY keytag 19036, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply . is DNSKEY keytag 2134, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply com is DS keytag 30909, algo 8, digest 2
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 dnssec-query[DS] google.com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 dnssec-query[DNSKEY] com to 127.0.0.1
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply com is DNSKEY keytag 30909, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply com is DNSKEY keytag 46475, algo 8
Oct  7 21:33:23 dnsmasq[9256]: * 192.168.1.145/52460 reply google.com is no DS
Oct  7 21:33:23 dnsmasq[9256]: 28534 192.168.1.145/52460 validation result is INSECURE
Oct  7 21:33:23 dnsmasq[9256]: 28534 192.168.1.145/52460 reply google.com is 172.217.4.174

Output with upstream set to google-dns:

Oct  7 21:35:46 dnsmasq[10600]: 6 192.168.1.145/65518 query[A] google.com from 192.168.1.145
Oct  7 21:35:46 dnsmasq[10600]: 6 192.168.1.145/65518 forwarded google.com to 8.8.4.4
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 dnssec-query[DS] com to 8.8.4.4
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 reply com is DS keytag 30909, algo 8, digest 2
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 dnssec-query[DS] google.com to 8.8.4.4
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 dnssec-query[DNSKEY] com to 8.8.4.4
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 reply com is DNSKEY keytag 46475, algo 8
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 reply com is DNSKEY keytag 30909, algo 8
Oct  7 21:35:46 dnsmasq[10600]: * 192.168.1.145/65518 reply google.com is no DS
Oct  7 21:35:46 dnsmasq[10600]: 6 192.168.1.145/65518 validation result is INSECURE
Oct  7 21:35:46 dnsmasq[10600]: 6 192.168.1.145/65518 reply google.com is 216.58.217.46

#8

When I tested dnscrypt-proxyv2, I noticed this to. I reported this to Simon Kelley, he even installed dnscrypt-proxy to troubleshoot this problem.
It has been fixed in dnsmasq2.80test2 (The latest test release is dnsmasq2.80test8). Unfortunately, you wil have to wait until dnsmasq2.80 is released and has been adopted by pihole.
Workaround: Tests showed the problem appears / disappears depending on the dnscrypt server you use. All d0wn servers seam to have the problem.
Servers I found that behave (no DNSSEC problems) are (as configured in /opt/dnscrypt-proxy/dnscrypt-proxy.toml:

server_names = ['scaleway-fr', 'de.dnsmaschine.net', 'dnscrypt.me']

edit
dnscrypt.eu-dk’ and ‘dnscrypt.eu-nl’ also work, but the signature for these servers is valid for a very long time, witch is unsecure, When using these servers, you’ll find a warning, regarding this, in the dnscrypt-proxy log, if you have configured logging.
/edit


#9

I tried setting dnscrypt-proxy with the upstream servers you mentioned:

pi@pi-hole:~ $ journalctl -f -u dnscrypt-proxy.service
-- Logs begin at Sat 2018-10-06 11:02:33 PDT. --
Oct 08 12:40:57 pi-hole systemd[1]: Started Encrypted/authenticated DNS proxy.
Oct 08 12:40:57 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:57] [NOTICE] Source [public-resolvers.md] loaded
Oct 08 12:40:57 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:57] [NOTICE] dnscrypt-proxy 2.0.17
Oct 08 12:40:57 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:57] [NOTICE] Now listening to 127.0.0.1:54 [UDP]
Oct 08 12:40:57 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:57] [NOTICE] Now listening to 127.0.0.1:54 [TCP]
Oct 08 12:40:58 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:58] [NOTICE] [de.dnsmaschine.net] OK (crypto v2) - rtt: 178ms
Oct 08 12:40:58 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:58] [NOTICE] [dnscrypt.me] OK (crypto v2) - rtt: 178ms
Oct 08 12:40:58 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:58] [NOTICE] [scaleway-fr] OK (crypto v2) - rtt: 158ms
Oct 08 12:40:58 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:58] [NOTICE] Server with the lowest initial latency: scaleway-fr (rtt: 158ms)
Oct 08 12:40:58 pi-hole dnscrypt-proxy[23718]: [2018-10-08 12:40:58] [NOTICE] dnscrypt-proxy is ready - live servers: 3

The message ;;Truncated, retrying in TCP mode. does not disappear.

Further, I compiled dnsmasq-2.80test8 with the following command: make COPTS='-DHAVE_DNSSEC -DHAVE_DNSSEC_STATIC' and ran it with this command sudo ./dnsmasq -q -d --conf-file=conf.

Here’s the contents of the conf file (I tried to match pihole-FTL’s configuration as much as possible):

localise-queries
no-resolv
cache-size=10000
log-queries=extra
local-ttl=2
log-async
domain-needed
bogus-priv
dnssec
trust-anchor=.,19036,8,2,49AAC11D7B6F6446702E54A1607371607A1A41855200FD2CE1CDDE32F24E8FB5
trust-anchor=.,20326,8,2,E06D44B80B8F1D39A95C0B0D7C65D08458E880409BBC683457104237C7F8EC8D
interface=eth0
port=55
server=127.0.0.1#54

Running dig -p 55 google.com gives me same truncated message :confused:


#10

dnsmasq2.80 cannot fix Truncated, retrying in TCP mode because:

This is explained further in a conversation with Simon Kelley (dnsmasq developer)
quote
Getting the answer via UDP or TCP depends on the state of the dnsmasq cache. This is because the answer to query for the root key is too long to fit in a UDP packet via dnscrypt. If dnsmasq queries the root key during its attempt to validate the query above, then the answer will be truncated, and dnsmasq will restart the whole process, using TCP
/quote
You shouldn’t’ be focusing on Truncated, retrying in TCP mode, but on other DNSSEC problems, such as unexpected ABANDONNED or BOGUS validations. that is what dnsmasq2.80 fixes.


#11

Okay, thank you for the explanation.

Your answer got me looking into why the message appears all the time when it should not appear if the result is already present in pihole-ftl’s cache.

I think pihole-ftl is not storing anything in its cache and forwarding all requests to dnscrypt-proxy. The logs seem to confirm this as well, two consecutive requests to are both forwarded to dnscrypt-proxy. Query times are similar for both requests.

Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 query[A] google.com from 127.0.0.1
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 forwarded google.com to 127.0.0.1
Oct  9 18:02:38 dnsmasq[554]: * 127.0.0.1/45827 dnssec-query[DS] com to 127.0.0.1
Oct  9 18:02:38 dnsmasq[554]: * 127.0.0.1/45827 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.102
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.113
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.101
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.139
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.138
Oct  9 18:02:38 dnsmasq[554]: 60859 127.0.0.1/45827 reply google.com is 172.217.194.100
Oct  9 18:02:38 dnsmasq[22103]: 60860 127.0.0.1/44171 query[A] google.com from 127.0.0.1
Oct  9 18:02:38 dnsmasq[22103]: 60860 127.0.0.1/44171 forwarded google.com to 127.0.0.1
Oct  9 18:02:38 dnsmasq[22103]: * 127.0.0.1/44171 dnssec-query[DS] com to 127.0.0.1
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply . is DNSKEY keytag 2134, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply . is DNSKEY keytag 41656, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply . is DNSKEY keytag 19036, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply . is DNSKEY keytag 20326, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply com is DS keytag 30909, algo 8, digest 2
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 dnssec-query[DS] google.com to 127.0.0.1
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 dnssec-query[DNSKEY] com to 127.0.0.1
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply com is DNSKEY keytag 46475, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply com is DNSKEY keytag 30909, algo 8
Oct  9 18:02:39 dnsmasq[22103]: * 127.0.0.1/44171 reply google.com is no DS
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 validation result is INSECURE
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.102
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.113
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.101
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.139
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.138
Oct  9 18:02:39 dnsmasq[22103]: 60860 127.0.0.1/44171 reply google.com is 172.217.194.100
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 query[A] google.com from 127.0.0.1
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 forwarded google.com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[554]: * 127.0.0.1/41794 dnssec-query[DS] com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[554]: * 127.0.0.1/41794 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.102
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.113
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.101
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.139
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.138
Oct  9 18:02:45 dnsmasq[554]: 60960 127.0.0.1/41794 reply google.com is 172.217.194.100
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 query[A] google.com from 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 forwarded google.com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 dnssec-query[DS] com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 dnssec-query[DNSKEY] . to 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply . is DNSKEY keytag 2134, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply . is DNSKEY keytag 41656, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply . is DNSKEY keytag 19036, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply . is DNSKEY keytag 20326, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply com is DS keytag 30909, algo 8, digest 2
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 dnssec-query[DS] google.com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 dnssec-query[DNSKEY] com to 127.0.0.1
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply com is DNSKEY keytag 46475, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply com is DNSKEY keytag 30909, algo 8
Oct  9 18:02:45 dnsmasq[22172]: * 127.0.0.1/56487 reply google.com is no DS
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 validation result is INSECURE
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.102
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.113
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.101
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.139
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.138
Oct  9 18:02:45 dnsmasq[22172]: 60961 127.0.0.1/56487 reply google.com is 172.217.194.100

I compared this behavior with the logs from dnsmasq-2.79 and the logs for this do show that the second response was served from the cache.

dnsmasq: started, version 2.79 cachesize 10000
dnsmasq: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect inotify
dnsmasq: DNSSEC validation enabled
dnsmasq: asynchronous logging enabled, queue limit is 5 messages
dnsmasq: using nameserver 127.0.0.1#54
dnsmasq: read /etc/hosts - 5 addresses
dnsmasq: 1 127.0.0.1/33543 query[A] google.com from 127.0.0.1
dnsmasq: 1 127.0.0.1/33543 forwarded google.com to 127.0.0.1
dnsmasq: * 127.0.0.1/33543 dnssec-query[DS] com to 127.0.0.1
dnsmasq: * 127.0.0.1/33543 dnssec-query[DNSKEY] . to 127.0.0.1
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.102
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.113
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.101
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.139
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.138
dnsmasq: 1 127.0.0.1/33543 reply google.com is 172.217.194.100
dnsmasq: 2 127.0.0.1/43569 query[A] google.com from 127.0.0.1
dnsmasq: 2 127.0.0.1/43569 forwarded google.com to 127.0.0.1
dnsmasq: * 127.0.0.1/43569 dnssec-query[DS] com to 127.0.0.1
dnsmasq: * 127.0.0.1/43569 dnssec-query[DNSKEY] . to 127.0.0.1
dnsmasq: * 127.0.0.1/43569 reply . is DNSKEY keytag 2134, algo 8
dnsmasq: * 127.0.0.1/43569 reply . is DNSKEY keytag 41656, algo 8
dnsmasq: * 127.0.0.1/43569 reply . is DNSKEY keytag 19036, algo 8
dnsmasq: * 127.0.0.1/43569 reply . is DNSKEY keytag 20326, algo 8
dnsmasq: * 127.0.0.1/43569 reply com is DS keytag 30909, algo 8, digest 2
dnsmasq: * 127.0.0.1/43569 dnssec-query[DS] google.com to 127.0.0.1
dnsmasq: * 127.0.0.1/43569 dnssec-query[DNSKEY] com to 127.0.0.1
dnsmasq: * 127.0.0.1/43569 reply com is DNSKEY keytag 46475, algo 8
dnsmasq: * 127.0.0.1/43569 reply com is DNSKEY keytag 30909, algo 8
dnsmasq: * 127.0.0.1/43569 reply google.com is no DS
dnsmasq: 2 127.0.0.1/43569 validation result is INSECURE
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.102
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.113
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.101
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.139
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.138
dnsmasq: 2 127.0.0.1/43569 reply google.com is 172.217.194.100
dnsmasq: 3 127.0.0.1/36602 query[A] google.com from 127.0.0.1
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.100
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.138
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.139
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.101
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.113
dnsmasq: 3 127.0.0.1/36602 cached google.com is 172.217.194.102

Note that upstream is dnscrypt-proxy for both pihole-ftl and dnsmasq2.79.

The response time for the second request is substantially lower (77ms for the 1st, 6ms for the 2nd). The truncated message also does not appear for the cached response.

I think I may be having the same problem mentioned in this thread: Pi-hole not caching when using dnscrypt-proxy

Should I move the conversation over there?

EDIT: This may also help

pi@pi-hole:~ $ echo ">cacheinfo" | nc 127.0.0.1 4711
cache-size: 10000
cache-live-freed: 0
cache-inserted: 0
---EOM---

#13

We can leave it here.


#15

From your cacheinfo snippet, it seems nothing is cached at all (zero cache insertions). Is this always the case (also when querying other domains or using a different upstream destination)?

Concerning the other discussion you mentioned: could it be that there was a recent change in dnscrypt that causes this? It’s strange that the issues are reported at the same time whereas we haven’t seen this before.


#16

Using google-dns as upstream works with the cache.

Here’s the log:

Oct  9 21:48:50 dnsmasq[26205]: 4 127.0.0.1/41633 query[A] google.com from 127.0.0.1
Oct  9 21:48:50 dnsmasq[26205]: 4 127.0.0.1/41633 forwarded google.com to 8.8.8.8
Oct  9 21:48:50 dnsmasq[26205]: * 127.0.0.1/41633 dnssec-query[DS] google.com to 8.8.8.8
Oct  9 21:48:50 dnsmasq[26205]: * 127.0.0.1/41633 reply google.com is no DS
Oct  9 21:48:50 dnsmasq[26205]: 4 127.0.0.1/41633 validation result is INSECURE
Oct  9 21:48:50 dnsmasq[26205]: 4 127.0.0.1/41633 reply google.com is 216.58.193.78
Oct  9 21:48:54 dnsmasq[26205]: 5 127.0.0.1/32930 query[A] google.com from 127.0.0.1
Oct  9 21:48:54 dnsmasq[26205]: 5 127.0.0.1/32930 cached google.com is 216.58.193.78

and the commands I ran to generate above logs:

pi@pi-hole:~ $ dig google.com

; <<>> DiG 9.10.3-P4-Raspbian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37237
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		296	IN	A	216.58.193.78

;; Query time: 71 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Oct 09 21:48:50 PDT 2018
;; MSG SIZE  rcvd: 55

pi@pi-hole:~ $ dig google.com

; <<>> DiG 9.10.3-P4-Raspbian <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57799
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.			IN	A

;; ANSWER SECTION:
google.com.		292	IN	A	216.58.193.78

;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Oct 09 21:48:54 PDT 2018
;; MSG SIZE  rcvd: 55

Finally the cache statistics:

pi@pi-hole:~ $ echo ">cacheinfo" | nc 127.0.0.1 4711
cache-size: 10000
cache-live-freed: 0
cache-inserted: 37
---EOM---

#17

I’ve just tested dnscrypt-proxy v2.0.14, 2.0.15, 2.0.16 and 2.0.17 (latest). with dnsmasq2.80test8, they all react in the same way, i.e. executing dig google.com always generates Truncated, retrying in TCP mode.

I than removed the DNSSEC settings from dnsmasq.d configuration files, and tried again. Without DNSSEC, Truncated, retrying in TCP mode does not appear.

This would suggest it is NOT a dnscrypt-proxy problem, but a dnsmasq problem, when using DNSSEC.

I’ve contacted Simon, explaining the problem, concluding with the simple question:
Is it impossible to use the cache, when using dnsmasq + DNSSEC + dnscrypt-proxy, or is there something else going on?


#18

Yes removing DNSSEC validation does cause the TCP retry problem to disappear for both pihole-ftl and dnsmasq-2.79.

However the cache problem seems to be something specific to pihole-ftl as dnsmasq-2.79 caches responses even with DNSSEC validation enabled.


#19

This is NOT what I see. When using dnsmasq2.80test8, I do get Truncated, retrying in TCP mode twice (entering dig google.com twice after stop/start both dnscrypt-proxy and dnsmasq) and cached is NOT found in the log.


#20

Okay, I did more experimentation and found that it was caching when I ran dnsmasq as follows:

sudo ./dnsmasq -d --conf-file=conf

caching does work as I’ve posted logs previously showing.

BUT, running it without the -d flag does cause the caching issue, you are right. Maybe this more useful information?


#21

Simon says (not the game):

quote
What’s happening here is that dnsmasq spawns a new (child) process to handle each TCP connection, in classic unix-daemon style.

The child process inherits a copy of the cache, and updates it as usual, so that the cache is active within a single TCP connection. When the TCP connection closes, the child evaporates, and with it any additions to its copy of the the cache: they are not propagated back to the long-lived main process.

This is a great simplification, and in a world where almost all DNS goes over UDP, it’s a good idea. On your world, DNSSEC + dnscrypt-proxy are focing lots of stuff over TCP, and it will cost you performance.

Short term, after the roll-over (this week, I think), you’ll be back to on root key and things will OK again.
/quote

I asked for clarification of the last sentence, answer:
quote
At the moment, there TWO root keys, the an old one and a new one. The new root key will become live this week, and the old one will be removed. I expect that the size of queries for the root key will become smaller when that happens, so that they won’t force TCP.

Fixing the general problem won’t happen this week.
quote

I assume, the roll-over he mentioned refers to this. In Brief Overview, section purpose, the date 11 October 2018 is listed as the actual roll-over date.
Our test results may well change after tomorrow…


#22

Thank you this makes sense. The -d flag tells dnsmasq not to fork for handling TCP connections and this is why my logs showed that responses were being cached.

Copied verbatim from the man page for dnsmasq

-d, --no-daemon
    Debug mode: don't fork to the background, don't write a pid file, don't change user id,
    generate a complete cache dump on receipt on SIGUSR1, log to stderr as well as syslog, 
    don't fork new processes to handle TCP queries. Note that this option is for use in debugging 
    only, to stop dnsmasq daemonising in production, use -k. 

So the two issues were related! I will disable DNSSEC validation for now on pihole-ftl and try again in a week. Anyway my configuration for dnscrypt-proxy allows it to only choose upstream servers that support DNSSEC.