Hmm, yes, that's indeed interesting. Can you test the delay for some random domains you have not queried before? Like
dig ebay.com @1.1.1.1
dig ikea.com @1.1.1.1
and some else, checking the reply time (right at the bottom)?
Hmm, yes, that's indeed interesting. Can you test the delay for some random domains you have not queried before? Like
dig ebay.com @1.1.1.1
dig ikea.com @1.1.1.1
and some else, checking the reply time (right at the bottom)?
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Raspbian <<>> ikea.com @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 704
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;ikea.com. IN A
;; ANSWER SECTION:
ikea.com. 300 IN A 204.74.99.103
;; Query time: 65 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Tue Oct 13 13:25:48 IDT 2020
;; MSG SIZE rcvd: 53
Did not enter IKEA at all, so it's a new site. it's 65 ms.
this is one I ran:
dig blizzard.com @1.1.1.3
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Raspbian <<>> blizzard.com @1.1.1.3
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14091
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;blizzard.com. IN A
;; ANSWER SECTION:
blizzard.com. 129 IN A 137.221.106.104
;; Query time: 188 msec
;; SERVER: 1.1.1.3#53(1.1.1.3)
;; WHEN: Tue Oct 13 13:27:31 IDT 2020
;; MSG SIZE rcvd: 57
I will watch my network to see if something is using the upload to the limit of my isp's bandwidth. would have been nice to have a dashboard in pihole for traffic that at least goes through the pihole.
; <<>> DiG 9.11.5-P4-5.1+deb10u2-Raspbian <<>> get.paleorecipebook.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40500
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;get.paleorecipebook.com. IN A
;; ANSWER SECTION:
get.paleorecipebook.com. 300 IN CNAME unbouncepages.com.
unbouncepages.com. 60 IN A 54.93.101.66
unbouncepages.com. 60 IN A 18.196.95.178
;; Query time: 339 msec
;; SERVER: 1.1.1.3#53(1.1.1.3)
;; WHEN: Tue Oct 13 14:05:16 IDT 2020
;; MSG SIZE rcvd: 112
Should be about zero (only a few kilobytes), I don't think it'd be really worth it. Also, keep in mind Pi-hole should also run on very low-end devices like Raspberry Pi Zero. Bandwidth monitoring isn't all that trivial and there is a ton of external software dedicated for monitoring. We would always only be running after what they already offer, so I suggest installing something like munin
or RPi-monitor
(note: I haven't checked if they can actually do network traffic diagnostics, but I assume so).
These are pretty slow, is a ping
to their servers similarly slow?
Reply from 137.221.106.104: bytes=32 time=217ms TTL=49
to blizzard.
I made sure to have at least 100KB upload free at all times for the pihole, and it still happens.
Also, keep in mind Pi-hole should also run on very low-end devices like Raspberry Pi Zero
I understand that, I meant as an optional component. but you are correct, only dhcp and dns traffic goes through it.
just noticed something strange, there's another type of unknown entry, not 0 but 12. here's a screenshot:
/var/log/pihole.log:16747:Oct 14 06:19:39 dnsmasq[653]: 50855 192.168.55.100/57402 query[A] www.fosshub.com from 192.168.55.100
/var/log/pihole.log:16748:Oct 14 06:19:39 dnsmasq[653]: 50855 192.168.55.100/57402 forwarded www.fosshub.com to 208.67.222.123
/var/log/pihole.log:16749:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 query[A] www.fosshub.com from 192.168.55.100
/var/log/pihole.log:16750:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 forwarded www.fosshub.com to 1.1.1.3
/var/log/pihole.log:16751:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 forwarded www.fosshub.com to 208.67.222.123
/var/log/pihole.log:16753:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 reply www.fosshub.com is 104.20.136.9
/var/log/pihole.log:16754:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 reply www.fosshub.com is 172.67.32.78
/var/log/pihole.log:16755:Oct 14 06:19:39 dnsmasq[653]: 50856 192.168.55.100/57402 reply www.fosshub.com is 104.20.137.9
you'll notice, it's less than 300ms, so at least this kind of unknown is not related to the delayed response.
From this I cannot see that it is less than 300msec. This new 12
is actually what I added. I'll need to update the web interface so it shows forwarded (retried later)
here.
so it is new, I just almost never scroll down. anyway in the screenshot the green query shows 244.8ms
Okay, can you run the tcpdump
and generate a pcap
once again? May just have been a coincidence that last time we picked up the retried scenario which we can now distinguish from this one.
sent it to you in a pm.
I did the required low-level DNS traffic inspection of your data sent via PM now. Thank you very much for the traffic snippet. Apparently, it is a specific timing issue we're facing here. One that was, admittedly, quite hard to extract from the wirecapture. I will put up the exact sequence that lead to what you're seeing (anonymized) below for later reference:
Delay | Sender <--> Destination | Activity |
---|---|---|
0 ms | Windows --> Pi-hole |
Query A a.b.c is made * |
1 ms | Pi-hole --> upstream | Query A a.b.c is sent upstream |
60 ms | Pi-hole <-- upstream | Upstream reply arrives for A a.b.c
|
60 ms | Pi-hole --> upstream | DNSSEC query: DS c
|
118 ms | Pi-hole <-- upstream | Upstream reply arrives for DS c
|
119 ms | Pi-hole --> upstream | DNSSEC query: DS b.c
|
187 ms | Pi-hole <-- upstream | Upstream reply arrives for DS b.c
|
188 ms | Pi-hole --> upstream | DNSSEC query: DNSKEY c
|
219 ms | Windows --> Pi-hole |
Query A a.b.c is retried ** |
220 ms | Pi-hole --> upstream | DNSSEC query: DNSKEY c is retried by Pi-hole *** |
264 ms | Pi-hole <-- upstream | Upstream reply arrives for DNSKEY c
|
270 ms | Pi-hole --> Windows | Reply to the original request |
282 ms | Pi-hole <-- upstream | Upstream reply arrives for retried DNSKEY c **** |
(delay is relative time from first query)
Extra comments:
*
This query is shown as green OK (forwarded)
in the Query Log.**
This is the query that is shown as Unknown (0)
. FTL is ignoring a retired query here because it already has the reply, however, it is not ready to send it to the requestor because DNSSEC verification is still ongoing.***
The too-soon retry of the query leads FTL to the (somwhat wrong) assumption that the DNSSEC verification took to long. As a result, the DNSKEY c
query is retried. This is without any consequences otherwise, so I won't change this minor bit.****
This is the reply to the retried DNSKEY
query. It is silently ignored as we already have the answer.Result of this analysis: Reproducing this exact chain of things is a bit tricky, however, you seem to be able to routinely get into this scenario due to your specific tuning settings.
@Scepterus Please update your Pi-hole to get the latest version and verify that there are no Unknown (0)
queries any longer. Instead, there should be only Unknown (12)
(retried queries) and Unknown (13)
(retried during DNSSEC verification).
hey, that's great, glad I could help. I will keep checking this today to see that I do indeed get only the 12 and 13, they will show up at the bottom if I sort by status right?
also if this is indeed the case, how do we proceed from here?
EDIT: ah sadly the saga is not over, see screenshots.
My bad, I told FTL to add the 13
status to the original instead of the retried query. While this is correct for the "regular" retried queries (the new ones take over, we flag the original query as being ignored), it is wrong in this case (the new query is simply ignored, the old one survives).
So one more update and try, please.
It will be merged into development
and automatically become available for all through the next release of Pi-hole. You can stay on this test branch but have to remember to run pihole checkout master
to get back on track before doing the next pihole -up
. Otherwise, custom checked out branches are preserved across updates. This is intended behavior.
[i] Downloading and Installing FTL...pihole-FTL-arm-linux-gnueabihf: FAILED
sha1sum: WARNING: 1 computed checksum did NOT match
[✗] Downloading and Installing FTL
Error: Download of https://ftl.pi-hole.net/fix/retries_master/pihole-FTL-arm-linux-gnueabihf failed (checksum error)
[✗] FTL Engine not installed
Unable to complete update, please contact Pi-hole Support
that's what I get when I try to update.
This is strange. I triggered a rebuild on our automated system. However, due to many jobs being run at the same time, it may take up to 30 minutes until the new binaries are ready. Please try again after some time today.
ok, it updated now, and so far 0 unknown (0) they are now mostly 13 and a few 12.
what's next? do we need to change something? or just keep watch to make sure no more 0's appear?
are these (13 and 12) going to just stay that way? or is this the first step in fixing them or just renaming them?
thanks for your patience so far with this!
No.
That'd be good, however, I'm fairly certain we found them all.
Yes, almost. They will get nicer names.
ok great! the 0's did not come back since yesterday, so I'll return to master now.
this thing did however make me want to open another bug/feature request, fix the query list status sort. but I'll open a new one for that, thanks!
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.