Thanks for providing the PCAP via PM. I checked what was going on in your network and found that the second query was in fact a resubmission because Windows was impatient.
Windows resubmitted after waiting only 0.1 seconds! That's pretty odd and a bit low for a timeout, but okay, this is probably among the things that cannot be fixed on Windows.
Now we know what is going on and I can look into reproducing this locally so we can work on a fix.
that's a lot of restarts for something that won't fix this for sure. I'll read later into each tweak to see in-depth if something is more relevant to this case. however if it's a time out thing, those entries were in the 600ms range for some of the sites, it could be a normal timeout.
It's just a single restart. The one that disables the entire list of changes. That will tell you with certainty if the issue is with pihole-FTL of if it's self-inflicted.
Just a quick update: Reproducing this locally turns out to be a lot trickier than I figured initially because Linux (which is the only operating system I have at hand) is trying really hard to prevent me from doing DNS lookups with such a ridiculously low retry timeout
ah Linux, allowing you to do stupid things if you want to, but you'll have to work hard for that. yeah, windows is a bit more flexible with user errors. anyway, I might have the time today to restore the settings of the TCP optimizer to defaults and check if that's the cause.
I honestly disagree. From what I know, the registry is a beast you don't want to edit manually. And you can only tweak such things in Windows using third-party software.
Anyway, even when I was able to reproduce retried queries by sending queries with the same query ID in short succession, I was not able to reproduce exactly what you saw. However, I'm currently on my somewhat limited mobile setup and will try to reproduce this at home next week.
I mentioned it's flexible to user errors, I worked as a pc tech for most of my careers, the ease with which normal users can destroy windows with a few clicks be it with 3rd party software or just randomly, is astounding.
as for the change, that's great! I hope this also helps other people, maybe ones with lower-end hardware or low memory or something.
haven't gotten around to testing the TCP optimizer, maybe tomorrow. will update with the results.
yep, reverted to windows defaults and the issue disappeared. will test further to see if it's just temporary.
EDIT: was wrong, it did not change, and I saw this happen on a computer in the network I'm pretty sure I did not use the optimizer on.
Hmm, strange that we are not seeing this from other users on Windows (at least there are no reports). Anyway, a method to handle this is on its way. I hope this will work for you as well.
I think I found a possible explanation in TCP optimizer. there's a setting called "Retransmit Timeout" the description for it says it determines the time before connections are aborted.
now you can see in my screenshot the initial time is 2 seconds, and the minimum time is 300 ms. that would explain the queries that took more than 600ms to respond showing up as unknown.
however, queries should mostly not take that long to respond, I'm using Cloudflare DNS which has a very fast response time, around 60-80ms.
I will watch my network to see if something is using the upload to the limit of my isp's bandwidth. would have been nice to have a dashboard in pihole for traffic that at least goes through the pihole.