I put your pcap command in both my containers and now both are broken and not resolving DNS at all. Looks like that changed the ftl config to point to the pcap file? How can I quickly correct this without losing my configurations?
Logging into the IP:8080 web interface and removing the writable pcap variable in the All Settings Expert brought back resolution. I would hope writing a pcap file would mirror the traffic to an output and not take over the interface. I do not believe I'll be able to capture a pcap of this behavior given my environment.
I'm open to having a screen share session to help troubleshoot, but this error is not easily reproduceable on my end. The only records I can see that may cause it to happen are related to netflix.com which are blocked. Any other query log items I see during the error timestamps are passed correctly or not logged at all for some reason.
Maybe the file location wasn't writable, this could theoretically prevent dnsmasq from starting. But there would have been a clear error message about this in the log files.
I touched a file to my /pihole mapped config folder and performed chmod 777 inside the container on the touched pcap file. Once I put the command in to write to the .pcap (or placing it directly in the file field in the GUI), I get the following message on the dashboard:
Yes! This is working. I see the dump file growing and dns is still Active on the node.
I will restart the node to clear the error and allow the capture to run until it appears again.
Worth mentioning that I did not need to restart the nodes for the error to clear. After accepting the settings change, the error cleared itself.
Will monitor the next error and provide timestamps with the cap files.
Conditional forwarding was not working, noticed on firewall logs (which also hosts internal dns) that request was received and answer sent...
Also noticed "Case mismatch in DNS reply - check bit 0x20 encoding" in logs
As suggested, when using docker image with
-e FTLCONF_misc_dnsmasq_lines="no-0x20-encode" \
Thank you all for the logs and PCAP recording you sent me via direct messages.
Private communication between Simon Kelley and me already leads to the conclusion that this feature - while being a good idea in principle - just doesn't seem to work with some (large!) upstream servers such as Google's 8.8.8.8 or Cloudflares 1.1.1.1 and there also isn't anything we can really do to resolve this as the way they are replying with the wrong capitalization is exactly what this anti-spoofing feature is designed to prevent. Your logs and our analysis based thereon made this very clear.
It is likely that this feature will now become a default-off feature in the next release unless there is some very good last-minute idea on how this legit replies with different caps can be told apart from real cache spoofing attacks.
You are invited to use no-0x20-encode in the meantime.
I'm happy to help troubleshot this in my home deployment if this is ever picked up again.
Based on my previous post, it appears this is still experimental but may end up being deployed in the future on upstream servers at some point.
I'm not seeing any issues with resolution with the option turned on in PiHole or Unbound at this point in time, just the error log and no visible correlated Query Log entry that I can see.