I'm running 2 pihole+unbound instances in docker containers on raspberry pies. I use pihole-cloudsync to keep the adlists etc. in sync in a push/primary pull/secondary relationship.
Today the secondary instance began blocking 100% of it's DNS queries.
Running dig against the secondary unbound DNS was fine but the same query against the secondary pihole failed with ";; connection timed out; no servers could be reached"
I noticed that while the primary had all the adlists present and was able to update gravity ok, the secondary adlist was empty.
If I attempt to update gravity on the secondary I get:
[i] Neutrino emissions detected...
[✓] Pulling blocklist source list into range
[i] No source list found, or it is empty
[✓] Preparing new gravity database
[i] Creating new gravity databases...
[✗] Unable to copy data from /etc/pihole/gravity.db to /etc/pihole/gravity.db_temp
Parse error near line 11: no such table: OLD.group
Parse error near line 14: no such table: OLD.domainlist
Parse error near line 15: no such table: OLD.domainlist_by_group
Parse error near line 16: no such table: OLD.domainlist_by_group
Parse error near line 18: no such table: OLD.adlist
Parse error near line 19: no such table: OLD.adlist
Runtime error near line 20: FOREIGN KEY constraint failed (19)
Parse error near line 23: no such table: OLD.group
[✗] Unable to create gravity database. Please try again later. If the problem persists, please contact support.
If I disable blocking, queries are forwarded to unbound and correctly resolved. Also a dig run against the secondary pihole will succeed.
So many questions. What happened? How did my gravity db disappear and why? What do I do to troubleshoot this further and resolve the issue(s)? Does pihole block all queries if there is no adlist rather than forward queries to DNS for resolution? Why would enabling or disabling blocking seem to interrupt communications between the pihole and unbound containers?
I'm pretty new to pihole and have been trying to learn to walk before running so my environment has been changing over time. I started out with a single instance of pihole running on my Synology NAS and using google as my upstream resolver. I then went to 2 instances with 1 on the rpi and one on the Syno. Then I introduced pihole-cloudsync with the rpi as primary and NAS as secondary. The next step was adding unbound to the primary rpi while continuing to use google for the secondary pihole resolver. The last "phase" of this was to duplicate the rpi by adding a second rpi with pihole+unbound making it secondary and retiring pihole on the NAS.
The dual rpi pihole+unbound has been working properly (as far as I can tell) for about 2 weeks. Today I added Tailscale to the secondary rpi but that was about 3 hours after pihole started to block 100% and about 3 hours before I noticed this problem. Although it is coincidental that this happened on the same day that I installeed Tailscale it doesn't seem that timing is close enough to say that started the trouble. I have disabled Tailscale to see if the might be a relationship but no, the problem persists if Tailscale is running or not.
I'm hoping that with some help I can figure out what's going on without having to rip it all down and start again.