Please follow the below template, it will help us to help you!
Please ensure that you are running the latest version of the beta code.
Check.
Problem with Beta 5.0:
Per the title, if I disable Pi-Hole for a set time, DNS is broken for at least 1 minute before operations resume as normal. What used to be a completely seamless transition now has an abrupt network disruption.
Once you run the disabling command, the tail window should show you some lines incoming like
[2020-01-26 11:30:31.296 9059] Reloading DNS cache
and
[2020-01-26 11:32:43.692 9059] Compiled 0 whitelist and 2 blacklist regex filters in [...] msec
What is the delay in between these two messages? Does name resolution work after the Compiled ... message?
Only if the answer to the latter is no: Please check /var/log/pihole.log in addition. Do you see the new queries (which are not being answered) incoming there at all?
[2020-01-27 12:44:48.649 9059] Reloading DNS cache
[2020-01-27 12:44:48.649 9059] Blocking status is disabled
[2020-01-27 12:46:12.284 9059] INFO: No regex whitelist entries found
[2020-01-27 12:46:12.389 9059] Compiled 0 whitelist and 2 blacklist regex filters in 110.5 msec
[2020-01-27 12:48:03.801 9059] Reloading DNS cache
[2020-01-27 12:48:03.801 9059] Blocking status is enabled
[2020-01-27 12:49:34.246 9059] INFO: No regex whitelist entries found
[2020-01-27 12:49:34.352 9059] Compiled 0 whitelist and 2 blacklist regex filters in 111.6 msec
The DNS downtime coincides fairly perfectly with the gap between those two events, with names resolved as soon as the Compiled message appears
Edit: the answer to your question is yes, so the below may or may not be useful - keeping it anyways since it's already been run.
from /var/log/pihole.log I get:
Jan 27 12:46:13 dnsmasq[9059]: read /etc/hosts - 4 addresses
Jan 27 12:46:13 dnsmasq[9059]: failed to load names from /etc/pihole/custom.list: No such file or directory
Jan 27 12:46:13 dnsmasq[9059]: read /etc/pihole/local.list - 4 addresses
Jan 27 12:49:35 dnsmasq[9059]: read /etc/hosts - 4 addresses
Jan 27 12:49:35 dnsmasq[9059]: failed to load names from /etc/pihole/custom.list: No such file or directory
Jan 27 12:49:35 dnsmasq[9059]: read /etc/pihole/local.list - 4 addresses
Jan 27 12:51:17 dnsmasq[9059]: read /etc/hosts - 4 addresses
Jan 27 12:51:17 dnsmasq[9059]: failed to load names from /etc/pihole/custom.list: No such file or directory
Jan 27 12:51:17 dnsmasq[9059]: read /etc/pihole/local.list - 4 addresses
There appears to be gaps during the periods between
Yes, in between FTL is re-reading the lists. I have never seen this taking more than (at most!) very few seconds. I wonder if this has to do with your 30 clients your tiny devices handles. Could you put
DEBUG_DATABASE=true
into your /etc/pihole/pihole-FTL.conf, run pihole restartdns and repeat you test?
I wonder what the log may reveal to us.
PM'd the results, as they were too long to share here.
On a side-note, given that I'm the only one who seems to have experienced these issues (I keep hoping for my sanity that someone else chimes in with a "me too!") I'm starting to wonder if it's specific to my setup (RPi-0, running DietPi).
I may go with a clean install of Raspbian Lite to see if the issue persists, as I'd hate to take up too much developer time if I'm all alone!
It is worth mentioning though that everything was working flawlessly in v4, this issue is in some way related to the beta as far as I can tell.
Due to the per-client options, a lot of the internal machinery got more complex. The issue is that we have to reread this all when changing something somewhere, I just never saw any notable delay in any of my tests.
Okay, so let's look at your log excerpt. The disable command initiates this instantaneously:
[2020-01-27 16:26:43.842 638] Reloading DNS cache
Querying the configuration for all the clients is finished here:
This is very strange as in here we only COUNT the number of distinct domains. This almost looks like you are missing the index on the gravity table?...
Could you please run
time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM gravity;"
time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM vw_gravity;"
and report the output?
Once done, does a run of pihole -g help resolving this issue (test again with above two lines)?
I have the same problem. When enabling or disabling blocking, there is an approx. 1 minute delay before DNS resolution works.
Might be on the right track, as this is the result of the above 2 time commands:
time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM gravity;"
2378482
real 1m1.658s
user 0m32.470s
sys 0m3.541s
time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM vw_gravity;"
2378482
real 0m51.983s
user 0m43.606s
sys 0m3.199s
I'm guesing the Pi (I'm running a Pi3b) just can't easily handle parsing out the distinct records, when the blocklist is so large and with so many duplicates:
[i] Number of gravity domains: 4645932 (2378482 unique domains)
The problem, as I see it, isn't limited to disabling pihole. I noticed the delay (DNS resolution unavailable) when trying to get acquainted with group management. In the group management documentation, it says you need to enter the command pihole restartdns reload-lists, whenever changing things in group management. I never assumed this could affect other pihole features, such as disabling pihole, so I requested to add code, already available in the sources, to display a warning (count down), whenever this delay occurred.
You can read my request here, unfortunately, nobody confirmed the delay. You still might want to add a comment in the topic, this to ensure that you get at least a warning (on the console, if possible in the web interface), in case the problem cannot be resolved entirely (I'm hoping for a performance improvement, but don't know if this is possible).
Off topic:
I haven't been using pihole disable for a long time, since the feature (disabling / enabling pihole) is applied to pihole and thus to ALL clients.
In pihole v.4.3.2, disabling pihole resulted in simply commenting out the gravity list in /etc/dnsmasq.d/01-pihole.conf and inform pihole-FTL.
In pihole beta5, it looks like this is achieved (NOT TESTED) by changing the variable BLOCKING_ENABLED=true and inform pihole-FTL.
Both solutions result in a delay, practically unnoticeable in v4.3.2, aparentlly very much noticeable in beta5.
To overcome this problem (impact on ALL clients), I searched and found a solution for Windows devices.
My solution is explained here, it assumes using unbound, thus avoiding having to go outside the LAN for DNS resolutions.
Note that the solution (the windows command script) can be used without unbound, by simply using an other resolver (208.67.222.222 - OpenDNS). I even implemented (NOT explained in the topic, explained here) running the script, without having the UAC prompt every time.
The down side of this solution: you have to install the script + desktop shortcut on each windows device you want to give the option.
My result for pi zero. My reloading takes minutes. Zero being single core aint helping.
time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM gravity;"
1672511
real 1m32.540s
user 0m54.671s
sys 0m11.784s
pi@RaspPi:~$ time sqlite3 /etc/pihole/gravity.db "SELECT count(DISTINCT domain) FROM vw_gravity;"
1672507
real 1m45.270s
user 1m15.627s
sys 0m13.049s
Thanks for confirming the delay. I will have pihole -g compute the number once and simply store them in the database as numeric value. Your devices will be able to read this number instantly.
and check if this reduces the delay you've been observing here. If it does not work initially (strange number shown on the dashboard), make sure you run pihole -g.
[2020-02-03 10:04:27.036 28138] Reloading DNS cache
[2020-02-03 10:04:27.036 28138] Blocking status is disabled
[2020-02-03 10:04:27.098 28138] INFO: No regex blacklist entries found
[2020-02-03 10:04:27.099 28138] INFO: No regex whitelist entries found
[2020-02-03 10:04:27.140 28138] Compiled 0 whitelist and 0 blacklist regex filters in 43.4 msec
pihole enable
[2020-02-03 10:06:02.006 28138] Reloading DNS cache
[2020-02-03 10:06:02.006 28138] Blocking status is enabled
[2020-02-03 10:06:02.069 28138] INFO: No regex blacklist entries found
[2020-02-03 10:06:02.071 28138] INFO: No regex whitelist entries found
[2020-02-03 10:06:02.100 28138] Compiled 0 whitelist and 0 blacklist regex filters in 32.4 msec