Thanks for the thought, ShooflyPie.
However, there seems to be no obvious relation to CPU architecture.
So far, RPi models that encounter the issue cover ARM11 design (used in RPi 1 and Zero), Cortex-A7 (used in early RPi 2B (right, Bull?)), and Cortex-A72 (used in RPi 4B).
I am still unable to reproduce the issue, neither on my Zero nor on my RPi 3A+ (a Cortex-A53 design, also used in 3B and 3B+ as well as late RPi 2B v1.2).
EDIT: ShooflyPie, Zoaky, kusha, g_p : Are you using Pi-hole as DHCP server like Bull?
I'm not using Pihole as a DHCP server. DHCP is handled by my router (an Eero mesh, if it matters).
Is there anything I can run for you - any kind of report I can generate while it's happening that might help? I've tried to do the debug log thing...the log gets generated, but it fails to upload itself due to the lack of connectivity.
Yes, actually, there is something which will tell us what is going on, but it will create a potentially huge log file (which is a good thing in the end).
Please set
DEBUG_ALL=true
in /etc/pihole/pihole-FTL.conf and run pihole restartdns reload
The reload instruction is the same thing happening when you modify a list or change the blocking status so it should already trigger the delay. /var/log/pihole-FTL.log will now likely grow quickly with output. With the full output of the outage, we should be able to identify what is causing the delay so we can continue looking for the why and finally the how (to fix it).
Maybe trigger it twice just to ensure the debug option is already present before the delay begins.
The content of /var/log/pihole.logand/var/log/pihole-FTL.log will be of interest.
Just registered to chime in here also. I am having the same issue on a Pi2b, and on a Pi4 (different networks). When I disable for any length of time, DNS completely breaks and I need to restart the Pi.
[2020-07-22 06:45:45.524 1511M] Reloading DNS cache
[2020-07-22 07:27:11.748 1511M] Blocking status is disabled
[2020-07-22 07:27:43.357 1511M] INFO: No regex whitelist entries found
[2020-07-22 07:27:43.568 1511M] Compiled 0 whitelist and 1 blacklist regex filters for 78 clients in 215.3 msec
[2020-07-22 07:28:13.822 1511M] Reloading DNS cache
[2020-07-22 07:28:13.822 1511M] Blocking status is enabled
[2020-07-22 07:28:42.839 1511M] INFO: No regex whitelist entries found
[2020-07-22 07:28:43.049 1511M] Compiled 0 whitelist and 1 blacklist regex filters for 78 clients in 214.8 msec
[2020-07-22 07:29:16.632 1511M] Reloading DNS cache
[2020-07-22 07:29:16.633 1511M] Blocking status is enabled
I wonder if this comes from the large number of clients. Does 78 sound reasonable to you?
If you also have a very large number of domains, this may end up in FTL having to purge a lot of memory and, somehow, it takes too long for this (N * M problem). What is the average number of queries per day in your network and what hardware are you running your Pi-hole on?
I just checked my router's device list and 78 is accurate, though it's closer to 50 at any given moment as devices go to sleep, etc. That supports four people and several devices running all day. Smart TVs, wifi scales, doorbell, cameras, phones, tablets, e-readers, smart speakers, computers, etc.
Here's the queries from the 24 hour period yesterday. It sort of peaked around 1100, but averages mainly hover around 500-600 on any given day.
I'm running PiHole on a Raspberry Pi Zero W. I agree that it could be the weak link in the chain here, but it did run PiHole 4.x without a hiccup for years.
Yeah, i see that, however Pi-hole v5.0 introduced per-client filtering effectively making everything cost roughly
N(client) * (amount of work with v4.x treating blocking the same for every client)
In reality, it is somewhat less dramatic, however, it can explain why you are now seeing 30 seconds of delay where there was maybe 1-2 before (which went unnoticed).
I don't know if we can do much about it, but I will think about it some more.
For comparison: I have 5 devices in my home network with 2 people actively using the web. But setups differ
And thanks @DL6ER for the context on how the new filtering style works. I remember seeing some messaging that suggested that weaker devices like the Zero W wouldn't be supported anymore during the 5.0 beta, but that all went away with the 5.0 release. Might need to revisit that decision. And it's all my fault!
I think there was some confusion happening here. The bugfix for this was not merged into v5.1.1 at the time you posted. It was still open for review. They just merged it an hour ago. Please update and try again. Maybe this can safe your Pi Zero from scrapping.