Pihole stopped working - after upgrade to v5.18.2 it worked again

Expected Behaviour:

  • OS: Raspbian GNU/Linux 11 (bullseye)
  • hardware: RaspberryPi3
  • pi-hole configuration with unbound

Pi-Hole stopped DNS'ing in the "middle of the night", no changes on OS or config present.
Did a reboot, still nothing.
Upgraded to Pi-Hole v5.18.2 via pihole -up after that it worked again

Actual Behaviour:

even after a reboot, the DNS was not possible
Can i do something to mitigate this behaviour in the future?

the FTL.log looked like this before reboot:

[2024-04-05 06:50:01.523 15239/T15353] Notice: Database size is 4551.99 MB, deleted 951 rows
[2024-04-05 07:00:02.014 15239/T15353] Notice: Database size is 4551.99 MB, deleted 862 rows
[2024-04-05 07:02:46.504 15239M] Resizing "FTL-dns-cache" from 3170304 to (198400 * 16) == 3174400 (/dev/shm: 22.0MB used, 510.0MB total, FTL uses 21.9MB)
[2024-04-05 07:02:48.668 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:02:54.495 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:08:17.501 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:08:26.216 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:10:03.145 15239/T15353] Notice: Database size is 4551.99 MB, deleted 1043 rows
[2024-04-05 07:10:26.465 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:10:33.328 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:10:39.746 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:10:47.060 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:10:54.245 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:11:04.795 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:11:16.133 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:11:23.585 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:11:36.760 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:11:42.159 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:12:08.527 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:12:15.083 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:20:00.934 15239/T15353] Notice: Database size is 4551.99 MB, deleted 874 rows
[2024-04-05 07:23:09.461 15239M] Resizing "FTL-dns-cache" from 3174400 to (198656 * 16) == 3178496 (/dev/shm: 22.0MB used, 510.0MB total, FTL uses 22.0MB)
[2024-04-05 07:27:24.050 15239M] WARNING in dnsmasq core: Maximum number of concurrent DNS queries reached (max: 150)
[2024-04-05 07:27:40.018 15239M] Rate-limiting 192.168.78.161 for at least 37 seconds
[2024-04-05 07:28:17.738 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 2774 queries
[2024-04-05 07:29:17.285 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 3204 queries
[2024-04-05 07:30:03.076 15239/T15353] Notice: Database size is 4551.99 MB, deleted 858 rows
[2024-04-05 07:30:17.494 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 1541 queries
[2024-04-05 07:31:17.791 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 1443 queries
[2024-04-05 07:32:17.016 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 1534 queries
[2024-04-05 07:33:17.212 15239/T15354] Still rate-limiting 192.168.78.161 as it made additional 1449 queries
[2024-04-05 07:33:41.099 525M] Using log file /var/log/pihole/FTL.log
[2024-04-05 07:33:41.107 525M] ########## FTL started on pihole! ##########

Debug Token:

xH8vOKhN

Maximum number of concurrent DNS queries reached and rate limiting would indicate that at times, either you have at least one client that excessively requests resolution, or your upstream is unresponsive or inaccessible.

Run from the machine hosting your Pi-hole, the following commands may reveal overly active clients or excessively requested domains.

echo ">stats >quit" | nc localhost 4711
echo ">top-clients >quit" | nc localhost 4711
echo ">top-domains >quit" | nc localhost 4711
echo ">top-ads >quit" | nc localhost 4711

thanks for the fast answer.
I do regularly have a look on the stats on the dashboard and I do get sometimes a "client has been rate-limited".

What I never got (until today like at 6am) was a completely unresponsive DNS. I did try on multiple clients, even pihole itself didn't get a DNS-answer.

Admittedly my network does have a few clients (~60 always active and around 20 "on demand" like smartTVs and stuff). Especially my Laptop and that of my son are prone to use a lot of queries. But both Laptops were shut down during that phase this morning.

The top domains are no surprise, mostly regular services like NTP, API for my electric scooter (NIU), API for my NAS and OndeDrive and of course another forum, which I held open in a Tab yesterday.
The top ad domains are also no surprise, mostly Microsoft-telemetry, Playstation and my Whirlpool API (my.idigi.com), which I set on a blacklist.

From my view, there's not much to see, except both Laptops running in a occasional rate limit (which neither I nor my son really notices). I was thinking, perhaps something blocked, and the update of the pi-hole to current version released that block? That's why I added the FTL.log in my initial post, or is this "normal behaviour" also?

Presently it works "as designed", so OK for now, I just don't want to run into something like that again, perhaps when I'm not home and I don't have a chance to have a look into...

your suggested echos:

pi@pihole:~ $ echo ">stats >quit" | nc localhost 4711
domains_being_blocked 4284882
dns_queries_today 255497
ads_blocked_today 101601
ads_percentage_today 39.766026
unique_domains 4379
queries_forwarded 117003
queries_cached 33414
clients_ever_seen 69
unique_clients 68
dns_queries_all_types 255497
reply_UNKNOWN 52212
reply_NODATA 12268
reply_NXDOMAIN 8307
reply_CNAME 57508
reply_IP 123092
reply_DOMAIN 1161
reply_RRNAME 4
reply_SERVFAIL 265
reply_REFUSED 94
reply_NOTIMP 0
reply_OTHER 2
reply_DNSSEC 18
reply_NONE 0
reply_BLOB 566
dns_queries_all_replies 255497
privacy_level 0
status enabled
pi@pihole:~ $ echo ">top-clients >quit" | nc localhost 4711
0 78375 192.168.78.161 LaptopThomas.fritz.box
1 26459 192.168.78.165 LaptopDavid.fritz.box
2 18061 fd80::f95d:xx:xx:xx S21-von-Thomas.fritz.box
3 17260 192.168.78.73 HS110.fritz.box
4 15446 fd80::cca3:xx:xx:xx EchoBadezimmer.fritz.box
5 12810 192.168.78.171 LaptopSarah.fritz.box
6 12135 192.168.78.124 PS5-Dave.fritz.box
7 11155 fd80::xx:xx:xx:xx EchoArbeiten.fritz.box
8 11049 192.168.78.164 HandyDave.fritz.box
9 8484 192.168.78.14 oktopusread.fritz.box
pi@pihole:~ $ echo ">top-domains >quit" | nc localhost 4711
0 8623 time.nist.gov
1 8612 pool.ntp.org
2 6256 app-api-fk.niu.com
3 4375 www.google.com
4 3697 wpad.fritz.box
5 3224 community.openhab.org
6 2002 account-fk.niu.com
7 1956 api.insight.synology.com
8 1867 alexa.amazon.de
9 1681 api.onedrive.com
pi@pihole:~ $ echo ">top-ads >quit" | nc localhost 4711
0 21445 self.events.data.microsoft.com
1 13875 eu-mobile.events.data.microsoft.com
2 11950 eu-office.events.data.microsoft.com
3 10137 telemetry-console.api.playstation.com
4 3874 teams.events.data.microsoft.com
5 3353 mobile.pipe.aria.microsoft.com
6 2897 my.idigi.com
7 2634 global.telemetry.insights.video.a2z.com
8 2111 www.msftncsi.com
9 1878 fls-eu.amazon.com

The top 5 or so blocked domains seem MS telemetry related, but they are not overly excessive.
Still, you may try to tune those down a bit by configuring your Windows clients according to BSI recommendations, see e.g. Schritt-für-Schritt-Anleitung: Telemetriedaten-Übertragung bei Windows 10 Home abschalten – Datenschutz – Unter dem Radar.

Likewise, 94 is not that a large count of REFUSED replies, indicating that while excessive DNS requests have happened, they are not likely to be your main issue.

Instead, this may suggest that your upstream DNS servers are unresponsive at times, further supported by roughly 20% of your DNS requests shown as UNKNOWN, which commonly may happen when Pi-hole did not receive a reply for a forwarded DNS request.

You are using unbound as Pi-hole's upstream, and unbound would be a bit slower than a public caching resolver when resolving a domain for the first time (but quicker for domains it has already cached).

In addition, unbound requires correct time information, which may be off on RTC-less systems like your RPi, especially after periods of longer shutdowns, resulting in no DNS resolution until the time is resync'ed with a time server.

And indeed, your two top allowed domains are for time servers.

As you are operating a FritzBox router, and that router can be configured to act as a local time server, you could consider to shadow time.nist.gov and pool.ntp.org via Local CNAME records to time.fritz.box, creating a Local DNS record for time.fritz.box pointing to your router's IP address.

If Pi-hole's Conditional Forwarding is enabled, those DNS records would allow time to be sync'ed via the given CNAMEs even if your system's time would be off by too much for unbound to succesfully resolve any domain at all (due to failing DNSSEC validation).

1 Like

My Win11 on my business-Laptop does not allow for changing that - I contacted my Group-IT, perhaps they'll change the settings, thanks for the link, didn't know, that such a recommendation existed.

Thanks, did configure Pi-Hole accordingly. but my pi-hole should already ask my FritzBox, according to my /etc/systemd/timesyncd.conf:

[Time]
NTP=fritz.box
FallbackNTP=0.debian.pool.ntp.org 1.debian.pool.ntp.org 2.debian.pool.ntp.org 3.debian.pool.ntp.org
RootDistanceMaxSec=5
PollIntervalMinSec=32
PollIntervalMaxSec=2048

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.