Pi-hole freezes

Hey @ gdw4fpj4tayxug8 / @ Bucking_Horn

Actually I have the exact same problem, however on a different setup. I use pihole on a raspi 4 and have been using it for years with no issues. Suddenly (like 2 month ago) it started to "freeze" on random occasions (meaning that it did not respond on DNS querys anymore). My debug log looks exactly like the one from @ gdw4fpj4tayxug8 (everything seems to be running fine but the resolve fails and the admin GUI would not load). I even replaced my SD card with a brand new one and did a complete fresh install.
Restarting the pihole-FTL service fixes the issue - but I also would like to know, why this happens and maybe how to prevent it. If it helps, here is my debug token: ytOx77ix

Thank you :cowboy_hat_face:

Please don't jump to conclusions.
There are hundreds of lines that differ in your debug log, including OS distribution, processing unit, Pi-hole version, IPv4 connectivity, port allocation, choice of blocklists, to name just a few.

In particular, your debug log shows your pihole-FTL to be up and running, and resolution tests are succeeding (which seems to contradict your observation).

It also shows your Pi-hole has not received a single DNS request for about the last 24 hours at the time of debug log creation:

   -----tail of FTL.log------
   [2023-01-20 08:26:43.515 4934M] Imported 0 queries from the long-term database
   [2023-01-20 08:26:43.515 4934M]  -> Total DNS queries: 0
   [2023-01-20 08:26:43.515 4934M]  -> Cached DNS queries: 0
   [2023-01-20 08:26:43.515 4934M]  -> Forwarded DNS queries: 0
   [2023-01-20 08:26:43.515 4934M]  -> Blocked DNS queries: 0
   [2023-01-20 08:26:43.515 4934M]  -> Unknown DNS queries: 0
   [2023-01-20 08:26:43.515 4934M]  -> Unique domains: 0
   [2023-01-20 08:26:43.515 4934M]  -> Unique clients: 0
   [2023-01-20 08:26:43.515 4934M]  -> Known forward destinations: 0

If there have been active devices during that time, this may suggest that your clients are not sending their DNS requests to your Pi-hole.

Your debug log indicates you are using a FritzBox router (another difference).
Please share the details of its IPv6 DNS configuration.

[✓] IPv4 address(es) bound to the eth0 interface:
    192.168.178.36/24

      router: 192.168.178.1
      dns-server: 192.168.178.63
      domain-name: "fritz.box"
      broadcast: 192.168.178.255
      ntp-server: 192.168.178.1
      Port Control Protocol (PCP) server: 192.168.178.1

The IP of your Pi-hole and the IP your FritzBox sends out to your clients via DHCP don't match.

1 Like

Hi @Bucking_Horn & @yubiuser ,Thank you for your answers :hugs:

Yes, you are absolutely right. From an users point of view it looked like the issue is a similar one and I therefore thought that I won't open a new thread but use the same one. Sorry if it is against the rules.
Unfortunately I pulled the debug log after I restarted the FTL service - maybe this explains why it shows no issue with it. Also I believe the reason that the log shows no requests are my settings: I have query logging turned off and privacy level on anonymous mode. Just to mention: my setup ran perfectly fine with those settings for years. The issue arose just recently, maybe 2 months ago. I also suspected my SD card to be degraded (alltho the SD check turned out to be fine) and I therefore replaced it, did a clean, fresh install of raspibianOS and pi-hole only, nothing else.
Indeed I use a FB as a router, but I have IPv6 disabled by choice.

@yubiuser : yes, thats correct. I run my clients in a Microsoft Active-Directory environment where the domain controllers act as a SpoC. Therefore the DNS route looks as follows: client -> DC -> pihole -> Web .

By far I am not a developer and my understanding is surely limited. However, I find it hard to explain that my router or the rest of my hardware setup can cause the pihole-FTL service itself to become unresponsive and freeze. I'll wait until the next time when it happens and repost a new debug-token fur further analasis.

Thanks a bunch guys :hugs:

Ah, of course.

That setting may also make it harder to find out what's happening.
You may consider to reenable logging until your issue reoccurs, or even until it is dealt with.

There are quite a few ways that could happen.

A change of IPv6 prefixes would have rendered your Pi-hole's IPv6 GUA address invalid.
If you'd put that GUA address in your FB configuration, clients with a preference for IPv6 would take a while before falling back to IPv4, potentially for every DNS requets.
Depending on client behaviour, that may be observed as very slow resolution or a temporary outage.
But you've IPv6 disabled, so this can be ruled out as a cause.

Excessive DNS requests could overwhelm any DNS resolver.
Pi-hole's rate limiting tries to safeguard against this. Of course, to rate-limited clients, this will look like a DNS outage.

DNS loops may amplify or even cause excessive requests.
Your debug log shows you have enabled Pi-hole's Conditional Forwarding.
That may close a partial DNS loop - if your router would use Pi-hole as its upstream DNS resolver.
With your router, that could mean that a query for an unknown host from a guest network client could be bouncing between Pi-hole and your FritzBox, until the FritzBox would get rate-limited.
However, that should only affect general DNS resolution very briefly, while your guest network may suffer from being rate-limited a bit longer or for a sustained period, depending on how desperate clients would be to repeat their failed requests.

You may also want to investigate the possibility of a DNS loop between your DC, Pi-hole and your router.

When the outage occurs next time, try running some DNS lookups from a client, and monitor your Pi-hole's Query Log for the requests (if you decide to reenable logging).
Also take a note of Pi-hole's diagnosis panel, maybe something shows up there.

1 Like

It just happened again last night and I took note of it this morning while I was trying to read your reply :sweat_smile:
The problem is: while the FTL is frozen, I cannot access the admin GUI via Browser - it just won't load further after logging in. Therefore I had no access to the diagnosis panel. However, I enabled query logging via SSH but had no idea, that enabling it will restart the DNS service. And unfortunately I did not pull the -d first :smiling_face_with_tear: But know that I know, I'll do it next time.

I did some DNS requests on my client however when it still was frozen. They are not that surprising tho and I guess wont really help:

C:\Users\lopov>nslookup google.ch
Server:  srv01.aw29.local
Address:  192.168.178.63
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Zeitüberschreitung bei Anforderung an srv01.aw29.local.
C:\Users\lopov>nslookup bing.ch 192.168.178.36
DNS request timed out.
    timeout was 2 seconds.
Server:  UnKnown
Address:  192.168.178.36
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
^C
C:\Users\lopov>nslookup rag.ch 192.168.178.63
Server:  srv01.aw29.local
Address:  192.168.178.63
Nicht autorisierende Antwort:
Name:    rag.ch
Addresses:  2a06:c01:1:1102::31
          5.102.145.31
C:\Users\lopov>nslookup coop.ch 192.168.178.63
Server:  srv01.aw29.local
Address:  192.168.178.63
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Zeitüberschreitung bei Anforderung an srv01.aw29.local.

FYI: the DC is set to a 3 second waiting time before forwarding the DNS query to the root servers when there's no answer from Pi-hole.

A loop between PH / FB and DC the should not happen in theory. However, I disabled CF to see if it changes anything (and actually there is not a lot to see since the only client asking DNS requests should be the DC itself).

I will keep you updated. Thanks again! :cowboy_hat_face:

At least, that's indicating Pi-hole's DNS service is still accepting requests, but not capable of answering them in a timely manner. You'd probably seen something like 'no servers could be reached' otherwise.

Likely unrelated to your issue, but where does that .local come from?

In case there's a DNS record for that domain somewhere:
Note that the .local TLD is reserved for mDNS usage and should NOT be used with plain DNS.

This is one of my MS-AD Trees (srv01 being DC1) I use in my setup and is only used locally with no connection to the outside world. There is no zone for local. and no conditional forwarding. Oh and - no Apple devices in my network :wink:

I'd still recommend to consider changing away from .local.

AFAIAAO, somewhere in the evolution of Win10, MS has added (optional?) mDNS support, and apparently Win11 is employing it by default, see e.g. Aligning on mDNS: ramping down NetBIOS name resolution and LLMNR - Microsoft Community Hub and So nutzen Sie Multicast DNS auf Windows-Systemen (sorry, German link target).

I do partially agree with you. .local is not considered best practice anymore and MS also recommends to not use it.

However - and also AFAIAAO - clients will not use Bonjour / mDNS if they have correct DNS settings configured (why should they?). This especially applies to clients which are joined in a domain. Just imagine the potential security issues if domain clients start do blurr out mDNS requests all over :sweat_smile:

That is not the case.

Hey @Bucking_Horn
Okay, here we go again: FTL froze up and is not responding anymore. This time I pulled the debug, token is DvdV0zGJ .

nslookup results in:

Details
>PS C:\Users\xxx> nslookup bing.com
Server:  srv01.aw29.local
Address:  192.168.178.63
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Zeitüberschreitung bei Anforderung an srv01.aw29.local.
PS C:\Users\xxx> nslookup bing.com
Server:  srv01.aw29.local
Address:  192.168.178.63
Nicht autorisierende Antwort:
Name:    bing.com
Addresses:  2620:1ec:c11::200
          204.79.197.200
          13.107.21.200
PS C:\Users\bluesunset.000> nslookup bing.com
Server:  srv01.aw29.local
Address:  192.168.178.63
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Zeitüberschreitung bei Anforderung an srv01.aw29.local.
PS C:\Users\xxxx> nslookup pi-hole.net 192.168.178.36
DNS request timed out.
    timeout was 2 seconds.
Server:  UnKnown
Address:  192.168.178.36
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
*** Zeitüberschreitung bei Anforderung an UnKnown.

Also attached is a screenshot of my Admin-GUI after I log in into it. Additional Info via pastebin:

  • Result of "sudo service pihole-FTL status"
    Pastebin

After entering sudo service pihole-FTL restart everything is back to normal and working fine.

I hope you can find something winthin the info i provided. Thanks a lot :crossed_fingers:

These lines are unexpected:

*** [ DIAGNOSING ]: contents of /var/log/pihole

-rw-r--r-- 1 pihole pihole 4.8K 23. Jan 16:19 /var/log/pihole/FTL.log
   -----head of FTL.log------
   [2023-01-23 00:59:11.738 30554/F22466] TCP worker already terminating!
   [2023-01-23 00:59:13.822 22466/T22491] Error when obtaining outer SHM lock: Der Eigent�mer-Prozess wurde beendet
   [2023-01-23 00:59:13.823 22466/T22491] Error when obtaining inner SHM lock: Der Eigent�mer-Prozess wurde beendet
   [2023-01-23 01:18:20.708 22466M] Resizing "FTL-queries" from 1622016 to (40960 * 44) == 1802240 (/dev/shm: 2.9MB used, 2.0GB total, FTL uses 2.9MB)
   [2023-01-23 01:54:26.418 30934/F22466] TCP worker already terminating!
   [2023-01-23 01:54:36.874 22466M] Error when obtaining outer SHM lock: Der Eigent�mer-Prozess wurde beendet
   [2023-01-23 01:54:36.874 22466M] Error when obtaining inner SHM lock: Der Eigent�mer-Prozess wurde beendet
   [2023-01-23 02:54:46.489 31336/F22466] TCP worker already terminating!
   [2023-01-23 02:54:46.630 22466M] Error when obtaining outer SHM lock: Der Eigent�mer-Prozess wurde beendet
   [2023-01-23 02:54:46.630 22466M] Error when obtaining inner SHM lock: Der Eigent�mer-Prozess wurde beendet

And your service status is unusual as well:

pi@masterberry:~ $ sudo service pihole-FTL status
* pihole-FTL.service - Pi-hole FTL
     Loaded: loaded (/etc/systemd/system/pihole-FTL.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-01-22 09:24:11 CET; 1 day 8h ago
   Main PID: 22466 (pihole-FTL)
      Tasks: 20 (limit: 4915)
        CPU: 6min 2.233s
     CGroup: /system.slice/pihole-FTL.service
             |- 5176 /usr/bin/pihole-FTL -f
             `-22466 /usr/bin/pihole-FTL -f

I'd expect that to show only one pihole-FTL instance, not two.

You wouldn't perhaps invoke a second instance manually?

1 Like

Hmm okay, I see. Alltho I lack the skills to fully understand whats going wrong :see_no_evil:

No, the only thing I did was: setup new raspbianOS on a new SD card, apt update && upgrade, install and configure pihole - thats basically it.

Do you know if and how I can fix this?

Edit: oh and manually restart the ftl upon freezing up. But aside from that: no manual interruptions