Pihole occasional service freeze

My vanilla Raspberry Pi Pihole, fully updated, occasionally locks up during the night at exactly the same time.

These entries are always the last thing to appear in the log

2023-03-27 00:41:59 AAAA 3.uk.pool.ntp.org localhost OK
2023-03-27 00:41:59 A 3.uk.pool.ntp.org localhost OK

Reboot fixes it, for info I cannot SSH into the device either when this occurs.

There does seem to be some DNS service but it's inconsistent across devices, were the service utterly dead I'd have discovered this earlier in the morning

Please upload a debug log and post just the token URL that is generated after the log is uploaded by running the following command from the Pi-hole host terminal:

pihole -d

or do it through the Web interface:

Tools > Generate Debug Log

https://tricorder.pi-hole.net/89Dt7zpO/

Your debug log cntains no hints for actual DNS failures.

Your screenshot above indicates a period where no DNS requests have been processed by Pi-hole.

This may be explained by outages of Pihole's DNS resolver or by your clients DNS requests never reaching your Pi-hole.
The latter would be more likely if you didn't have to restart Pi-hole's DNS resolver in order to get DNS resolution operational again.

If this is indeed always the exact same time, that would suggest some time-based events, e.g. a firewall option in your router shutting down access to your Pi-hole.

Another explanation would be your clients by-passing Pi-hole during that time-frame.
As your debug log suggests your network to have link-local IPv6 connectivity, this could happen if your router would advertise its own link-local IPv6 address as DNS resolver.
However, that seems unlikely, as that wouldn't happen at a restricted time-frame, but always, and likely, some IPv4 clients would still send their DNS requests to Pi-hole's IPv4 address.
Nevertheless, you should verify your router's IPv6 DNS configuration settings.

My OpenWRT router has IPv6 disabled completely

Also more than curious that NTP from localhost is always the last recorded entry

If you are able to connect a keyboard and HDMI screen to the Pi you will be able to investigate next time it happens even with the connection down. You can at least then check the networking or even see if the whole thing has crashed.

I did overlook that line.
Since your device is completely inaccessible, this would strongly suggest a network/OS level issue, out of Pi-hole's scope.

EDIT: Particularly if you can't ssh into your RPi via IP address.

Local checks:
Yes I think I will move it to sit near to my main workstation so I can easily connect KVM and do the pihole -d during the issue

NTP localhost:
This feels like a smoking gun as once back up this is always the last log entry

DNS service failure:
What is odd is that things do continue during the issue. The thing which alerted me this time is the kids' laptops not working properly, they had just been powered on. My workstation and an Apple Mac still seemed ok

SSH down:
Yes this does feel like an OS failure

Does it happen at the same time always, or at the same length of time passed since it was booted? With the display connected you'll be able to check the network and system logs to see if something is failing. Have you seen this same behaviour on both wifi and ethernet or is it only so far on one of those?

I think it dies at the same time exactly, unfortunately I didn't make notes the previous times.

It's a Pi Zero 1.3 with a LAN dongle so wired only.

When unable to ssh is that by the machine's name or by IP or either? If DNS is down, then I would fully see not being able to ssh or ping by name. I'm also wondering if the NTP time server look up precedes a faulty time adjustment. One that maybe shifts your pihole way ahead or behind your client's times. I 'think' I recall that this can cause issues.

If it happens 24 hours after booting, rather than at a set time, then the NTP could be a coincidence. Pi starts up, NTP time lookup is made causing ntp pool query. A few seconds later the system is booted and ready. 24 hours later the NTP time is updated and another ntp pool lookup takes place, and a few seconds later the network goes to sleep (or whatever is happening 24 hours later that takes it offline).

It's not happening 24 hours after booting

SSH / ping is by IP address not DNS

It's certain then that your issue is with your RPi's network connectivity rather than Pi-hole.

You should check cables and ethernet ports involved, and also consider to consult other forums specialising in RPi/networking.

What make and model is your USB adaptor?

Partial service makes me not suspect network

Would DNS cache on end devices still be alive after 7 - 10 hours? I suspect not.

If it died 100% I'd not have raised this ticket, the adaptor isn't branded

I've moved the Pi so I can easily add KVM as required, I suspect it'll not occur again within the 7 day update/close window of this ticket though

Plan of attack for next occurrence:
Switch my workstation to other DNS (batch jobs prepped)
Try ping and SSH to Pi with Wireshark running on workstation
tcp dump on Pi to see what packets are arriving through the Pi IP stack
Telnet enabled as alternative to SSH
Take pihole -d during issue (save manually as upload likely to fail)

Any other ideas? I'll bookmark this post and refer back during next event

Since both ping and shh by IP address fail, you've already established that it's a network issue.
There's absolutely no involvement of Pi-hole with those. That is to say that Pi-hole wouldn't be able to affect those, even if it would have crashed or be otherwise inoperational.

Unlikely, but not impossible, as that would depend on the TTLs of respective domains in use.
Most would be a few minutes to hours, but longer TTLs are not uncommon, e.g. CNAME records for www.youtube.com may have a TTL of 86400 seconds - a full day.

Passively observing clients that still work would not tell you whether your Pi-hole is answering DNS requests. They could be using an alternative DNS server, in absence of Pi-hole.
To actively verify that, you could run nslookup pi.hole 192.168.222.2 from a client.

It wouldn't carry some label spelling 9700?