Pihole appears to crash until physical reboot

The issue I am facing:
I appear to have a bit of a strange issue - I have used pihole in one form or another for a few years, I'm currently running it standalone on a pizero w /w diet pi. I've tried a number of different releases to the same result.

Pihole is installed natively with unbound and dns and everything points to that static IP 192.168.0.168 Everything works fine for so long and then I believe the device locks up and stops resolving DNS until a physical reboot and then it works no problem again for another short while before the issue happens again -- I do not believe it's overheating, usually runs about 50c permanently

Details about my system:

What I have changed since installing Pi-hole:

I tried extending the lease time of dhcp to see if that would make a difference but rather than it being a lease issue it does appear to bring down the full device.

I've included the debug token below - thanks.

[✓] Your debug token is: https://tricorder.pi-hole.net/Wmp16BeQ/

Your debug log doesn't show any issues.

Pihole appears to crash until physical reboot

Are you sure it is crashing? or you just can't reach the system?
Maybe it could be a problem on the router (I saw you are using wifi).

Are there any other services/jobs running on the system? There could be a different software causing the crash/disconnection.

@Nossie one way to check this is to move the Pi so that it can be connected to a HDMI display and a keyboard. Next time it goes offline you can access it directly and see if it's really crashed or if it's just lost its network.

You may also find a tool like atop useful. I mentioned it in another post linked below. You can use the command atop -r to see today's log and add a space and 1 or more letter y to view yesterdays, the day before's etc.

I'll try atop for a few days to troubleshoot

I have the Pi zero 2 W running unbound and pihole on its own and that's pretty much it. The most talkative device you see on the network is a Pi 4 8GB running 50 containers - in the past I had a 4GB pi pushed to its limits and when resources on that became low, obviously the network performance was impacted. so I split the realtime services needed and upgraded the 4GB. You will see by the sudden drops of traffic when the pizero goes down.

It's quite a hassle gaining local access to the device, I'll need to try and dig out all the right cables (micro hdmi, micro usb to usb A cables etc) - although originally I thought the device was going down once a day due to some DHCP release, but it does now appear to be happening more often which COULD be an SD card issue. However I've never known huge resource usage on the device that would cause issues, I've been watching it on nmon or rpi-mon and never seen usage outside the ordinary.

It's not an issue with the router - when the pizero goes down, I can login to the router, swap it back to using DCHP with its own DNS and suddenly the internet is working again.

When the pizero goes down - dietpi interface, pihole, rpi-mon etc all stop working , ssh wont connect (but does still allow me to connect to other local devices)

This indicates the whole device is crashing, not only Pi-hole.

This is a very good guess.

If you have a new SD card, my suggestion is: make a backup of your Pi-hole config using Teleporter, backup Unbound config, install the OS on the new card and restore the backups.

Another option is to install pi-hole on a docker container.

Some additional thoughts to consider. :wink:

Your debug log shows you are not using an official Pi-hole release, but rather some development branch (click for details):
*** [ DIAGNOSING ]: Core version
[✓] Version: v5.13
[i] Remotes: origin	https://github.com/pi-hole/pi-hole.git (fetch)
             origin	https://github.com/pi-hole/pi-hole.git (push)
[i] Branch: development
[i] Commit: v5.13-50-ga8b6eb9

*** [ DIAGNOSING ]: Web version
[✓] Version: v5.16
[i] Remotes: origin	https://github.com/pi-hole/AdminLTE.git (fetch)
             origin	https://github.com/pi-hole/AdminLTE.git (push)
[i] Branch: devel
[i] Commit: v5.16-38-g18494ed

*** [ DIAGNOSING ]: FTL version
[✓] Version: vDev-9fe7fb5
[i] Branch: development
[i] Commit: 9fe7fb5

What's the motivation for using a development branch?

Did you try to connect to your Zero by name or by IP address?

The gaps in that graph do not necessarily imply that the Zero goes down.
They rather indicate that your Pi-hole running on that Zero hasn't received any DNS requests at certain times.

If clients in your network would use another DNS server during those times, it may also explain why you wouldn't be able to access services by pi.hole's name.

That said, a sudden switch to another DNS server would be more likely if your router would serve as local DNS server, funneling all your clients' requests and using Pi-hole as one of its upstreams.
Your graph indicates otherwise, as there seem to be multiple clients using Pi-hole directly, and your debug log shows you are correctly distributing your Zero's IPv4 via DHCP only.
Still, as your debug log shows your network to have link-local IPv6 connectivity, that would leave your router advertising its own link-local IPv6 as DNS server as a possible by-pass.
You may want to check your router's DNS configuration for IPv6.

But considering other explanations as well:
How is your Zero 2 connected to your network?
Your debug log suggests that's wlan0?

A severed network link between your Zero and your router could also explain your observation. For a wifi link, your zero may cut wifi connectivity when entering a power save mode, or your router may temporarily block certain wifi channels, e.g. when detecting a weather radar activating in its vicinity.

Hi - so I reinstalled on a smaller sd card - about 10 mins in it crashed and then a bit later it crashed again - still looking into this but can anyone tell me why this isn't working?

root@DietPi:~# cd /var/log/atop
root@DietPi:/var/log/atop# cd /var/log/atop
root@DietPi:/var/log/atop# atop -r atop_2022118
atop_2022118 - stat raw file: No such file or directory
root@DietPi:/var/log/atop# atop -r atop_2022117
atop_2022117 - stat raw file: No such file or directory
root@DietPi:/var/log/atop#

root@DietPi:/var/log/atop# ls -l
total 0
-rw-r--r-- 1 root root 0 Nov 18 00:17 atop_20221117
-rw-r--r-- 1 root root 0 Nov 18 01:20 atop_20221118
-rw-r--r-- 1 root root 0 Nov 17 21:32 dummy_after
-rw-r--r-- 1 root root 0 Nov 17 21:32 dummy_before
root@DietPi:/var/log/atop#

You've missed a '1' from the filenames in your command. Note that from any directory you can just type

atop -r

to view today's log, and you can view yesterday's, the day before, the day before that, etc by adding one or more 'y'

atop -r y
atop -r yy
atop -r yyy
...

oops!

anyway I tried that too :frowning:
root@DietPi:~# atop -r
can not read raw file header
root@DietPi:~# atop -r y
can not read raw file header
root@DietPi:~# atop -r yyy
/var/log/atop/atop_20221115 - stat raw file: No such file or directory
root@DietPi:~# atop -r
can not read raw file header
root@DietPi:~#

so there is no file for yyy but there should be for y

Actually your original ls -l showed that they were zero length, so nothing to read. Either the atop service isn't logging properly or the same overall systemic symptoms are impacting atop too. Check the logging service is running with

systemctl status atop

Other than that all I can suggest is installing Pi OS Lite and doing a fresh Pi-hole installation and then seeing if this is stable over a few days. This will help rule out hardware, PSU, SD card, network problems, and show whether the behaviour is specific to DietPi.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.