Pihole periodically stops functioning

Steve_Zemlicka · January 1, 2024, 1:57pm

Environment:

Running pihole on a Debian 11 xcp-ng VM with 6gb RAM and with 89% vm disk space available. Pihole versions:

Pi-hole [v5.17.2]
FTL [v5.23]
Web Interface [v5.21]

Actual Behaviour:

Everything works well for one to several days before breaking. When it breaks, DNS, DHCP, and web interface all stop responding. Restarting pihole-FTL service seems to make functional again.

Debug Token:

https://tricorder.pi-hole.net/uhr7Dhcs/

Unfortunately this was taken after the service restart since I didn't find the pihole -d command until after I had restarted the service. Perhaps waiting until it happens again and running this while the problem exists would be helpful.

Additional details and logs:

The logs I've checked seem to show little helpful info:
pihole.log
Jan 1 06:12:31 dnsmasq[519]: query[A] rr4---sn-vgqsrnz6.googlevideo.com from 172.23.2.163
Jan 1 06:34:11 dnsmasq[95853]: started, version pi-hole-v2.89-9461807 cachesize 10000

pihole-FTL.log
[2024-01-01 06:10:00.078 519/T887] Notice: Database size is 857.99 MB, deleted 75 rows
[2024-01-01 06:34:11.308 95853M] Using log file /var/log/pihole/FTL.log

journalctl -u pihole-FTL
Dec 28 13:19:25 dns01 pihole-FTL[519]: [2023-12-28 13:19:25.838 519M] Creating mutex
Dec 28 13:19:26 dns01 pihole-FTL[519]: [2023-1
Jan 01 06:34:01 dns01 systemd[1]: Stopping Pi-hole FTL...
Jan 01 06:34:11 dns01 systemd[1]: pihole-FTL.service: State 'stop-sigterm' timed out. Killing.

As you may be able to see, I restarted the pihole-FTL service at approximately 6:34am this morning. Everything (from a user perspective) was functioning from when I woke up at about 4am until shortly (10-20 minutes) before restarting the service.

I suppose I could create a cron job to restart the service but I'd rather figure out what's happening and try to correct it. In my research, I found some others having issues with the pihole-FTL.db. AFAICT the db reports as "ok" when running:
sqlite3 /etc/pihole/pihole-FTL.db "pragma integrity_check;"

Any thoughts or suggestions on next steps to figure out what's going on here?

Steve_Zemlicka · January 1, 2024, 2:13pm

FWIW, I noticed that the last log from systemd on Dec. 28 is incomplete. I suspect this was from a snapshot reversion that I performed after a failed upgrade to Debian 12. I was attempting to upgrade from Debian 10 to 12. However since I was having the issues prior to then (on Debian 10), I don't believe this snapshot and/or snapshot reversion is causing the issue. I suppose I could just rebuild my dns server on a clean Deb12 image if it looks like there's some sort of corruption mess. Obviously I'd rather not do that if it isn't necessary.

jfb · January 1, 2024, 2:21pm

Do this. In place upgrades to a newer OS can be problematic.

Export your Pi-hole settings first with the teleporter function.

system · January 22, 2024, 2:22pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.