Long-term load on Pi-Hole Instance

Expected Behaviour:

Instance should not be having such a big CPU load

Actual Behaviour:

Receiving multiple warnings lately on this on the Pi Hole Diagnosis page

Debug Token:

https://tricorder.pi-hole.net/vNg0r9Rd/

Not exactly sure what can be the issue. I've also checked htop but CPU usage looks fine, but the load spikes here are quite big.
Thank you!

The warning is not saying Pi-hole is causing the issue. Actually, there is no information related to which process caused the issue. It is just saying your machine had a load around 4.92, at 13:22:54.

I also noticed you are using DietPi.
I think DietPi deletes your logs hourly, so it will be harder to identify what was causing the issue at 13:22.

Do you know what was running at that time?

Hey there. So, I had installed atop a couple of days ago to see if I could get a log on what was going on.

But I cannot make it work to retrieve the log.

root@orangedietpi:~# atop -r /var/log/atop/atop_20240424
file /var/log/atop/atop_20240424 does not contain raw atop/atopsar output (wrong magic number)
root@orangedietpi:~# cd /var/log/atop/
root@orangedietpi:/var/log/atop# atop -r y
can not read raw file header
root@orangedietpi:/var/log/atop# atop -r atop_20240423
can not read raw file header
root@orangedietpi:/var/log/atop# ls -la
total 20
drwxr-xr-x 2 root root 100 Apr 24 00:00 .
drwxr-xr-x 11 root root 440 Apr 24 17:27 ..
-rw-r--r-- 1 root root 0 Apr 23 00:17 atop_20240422
-rw-r--r-- 1 root root 0 Apr 24 00:17 atop_20240423
-rw-r--r-- 1 root root 1478992 Apr 24 20:20 atop_20240424

Not sure if there's another way I could tell what's causing this. I've tried by taking a look on htop but with no results.

I also saw there's another case pretty similar to my issue: Pihole appears to crash until physical reboot - #9 by chrislph

Could be also maybe the microsd. I could check that as well.

Thanks!

I'm not sure how and where atop stores the logs, but I think DietPi will remove every file from /var/log once an hour.

This load issue seems to be related to something else on the OS.
Maybe @MichaIng has a better suggestion on how to find DietPi logs pointing to the process causing the excessive load.

Ahhh so that's why atop can't read those logs. If everything gets deleted under /var/log there wasn't going to be any way for me to read them. I'll see if I disable the deletion of logs...

For atop, you could turn off the RAM log:

dietpi-software uninstall 103
reboot

In the dietpi-software menu, there is also a logging option which stores the content of files to persistent disk, before clearing the /var/log tmpfs.

Probably system or kernel logs give a hint already?

journalctl
dmesg -T

Generally, on most modern Linux distributions, journalctl shows all system logs for a very long time, stored in an own longer-term tmpfs /run/log/journal. /var/log contains only either plain text duplicates of the journal, when rsyslog or something similar is installed, or logs from some particular other installed services. However, nowadays most server software packages ship with a native systemd service and log to the journal instead.

Hey MichaIng, I'll turn off the RAM log with that command, thank you.

In the meantime, I'm attaching the logs I received with those 2 other commands.

Also, here's the CPU load from the last 24hs,

I assume this big CPU load would have to do with Internet activity. This is what I'm running at the moment (by the way, this started almost at the same time I went from Bullseye to Bookworm on my Orange Pi Zero):

Thank you.
logs.zip (91.8 KB)

Couldn't figure out the cause of this, changing micro sd to see if that helps.

Your logs show CPU stalls every ~3 minutes (sometimes longer), suspiciously often at the 21th second of the minute. At first I thought a cron job, but the journal does not show any cron job execution, but those start and finish quite clearly between the CPU stalls.

I remember a similar report on our forum, other SBC but same SoC/kernel: NanoPi NEO running unstable since latest update - Troubleshooting - DietPi Community Forum

Strange is that this kernel is from February, while the CPU stalls in this case seem to have started much later.

This might be related: High temperatures after CSC 6.6 Kernel upgrade - Orange Pi Zero - Armbian Community Forums

Thanks again for your answer, MichaIng.

Before I change the microSD, I'll flash the armbian firmware and see if I can downgrade the kernel version to 6.1 (using the sudo armbian-config > System settings > Other option), since apparently for what I'm reading on those topics, it might actually be the cause of all this.

Thanks again!

Yes, downgrading seems to solve it. However, better would be to fix it forwards. I am running a new kernel build based on latest Linux 6.6, so we can see whether this also solves it. Next would be to test latest stable Linux 6.8.

I'll check for sunxi related commits in the Armbian build system since the switch to Linux 6.6 when I find time, probably we find something, in case it is not an upstream issue.

Sounds good, and happy to do some testing when the newest comes out to see if it's finally solved.

Here a new kernel package with Linux 6.6.29:

cd /tmp
wget https://dietpi.com/downloads/binaries/testing/linux-{image,dtb}-current-sunxi.deb
dpkg -i linux-{image,dtb}-current-sunxi.deb
rm linux-{image,dtb}-current-sunxi.deb
reboot

I'm btw just installed Pi-hole on my NanoPi M1 (same SoC/kernel), and so far (with Linux 6.6.16) it works well, runs stable. Will let it run for a while and monitor CPU usage and kernel logs.

However, Orange Pi Zero (1) and NanoPi NEO (1) probably have more in common than NanoPi M1, form factor and first sight hardware features at least (aside of onboard WiFi).

1 Like

Thanks, that's for installing the kernel? So I shouldn't flash armbian image and instead install the .deb?
In my case being wget https://dietpi.com/downloads/binaries/testing/linux-dtb-current-sunxi.deb ?

Yes, if you are still running the DietPi image, just run these commands to update the kernel.

The Armbian image should suffer from the same issues, at least as long as not the CPU governor is relevant. So could be tested as well and verified that the issue really is the same, and then the kernel update tested similarly.

(Reading database ... 24083 files and directories currently installed.) Preparing to unpack linux-dtb-current-sunxi.deb ... Armbian 'linux-dtb-current-sunxi' for '6.6.29-current-sunxi': 'preinst' starting. Armbian 'linux-dtb-current-sunxi' for '6.6.29-current-sunxi': 'preinst' finishing. Unpacking linux-dtb-current-sunxi (24.5.0-trunk) over (24.2.1) ... Setting up linux-dtb-current-sunxi (24.5.0-trunk) ... Armbian 'linux-dtb-current-sunxi' for '6.6.29-current-sunxi': 'postinst' starting. Armbian: DTB: symlinking /boot/dtb to /boot/dtb-6.6.29-current-sunxi... 'dtb' -> 'dtb-6.6.29-current-sunxi' Armbian 'linux-dtb-current-sunxi' for '6.6.29-current-sunxi': 'postinst' finishing.

Just rebooted, I'll see how it goes!

1 Like

So, CPU load seem to be more normal now... But idle temperature is kind of high, around 60°C. So I guess I'll test with armbian-config for the downgrade for now.

I'll be checking in case I see up to date stable kernel versions.

Thank you very much :slight_smile:

That could be however expected on this small SBC. Keep an eye on dmesg, whether it still shows these CPU stalls.

None of the CPU stalls apparently, so that's good! And the CPU load graph definitely shows an improvement on the 15-min the moment I rebooted after testing this kernel.

But before the kernel upgrade, this device used to be around 49°C or so depending how hot it is. This temperature increase seems to be related to the kernel then, like on the Armbian thread...

1 Like

So far so good. Let's hope the guy on our forum also has good results with that one.