High CPU load when accessing the Web Interface

TheME · March 17, 2024, 10:55pm

I am running the v6 beta on a NanoPi Neo with DietPi.
This is just a bare beta installation, with nothing else.
LAN cable connection. Web access via HTTPS.
Everthing is up to date and all repos are on the v6-development branches.

Today once again (like already a day before) I could not reach the Web Interface due to a connection timeout at the first try.

The DietPi welcome message showed a system's CPU temperature above 70°C this time. (Ok, I am running the system without a heatsink, so the temperature can rise quite fast. In normal operation, however, the CPU always remains between 35 and 45°C.)

On the pihole diagnostics page a message is stating that the 15min avg load was higher then the number or processors.

But while observing the CPU load with htop the 15min avg is then fallen to below 0.3 within 20mins, even with a client connected to the dashboard.

What I noticed BTW:

FTL.log reported an SQLite3 error at 21:55:32.526.
The 'ERROR' wording in the taillog should be in red.
The long term overload warning appears multiple times in the log.
But on the Pi-hole diagnostics page the critical avg load value in the message reduced itself (with every warning occurrence).
It would make sense to state the highest value in the message instead of the last one - right?

Any idea what I could also check or look for at the next time?
What is causing the SQLite3 error?

Debug token: https://tricorder.pi-hole.net/sVE5D4T1

rdwebdesign · March 18, 2024, 4:08am

It is hard to guess what could be causing the issue based only on the screenshot and your debug log shows FTL.log is empty:

-rw-r----- 1 pihole pihole 0 Mar 17 23:17 /var/log/pihole/FTL.log
   -----head of FTL.log------

   -----tail of FTL.log------

Did you delete it?

Also, are you sure the load is caused by Pi-hole (and not by some cronjob)?

Your screenshot shows load messages in a 5 minutes interval:

2024-03-17 21:55:32
2024-03-17 22:00:32
2024-03-17 22:05:32
2024-03-17 22:10:32
...

chrislph · March 18, 2024, 4:57am

You may find the atop package useful for this. It's like a recording version of htop. You can review historical loads after something has happened and see what the cause was.

I mentioned it in another post with some shortcut tips, and a link to a larger guide.

TheME · March 18, 2024, 4:06pm

No, I didn't delete any log file myself.
But that seems to be normal due to DietPi's RAMlog configuration:

root@DietPi:/var/log/pihole# ls -l
total 0
-rw-r----- 1 pihole pihole 0 Mar 17 23:17 FTL.log
-rw-r----- 1 pihole pihole 0 Mar 18 10:17 pihole.log
-rw-r----- 1 pihole pihole 0 Mar 18 00:17 pihole.log.1
-rw-r----- 1 root   pihole 0 Mar 18 00:17 pihole_debug.log
-rw-r----- 1 pihole pihole 0 Mar 17 22:17 pihole_updateGravity.log
-rw-r----- 1 pihole pihole 0 Mar 17 22:17 webserver.log

Looks like unfortunately the logs were cleared on the hour, right before I created the debug log.
Via the web interface the logs are even now visible.

I am not sure yet. I have not created or edited any cron jobs.
I assume that FTL checks the CPU loads every 5 minutes and therefore the warnings were displayed at this interval.

Thanks. I have now installed atop and waiting for a another occurrance.

rdwebdesign · March 18, 2024, 6:40pm

I think you are correct:

github.com/pi-hole/FTL

src/gc.c

0d2fcd09e


      
          // Resource checking interval
          // default: 300 seconds
          #define RCinterval 300

I don't know what could be causing such a high load.

Maybe @DL6ER will be able to understand what is happening.

DL6ER · March 18, 2024, 7:00pm

Load is a multi-dimensional metric. Your disk is slow and your system has to wait for data often? High load. Your network is slow and you are waiting? High load. Your RAM is slow or there are many context changes and cache invalidations? You name it!

It is a good metric to know something is making your system less responsive as it could be. Finding what caused this exactly is more work.

Bad analogy time: Compare it to lending your car to someone else and getting it back the next day without any fuel left. Did they go really fast on the Autobahn and used lots of fuel quickly, or did they drive a long way, or was the engine running in standby without any movement for 24 hours? You will never know from the mere fact of "empty fuel"/"high load".

TheME · March 25, 2024, 8:49pm

Checking the atop logs I see repeating high disk busy values while
while FTL.log fills up with this messages

ERROR add_message(type=6, message=excessive load) - SQL error step DELETE: database is locked
ERROR Error while trying to close database: database is locked
ERROR log_resource_shortage(): Failed to add message to database

Looks to me like the SD card is about to fail. I will try a new SD card...

DanSchaper · March 25, 2024, 9:47pm

iotop or something else that measures IO Wait state could also be helpful for you to see what process are binging or starving for IO.

rdwebdesign · March 26, 2024, 1:16am

Thanks for noticing.

This one is just a format issue and it was fixed by
https://github.com/pi-hole/web/pull/2984