lighttpd CPU hits 100% and /admin won't respond

I am having a problem with my pi zero 2w. I recently upgraded from the original pi zero. Today I was using /admin in the morning to monitor traffic. In the afternoon I tried to reach /admin and it would not load. I ssh’ed into the device and used top, it showers that the lighttpd process was using 100% CPU. I was unable to stop the process using systemctl. But restarting the pi fixed the problem.

Expected Behaviour:

I would expect I could install pihole on my pi zero 2w with a static IP, acting as a DHCP server and to have it work fine. I would expect to be able to login to the /admin page and look at queries/dashboard. And, I'd expect it to server up dns data to all of the clients on my network.

Actual Behaviour:

This afternoon I tried to access the /admin page and it wouldn’t not respond. I SSH-ed onto the pihole and ran a "top" command and saw that lighttpd process was running at 100%. I attempted to run pihole -d and it got stuck trying to get dashboard headers. I also was unable to use systemctl to stop the service.

This has happened before. This time DNS requests will still being serviced. But, its happened before where DNS requests STOPPED being served up.

I was using the /admin screen this morning and it worked just fine.

Debug Token:

https://tricorder.pi-hole.net/psoTIMC3/

Your debug log doesn't show any lighttpd issues.
lighttpd logs are clean. Just server started and server stopped messages.


Unrelated to the issue reported, your log shows wlan0 is configured in /etc/pihole/setupVars.conf as Pi-hole interface:

*** [ DIAGNOSING ]: Setup variables
    PIHOLE_INTERFACE=wlan0

but your current system doesn't have an active wifi interface. Only eth0.

1 Like

Thanks a ton @rdwebdesign! I just changed the PI_HOLE_INTERFACE to eth0 in the setupVars.conf per your recommendation. I must have installed pihole on the raspberry pi before disabling the wifi interface.

@MightyHandy, I am a lighttpd developer and would like to see what was happening on your system for lighttpd to run at 100% CPU. I unfortunately do not have access to the tricorder log. If you can reproduce the issue, please strace the lighttpd process and share the result, either here or in the lighttpd forums: Support - Lighttpd - lighty labs

Thanks.

1 Like

Example:

$ systemctl cat lighttpd.service
[..]
ExecStart=/usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
$ sudo systemctl stop lighttpd.service
$
$ sudo strace /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
execve("/usr/sbin/lighttpd", ["/usr/sbin/lighttpd", "-D", "-f", "/etc/lighttpd/lighttpd.conf"], 0xbeaf778c /* 15 vars */) = 0
brk(NULL)                               = 0xbc1000
uname({sysname="Linux", nodename="ph5b", ...}) = 0
access("/etc/ld.so.preload", R_OK)      = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
^C
$ sudo systemctl start lighttpd.service
$
$ sudo systemctl is-active lighttpd.service
active

EDIT: Oh below strace argument can be useful:

$ man strace
[..]
       -o filename
       --output=filename
                   Write  the  trace  output  to  the  file filename
                   rather than to stderr.  filename.pid form is used
                   if  -ff  option is supplied.  If the argument be‐
                   gins with '|' or '!', the rest of the argument is
                   treated  as  a command and all output is piped to
                   it.  This is convenient for piping the  debugging
                   output  to  a program without affecting the redi‐
                   rections of executed programs.  The latter is not
                   compatible with -ff option currently.

Eg:

sudo strace -o strace.txt /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf

Do I just run those commands if the problem happens again? Or do I need to run them in advance?

I reckon you need to keep the strace running until you get to the 100% CPU load point.
But maybe @gstrauss can explain whats expected?

If it happens again, you can run the command while lighttpd is spinning. If there is no output, then lighttpd would be stuck in an infinite loop. If you have a debugger installed you can attach to the lighttpd process echo "bt full" | gdb -p <lighttpd-pid>. (replace '<lighttpd-pid>' with the lighttpd pid)

I do not know of any such outstanding issues in lighttpd until I saw this, which is why I am asking questions to see if we can track it down so that I can fix it.

Aha oc.
So:

$ pidof lighttpd
21558
$ sudo strace -o strace.txt -p 21558
strace: Process 21558 attached
$ tail -F strace.txt
wait4(21561, 0xbed40bd0, WNOHANG, NULL) = 0
wait4(21562, 0xbed40bd0, WNOHANG, NULL) = 0
openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 10
read(10, "0.09 0.05 0.07 2/131 31792\n", 64) = 27
close(10)                               = 0
[..]

That should do it right?

Yes, that should do it. Thanks for providing a more detailed example.

lighttpd should not spin. Have you heard of any other similar reports?

Not that I can remember.

Also, MightyHandy's error log was clean. I remember I saw only server started and server stopped messages.