I am having a problem with my pi zero 2w. I recently upgraded from the original pi zero. Today I was using /admin in the morning to monitor traffic. In the afternoon I tried to reach /admin and it would not load. I ssh’ed into the device and used top, it showers that the lighttpd process was using 100% CPU. I was unable to stop the process using systemctl. But restarting the pi fixed the problem.
Expected Behaviour:
I would expect I could install pihole on my pi zero 2w with a static IP, acting as a DHCP server and to have it work fine. I would expect to be able to login to the /admin page and look at queries/dashboard. And, I'd expect it to server up dns data to all of the clients on my network.
Actual Behaviour:
This afternoon I tried to access the /admin page and it wouldn’t not respond. I SSH-ed onto the pihole and ran a "top" command and saw that lighttpd process was running at 100%. I attempted to run pihole -d and it got stuck trying to get dashboard headers. I also was unable to use systemctl to stop the service.
This has happened before. This time DNS requests will still being serviced. But, its happened before where DNS requests STOPPED being served up.
I was using the /admin screen this morning and it worked just fine.
Debug Token:
https://tricorder.pi-hole.net/psoTIMC3/
Your debug log doesn't show any lighttpd
issues.
lighttpd
logs are clean. Just server started
and server stopped
messages.
Unrelated to the issue reported, your log shows wlan0
is configured in /etc/pihole/setupVars.conf
as Pi-hole interface:
*** [ DIAGNOSING ]: Setup variables
PIHOLE_INTERFACE=wlan0
but your current system doesn't have an active wifi interface. Only eth0
.
1 Like
Thanks a ton @rdwebdesign! I just changed the PI_HOLE_INTERFACE to eth0 in the setupVars.conf per your recommendation. I must have installed pihole on the raspberry pi before disabling the wifi interface.
@MightyHandy, I am a lighttpd developer and would like to see what was happening on your system for lighttpd to run at 100% CPU. I unfortunately do not have access to the tricorder log. If you can reproduce the issue, please strace
the lighttpd process and share the result, either here or in the lighttpd forums: Support - Lighttpd - lighty labs
Thanks.
1 Like
Example:
$ systemctl cat lighttpd.service
[..]
ExecStart=/usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
$ sudo systemctl stop lighttpd.service
$
$ sudo strace /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
execve("/usr/sbin/lighttpd", ["/usr/sbin/lighttpd", "-D", "-f", "/etc/lighttpd/lighttpd.conf"], 0xbeaf778c /* 15 vars */) = 0
brk(NULL) = 0xbc1000
uname({sysname="Linux", nodename="ph5b", ...}) = 0
access("/etc/ld.so.preload", R_OK) = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
^C
$ sudo systemctl start lighttpd.service
$
$ sudo systemctl is-active lighttpd.service
active
EDIT: Oh below strace argument can be useful:
$ man strace
[..]
-o filename
--output=filename
Write the trace output to the file filename
rather than to stderr. filename.pid form is used
if -ff option is supplied. If the argument be‐
gins with '|' or '!', the rest of the argument is
treated as a command and all output is piped to
it. This is convenient for piping the debugging
output to a program without affecting the redi‐
rections of executed programs. The latter is not
compatible with -ff option currently.
Eg:
sudo strace -o strace.txt /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
Do I just run those commands if the problem happens again? Or do I need to run them in advance?
I reckon you need to keep the strace running until you get to the 100% CPU load point.
But maybe @gstrauss can explain whats expected?
If it happens again, you can run the command while lighttpd is spinning. If there is no output, then lighttpd would be stuck in an infinite loop. If you have a debugger installed you can attach to the lighttpd process echo "bt full" | gdb -p <lighttpd-pid>
. (replace '<lighttpd-pid>' with the lighttpd pid)
I do not know of any such outstanding issues in lighttpd until I saw this, which is why I am asking questions to see if we can track it down so that I can fix it.
Aha oc.
So:
$ pidof lighttpd
21558
$ sudo strace -o strace.txt -p 21558
strace: Process 21558 attached
$ tail -F strace.txt
wait4(21561, 0xbed40bd0, WNOHANG, NULL) = 0
wait4(21562, 0xbed40bd0, WNOHANG, NULL) = 0
openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 10
read(10, "0.09 0.05 0.07 2/131 31792\n", 64) = 27
close(10) = 0
[..]
That should do it right?
Yes, that should do it. Thanks for providing a more detailed example.
lighttpd should not spin. Have you heard of any other similar reports?
Not that I can remember.
Also, MightyHandy's error log was clean. I remember I saw only server started
and server stopped
messages.