Alright, I see myself as fairly competent in troubleshooting and fixing most problems by googling and reading docs, forums, etc. But now I just have to give up, as I'm starting to suspect there are problems with Pi-hole combined with the recent Raspbian update that came this February.
Note that the Raspbian system will work 100% flawlessly up until I attempt to get Pi-hole in installed.
Symptoms:
Problems began immediately after updating two RPi3B+ devices here with the February Raspbian update a few days back. These devices have been running perfectly for maybe a year or so. They only run Pi-hole, and are basically idle 24/7.
Using VNC/SSH to connect to headless Raspberry Pi 3b+. Desktop seems fine with contents being refreshed over VNC, until I click/activate a window. Then keys/mouse inputs suddenly are ignored. Window contents are still being updated normally (i.e. htop or Task Manager). I am sometimes allowed to maybe click another window once more, until it freezes completely, only allowing me to move the mousepointer. Pi-hole stops working, and DNS lookups cannot be done anymore.
If am connected over SSH at the same time, doing a sudo reboot does not work. Nothing happens. I have to unplug power to recover.
Freeze happens after running maybe an hour or two, sometimes even shorter. It also happens if I am not connect over VNC or SSH.
If task manager is running, I can see CPU usage bar is at 100%, even though the process list shows no process with high processor use (all tasks are shown). The window is still being refreshed when this is happening
Screenshot: https://i.imgur.com/uQ576cW.png (ignore the smartclt output, I was mid troubleshooting when the system froze, and I could not move Task Manager to the front anymore)
CPU usage 100% reported over SNMP as well. snmpd seems to be unaffected by the system freeze, and my PRTG server can keep on monitoring the system just fine: https://i.imgur.com/tZ1PfX7.png (this is a clean install, only snmpd and Pi-hole. Note the uptime of only 15 minutes until 100% cpu load, and system crash).
RAM usage normal, no increase in number of processes running.
vcgencmd get_throttdled always shows 0x0 on this system, no problems with power or temperatures.
Temperature does not increase at all when this happens, it remains on ~45 degrees C.
I've done maybe 15 clean image reinstalls of Raspbian, with various different approaches, and only installing what I really need to monitor the system, net-snmp, vnc, etc. I've tried with the last Raspbian image of 2019, and also with the newest 2020 image.
I've tried a brand new SD card.
I've tried a brand new Blitzwolf power adapter.
I've also tried moving the SD card over to another Raspberry Pi 3b+.
I was initially running the system from an external USB SSD, and thought this was the cause, but the problems are still there after I stopped using it.
I am unable to install Pi-hole unless I do apt update and apt upgrade. Meaning the system is fully updated.
Using the cURL install never works anymore, as it always complains about not having root (sorry I dont have the exact text, didn't copy/save it).
Using the git clone commands works fine, but when the install script comes to the downloading/installing FTL part, it will now ALWAYS fail. It seems like it's setting eth0 DNS to 127.0.0.1 too early (?), and it's unable to resolve internet addresses anymore.
I have to manually go into the network settings, and set DNS to i.e. 8.8.8.8, reboot the system, and run the pi-hole install once more. (Pi-hole install will set DNS back to 127.0.0.1)
When this is done installing, there is a new problem with eth0 caused by the install script: eth0 is now going crazy and it seems to be in a loop spamming the the dhcp server on my pfSense box. https://i.imgur.com/8uwbobY.png (Deckard = pfSense, mariette = Pi)
I suspect this might be related to the kernel "crash" (or whatever to call it).
As you can see from my post, there are no so many problems, I really don't know where to begin troubleshooting properly. Everytime I seem to "fix" something, there will always be another new problem popping up. And the most frustrating aspect of this is that the problems are already here from the very beginning after installing a clean image of Raspbian.
I have a bunch of Pi-holes I administer, and I am super terrified of running any Raspbian updates in the future now.
I've never had any issues setting up Pi-hole on brand new RPi3b+ devices. I must have setup maybe 30 Pi-holes for customers/clients the last couple of years, and I literally never had any problems. Never. And now I am 100% stuck, no matter what I try.
I am not The Pi-hole installation script causes this without my intervention. As you can see in my screenie from the eth0 config, there is a static config. But Raspbian is still requesting (read: spamming) dhcp. And now I cannot get eth0 back to normal operation again, even if I set all fields empty (automatic), or re-set them back to static values, eth0 still keeps spamming. Uninstalling Pi-hole does not help either.
I have posted in Raspberry Pi forums, but the post is still awaiting approval.
(This forum has a lot of restrictions to new users. I cannot add more links to the original post, so I'll dump new info here)
Another user is reporting identical installation problems, even in context with the Feb Raspbian update: Raspbian: FTL Engine installation failure, Feb 2020. So this is clearly not a one-off unique problem with my setup/hardware.
Are you aware there are two February Raspbian releases?
2020-02-05-raspbian-buster-lite.zip
2020-02-13-raspbian-buster-lite.zip
They normally never do that, so maybe something has been fixed in the 2020-02-13 version? edit
currently running v4.3.2 -> beta5 on Raspbian 2020-02-05 version on pi 3B, encountered no problems at all /edit
I was not aware of the new Raspbian image released only a couple of days ago. But I assume apt update and apt upgrade would have installed anything that is also contained in this image update, right? I will attempt another clean install here, and see how it goes. Will report back when done.
Quoting the release notes doesn't really show anything related but I agree that this is uncommon and there might be going more behind the scenes than they're telling:
2020-02-13:
* Raspberry Pi Configuration - screen blanking setting disabled if Xscreensaver is installed
* Bug fix - switch to turn off VNC server in Raspberry Pi Configuration has no effect
* Bug fix - fix %20 characters in file names
* Linux kernel 4.19.97
* Raspberry Pi firmware 9a34efbf2fc6a27231607ce91a7cb6bf3bdbc0c5
- gencmd: Fix measure_clock name for CLOCK_OUTPUT_108
- mmal isp: Remote alignment requirements for RGB24 formats
- Add missing flags for VC_IMAGE_PROP_YUVUV_4K_CHROMA_ALIGN
- platform: Compromise on gpu overclock settings
2020-02-05:
* Version 3.2.6 of Thonny included - significant improvements in speed, particularly when debugging
* Version 1.0.4 of Scratch 3 included - adds new "display stage" and "display sprite" blocks to SenseHAT extension, and loading of files from command line
* Version 32.0.0.314 of Flash player included
* Version 1.0.3 of NodeRED included
* Version 6.6.0 of RealVNC Server and version 6.19.923 of RealVNC Viewer included - adds support for audio
* Version 78.0.3904.108 of Chromium included
* Mesa updated to 19.3.2 for OpenGL ES 3.1 conformance
* Pixel doubling option added in Raspberry Pi Configuration on platforms using FKMS display driver
* Orca screen reader added to Recommended Software
* Code The Classics Python games added to Recommended Software
* File manager - new "places" pane added at top of sidebar to show mounted drives in simplified view; "new folder" icon added to taskbar; expanders in directory browser now correctly show state of subfolders
* Multiple monitor support improved - alignment of icons on second desktop corrected, Appearance Settings opens on correct tab when launched from context menu
* Raspberry Pi Touchscreen correctly aligned with display
* System clock synchronised before installing new packages in startup wizard and Recommended Software
* Mixer dialogs added to taskbar volume plugin; separate Audio Preferences application removed
* Raspberry Pi Configuration - separate tab added for display options; screen blanking control added
* Volume taskbar plugin and raspi-config modified to support separate ALSA devices for internal audio outputs (analogue and HDMI 1 and 2)
* Robustness improvements in volume, ejecter and battery taskbar plugins
* Movement of mouse pointer to menu button on startup now controlled by point_at_menu parameter in Global section of lxpanel configuration file
* Ctrl-Alt-Del and Ctrl-Alt-End shortcuts added to open shutdown options box
* Ctrl-Shift-Esc shortcut added to open task manager
* Enabled NEON routines in OpenSSL
* Linux kernel 4.19.97
* Raspberry Pi firmware 149cd7f0487e08e148efe604f8d4d359541cecf4
Alright, I've done a clean image install of the newest Raspbian image "2020-02-13-raspbian-buster-full.zip" a total of three times now to ensure it's not just a fluke.
The Pi-hole installer script is able to get past "[✓] Downloading and Installing FTL" without issues. It seems the newest 2020-02-13 image has fixed what was previously causing issues.
... but it takes a very long time to get past "Downloading and Installing FTL" (almost a minute). Unsure if it used to do that previously (before these problems started).
I'm gonna leave my Pi-hole running, and have some of my LAN clients use it as DNS, and see if I get any more kernel crashes.
I'll post an update in 24h, or sooner if I'm back to the old problems.
No VPN. I'm 100% positive it was because of the install script setting 127.0.0.1 as DNS "too early", and causing the script to fail (seemed to be because of some kind of issue in Raspbian 2020-02-05, I don't know where/how to look for clues, tbh).
The snippet you copied in is not what we output. We don't do anything with DHCP so I'm not sure how we can cause DHCP lease requests to happen. The copy of the configuration provided was not produced by the Pi-hole installer, we don't output any inform lines, nor noipv6. There may have been something with the Feb 05 image that caused those lines, I haven't looked in to see:
Most of our users tend to use the Buster lite image and not use any kind of GUI.
I'm becoming more inclined to not have the installer touch IP addressing on the client at all, as it's not really foolproof for us to do that. Not automatically configuring that would remove the potential for both Pi-hole and the OS trying to override each other for network configuration.
The setting of 127.0.0.1 for the local resolver was removed in the beta for version 5, so that already is done.
yes, you are correct, that config was not created by Pi-hole itself,
but rather seems like the OS did something... as a result of Pi-hole changing the network adapter settings.
Also, I never saw this misbehaviour on Raspbian Stretch (I also tested on a VM to confirm this just now). I have made a post in the Raspbian Forums here, as this seems like a potential bug/flaw to me.
My two homelab RPi3b+ Pi-holes have been running on the 2020-05-13 image for a whole day now, without any snags or other issues. I am now entirely sure it was the "2020-02-05-raspbian-buster-lite.zip" image that was causing problems. Almost a whole week wasted on troubleshooting this. Well, life I guess....
Thanks for looking into it anyway. It's nice to see the devs spending time in the user forum for their product.