Pi Hole freezes in dhcp mode

We could try to take a closer look at what's happening with your DHCP requests.

Create a custom configuration file for dnsmasq at /etc/dnsmasq.d/42-log-dhcp.conf containing the following line:

log-dhcp

To activate that option, run the following command

pihole restartdns

Then try to connect a DHCP client.

This would allow you to see more verbose DHCP related messages, e.g. to see them as they are happening:

tail -n 10 -F /var/log/pihole.log | grep dhcp

Or to search at a later time:

grep -n "dhcp" /var/log/pihole.log

Hi BuckingHorn,

thanks again for your message. I've jsut seen in another thread that you're german as well, but lets continue in english anyways so other people can benefit from your help, too.

So I've followed your steps. How can I check if it is working as intended? If I'm right, the next step would be to disable the router DHCP, wait for the freeze and check pihole.log for the last entries before it froze?

Yes, and please share the grep output here.

Okay so at 14:45 I disabled my router's DHCP and enabled the pihole's.
At 15:48 I realized that the pihole's admin page is not reachable anymore and my VNC didn't connect. I enabled my router's DHCP and waited for the VNC to connect again, afterwards I extracted the following log:
log.txt (89 KB)

Token is: https://tricorder.pi-hole.net/ajciha7cjy

Your DHCP log shows that only four clients successfully requested an IPv4 address:

55417:Nov  7 15:06:41 dnsmasq-dhcp[14377]: 931615095 DHCPACK(wlan0) 192.168.0.73
61372:Nov  7 15:28:01 dnsmasq-dhcp[14377]: 915842021 DHCPACK(wlan0) 192.168.0.107
61586:Nov  7 15:29:07 dnsmasq-dhcp[14377]: 3662234458 DHCPACK(wlan0) 192.168.0.30
64158:Nov  7 15:41:22 dnsmasq-dhcp[14377]: 3249465666 DHCPACK(wlan0) 192.168.0.136 

Does that fit the number of clients on your network?

What client did you use to monitor your Pi-hole's connectivity?
Is it among the clients that requested a lease?

Also, your log doesn't show any DHCP activity between 15:47:43 (line 65178) and 15:49:27 (line 65362). Do the lines in between contain any hints to a possible cause?

If you want to upload the complete log without making it public here, you can do so by executing:

cat /var/log/pihole.log | pihole tricorder

or substitute with any relevant other file, and then post the token(s) here.
(see also How do I debug my Pi-hole installation? for other ways - apologies for not mentioning this earlier).

No, it should be more. There are at least my computer and my mobile missing which have been active during this time and with which I have accessed the pihole admin page (could I access it from outside my network?). I've set my pc and my tv to use the raspberry's IP as DNS server, the tv is among the devices, the pc is not.
Also, in pihole's DHCP settings in the list of currently active DHCP leases, I see the DLAN adapter which was active during the time but its IP is not among those four. Further I see IPv4 and IPv6 adresses for some of my devices in this list.
grafik
The 192.168.0.136's name is unknown, there are two further IPs like this in the list.
I've uploaded the whole log file here: https://tricorder.pi-hole.net/t1mgqycn9c
To me there was nothing noticeable in this timeframe but maybe I missed something?

Those may still hold on to their previous DHCP leases as issued by our router.
The moment you lose connectivity could have been the moment they tried to renew their lease with your router.

You could try to verify this by disconnecting and reconnecting your smartphone to your network by disabling WiFi.

EDIT:
There are no anomalies in the aforementioned timeframe, just plain DNS queries (from IPv6 client IPs ending in :74cc, :cd7e, :680a and :185a).

This confirms that Pi-hole itself has been operational during that time, making a DHCP renewal issue more likely.

Okay so I disabled the WLAN hotspot from the DLAN adapter and connected the raspberry directly to the router's WLAN. Then I repeated the procedure but this time, I reconnected all my known devices manually. None of them caused any issues.
The only unchecked devices I couldn't test from the "Currently active DHCP leases"-List are my DLAN and two unknown devices.
What about the unknown hostnames? Are those mobiles of guests? A search in a MAC-adress to Vendor Database didn't show any results (It worked for the third unknown device though - my robot vacuum cleaner but not for my Samsung mobile).
According to the tpPLC software, both DLAN adapters as well as the 2.4 and 5 GHz WLAN access points have own MAC adresses, but only the one of the 2.4 GHz access point shows up in pihole's "Currently active DHCP leases"-List. Plugging the DLAN access point to the wall made my (via LAN to this device) coonected PC and TV reconnect to the pi, but no IP for the DLAN adapter itself showed up in the log. Does that maybe point to the problem?
I will let the system run as it is and see, if connecting devices to the DLAN hotspot causes the trouble. As my routers DHCP lease time is set to one hour, we should have data for that soon.

Okay so that did not solve the problem, the pihole just froze again. Here is the log since my first attempt (with activated access point) at 14:33. On 15:26 I started the second attempt. On 16:22 I restartet the router's DHCP as I realized the system crashed (lost VNC and couldn't reach pihole admin page). On both attempts, all my known devices connected to the pihole as described above.
https://tricorder.pi-hole.net/ced4rky43i

I'm not sure I am following what you tried to do. :thinking:

My above suggestion was simply " to verify this (i.e. whether DHCP lease renewal would cause your issue) by disconnecting and reconnecting your smartphone to your network by disabling WiFi."
That would trigger your smartphone to request a DHCP lease, and if that indeed would introduce your issue, you'd be unable to access Pi-hole from that smartphone afterwards.
If you'd see no DHCP requests from your smartphone showing up in pihole.log then, it would suggest that your smartphone either cannot reach Pi-hole or it's talking to another DHCP server.

As your logs have shown Pi-hole to be operational during times when clients fail to access Pi-hole's web UI, do you have indications that indeed a crash is happening? Did you observe corresponding errors in other logs?

Your observation that you can restore access to Pi-hole by simply switching your router's DHCP server back into action would imply that Pi-hole has been fully operational all along. If it weren't, you would've had to do something on your Pi-hole machine as well, like restarting Pi-hole or its webserver.

So I did exactly this with all my devices, switch them off and back on again, thats what I meant with "manually reconecting". I couldn't access pihole with the previously connected computer and mobile after the "crash", beforehand it worked fine.
So far, I am still not sure why the "Currently active DHCP leases"-List shows 15 connected devices instantly when I turn on DHCP mode, although the devices connect eventually one after another (when the routers DHCP lease runs out I guess? Or of course if I reconnect them to the network).
I will take a closer look what is happening when I can't access the admin page anymore. So far I just know that this happens and VNC (from my pc) disconnects, this lead me to the assumption that something has to be wrong with pihole but maybe it is just the connection and other devices work fine. Thanks so far, I will get back as soon as I have found out more.

Hmm, this would mean that those devices don't have any issue in requesting a DHCP lease through Pi-hole.
If I understand you correctly though, you've changed your network environment by eliminating your DLAN access point, which in turn wouldn't allow any conclusions with regards to that device.

Yet as the error resurfaced even without your DLAN's involvement, it may well be something else that triggers your issue.

I can't really answer that. :wink:
The two MACs without hostname information that I am aware of resolve to LCFC(HeFei) Electronics Technology (54:E1:AD) and to Beijing Xiaomi Mobile (64:90:C1).

Network equipment does not necessarily require an IP to operate, though it has to have one if it can be administered via its own web interface.

If that's true for your DLAN AP: Is it still accessible from a client that acquired a lease through Pi-hole when Pi-hole is your DHCP server?

I've found out some interesting things and came up with a theory in my limited knowledge and experience with networks. Maybe you can tell if its plausible so far.

I guess I am facing two issues:
First, I think the router does not keep the raspberry's IP static with DHCP set off. Once I disable its DHCP the whole menu gets greyed out including the list with reserved IPs.
grafik
So i think, that after my router's pre set lease time of an hour, it assigns another IP to the raspberry which causes it do not be accessible anymore from any device in the network (no matter if it appeared with a DHCPACK message in the log before or not). This would explain why it never works longer than an hour. I also am not able to access the router config page from the raspberry after it disconnected. If I try to reconnect a device it says "IP address could not be retrieved".
The second issue is that the DLAN could in fact have its own DHCP server which gets active once the others in the network fail like proposed in some threads (TL-WPA8630P v2 Acting as DHCP server- how do I turn it off - Home Network Community). This would explain why I don't have to reset the router in most of the times, as after the piholes disconnection the DLAN assigns an IP to my device, from which I can then access its config page. Also it keeps up the connection to the pihole, which is then able to give IPs to some other devices, which keep using the DNS server. This would explain, why the pihole kept on working after the issues for some devices. This time this did not happen.

Here is what I did: last time I just disabled the DLAN AP wifi. This time I connected all my devices directly to the router and unplugged both DLAN adapters. After the piholes disconnection all the devices loose their connection. If I try to reconnect a device here is what I'm seeing in the log:

Nov 8 21:02:14 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov 8 21:02:21 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0

After I reset the router and set the SSID back, the raspberry reconnected and continued working (with the devices which IPs were assigned by it and not the router)

Here's the log with comments when I startet the test at 20:05, the pihole disconnected at 20:53 and when I reset the router at 21:20: https://tricorder.pi-hole.net/3kxtowywj2

What do you think about my theory?

PS: I could find out that Samsung mobiles seem to have a different MAC adress when connected via 2.4 GHz than vie 5GHz. I connected both mobiles to the 2.4GHz network of the router and saw both MAC adresses which were "unknown" in the pihole list before, so that mystery is solved. :wink:

Yes, I just checked that. It is.

Another thing regarding the TP Link DLAN:
if I have to reset the router I always have to disconenct the DLAN, otherwise it will keep the SSID and DHCP settings as before and will still be not accessible.
it has this WiFi Move option which could be responsible for that:
grafik
So the adapters seem in fact to have some influence in the network which takes time to figure out someone without IT knowledge...

No, that is definitely not happening. (click for details)

If your Pi-hole machine would request a new DHCP lease through your router and receive a new IP address, your network would lose DNS resolution instantly, as all your clients would still talk to Pi-hole's previous IP.

After DNS cache entries would expire on your client OS and browser, you wouldn't be able to access any websites anymore, i.e. it would look like a complete Internet failure.


Besides, your log already shows that clients still request and receive DNS resolution during the times when you cannot access Pi-hole's UI.

In addition, Pi-hole's installation may have created a static network interface configuration in /etc/dhcpcd.confon its host device if you chose to do so.
Your Pi-hole host machine never requests a DHCP lease in that case.

So far, we have two indications that DHCP is involved in your failure to access Pi-hole's UI:
a) Once you enable Pi-hole's and disable your router's DHCP server, access to Pi-hole's UI starts failing at about the same time your router's DHCP leases expire (one hour)
b) Pi-hole's UI is accessible again after you reenable your router's DHCP server

Those observations could be explained by another DHCP server (yet unknown) that wouldn't distribute Pi-hole as DNS server, since only Pi-hole knows how to resolve pi.hole into 192.168.0.10.

Once access to Pi-hole fails from one of your desktops, what's the output of the following command, run from that desktop:

nslookup pi.hole

Do not run this from a smartphone terminal; that often yields wrong results, as terminal emulation apps often use fixed DNS servers instead of the ones provided by your network.

Also, are you still able to access Pi-hole's UI via IP address then?

1 Like

That is exactly what has happened with the unplugged DLAN adapters. I had to reset the router until I could access the web again. So in the latest log between the breakdown and the router reset there is nearly half an hour no DNS activity:

Log
Nov  8 20:53:14 dnsmasq[3510]: reply userlocation.googleapis.com is 216.58.207.42

# System breakdown

Nov  8 20:59:27 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov  8 20:59:29 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 20:59:49 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov  8 21:00:00 dnsmasq[3510]: query[PTR] 30.0.168.192.in-addr.arpa from 127.0.0.1
Nov  8 21:00:00 dnsmasq[3510]: DHCP 192.168.0.30 is TVWohnzimmer.lan
Nov  8 21:00:00 dnsmasq[3510]: query[PTR] 3.c.0.2.6.6.5.7.2.a.2.a.3.8.0.2.0.a.5.8.1.8.0.1.8.0.9.0.2.0.a.2.ip6.arpa from 127.0.0.1
Nov  8 21:00:00 dnsmasq[3510]: /etc/pihole/local.list 2a02:908:1081:85a0:2083:a2a2:7566:20c3 is raspberrypi
Nov  8 21:00:00 dnsmasq[3510]: query[PTR] a.0.8.6.0.a.5.7.0.3.a.4.9.1.9.a.0.a.5.8.1.8.0.1.8.0.9.0.2.0.a.2.ip6.arpa from 127.0.0.1
Nov  8 21:00:00 dnsmasq[3510]: cached 2a02:908:1081:85a0:a919:4a30:75a0:680a is NXDOMAIN
Nov  8 21:02:14 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov  8 21:02:21 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:02:28 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:04:45 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:04:46 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov  8 21:04:46 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:05:00 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:11:15 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:11:15 dnsmasq-dhcp[3510]: RTR-SOLICIT(wlan0) 86:a7:c5:5b:74:cb
Nov  8 21:11:16 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:11:17 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:11:21 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0
Nov  8 21:11:29 dnsmasq-dhcp[3510]: no address range available for DHCP request via wlan0

# Reset of router

Nov  8 21:20:18 dnsmasq-dhcp[3510]: DHCP packet received on wlan0 which has no address
[..]
Nov  8 21:21:10 dnsmasq-dhcp[3510]: 7167144 sent size: 18 option: 39 FQDN  raspberrypi.lan
Nov  8 21:21:11 dnsmasq[3510]: query[A] ocfconnect-shard-eu02-euwest1.samsungiotcloud.com from 192.168.0.24
``

The DNS activity did only happen in tests before with plugged in DLAN adapters, which lead me to the second assumption. What speaks against my theory though, is that there is activity in the log once I try to reconnect one device to the network, so somehow the pi is still communicating with the router.

I've just checked the file, all settings were commented out, so I set it up like this:

interface wlan0
static ip_address=192.168.0.10/24
static routers=192.168.0.1
static domain_name_servers=192.168.0.10

I will run another test with unplugged DLAN and the new settings this evening.

I have allways accessed the UI via pihole's IP directly (as I also did with the router and DLAN config pages), so I am pretty sure that "nslookup pi.hole" would not give any results. Also from my understanding, this shows that not only my device is disconnected from the pihole but also from the router after "breakdown".
Except from the DLAN I cannot imagine that any other connected device would have DHCP funcionality (PCs, smartphones, TV, LED controller, smartwatch, robot cleaner)

Your recent findings would contradict some of your previous observations, but that may be attributed to swapping your DLAN access points in and out for testing. (Makes it harder for me to follow, but I understand you are doing so to pinpoint your issue :wink: ).

Let's stick with the current facts, bringing us closer to a solution:

Lack of static definitions would indeed have your Pi-hole request a new DHCP lease after its router-sourced lease expires, and it would be your Pi-hole's DHCP server answering it.
Since its DHCP pool is ranging from .50 to .251, your RPi wouldn't be able to acquire its old .10 (unless you've setup a corresponding lease reservation in Pi-hole's DHCP).

All devices having requested a DHCP lease before that IP switch-over would now continue to talk to the retired .10 address until the lease expires.
Yet all devices requesting a DHCP lease from Pi-hole's DHCP server after that IP switch-over would receive the correct new DNS server IP.
Still, you don't want Pi-hole to switch IP addresses in general.

I am confident that your static definitions in /etc/dhcpcd.conf will bring you back to normal, at least as long as you leave out your DLAN access points. Bring them back in once you've verified it's working as expected without them. That way, we'd know that any remaining issues had to be caused by them.

If you stick with those static definitions, I'd change the `domain_name_servers`. (click for details)
static domain_name_servers=8.8.8.8

This will allow your RPi (and only that) to resolve DNS through Google if resolution through Pi-hole would fail for some reason, so you would still be able to access online resources, e.g. to download and apply updates on your RPi.
Change 8.8.8.8 to your likings.

Hi Bucking_Horn,
good news! The system is running since 19:00, so far everything is looking fine. There was short trouble an hour after I switched pihole on, when I couldn't connect devices to the router anymore (my mobile didn't even show up the SSID although it was directly next to it) but I could solve this by reactivating the DLAN (maybe the mirror had just lost connection to the router and connected to the DLAN AP, I didn't check that).
So far it seems, that my devices connect via pihole and not via the DLAN's DHCP, but I will keep an eye on that. So I guess the DHCP issue is solved so far, so I will close this thread.
I still have no idea, why the static IP was not set up during the installation process, but I am glad that with your help, I could fix this and the configuration of the wlan-port as well!
Thanks a lot for your help!

Glad it's working now. :slight_smile:

I'll try to summarise our efforts to solve this for more casual readers.
Note that Pi-hole never actually froze at any time, but was fully operational all along.

When you first activated Pi-hole's DHCP server, it was unable to hand out any DHCP leases:

*** [ DIAGNOSING ]: Pi-hole log
-rw-r--r-- 1 pihole pihole 617804 Nov  1 13:13 /var/log/pihole.log
   -----head of pihole.log------
   dnsmasq-dhcp[675]: no address range available for DHCP request via wlan0

That was caused by Pi-hole being configured for your RPi's eth0 instead of wlan0.
We fixed that by running

pihole -r

and chosing Reconfigure.

That allowed switching to Pi-hole's DHCP server.

While it was now able to hand out DHCP leases, access to Pi-hole's UI started failing at about the same time your router's last DHCP leases expired (one hour). There were hints that this was only happening for devices connected through a wifi access point early on.

In the end, that was caused by your Pi-hole machine's router-sourced IPv4 address (.10) expiring, and since Pi-hole's DHCP had no lease reservation for your RPi, it was acquiring a new IP that none of the clients knew about.

We fixed that by providing a static network interface definition in /etc/dhcpcd.conf:

interface wlan0
static ip_address=192.168.0.10/24
static routers=192.168.0.1
static domain_name_servers=8.8.8.8

If you would observe any further anomalies with regards to your additional DLAN network equipment, please consider opening a new topic (dann auch gerne in der Kategorie Deutschsprachige Hilfe :wink: ).

2 Likes