Pi-hole is crashing my ethernet network

I could use some help with the following situation where I have no idea where to look or how to isolate this further.

Config;

Hardware:

  • Raspberry Pi 4GB
  • Kingston A400 - Interne SSD - 240 GB
  • Netgear R7000 with custom Firmware Freshtomato2021.2 K26ARM USB AIO-64K
  • Several pc's connected via cable to router/switches

Issue:
Since a couple of weeks my complete ETHERNET networkd crashes. That means every device that is connected via cable looses connection with the router (no IP). When I check a wireless device it is still connected to my router and is able to go on the internet.

This happens every couple of days and I have no way to trigger it faster.

When this happens and I unpower my RPI my ethernet network comes back straight away. So it's fair to say/assume that the PI is the culprit.

Isolation/troubleshoot steps:

  • Unplug RPI - network comes back, devices do receive IP again.
  • I turned off pi-hole for a couple of weeks and issue didn't came back
  • I checked logs but could not find anything what might be causing this. But I don't rule out that I'm looking over it
  • When I generate a docker debug log I do see some errors - But I also read somewhere that debug mode wasn't written voor docker and some errors could be ignored.
  • Deleted all pihole container and image and started from scratch.

https://tricorder.pi-hole.net/gg0qoqtxdj

Docker-compose file

version: "3"

# More info at https://github.com/pi-hole/docker-pi-hole/ and https://docs.pi-hole.net/
services:
  pihole:
    container_name: pihole
    image: pihole/pihole:latest
    hostname: pihole
    ports:
      - "53:53/tcp"
      - "53:53/udp"
      - "67:67/udp"
      - "80:80/tcp"
      - "443:443/tcp"
    environment:
      TZ: 'Europe/Amsterdam'
      WEBPASSWORD: 'xxxxxx'
    # Volumes store your data between container upgrades
    volumes:
      - './etc-pihole/:/etc/pihole/'
      - './etc-dnsmasq.d/:/etc/dnsmasq.d/'
    # Recommended but not required (DHCP needs NET_ADMIN)
    #   https://github.com/pi-hole/docker-pi-hole#note-on-capabilities
    cap_add:
      - NET_ADMIN
    restart: unless-stopped

Anyone that might experienced the same or someone who could point me in te right direction? If it happens again I'm at the verge of installing the non docker version and see if it happens here as well.

or maybe set pihole/docker on a fixed IP instead of shared with the other apps?

Ive got the exact same problem:
My token: https://tricorder.pi-hole.net/f4htyokt2z

Except none of my wireless devices can go on the internet. I cant even reach my router. wireless or cable, makes no difference, have to reboot my router to get back online.

I experienced my problem since i bought a new Synology and a new router and set them up together on the same day.
Router: Asus Ax-88u with merlin firmware (Because of this issue)
Synology: DS920+
Pihole on docker.

Ive tried every guide i could find, nothing seems to work and i dont know why.
Today i reinstalled pihole and left everything as default.
It got a static ip from my Router's DHCP, DNS Filter On, no filtering, router mode.
God a Modem from ISP which is in Bridge mode.

Everything is wired regarded those 3.

Even started to wonder if my router placement on top of my synology could be the issue.

Ive got a VM running on my synology Which has an active VPN connection. But the issue started before i set that up.

I can add that DNS LAN 1 got my Synology static IP. 192.168.1.110 DNS 2 Blank
WAN DNS automatic.
Pihole DNS Set for both cloudfare,
COnditional forwarding disabled
And the first 2 "never forward...." boxes are ticked. rest are blank or not ticked.

Except for the wireless/cabled difference, we might have the same issue ?

Could be...

When I reinstalled pihole yesterday I basically left it blank/default. Didn't change one setting and only have 1 PC (my main) using pihole.

1 thing i remembered reading somewhere was regarding the host name of the pihole.
I named it Pihole like you, now with the reinstall i left it at default as pihole-pihole1.
Cant remember exactly what it was, but i think it had to have a "." or "-" for some reason i dont remember.
Thought i renamed it earlier in the process, but apparently not.

Just dropped again, couldnt even connect to the wifi.

This looks more like a hardware issue.
Did you check for defective ethernet cables or switches?

As a DNS server, Pi-hole is not involved in IP address allocation at all.

It could be if you had enabled Pi-hole's and disabled your router's DHCP server.
But if that were the case, powering down your Pi-hole RPi would cause your connected clients to lose their IP address over time once their current DHCP leases would expire (which doesn't seem to match your observation).

But why doesn't this happen when I had pi-hole not running for 2 weeks. The moment I turned on pihole again it came back after ~2days (which is pretty standard for this issue start happening)
I tested different cables before and also a different pi.

The only link I can explain when I power off / reboot the RPI my network comes back instantly. It's like the rpi is flooding my network. When the issue starts to happen I notice that Pihole / home assistant / dsmr reader stop responding. For a few more secs/min I'm still able to internet with the wired devices but that stops ass well after that.

The normal result when switching off a properly integrated Pi-hole DNS server would be loss of DNS resolution, local and public alike, for all clients in your network as soon as on-device cached domains expire.
This would look like a complete internet failure, which is the opposite of what you describe.

If your network is still operational when you power down your Pi-hole machine, your clients may not use Pi-hole for DNS at all.

Stop responding to what?
Did you try to verify whether DNS is involved?

I know how it sounds and also find it very strange.
But I'm just stating the facts and my findings.

When it happens all Ethernet devices looses their connection (no ip).
Wirelesss devices work fine (especially when they don't use the pihole as dns)

When I power-off the Pi my network comes back straight away.

And most importantly I ran a test for 2.5 weeks without pihole and my network didn't crashed. I turned it back on and within 2 days it crashed again. One time even after a couple of hours after the pi got started.

I'm just using a docker setup with a plain install of pihole. Something in my setup a long with pihole or docker or something else is causing this issue that I'm experiencing.

When it happens I do notice that the other applications on the pihole stop responding. I can't access them via the browser anymore. At that moment my network/internet is still working but within a couple secs to a minute my LAN connection also drops (No IP). It's like something is flooding the network and adds up like a snowball until everything stops working.

So again, I do understand how it sounds and doesn't make sense for DNS application. But I'm not making this up and would love to get this isolated further. I've ran pi-hole for a couple of years without any problems. This mostly started when I switched to docker but could be coincidence.

Is there someway I can turn on debug logging that logs EVERY task it does every second/minute pihole does an action.

I'm not dismissing your observation.
I'm rather trying to align your description of your observation with standard behaviour.

So far, the only thing your description would suggest is that Pi-hole isn't the sole DNS server for your network (otherwise, you'd lose DNS resolution and thus any meaningful way to access the internet shortly after you've switched off your Pi-hole machine).

If a device or network interface suddenly drops its IP, a hardware related defect is very likely.

I somehow suspect your description may mix up objective observations with subjective speculations, and hence may lack in both precision and completeness.

How do you determine that a device has lost its IP addresses?
What did ip address or ipconfig show on a device you assume has lost its IP?
Were you able to ping your router from that device?
Did you try to ping that device's IP from another machine?

And did you try to verify whether DNS is involved yet?

1 Like

Correct.
I used to change my router dns to the pihole so every devices was going through. But for test purposes I'm only using my main-pc while the other devices go through the Router/ISP dns.

Okay, but if every device (2nd pc, tv, nas, mediacenter) loose connection I would doubt the problem is with these devices. Perhaps the RPI itself is the problem but I also exchanged it with another RPI and had the same behavior.

Windows shows the connection is lost icon in the systray.
I'm not able to ping other devices or google.com website. I can't connect to the RPI through SSH to make some test pings.

This is still too vague.

Windows' NCSI icon does not allow to conclude you've lost your IP, and it's known to falsely report no connectivity even if there is.

And as you didn't disclose the exact commands and results you were using, there is no way to judge what you've actually tested for, e.g. as you're testing for loss of IP addresses, you should have ping'ed IP addresses - did you?

And you still didn't answer

as well as

When it happens again (I expect soon) i can make screenshots. Atm I'm giving my observations out of my head what I can remember.

It's also safe to assume that I know how ping works :wink:
Yes I pinged my router
ping 192.168.1.1 = time out
ping google.com = time out

Not sure what you mean with this?
I did ping google.com which timed-out.

But again I'm doing this all out of my head. I've been on it since a couple of weeks.

ipconfig was 169.x i believe... I can guarrantee you it it didn't had my 192.168.1.100 IP (main-pc)

ping is good for IP related stuff, but it's not adequate for DNS (different protocol, and it resolves names through different means, not just DNS).
Use nslookup or dig to analyse DNS issues.

169.254.0.0/16 would be a link-local IPv4 address.
Those addresses would commonly only be assigned if a device has failed to acquire an IPv4 address via DHCP or static definitions.

I already explained that Pi-hole as a DNS server is not involved in IP address allocation at all, but may be so if you're using it for DHCP as well.

So far, you haven't disclosed whether you actually intended to use Pi-hole for DHCP?

Yes, I know nslookup.
But DNS isn't my main concern when all my devices have lost connection while my wireless devices don't have any problem.

I just want to isolate this further. For me, atm, all fingers point to Pihole. And I understand that it sound weird but Pihole does/can do more on a network level then the applications homeassistant or dsmr-reader (for instance cond. forwarding has never worked for me. When I turn it on my network also crashes after a while (but that's another issue which I don't care for atm).

As I said earlier the problems started to arise when I went to the docker setup and the debian guide on how to install home assistant supervised (supported version). Before that I always used pihole via the bash install script. If it happens again in the coming days (which I have no doubt) I might as well try the bash script again instead of docker.

Pi-hole is involved in exactly one type of network activity:
It answers to DNS requests arriving on port 53 UDP/TCP via IPv4 or IPv6 alike by forwarding them to one of its configured DNS servers as applicable.

If you'd enabled Pi-hole's DHCP server, it may additionally answer DHCP also.
Only when acting as DHCP server, it could be involved in allocating IP addresses.
You still haven't revealed whether you are using Pi-hole that way (I am mentioning this now for the third time).

I thought you pretty much answered that question yourself.

No, I don't use DHCP on the Pi.
You can also assume that I don't run 2 DHCP server on my network. I'm not a complete noob and know perfectly well what I'm doing :wink:

But can you explain why the moment I power-off/reboot the rpi my network comes back straight away? Or that I don't have this issue when I turned pihole (docker container) off for 2,5 weeks and didn't experienced this problem? And when it happens it starts with the rpi getting unreachable and after a few mins my complete wired network is shut down. I prefer to focus on those parts and how to troubleshoot/debug that instead of convincing you that I know how ping works or that my device doesn't receive a IP. I do appreciate your effort though!

Perhaps I should start a tcpdump on the rpi and let it run or are there any other ways to get some more/better debug logging to see what is happening at the moment it happens. I checked /var/log/messages log but there is so much regular 'spam' it's hard to isolate it for me.

And I'm not blaming pihole for this issue because 100000s of ppl use it finne. But in my case there is a combination of these factors what is causing my issue. And I would really like to get to the bottom of is.

EDIT:

Your topic's title suggests otherwise?
-EDIT END

(I'm not prone to assumptions, that's why I asked how you derived at your description.
Likewise, I didn't make any assumptions on your network, but provided you with the information how Pi-hole could be involved.
Or, as I've put it before: I tried to align your description of your observation with standard behaviour.)

Pi-hole wouldn't be involved in DHCP leases then.

No, and I don't intend to.

I can only relate as far as Pi-hole is concerned, and to the extent that you reveal your related configuration.

From what you shared, Pi-hole isn't involved.

If your IP addresses indeed switch to 169.254.0.0/16 range after some time, then perhaps your clients weren't able to contact your DHCP server to renew their lease.
In that case, observed outages should somehow correlate with your DHCP lease time, and you should focus on why they are not able to renegotiate a lease with their known DHCP server.
But that is just a guess, as good as the next one.

As this doesn't seem to be a Pi-hole issue, you'd increase your chances for an answer by consulting other forums as well.

Yes, explained several times
Pihole turned on = network crashes every couple of days
Pihole turned off = No issues straight for 2,5 weeks.

Not every setup is the same. And in my scenario something is happening weird when pihole is turned on.

I probably would have mentioned other error message's like 'ip conflict' etc. But that's another discussion.

Not sure, how you can say pihole isn't evolved??
modem <-> router <-> rpi with pihole
main pc connnected through DHCP withht the router and rpi with pihole acting as dns

Not relevant. This is not the case.
Otherwise you could say that I would have had the same problem when pihole was turned on which was not the case.

Again...
pihole turned on = crashes every couple of days
Pihole turned off = no issues at all.

I appreciate your help but you have to be more open that pihole could be the problem in my setup/scenario. I said it before and do realize that it sounds strange but until we can explain why this isn't happening when pihole is turned off we can't exclude pihole yet.

So unless you can't help me with actual troubleshooting/isolating step's, instead of pointing the finger away from pihole, I would appreciate it if we can leave this discussion. The goal to troubleshoot is to isolate the problem and narrow it down further. I should be the one explaining on the a tech forum that not every setup is in the same. What can work for you perfectly in your setup might cause issues in my setup. Until we haven't ruled that every card should be on the table.

You do realize that it will be hard to convince other parties that this problem only occurs when pihole is turned on. And with other parties I mean docker or debian.

If Bucking_horn can't help you then you're going to be on your own.

I don't know IF he can help me.
The help I was getting was doubting my findings or judging my ping skills. Or saying the problem is NOT related to pihole. I'm perfectly open to believe the problem is not related to Pihole. But then I expect to exclude pihole with isolation steps (while my isolation steps point to Pihole). Yet he can't explain why in my situation.