Please follow the below template, it will help us to help you!
Expected Behaviour:
My clients to be able to name resolve like they've always done.
Actual Behaviour:
I suddenly can't use my Raspberry pi pihole and VM on a differnt host pihole as DNS on my machines.
I checkout the raspberry pi and itself couldn't nslookup google.com and same with the VM.
I tried updating pihole on the raspberry pi but it didn't complete (before I new it wasn't resolving properly) and now when I'm trying to login to the pi.hole/ it says "for fresh installs use the default password" or something similar to that message. So it broke.
I checked the name server on the raspberry pi (the main pihole instance i use) and the /etc/resolv.conf is still nameeserver 127.0.0.1. I just can't ping anything from any device that is using the pihole instance as DNS and the pihole machines themselves also aren't working.
I'm still debugging but this is urgent. Thanks.
Other info
I still can reach my docker containers that are running on my server that is using the raspberry pi as the nameserver. But that server can't ping or nslookup or anything. Same with my laptop that was using the raspberry pi as DNS, network stopped working until I changed the DNS on it.
From the server using rpi (137 is the raspberrypi):
The pihole instance on the VM that is running on the main server mentioned above, is somehow working as DNS but very poorly, as in some sites load some don't, I'm assuming it's something to do with caching? I'm not sure. -- Oh i see in my notes now that I left the VM instance to have nameserver 1.1.1.1 in case the main (rpi) fails. But still, working poorly even though it's pointing to 1.1.1.1. Actually even my laptop's network is poor if I set 1.1.1.1 to be my DNS. So I have it set to the default.
I haven't touched any of the configs in a while. I only touched the device now because of this issue. It's been working fine up until either last night or this morning. I've tried rebooting but nothing changed. I thought it would be some issue from a recent power outage.
But no I haven't changed the config and I wanted the unbound and I think I followed this doc to set things up a while ago:
And no I actually haven't thought about the router DNS settings. So it's basically defeating the purpose a bit of pihole if it's using itself and the ISP's DNS, right?
Nov 28 12:30:29 dnsmasq[567]: forwarded 2.debian.pool.ntp.org to 127.0.0.1#5335
Nov 28 12:30:29 dnsmasq[567]: query[AAAA] 2.debian.pool.ntp.org from 127.0.0.1
Nov 28 12:30:29 dnsmasq[567]: forwarded 2.debian.pool.ntp.org to 127.0.0.1#5335
Nov 28 12:30:29 dnsmasq[567]: forwarded 2.debian.pool.ntp.org to 127.0.0.1#5335
Nov 28 12:30:29 dnsmasq[567]: forwarded 2.debian.pool.ntp.org to 127.0.0.1#5335
Nov 28 12:30:29 dnsmasq[567]: reply error is SERVFAIL
Nov 28 12:30:29 dnsmasq[567]: reply error is SERVFAIL
Check the date/time on the host device and correct as necessary for your local time.
Incorrect time causes problems with the DNSSEC process used to authenticate replies.
Hmm ok I ran sudo raspi-config and setup the date again with the localization setting option (I probably could've done it in a different way). I actually thought about that, especially after seeing someone else's issue being about time being out of sync. But I just checked my unbound and it looks okay I think, it didn't change from before. The unbound itself looks fine, but the logs have a bunch of lines that say that at the top but then the logs look a normal? towards the end.
/var/log/unbound/unbound.log
unbound[446:0] info: generate keytag query _ta-4f66. NULL IN
Unbound Status
❯ sudo systemctl status unbound
● unbound.service - Unbound DNS server
Loaded: loaded (/lib/systemd/system/unbound.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2023-11-28 15:11:39 AST; 44min ago
Docs: man:unbound(8)
Process: 6004 ExecStartPre=/usr/lib/unbound/package-helper chroot_setup (code=exited, status=0/SUCCESS)
Process: 6007 ExecStartPre=/usr/lib/unbound/package-helper root_trust_anchor_update (code=exited, status=0/SUCCESS)
Main PID: 6010 (unbound)
Tasks: 1 (limit: 1595)
CPU: 1.770s
CGroup: /system.slice/unbound.service
└─6010 /usr/sbin/unbound -d -p
Nov 28 15:11:39 raspberrypi systemd[1]: Starting Unbound DNS server...
Nov 28 15:11:39 raspberrypi systemd[1]: Started Unbound DNS server.
Date on the system after the sudo raspi-config (I think it was the same? I forgot to check before updating):
I was able to ping just now after I posted my reply (after updating my date/time). But sudo apt update doesn't work (failure resolving archive.raspberrypi...)
But that's not an indicator that it's working right?
then i do nslookup to another domain like quay.io and that doesn't work. Some pings work fast some take their time to start or don't work.
❯ ping yahoo.com
PING yahoo.com (74.6.143.26) 56(84) bytes of data.
64 bytes from 74.6.143.26: icmp_seq=1 ttl=53 time=32.3 ms
64 bytes from media-router-fp74.prod.media.vip.bf1.yahoo.com (74.6.143.26): icmp_seq=2 ttl=53 time=30.7 ms
.....
Update:
Checked my system again now, I can suddenly ping and resolve things. Used nslookup domain.in.pihole and it worked. I'm using the raspberry pi as my DNS again now on my computer and seems to work. There is a bit of a weird lag every now and then. I'm not sure yet what's happening. Here's an updated
But in an updated debug log I see this and 1 servfail;
** [ DIAGNOSING ]: Operating system
[i] Distro: Raspbian
[i] Version: 11
[✗] dig return code: 9
[✗] dig response: ;; connection timed out; no servers could be reached
[✗] Error: dig command failed - Unable to check OS
Seems to be operational and I'm not sure what actually fixed it. It was gradual back to fully normal mode. The vm instance was back to functional before the raspberry pi.
Might have been the date/time update. Might be being able to update pihole as soon as it was working enough to get it to update? I can't pinpoint to a specific thing I did.
After they were functional again I updated the lists, and made sure all is good. It's been a few hours, so hopefully I don't jinx it!