DNS stops working

Please follow the below template, it will help us to help you!

Expected Behaviour:

DNS should work as long as pihole is running

Actual Behaviour:

I haven't been able to figure out a pattern but recently at least once a day the pihole won't resolve domain names outside of the network. It will last anywhere from 5-30 minutes. I'll poke around and temporarily disable the pihole blocking or change Upstream DNS Server and it will start working again. When it's not working I can't connect to any site outside my internal network. Even ssh to the pi and trying to ping google.com won't resolve. I thought it was being caused by having too many Upstream DNS Servers chosen, but changing to 1 didn't fix it for more than a day or so. I'm not sure how to diagnose or where to look when it's happening.

Debug Token:

qkcrj75ad7

What are the outputs of these commands from the Pi terminal:

echo ">stats" | nc localhost 4711

df -a -h

sudo stat /var/log/pihole.log

sudo stat /etc/pihole/pihole-FTL.db

**~ $** echo ">stats" | nc localhost 4711
domains_being_blocked 1299640
dns_queries_today 79043
ads_blocked_today 6215
ads_percentage_today 7.862809
unique_domains 3267
queries_forwarded 42863
queries_cached 29983
clients_ever_seen 41
unique_clients 41
dns_queries_all_types 79521
reply_NODATA 749
reply_NXDOMAIN 446
reply_CNAME 4896
reply_IP 9270
status enabled
---EOM---

$ df -a -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        15G  5.7G  8.1G  42% /
devtmpfs        460M     0  460M   0% /dev
sysfs              0     0     0    - /sys
proc               0     0     0    - /proc
tmpfs           464M     0  464M   0% /dev/shm
devpts             0     0     0    - /dev/pts
tmpfs           464M   18M  446M   4% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           464M     0  464M   0% /sys/fs/cgroup
cgroup             0     0     0    - /sys/fs/cgroup/systemd
cgroup             0     0     0    - /sys/fs/cgroup/cpu,cpuacct
cgroup             0     0     0    - /sys/fs/cgroup/cpuset
cgroup             0     0     0    - /sys/fs/cgroup/freezer
cgroup             0     0     0    - /sys/fs/cgroup/devices
cgroup             0     0     0    - /sys/fs/cgroup/net_cls
cgroup             0     0     0    - /sys/fs/cgroup/blkio
systemd-1          0     0     0    - /proc/sys/fs/binfmt_misc
mqueue             0     0     0    - /dev/mqueue
debugfs            0     0     0    - /sys/kernel/debug
sunrpc             0     0     0    - /run/rpc_pipefs
configfs           0     0     0    - /sys/kernel/config
/dev/mmcblk0p1   41M   22M   19M  54% /boot
tmpfs            93M     0   93M   0% /run/user/1000
gvfsd-fuse         0     0     0    - /run/user/1000/gvfs
fusectl            0     0     0    - /sys/fs/fuse/connections
tmpfs            93M     0   93M   0% /run/user/999


$ sudo stat /var/log/pihole.log
  File: /var/log/pihole.log
  Size: 38673742  	Blocks: 75544      IO Block: 4096   regular file
Device: b302h/45826d	Inode: 130353      Links: 1
Access: (0644/-rw-r--r--)  Uid: (  999/  pihole)   Gid: (  996/  pihole)
Access: 2018-11-01 16:18:02.362030019 -0400
Modify: 2018-11-01 21:12:43.711804804 -0400
Change: 2018-11-01 21:12:43.711804804 -0400
 Birth: -

$ sudo stat /etc/pihole/pihole-FTL.db
  File: /etc/pihole/pihole-FTL.db
  Size: 1333673984	Blocks: 2604840    IO Block: 4096   regular file
Device: b302h/45826d	Inode: 259181      Links: 1
Access: (0644/-rw-r--r--)  Uid: (  999/  pihole)   Gid: (  996/  pihole)
Access: 2018-01-22 14:34:05.279202245 -0500
Modify: 2018-11-01 21:12:00.211694163 -0400
Change: 2018-11-01 21:12:00.211694163 -0400
 Birth: -

When in sh.t creek, use below tool to check ICMP/IP connectivity targeting for example Google's public DNS service 8.8.8.8:

pi@noads:~ $ traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  10.0.0.1 (10.0.0.1)  0.534 ms  0.695 ms  0.627 ms
 2  192.168.1.1 (192.168.1.1)  2.389 ms  2.333 ms  2.273 ms
[..]
 7  108.170.241.193 (108.170.241.193)  19.366 ms  19.558 ms 108.170.241.129 (108.170.241.129)  16.575 ms
 8  216.239.51.5 (216.239.51.5)  20.686 ms 216.239.41.225 (216.239.41.225)  19.517 ms 216.239.63.247 (216.239.63.247)  20.694 ms
 9  google-public-dns-a.google.com (8.8.8.8)  16.641 ms  17.005 ms  16.942 ms

Use below tool on Pi-hole or any client PC running Linux/Windows/MacOS to test DNS lookups for example querying Google's public DNS server 8.8.8.8:

pi@noads:~ $ nslookup pi-hole.net 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   pi-hole.net
Address: 206.189.252.21

You can of course target any IP that you want to test.

This was helpful. I've been trying to figure it out when it happens and I haven't gotten too much closer. I find that if I do an nslookup without a DNS server specified while I'm having the problem, it fails. When I'm not having a problem it works fine. If I specify a DNS Server, it works fine. If I disable/re-enable blocking it works fine. I'd like to figure out what's causing the problem so I can prevent it from happening.

If with nslookup you dont specify a DNS server, nslookup will query the DNS server(s) configured in the OS.
With Linux and Mac systems, you can see what DNS servers are configured with below one (10.0.0.2 being my Pi-hole DNS):

xbian@kodi ~ $ cat /etc/resolv.conf
domain dehakkelaar.nl
search dehakkelaar.nl
nameserver 10.0.0.2

For Windows systems, below one will display configured DNS server(s) and more:

C:\>ipconfig /all

Windows IP Configuration

Ethernet adapter Local Area Connection:

   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv4 Address. . . . . . . . . . . : 10.0.0.11(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : donderdag 22 november 2018 19:53:31
   Lease Expires . . . . . . . . . . : vrijdag 23 november 2018 19:53:31
   Default Gateway . . . . . . . . . : 10.0.0.1
   DHCP Server . . . . . . . . . . . : 10.0.0.2
   DNS Servers . . . . . . . . . . . : 10.0.0.2

So question now is, what DNS server(s) is/are configured on your clients ?

And when having troubles, whats outcome for below one on a client PC when querying the DNS IP address configured in your OS:

nslookup pi.hole <OS_DNS_SERVER_IP_ADDRESS>

And below one on a client PC:

nslookup pi.hole <PIHOLE_IP_ADDRESS>

And below one on Pi-hole:

nslookup pi.hole localhost

If its actually Pi-hole's own DNS service that fails with the last one, check bottom of the logs with:

less /var/log/pihole.log

And:

less /var/log/pihole-FTL.log

Post relevant messages here please ?

Might want to check status as well while at it:

sudo service pihole-FTL status -l

And some general checks for disk space and date/time etc:

df -h

date

free -h

And test the commands first with a working setup so you know how it should look like!

I'm using a mac for a client. DNS shows as the pi-hole.

> **$ cat /etc/resolv.conf**
> **#**
> **# macOS Notice**
> **#**
> **# This file is not consulted for DNS hostname resolution, address**
> **# resolution, or the DNS query routing mechanism used by most**
> **# processes on this system.**
> **#**
> **# To view the DNS configuration used by this system, use:**
> **# scutil --dns**
> **#**
> **# SEE ALSO**
> **# dns-sd(1), scutil(8)**
> **#**
> **# This file is automatically generated.**
> **#**
> **domain local**
> **nameserver 192.168.99.7**

When I'm having the problem and sometimes when I'm not. This is what I get on the client. Same result on the pi-hole.

> $ nslookup pi.hole 192.168.99.7
> ;; connection timed out; no servers could be reached

Which is interesting since I can ping the server fine, I'm connected via ssh in another session no problem, and I can open the web interface.

Running on the pi-hole using localhost and I get the same thing. Sometimes it works. Sometimes it times out.

When the problem happens, if I stop and start the blocking it starts working again. I don't see anything in the logs that looks off. The status shows as I'd expect. I'm at a loss. I'll post some of what I'm seeing below. Except the pihole.log since there's so much there, I'm not sure what to post.

> **$** sudo service pihole-FTL status -l
> **●** pihole-FTL.service - LSB: pihole-FTL daemon
> Loaded: loaded (/etc/init.d/pihole-FTL; generated; vendor preset: enabled)
> Active: **active (exited)** since Sat 2018-11-24 12:02:09 EST; 5min ago
> Docs: man:systemd-sysv-generator(8)
> Process: 364 ExecStop=/etc/init.d/pihole-FTL stop **(code=exited, status=1/FAILURE)**
> Process: 430 ExecStart=/etc/init.d/pihole-FTL start (code=exited, status=0/SUCCESS)
> Nov 24 12:02:07 shutyourpihole systemd[1]: Starting LSB: pihole-FTL daemon...
> Nov 24 12:02:07 shutyourpihole pihole-FTL[430]: Not running
> Nov 24 12:02:08 shutyourpihole su[485]: Successful su for pihole by root
> Nov 24 12:02:08 shutyourpihole su[485]: + ??? root:pihole
> Nov 24 12:02:08 shutyourpihole su[485]: pam_unix(su:session): session opened for user pihole by (uid=0)
> Nov 24 12:02:09 shutyourpihole pihole-FTL[430]: FTL started!
> Nov 24 12:02:09 shutyourpihole systemd[1]: Started LSB: pihole-FTL daemon.
> [2018-11-24 12:02:08.236] ########## FTL started! ##########
> [2018-11-24 12:02:08.236] FTL branch: 
> [2018-11-24 12:02:08.236] FTL version: v4.0
> [2018-11-24 12:02:08.236] FTL commit: 8493df4
> [2018-11-24 12:02:08.236] FTL date: 2018-08-05 13:40:30 -0700
> [2018-11-24 12:02:08.236] FTL user: pihole
> [2018-11-24 12:02:08.236] Starting config file parsing (/etc/pihole/pihole-FTL.conf)
> [2018-11-24 12:02:08.237] SOCKET_LISTENING: only local
> [2018-11-24 12:02:08.237] AAAA_QUERY_ANALYSIS: Show AAAA queries
> [2018-11-24 12:02:08.237] MAXDBDAYS: max age for stored queries is 365 days
> [2018-11-24 12:02:08.237] RESOLVE_IPV6: Resolve IPv6 addresses
> [2018-11-24 12:02:08.237] RESOLVE_IPV4: Resolve IPv4 addresses
> [2018-11-24 12:02:08.237] DBINTERVAL: saving to DB file every minute
> [2018-11-24 12:02:08.237] DBFILE: Using /etc/pihole/pihole-FTL.db
> [2018-11-24 12:02:08.237] MAXLOGAGE: Importing up to 24.0 hours of log data
> [2018-11-24 12:02:08.237] PRIVACYLEVEL: Set to 0
> [2018-11-24 12:02:08.237] IGNORE_LOCALHOST: Show queries from localhost
> [2018-11-24 12:02:08.237] BLOCKINGMODE: Null IPs for blocked domains
> [2018-11-24 12:02:08.237] REGEX_DEBUGMODE: Inactive
> [2018-11-24 12:02:08.237] Finished config file parsing
> [2018-11-24 12:02:08.238] Compiled 4 Regex filters and 180 whitelisted domains in 0.8 msec (0 errors)
> [2018-11-24 12:02:08.239] Database successfully initialized
> [2018-11-24 12:02:08.240] Notice: Increasing queries struct size from 0 to 10000
> [2018-11-24 12:02:08.240] Notice: Increasing domains struct size from 0 to 1000
> [2018-11-24 12:02:08.240] Notice: Increasing clients struct size from 0 to 10
> [2018-11-24 12:02:08.240] New forward server: 4.2.2.1 (0/0)
> [2018-11-24 12:02:08.240] Notice: Increasing forwarded struct size from 0 to 4
> [2018-11-24 12:02:08.240] Notice: Increasing overTime struct size from 0 to 100
> [2018-11-24 12:02:08.240] New forward server: 1.0.0.1 (1/4)
> [2018-11-24 12:02:08.240] New forward server: 4.2.2.2 (2/4)
> [2018-11-24 12:02:08.240] Notice: Increasing clients struct size from 10 to 20
> [2018-11-24 12:02:08.244] Notice: Increasing clients struct size from 20 to 30
> [2018-11-24 12:02:08.249] New forward server: 1.1.1.1 (3/4)
> [2018-11-24 12:02:08.252] Notice: Increasing clients struct size from 30 to 40
> [2018-11-24 12:02:08.329] Notice: Increasing queries struct size from 10000 to 20000
> [2018-11-24 12:02:08.345] Notice: Increasing domains struct size from 1000 to 2000
> [2018-11-24 12:02:08.429] Notice: Increasing queries struct size from 20000 to 30000
> [2018-11-24 12:02:08.555] Notice: Increasing queries struct size from 30000 to 40000
> [2018-11-24 12:02:08.824] Notice: Increasing queries struct size from 40000 to 50000
> [2018-11-24 12:02:09.058] Notice: Increasing queries struct size from 50000 to 60000
> [2018-11-24 12:02:09.067] Notice: Increasing overTime struct size from 100 to 200
> [2018-11-24 12:02:09.161] Notice: Increasing queries struct size from 60000 to 70000
> [2018-11-24 12:02:09.192] Notice: Increasing clients struct size from 40 to 50
> [2018-11-24 12:02:09.233] Notice: Increasing domains struct size from 2000 to 3000
> [2018-11-24 12:02:09.273] Imported 67842 queries from the long-term database
> [2018-11-24 12:02:09.273] -&gt; Total DNS queries: 67842
> [2018-11-24 12:02:09.273] -&gt; Cached DNS queries: 31309
> [2018-11-24 12:02:09.273] -&gt; Forwarded DNS queries: 29497
> [2018-11-24 12:02:09.273] -&gt; Exactly blocked DNS queries: 7036
> [2018-11-24 12:02:09.273] -&gt; Unknown DNS queries: 0
> [2018-11-24 12:02:09.273] -&gt; Unique domains: 2134
> [2018-11-24 12:02:09.273] -&gt; Unique clients: 41
> [2018-11-24 12:02:09.274] -&gt; Known forward destinations: 4
> [2018-11-24 12:02:09.274] Successfully accessed setupVars.conf
> [2018-11-24 12:02:09.280] PID of FTL process: 503
> [2018-11-24 12:02:09.280] Listening on port 4711 for incoming IPv4 telnet connections
> [2018-11-24 12:02:09.281] Listening on port 4711 for incoming IPv6 telnet connections
> [2018-11-24 12:02:09.281] Listening on Unix socket
> [2018-11-24 12:02:09.283] Compiled 4 Regex filters and 180 whitelisted domains in 0.8 msec (0 errors)
> [2018-11-24 12:02:09.284] /etc/pihole/black.list: parsed 0 domains (took 0.0 ms)
> [2018-11-24 12:02:18.828] /etc/pihole/gravity.list: parsed 1313282 domains (took 9543.8 ms)

>               total        used        free      shared  buff/cache   available
> Mem:           927M        288M        131M         19M        506M        566M
> Swap:           99M        768K         99M


> Filesystem      Size  Used Avail Use% Mounted on
> /dev/root        15G  5.8G  8.1G  42% /
> devtmpfs        460M     0  460M   0% /dev
> tmpfs           464M     0  464M   0% /dev/shm
> tmpfs           464M   13M  452M   3% /run
> tmpfs           5.0M  4.0K  5.0M   1% /run/lock
> tmpfs           464M     0  464M   0% /sys/fs/cgroup
> /dev/mmcblk0p1   41M   22M   19M  54% /boot
> tmpfs            93M     0   93M   0% /run/user/1000
> tmpfs            93M     0   93M   0% /run/user/999

> Sat 24 Nov 12:10:14 EST 2018

Please re-run your debug log, upload it and post the token here.

TOKEN: s8puf711yk

I actually ran it earlier and I think the problem was happening because it failed to upload. I have a copy of that log still too if there's another way to upload it

Looking through the log it wasn't able to post. I noticed the following. I don't know that it's relevant.

*** [ DIAGNOSING ]: Name resolution (IPv4) using a random blocked domain and a known ad-serving domain
[✗] Failed to resolve computer-3ense8.stream via localhost (127.0.0.1)
[✗] Failed to resolve computer-3ense8.stream via Pi-hole (192.168.99.7)
[✓] doubleclick.com is 172.217.2.78 via a remote, public DNS server (8.8.8.8)

*** [ DIAGNOSING ]: Pi-hole processes
[✗] dnsmasq daemon is inactive
[✓] lighttpd daemon is active
[✓] pihole-FTL daemon is active

One more token please. They expire after 48 hours.

So, I had the same thing happen where it was unable to upload the log. I saved a copy, disabled blocking, enabled blocking, then re-ran the generate debug log, same thing.
[✗] There was an error uploading your debug log.

Saved the log. Tried again.



[✓] Your debug token is: khsvy8v213



This is relevant to the problem. The failed lines show that Pi-Hole is not resolving DNS.

This does not indicate a problem. With the dnsmasq code in pihole-FTL, dnsmasq does not run as a separate process and will show in the debug log as either failed or inactive.

This debug log shows that Pi-Hole is working properly again (resolving DNS queries from it's LAN IP and it's loopback address):

*** [ DIAGNOSING ]: Name resolution (IPv4) using a random blocked domain and a known ad-serving domain
[✓] a4rvp9.sprt.headcustomtennis.com is 0.0.0.0 via localhost (127.0.0.1)
[✓] a4rvp9.sprt.headcustomtennis.com is 0.0.0.0 via Pi-hole (192.168.99.7)
[✓] doubleclick.com is 172.217.3.78 via a remote, public DNS server (8.8.8.8)

Right. I’m trying to figure out what’s causing it to intermittently stop resolving. When it stops if I cycle blocking that will usually make it start working again. Leaving blocking off and there’s never a problem. I’m about ready to make a backup of the sd and start over to see if that fixes it.

How are you doing this?

If you do a rebuild, I would put in a fresh SD card.

To cycle the blocking I’m using the web interface. I click disable permanently. Then Enable.

The disable script restarts FTL, so that appears to be what is fixing the problem.

But what might be causing it?

I don't know. Intermittent problems can be hard to diagnose and fix. Your log looks normal until that one section shows it isn't resolving.

Next time it does this, open your /var/log/pihole-FTL.log and /var/log/pihole.log and look through them for anything that might show a problem.