Instable DNS resolving, Dnsmasq failure on Ubuntu client

Thanks for the improvement to disable logging.
Furthermore, I found out it was a performance issue. I had mplayer running as well. When I kill it, there are no problems any more with the dns. Enabling mplayer again and the issues returned.
Thanks for thinking with me.

@Eelco / @DL6ER Interesting I had exactly the same problem and I wanted to ask the same question :slight_smile: I had no idea what is the reason for this weird behaviour, I thought that it is a problem of my FritzBox or macOS beta. On the last weekend I have decided to make a hard reset of my FritzBox and a fresh install of pi-hole - and voila the problem was gone.

Well, I thought my problem was gone. But it is back. Very weird behaviour: pinging on the server name (kit) several times results in "unkown host". Then ping on it's ip address gives positive result directly. Then ping on the servername gives positive result as well. But then, it fails again. Even sometimes it doesn't work again after a positive ping on ip address. I think about reinstalling my pihole as well.

Yeah, I think it is totally unrelated if you do a ping to the IP or not. Please login and provide the output of

uptime

for us. How many cores does your server have?

I will monitor my network, I hope the problem does not come back here too. @Eelco you describe exactly what I have seen here before too. @DL6ER my pi-hole is running in an ESXi 6 VM with Ubuntu Server 16.10 (before it was 16.04 or 14.04 I am not sure) I have assigned two CPU cores.

Edit: I forgot to say that I have seen often the IP 192.0.53.53 in the response, when the problem occurred.

I have reinstalled my raspberry pi with a fresh downloaded image, installed pihole and added my servers to /etc/hosts:

pi@kit:~ $ cat /etc/hosts
127.0.0.1	localhost
::1		localhost ip6-localhost ip6-loopback
ff02::1		ip6-allnodes
ff02::2		ip6-allrouters

127.0.1.1	kit

192.168.1.1     router.aesset            
192.168.1.9     win8.aesset
192.168.1.15    kit.aesset
192.168.1.103   nas3.aesset 
192.168.1.105   tank.aesset
192.168.1.106   bambu.aesset
192.168.1.120   eelco.aesset
192.168.1.121   bas.aesset
192.168.1.150   sonos.aesset

And still the problem occurs. And you're right, it is not related to the ping or not. It is just a symptom, see the following log where the ping succeeds, then fails, then succeeds...:

$ ping kit
PING kit.aesset (192.168.1.15) 56(84) bytes of data.
64 bytes from kit.aesset (192.168.1.15): icmp_seq=1 ttl=64 time=0.397 ms
64 bytes from kit.aesset (192.168.1.15): icmp_seq=2 ttl=64 time=0.481 ms
^C
--- kit.aesset ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.397/0.439/0.481/0.042 ms
$ ping kit
ping: unknown host kit
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit
ping: unknown host kit
$ ping sonos
ping: unknown host sonos
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping 192.168.1.15
PING 192.168.1.15 (192.168.1.15) 56(84) bytes of data.
64 bytes from 192.168.1.15: icmp_seq=1 ttl=64 time=0.444 ms
64 bytes from 192.168.1.15: icmp_seq=2 ttl=64 time=0.494 ms
^C
--- 192.168.1.15 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.444/0.469/0.494/0.025 ms
$ ping kit
PING kit.aesset (192.168.1.15) 56(84) bytes of data.
64 bytes from kit.aesset (192.168.1.15): icmp_seq=1 ttl=64 time=0.437 ms
64 bytes from kit.aesset (192.168.1.15): icmp_seq=2 ttl=64 time=0.554 ms

uptime
pi@kit:~ $ uptime
12:28:25 up 1:41, 2 users, load average: 0,00, 0,01, 0,00

Okay, so this means that your server is not (at least not computationally) overloaded.

Can you run the following command on the device you also used for doing the pings?

dig kit
dig kit.aesset

Maybe multiple times and look if there is a difference.

several times dig kit gives results like this (no change over time):

$ dig kit

; <<>> DiG 9.9.5-3ubuntu0.13-Ubuntu <<>> kit
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5228
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;kit.				IN	A

;; ANSWER SECTION:
kit.			300	IN	A	127.0.1.1

;; Query time: 1 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Mar 14 13:45:45 CET 2017
;; MSG SIZE  rcvd: 48

for dig kit.aesset:

$ dig kit.aesset

; <<>> DiG 9.9.5-3ubuntu0.13-Ubuntu <<>> kit.aesset
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61089
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;kit.aesset.			IN	A

;; ANSWER SECTION:
kit.aesset.		300	IN	A	192.168.1.15

;; Query time: 2 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Mar 14 13:47:24 CET 2017
;; MSG SIZE  rcvd: 55

But waiting a while for a failing ping kit gives the following:

$ dig kit

; <<>> DiG 9.9.5-3ubuntu0.13-Ubuntu <<>> kit
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 40065
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;kit.				IN	A

**;; AUTHORITY SECTION:**
**.			2935	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2017031400 1800 900 604800 86400**

;; Query time: 18 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Tue Mar 14 13:49:46 CET 2017
;; MSG SIZE  rcvd: 96

Aha!

What operating system is this computer running on? It is not exclusively using the Pi-hole but might also ask other upstream servers from time to time which then answer with NXDOMAIN, i.e. unknown host.

Are you using NetworkManager?

Ubuntu 14.04. And yes NetworkManager is running, by default as I understand. Is that the cause? Why and how, then?

~$ ps -ef|grep dnsmasq
nobody    2227   997  0 08:12 ?        00:00:00 /usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-file=/run/sendsigs.omit.d/network-manager.dnsmasq.pid --listen-address=127.0.1.1 --conf-file=/var/run/NetworkManager/dnsmasq.conf --cache-size=0 --proxy-dnssec --enable-dbus=org.freedesktop.NetworkManager.dnsmasq --conf-dir=/etc/NetworkManager/dnsmasq.d
eelco    18315 14896  0 14:03 pts/16   00:00:00 grep --color=auto dnsmasq

Additional to above, After several attempts on my android mobile and ipad it seems that http:/kit/admin does not work but http://kit.aesset/admin does work. Don't know if this confirms your idea about the cause.

@DL6ER just to tell you that so far it looks very stable today. Didn't change anything, except ran the update of pi-hole. Thanks so fare and I'll let you know when things change. I'm still curious about your last thoughts when you had the Aha moment

Unfortunately, still not very stable. See below.

$ ping kit
ping: unknown host kit
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit.aesset
ping: unknown host kit.aesset
$ ping kit
PING kit.aesset (192.168.1.15) 56(84) bytes of data.
64 bytes from kit (192.168.1.15): icmp_seq=1 ttl=64 time=0.456 ms
64 bytes from kit (192.168.1.15): icmp_seq=2 ttl=64 time=0.568 ms
64 bytes from kit (192.168.1.15): icmp_seq=3 ttl=64 time=0.504 ms

And after that, it failed again, and after several attempts it responded again. In the browser I even got an error page:

If someone has a suggestion on the cause of this? I have no other problems on my network.

That is why we (internally) often call NetworkManager rather NetworkMangler...

Please try the following:

Edit /etc/NetworkManager/NetworkManager.conf:

# dns=dnsmasq

and restart the Network Manager:

sudo restart network-manager
1 Like

Thanks a lot @DL6ER! I found out that another client (Macbook) had a very stable DNS service, so it had to be something local on my UBUNUT client. So far, it looks like that the disabling of dnsmasq on network-manager solved the issue.

Can you maybe tell what the consequences are of disabling dnsmasq locally?

You won't have local caching. However, this is no issue at all, since the caching will be done for you in the Pi-hole now. Even on a Raspberry Pi version 1 the answer should arrive your client within at most 5 milliseconds, so I highly doubt you will ever be able to notice the difference.

It is just not good to don't have DNS caching at all, so that is why Ubuntu enables it via default (since there isn't a Pi-hole in every household).

1 Like

For a while I was happy, but today I was hit out of nowhere by the same problem again, local domain names are no longer resolved. A a ping gives me the following output:

ping borg
PING borg.fritz.box (127.0.53.53): 56 data bytes 

On my client runs macOS 10.12.4 (I have tested it with Arch on the same client too, same result). I have nothing changed on my configuration. It is pretty weird.

EDIT: After a Fritzbox firmware update it works again - the question is: How long?

Might I ask this question after nearly one week?

Sure @DL6ER, it seems I did a very stupid mistake, I have setup a second Upstream DNS server, which led to this weird inconsistent behaviour :flushed:.