Pihole on ubuntu 18.04 with unbound fails after some time

I installed pihole with the script and followed the documentation to get unbound running. It does work but after some time I get a timeout when I try to reach websites. A restart of unbound does help. What I did so far was changing the nameserver in /etc/netplan/01-netcfg.yaml. With and without these changes I get this behaviour. Any hints what I am missing?

This is what I get when I start unbound:

Aug 21 10:07:17 user package-helper[28528]: /var/lib/unbound/root.key has content
Aug 21 10:07:17 user unbound[28557]: [1566374837] unbound[28557:0] warning: so-rcvbuf 1048576 was not granted. Got 425984. To fix: start with root permissions(linux) or sysctl bigger net.core.rmem_max(linux) or kern.ipc.maxsockbuf(bsd) values.
Aug 21 10:07:17 user unbound: [28557:0] info: start of service (unbound 1.6.7).

Debug Token:

Weirdly enough I get:

[âś—]  There was an error uploading your debug log.
   * Please try again or contact the Pi-hole team for assistance.
   * A local copy of the debug log can be found at: /var/log/pihole_debug.log

This will temporarily reset the nameserver on the Pi to bypass Pi-Hole DNS.

sudo nano /etc/resolv.conf

edit nameserver 127.0.0.1 to nameserver 9.9.9.9 or your preferred third party DNS service, save and exit

Run pihole -d and upload the debug log

Tried that as well with the same outcome.

Has this install ever worked reliably? You can also post the log manually as the error explains from /var/log/pihole_debug.log

It has. When I use it with google's DNS (chosen in the web interface) I don't get timeouts. When switching to unbound after some time resolving doesn't work.

I will have a look at the debug.log - I don't feel to comfortable to just upload it.

You can PM it to me.

1 Like

What command are you using to start unbound? You have unresolved errors showing.

What are the outputs of:

unbound -h

sudo service unbound status

You can increase the verbosity of unbound to log more details - this may help you with your troubleshooting. In /etc/unbound/unbound.conf.d/pi-hole.conf, change verbosity to a higher number as described in the unbound documentation and restart unbound with sudo service unbound restart

1 Like

systemctl start unbound

unbound -h
usage:  unbound [options]
        start unbound daemon DNS resolver.
-h      this help
-c file config file to read instead of /etc/unbound/unbound.conf
        file format is described in unbound.conf(5).
-d      do not fork into the background.
-v      verbose (more times to increase verbosity)
Version 1.6.7
linked libs: libevent 2.1.8-stable (it uses epoll), OpenSSL 1.1.1  11 Sep 2018
linked modules: dns64 python subnetcache respip validator iterator
BSD licensed, see LICENSE in source package for details.
Report bugs to unbound-bugs@nlnetlabs.nl

and

systemctl status unbound
â—Ź unbound.service - Unbound DNS server
   Loaded: loaded (/lib/systemd/system/unbound.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-08-21 12:15:48 CEST; 2h 7min ago
     Docs: man:unbound(8)
 Main PID: 21707 (unbound)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/unbound.service
           └─21707 /usr/sbin/unbound -d

Aug 21 12:15:47 homeserver systemd[1]: Starting Unbound DNS server...
Aug 21 12:15:48 homeserver package-helper[21688]: /var/lib/unbound/root.key has content
Aug 21 12:15:48 homeserver package-helper[21688]: success: the anchor is ok
Aug 21 12:15:48 homeserver unbound[21707]: [1566382548] unbound[21707:0] warning: so-rcvbuf 1048576 was not granted. Got 425984. To fix: st
Aug 21 12:15:48 homeserver unbound[21707]: [21707:0] info: start of service (unbound 1.6.7).
Aug 21 12:15:48 homeserver systemd[1]: Started Unbound DNS server.

I changed the verbosity to 3 for now and will observe.

Your debug log shows that your Pi-Hole DNS queries are going to Google and not unbound:

    PIHOLE_DNS_1=8.8.8.8
    PIHOLE_DNS_2=8.8.4.4

When you shift it back to unbound, also disable DNSSEC in Pi-Hole as well, since unbound is performing that function.

Ah yes - sorry. I sent you the wrong debug. I will wait until unbound needs a restart and report back.

So it happened again. Here are the last lines of unbound syslog:

Aug 21 16:36:57 homeserver unbound: [10403:0] info: finishing processing for twitch.map.fastly.net. AAAA IN
Aug 21 16:36:57 homeserver unbound: [10403:0] debug: validator[module 1] operate: extstate:module_wait_module event:module_event_moddone
Aug 21 16:36:57 homeserver unbound: [10403:0] info: validator operate: query twitch.map.fastly.net. AAAA IN
Aug 21 16:36:57 homeserver unbound: [10403:0] debug: subnet[module 0] operate: extstate:module_wait_module event:module_event_moddone
Aug 21 16:36:57 homeserver unbound: [10403:0] info: subnet operate: query twitch.map.fastly.net. AAAA IN
Aug 21 16:36:57 homeserver unbound: [10403:0] debug: cache memory msg=521496 rrset=1585202 infra=432495 val=221762 subnet=74488

Nothing special here. It just doesn't log anything after it although there should be requests.

I sent you my pihole_debug.log @jfb.

Given your problems with this install of unbound, I would increase verbosity to 5 and see if there is more detail.

What is driving the restarts of unbound? Does it just fail to resolve, does it crash on error?

It is not failing.

# systemctl status unbound
â—Ź unbound.service - Unbound DNS server
   Loaded: loaded (/lib/systemd/system/unbound.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-08-21 14:26:52 CEST; 2h 17min ago
     Docs: man:unbound(8)
  Process: 10378 ExecStartPre=/usr/lib/unbound/package-helper root_trust_anchor_update (code=exited, status=0/SUCCESS)
  Process: 10366 ExecStartPre=/usr/lib/unbound/package-helper chroot_setup (code=exited, status=0/SUCCESS)
 Main PID: 10403 (unbound)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/unbound.service
           └─10403 /usr/sbin/unbound -d

Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] debug: subnet[module 0] operate: extstate:module_wait_module event:module_event_moddon
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: subnet operate: query org. DNSKEY IN
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: validated DNSKEY org. DNSKEY IN
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] debug: validator[module 1] operate: extstate:module_wait_subquery event:module_event_p
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: validator operate: query textsecure-service.whispersystems.org. A IN
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: NSEC3s for the referral proved no DS.
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: Verified that unsigned response is INSECURE
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] debug: subnet[module 0] operate: extstate:module_wait_module event:module_event_moddon
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] info: subnet operate: query textsecure-service.whispersystems.org. A IN
Aug 21 16:42:12 homeserver unbound[10403]: [10403:0] debug: cache memory msg=521740 rrset=1585436 infra=432804 val=221762 subnet=74488

It will help if I restart unbound though. I will do so with the increased log level.

Perhaps I don't understand the problem. When you get timeouts trying to reach websites, is the problem that the client is not getting an IP back from unbound/Pi-Hole, or is the IP received and then there is a delay for the client to be able to load that IP? These are two different things.

After some period of time I can't access any website:

$ dig github.com

; <<>> DiG 9.11.5-P1-1ubuntu2.5-Ubuntu <<>> github.com
;; global options: +cmd
;; connection timed out; no servers could be reached

This is what I get on a client. The only thing that helps is to restart unbound service although the service is still running and doesn't show any errors. It is madly weird. I suspected systemd-resolv to be the cause of problems but I couldn't anything that was confirming my theory.

The first place I would look is the connection between the client and Pi-Hole. The only traffic going to unbound in your configuration is traffic from Pi-Hole.

/var/log/pihole.log should show all the transaction history for a DNS request. That is the query from the client to Pi-Hole, the forward of the query from Pi-Hole to 127.0.0.1 (unbound), the reply from unbound.

Example: dig cnn.com from my client at IP 135 and the following transactions appear in the log:

Aug 21 09:56:05 dnsmasq[12972]: query[A] cnn.com from 192.168.0.135
Aug 21 09:56:05 dnsmasq[12972]: forwarded cnn.com to 127.0.0.1
Aug 21 09:56:05 dnsmasq[12972]: reply cnn.com is 151.101.193.67
Aug 21 09:56:05 dnsmasq[12972]: reply cnn.com is 151.101.1.67
Aug 21 09:56:05 dnsmasq[12972]: reply cnn.com is 151.101.129.67
Aug 21 09:56:05 dnsmasq[12972]: reply cnn.com is 151.101.65.67

I will check pihole.log once the problems occurs again because I didn't write down the time it happened. I think that request still got forwarded to 127.0.0.1 which would support the theory that something weird is happening with unbound (restarting seems to help).

When you run this command from the client, you should see a matching query string in the pihole log. You want to verify that the DNS request is reaching the Pi-Hole.

Okay, here we go again. This is the output on the client:

$ dig github.com

; <<>> DiG 9.11.5-P1-1ubuntu2.5-Ubuntu <<>> github.com
;; global options: +cmd
;; connection timed out; no servers could be reached

Pi-hole is forwarding the request made from the client:

Aug 21 18:50:32 dnsmasq[11131]: query[A] github.com from 192.168.178.110
Aug 21 18:50:32 dnsmasq[11131]: forwarded github.com to 127.0.0.1

even on the host I can't reach unbound anymore:

dig pi-hole.net @127.0.0.1 -p 5353

; <<>> DiG 9.11.3-1ubuntu1.8-Ubuntu <<>> pi-hole.net @127.0.0.1 -p 5353
;; global options: +cmd
;; connection timed out; no servers could be reached

Here you can find the last entries of unbound. I think I need to head to the unbound mailing list.

You might be right. One easy thing to do would be to uninstall unbound and reinstall it. I was having some problems with an unbound install yesterday and that was the solution. In my case, the root key wasn't playing well.

1 Like