Unbound frequent restarts

I know this is an old issue, but I am currently an issue where unbound on the pihole restarts every few hours due to the router advertisments. I tried setting a static IPV6 address for the pihole on pfsense but that didn't seem to help.

Did anyone have luck resolving this?

May 27 03:19:58 raspberrypi dhcpcd[462]: eth0: part of Router Advertisement expired
May 27 03:19:59 raspberrypi unbound: [536:0] info: service stopped (unbound 1.9.0).
May 27 03:19:59 raspberrypi unbound: [536:0] info: start of service (unbound 1.9.0).
May 27 03:19:59 raspberrypi unbound: [536:0] info: service stopped (unbound 1.9.0).
May 27 03:19:59 raspberrypi unbound: [536:0] info: start of service (unbound 1.9.0).

Unfortunately, I couldn't resolve it. I therefore uninstalled unbound from my PiHole Rpi4 and set it up on a Rpi Zero W. Works well.

What does your unbound configuration look like?

sudo grep -v '#\|^$' -R /etc/unbound/unbound.conf*
/etc/unbound/unbound.conf:include: "/etc/unbound/unbound.conf.d/*.conf"
/etc/unbound/unbound.conf.d/pi-hole.conf:server:
/etc/unbound/unbound.conf.d/pi-hole.conf:    verbosity: 0
/etc/unbound/unbound.conf.d/pi-hole.conf:    interface: 127.0.0.1
/etc/unbound/unbound.conf.d/pi-hole.conf:    port: 5335
/etc/unbound/unbound.conf.d/pi-hole.conf:    do-ip4: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    do-udp: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    do-tcp: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    do-ip6: no
/etc/unbound/unbound.conf.d/pi-hole.conf:    prefer-ip6: no
/etc/unbound/unbound.conf.d/pi-hole.conf:    harden-glue: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    harden-dnssec-stripped: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    use-caps-for-id: no
/etc/unbound/unbound.conf.d/pi-hole.conf:    edns-buffer-size: 1472
/etc/unbound/unbound.conf.d/pi-hole.conf:    prefetch: yes
/etc/unbound/unbound.conf.d/pi-hole.conf:    num-threads: 1
/etc/unbound/unbound.conf.d/pi-hole.conf:    so-rcvbuf: 1m
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: 192.168.0.0/16
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: 169.254.0.0/16
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: 172.16.0.0/12
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: 10.0.0.0/8
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: fd00::/8
/etc/unbound/unbound.conf.d/pi-hole.conf:    private-address: fe80::/10
/etc/unbound/unbound.conf.d/pi-hole.conf:    remote-control:
/etc/unbound/unbound.conf.d/pi-hole.conf:    control-enable: yes
/etc/unbound/unbound.conf.d/root-auto-trust-anchor-file.conf:server:
/etc/unbound/unbound.conf.d/root-auto-trust-anchor-file.conf:    auto-trust-anchor-file: "/var/lib/unbound/root.key"
/etc/unbound/unbound.conf.d/qname-minimisation.conf:server:
/etc/unbound/unbound.conf.d/qname-minimisation.conf:    qname-minimisation: yes

Here is an example of what I see in the system log:

May 27 14:05:58 raspberrypi dhcpcd[463]: eth0: part of Router Advertisement expired
May 27 14:05:58 raspberrypi unbound: [525:0] info: service stopped (unbound 1.9.0).
May 27 14:05:58 raspberrypi unbound: [525:0] info: start of service (unbound 1.9.0).
May 27 14:05:58 raspberrypi unbound: [525:0] info: service stopped (unbound 1.9.0).
May 27 14:05:58 raspberrypi unbound: [525:0] info: start of service (unbound 1.9.0).

Thank you btw!

Don't thank me too soon. :wink:
I am unable to reproduce your issue in my network.

My hope was that your config would look similar to doncarajo's, which binds to IPv4 and IPv6 localhost alike, in an attempt to use their observation of Turning off "Enable IPv6 support (SLAAC + RA)" solving their issue to support a theory that eliminating IPv6 from unbound's config could also avoid the issue.

However, your configuration is already limited to IPv4's 127.0.0.1 interface, so that theory of mine isn't working.

I don't think this is a Pi-hole issue:
unbound stops and restarts gracefully, so somethng seems to send a regular kill -HUP to unbound, presumably on change of an IPv6 RA - probably an IPv6 prefix change. Both such a change of a prefix as well as unbound reloading to adopt to such a change seem reasonable at first glance (but then I vaguely recall doncarajo mentioning that he's got a fixed IPv6 prefix).
It would seem logical to attempt to influence the frequency of RAs, which may be configurable via your router, but may also depend on a client's configuration or even on your ISP, in which case it would be out of your control.

This is just me speculating, however.

This will likely be quite hard to analyse, especially if we fail to reproduce it.

You could start by trying to find out which process is signalling unbound to stop and restart.
Depending on how that turns out, we may quickly reach the point where we cannot provide further meaningful assistance from a Pi-hole perspective.
It may be advantageous to consider to consult other support channels as well, specifically unbound's mailing list, since that is the software that exihibits unwanted behaviour.

1 Like

Okay, thank you. I appreciate your help regardless.

Is it possible for that kill -HUP command to come from the gateway on my network (pfsense)?

No, that definitely has to be a process running on the same machine.

Likely candidates would be any networking tools, including your DHCP client, and I wouldn't preclude Pi-hole itself just yet, though I am unaware it would do such a thing. I'm checking with development to be safe.

EDIT:
Just to complete the picture:
Do you use Pi-hole's DHCP server (like doncarajo does)?

1 Like

No, DHCP is managed by pfsense.

Not sure if this is relevant but I disabled the DNS resolver completely on PFsense and hand out the pihole IPs to clients on my network.

That makes Pi-hole's involvement even less likely, since it is only your router/pfsense that would trigger RAs in your network then.

1 Like

Yeah, it is very confusing. I tried increasing the lease time for IPV6 clients and setting a static IPV6 address for the pihole (on the pfsense box) to see of that would help. So far the static address did not. Waiting to see if unbound restarts after changing the lease time.

Would reformatting my sd card and redoing the entire pihole help with this, or is it like you said something caused by the router

Have some pointers.
Check the systemd journals around the time of restarting:

journalctl -u unbound

Check kernel ring buffer for OOM occurrences:

dmesg -T | grep -i oom

And check the kernel ring buffer in general for errors/fails etc:

dmesg -T | less

It's not caused by your router.
Your router is just triggering the RAs (as it should, and can be expected to).
Whatever is causing unbound to reload may in turn be triggered by processing such an RA (which is currently yet another assumption).

There's currently no reason to assume you are suffering from incomplete or misconfigured software packages, so I guess a complete reinstallation wouldn't change anything.

1 Like

Results from journalctl -u unbound

-- Logs begin at Thu 2021-05-27 16:53:31 BST, end at Thu 2021-05-27 18:30:01 BST. --
May 27 16:53:48 raspberrypi systemd[1]: Starting Unbound DNS server...
May 27 16:53:50 raspberrypi package-helper[496]: /var/lib/unbound/root.key has content
May 27 16:53:50 raspberrypi package-helper[496]: success: the anchor is ok
May 27 16:53:50 raspberrypi unbound[528]: [528:0] info: start of service (unbound 1.9.0).
May 27 16:53:50 raspberrypi systemd[1]: Started Unbound DNS server.

No results for second command dmesg -T | grep -i oom

dmesg -T | less:

[Thu May 27 16:53:51 2021] SMSC LAN8700 usb-001:003:01: attached PHY driver [SMSC LAN8700] (mii_bus:phy_addr=usb-001:003:01, irq=POLL)
[Thu May 27 16:53:51 2021] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[Thu May 27 16:53:51 2021] smsc95xx 1-1.1:1.0 eth0: Link is Down
[Thu May 27 16:53:55 2021] smsc95xx 1-1.1:1.0 eth0: Link is Up - 100Mbps/Full - flow control off
[Thu May 27 16:53:55 2021] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[Thu May 27 16:53:57 2021] ICMPv6: process `dhcpcd' is using deprecated sysctl (syscall) net.ipv6.neigh.eth0.retrans_time - use net.ipv6.neigh.eth0.retrans_time_ms instead

Are you sure you run the journalctl command after you noticed unbound being restarted?

Was that link down/up at about the same time as unbound was restarted?

1 Like

My apologies, I jumped the gun a little. I will keep an eye out for the next restart and run those commands again.

@deHakkelaar

1 Like

You can tail the journal logs live:

journalctl -fu unbound

EDIT: and also the kernel ring buffer:

dmesg -Tw

EDIT2: ow that wasnt intentinal profanity towards unbound with the "fu" :smiley:

2 Likes

I am just thinking, should I bump the lease time back down to 7200 seconds (default) on pfsense? I increased it to 24 hours in an attempt the restarts from happening.

Something bout patience ..... :wink:

1 Like

Okay so something happened.

For the results of journalctl -fu unbound

-- Logs begin at Thu 2021-05-27 16:53:31 BST. --
May 27 16:53:48 raspberrypi systemd[1]: Starting Unbound DNS server...
May 27 16:53:50 raspberrypi package-helper[496]: /var/lib/unbound/root.key has content
May 27 16:53:50 raspberrypi package-helper[496]: success: the anchor is ok
May 27 16:53:50 raspberrypi unbound[528]: [528:0] info: start of service (unbound 1.9.0).
May 27 16:53:50 raspberrypi systemd[1]: Started Unbound DNS server.
May 27 19:14:49 raspberrypi unbound[528]: [528:0] info: service stopped (unbound 1.9.0).
May 27 19:14:49 raspberrypi unbound[528]: [528:0] info: start of service (unbound 1.9.0).
May 27 19:14:49 raspberrypi unbound[528]: [528:0] info: service stopped (unbound 1.9.0).
May 27 19:14:49 raspberrypi unbound[528]: [528:0] info: start of service (unbound 1.9.0).

Running two terminal windows that are ssh'd to the pihole. The other terminal instance was running dmesg -Tw, but nothing new it seems:

[Thu May 27 16:53:45 2021] bcm2835-codec bcm2835-codec: Loaded V4L2 isp
[Thu May 27 16:53:47 2021] Under-voltage detected! (0x00050000)
[Thu May 27 16:53:48 2021] random: crng init done
[Thu May 27 16:53:48 2021] random: 7 urandom warning(s) missed due to ratelimiting
[Thu May 27 16:53:48 2021] 8021q: 802.1Q VLAN Support v1.8
[Thu May 27 16:53:49 2021] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[Thu May 27 16:53:49 2021] Adding 102396k swap on /var/swap.  Priority:-2 extents:1 across:102396k SSFS
[Thu May 27 16:53:49 2021] Voltage normalised (0x00000000)
[Thu May 27 16:53:51 2021] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[Thu May 27 16:53:51 2021] SMSC LAN8700 usb-001:003:01: attached PHY driver [SMSC LAN8700] (mii_bus:phy_addr=usb-001:003:01, irq=POLL)
[Thu May 27 16:53:51 2021] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[Thu May 27 16:53:51 2021] smsc95xx 1-1.1:1.0 eth0: Link is Down
[Thu May 27 16:53:55 2021] smsc95xx 1-1.1:1.0 eth0: Link is Up - 100Mbps/Full - flow control off
[Thu May 27 16:53:55 2021] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[Thu May 27 16:53:57 2021] ICMPv6: process `dhcpcd' is using deprecated sysctl (syscall) net.ipv6.neigh.eth0.retrans_time - use net.ipv6.neigh.eth0.retrans_time_ms instead

Probably not relevant, but my network has two pihole's. The one we are working with now, and another that is also running unbound for redundancy.

The results of running journalctl -fu unbound on pihole-2:

-- Logs begin at Sat 2021-05-22 07:21:25 BST. --
May 27 14:54:33 raspberrypi-2 unbound[431]: [431:0] info: service stopped (unbound 1.9.0).
May 27 14:54:33 raspberrypi-2 unbound[431]: [431:0] info: start of service (unbound 1.9.0).
May 27 16:49:43 raspberrypi-2 unbound[431]: [431:0] info: service stopped (unbound 1.9.0).
May 27 16:49:43 raspberrypi-2 unbound[431]: [431:0] info: start of service (unbound 1.9.0).
May 27 16:49:43 raspberrypi-2 unbound[431]: [431:0] info: service stopped (unbound 1.9.0).
May 27 16:49:43 raspberrypi-2 unbound[431]: [431:0] info: start of service (unbound 1.9.0).
May 27 19:14:49 raspberrypi-2 unbound[431]: [431:0] info: service stopped (unbound 1.9.0).
May 27 19:14:49 raspberrypi-2 unbound[431]: [431:0] info: start of service (unbound 1.9.0).
May 27 19:14:49 raspberrypi-2 unbound[431]: [431:0] info: service stopped (unbound 1.9.0).
May 27 19:14:49 raspberrypi-2 unbound[431]: [431:0] info: start of service (unbound 1.9.0).

Check if something else is happening at the same time that might cascade into what your experiencing:

journalctl