Yes, so here are the timestamps for a few of the recent stoppages:
May 29 12:37:21 raspberrypi-2 unbound[26603]: [26603:0] info: service stopped (unbound 1.9.0).
May 29 12:37:29 raspberrypi-2 unbound[26603]: [26603:0] info: start of service (unbound 1.9.0).
May 29 12:37:29 raspberrypi-2 unbound[26603]: [26603:0] info: service stopped (unbound 1.9.0).
May 29 12:37:37 raspberrypi-2 unbound[26603]: [26603:0] info: start of service (unbound 1.9.0).
And if I stat the files, two of the three were changed:
Continuing to investigate...
resolvconf is a script. which resolvconf returns /usr/sbin/resolvconf
The script has a comment regarding SIGHUP. Below that comment, there is a function that builds a RESTARTCMD, I assume the restart commands are build from the content of /etc/resolvconf.conf and eventually executed. The script also rewrites the files, referred in /etc/resolvconf.conf.
I don't know what OS your running, and if there is a different script for each OS.
can you run sha1sum /usr/sbin/resolvconf (modify the location if required - which resolvconf). For my OS ( Raspberry Pi OS Lite, Release date: May 7th 2021), this returns:
this to confirm we are looking at the same script / possible problem.
I'm considering to comment out the lines, for dnsmasq and unbound, in /etc/resolvconf.conf, this to verify if the unexpected stop events will disappear. I've already have an IPv6check script, running with cron, this to detect IPv6 address changes on the pi, hardly no effort to add sed instructions to modify the files and send a SIGHUP signal (only when the address has really changed).
Would however like the advice of DL6ER and Bucking_Horn...
A bit off-topic (because not targeting unbound) but it makes me wonder if there should be a "reread /etc/hosts but keep cache otherwise" feature in dnsmasq. Maybe I'll suggest this to the mailing list and check what they say.
All my Ipv6 addresses, including temporary addresses are kept (added automatically) in files in a specific directory, processed by dnsmasq, as soon as a file changes by using 'hostsdir=/etc/pfsense'. You can clearly see the new / changed entries are picked up by dnsmasq, without restart, by looking at the logfile.
you could consider moving (v6) the pihole files in a specific directory and use the hostdir= directive to achieve the same for /etc/pihole/local.list and /etc/pihole/custom.list as opposed to using (/etc/dnsmasq.d/01-pihole.conf)
On topic: the unbound 'stopped' messages do not match the time the file(s) (a new IPv6 address added to a file) is changed and dnsmasq reads it. The global dynamic mngtmpaddr, the address in /var/cache/unbound/resolvconf_resolvers.conf, see here, never changes, unless the ISP kicks the modem.
That isn't in my resolvconf, so different versions on different OS, as I already suspected...
Also, would blowing up my pihole and starting over with a different OS like dietpi, would that potentially resolve the issue? Or is modifying the resolveconf.conf file something more likely to help?
Can we safely say that this is something not related to pfsense or pihole at this point? For example, if I were to use opnsense would this issue still exist?
Thanks to everyone for you continued support. The devotion and generosity with time is unmatched in this community.
No, you cant supply the #5335 port in that dhcpcd.conf file.
Either you just supply the localhost address 127.0.0.1 (without the port) which will be pihole-FTL listening on default port 53:
Another restart happened, so I am not sure if this is because I didn't follow through with the rest of the steps or because this service wasn't the one actually causing the restarts.
Sorry, I should probably have clarified that the logs persist, i.e. after you captured the event we were after, you could have stopped auditing your system (and also sorry for returning late).
Just for completeness (click)
You should be able to extract all lines for that event by grepping the audit log for the audit id (following the timestamp):
In your case, that would produce a few lines that would confirm that unbound was the process receiving the HUP signal, so nothing we do not already know.
Still, extracting all the lines pertaining to the same audit id may be of help in other auditing endeavours.
Looks like you've tacked resolvconf as the process responsible for HUPing unbound In the meantime.
There seem to be different versions in different packages around.
In my RaspberryPi OS installation, it resides at /sbin/resolvconf (edit: and there only), but produces the same sha hash.
On my Armbian installation, it doesn't: It seems Armbian uses the resolvconfpackage instead of openresolv.
Maybe that explains why I wasn't able to reproduce jmccamb's observation. I do also distribute a ULA prefix only in my network, though, and I haven't looked into the differences between different resolvconf implementations.
I'm also learning about those differences for the first time.
I think DietPi are basing some of their images on Armbian, so maybe @MichaIng can shine some light on those differences?
I don't know - without knowlegde of the IPv6 trigger event and the resolv.conf content, it may be too early to decide on that.
The one thing we haven't established yet is what IPv6 event exactly is triggering resolvconf. I think DL6ER has made some suggestions earlier to get hold of that.
It may still be the case that such an event warrants an update of resolvconf and the subsequent HUP signal to unbound (though I personally think that is unlikely, as jmccamb is observing that way too often).
@jmccamb, do you recall whether /etc/resolv.conf content changed at all during this analysis?
(Warning: speculations ahead)
If I assume that an IPv6 prefix has changed, then such a change may trigger construction of a new IPv6 address for a DNS server.
If that DNS server was listed in resolv.conf by a different, now expired IPv6 address, the update / reload may be warranted (though I still struggle to understand why unbound would be interested in resolv.conf).
But I somehow suspect that whatever decides on the invocation of unbound.in could be more sophisticated in detecting whether an IPv6 prefix change would also entail a relevant change of a DNS server address (e.g. if resolv.conf never needs to change because it contains only an LLA or ULA).
If that would be true, then that issue should be addressed by resolvconf maintainers. (Again, that is just me speculating here.)
And let me close by saying that all of this is out of scope for Pi-hole.
We don't normally have the time to cover a non-Pi-hole issue in the depth we've done here, so please don't take this as the norm.
On the other hand, this has been categorised accordingly, and given the many voices that joined in (thank you!), I guess you are seeing just that: An exceptional example of Community Help.
/sbin/resolvconf and /usr/sbin/resolvconf are one and the same file on recent Debian and Raspberry Pi OS images, as /sbin is a symlink to /usr/sbin, like /bin and /lib as well.
We always installed the actual resolvconf package, not openresolv on DietPi, so I have no experience with it either. Nowadays we don't even have resolvconf pre-installed, as for simple network including DHCP it is not required. ifupdown and isc-dhcp-client write to /etc/resolv.conf directly when no resolvconf is present.
DL6ER linked the source code of openresolv above which triggers the Unbound HUP/restart, probably we should verify that resolvconf doesn't to that and then find out about the reason, as I currently don't see a reason why a DNS server should restart when the systems upstream DNS is updated . It's too late today but I'll have a look into this tomorrow.
With all due respect for your effort as moderator, greatly appreciated, I must disagree.
One of pi-hole's recommended solutions (upstream) is unbound, see here.
In the past, see here, we have been looking at a high retry count problem. The fact that unbound stops/starts multiple times a day (4 times in the last 5 hours - my environment) may well be one of the reasons for this. We didn't look into to that possibility, because nobody really noticed it at the time.
Thank you all for your help. Shout out to @Bucking_Horn@deHakkelaar .. I would mention everyone but it only allows you to mention two people.
I disabled that resolveconf service and my issue appears to have resolved. If this is an issue with another package I will report it to them too, I am just not sure where. If anyone could be assistance with pointing me in the right direction, that would be awesome.
Edit: I spoke too soon. looks it restarted. Just a lot later than usual.
I am not sure if this is I did it on one pihole instance and not another, but I am going to disable it on both and see what happens. Will report back.