I have been using my Pi-hole + Unbound for a couple weeks without issue, when I suddenly started having this error randomly pop up in Firefox (with its DoH disabled) for a bunch of different pages:
It is "fixed" each time by simply refreshing the page once or twice. I was wondering what might be causing it, so I went to my Pi-hole query log and saw entries that look like this:
There is usually a Retried status immediately followed by a BOGUS / SERVFAIL. I did a query log over the past seven days and saw that there were over 3000 Retried entries across several different devices, not just this one.
I haven't made any configuration changes in the past few weeks, so I'm not sure why it has suddenly started to fail when trying to resolve pages. The only thing I can think of is that I changed the speed of my AT&T broadband plan about the same time, and maybe they pushed some sort of change to their BGW210 modem I'm forced to use.
After doing some Googling, others who've had lots of Retried and/or SERVFAIL errors said they might have an issue with packet filtering upstream of the Pi-hole on Port 53 or something, but I haven't been able to figure out if that's my problem, or even how to fix it. Any help would be appreciated!
Details about my system:
AT&T fiber jack > Arris BGW210 modem/router [in pass-through mode] > AmpliFi Instant Router > Gigabit switch > Raspberry Pi (+ rest of network)
Contents of /etc/unbound/unbound.conf.d/pi-hole.conf:
server:
# If no logfile is specified, syslog is used
# logfile: "/var/log/unbound/unbound.log"
verbosity: 0
interface: 127.0.0.1
port: 5335
do-ip4: yes
do-udp: yes
do-tcp: yes
# May be set to yes if you have IPv6 connectivity
do-ip6: no
# You want to leave this to no unless you have *native* IPv6. With 6to4 and
# Terredo tunnels your web browser should favor IPv4 for the same reasons
prefer-ip6: no
# Use this only when you downloaded the list of primary root servers!
root-hints: "/var/lib/unbound/root.hints"
# Trust glue only if it is within the server's authority
harden-glue: yes
# Require DNSSEC data for trust-anchored zones, if such data is absent, the zone becomes BOGUS
harden-dnssec-stripped: yes
# Don't use Capitalization randomization as it known to cause DNSSEC issues sometimes
# see https://discourse.pi-hole.net/t/unbound-stubby-or-dnscrypt-proxy/9378 for further details
use-caps-for-id: no
# Reduce EDNS reassembly buffer size.
# Suggested by the unbound man page to reduce fragmentation reassembly problems
edns-buffer-size: 1472
# Perform prefetching of close to expired message cache entries
# This only applies to domains that have been frequently queried
prefetch: yes
# One thread should be sufficient, can be increased on beefy machines. In reality for most users running on small networks or on a single machine, it should be unnecessary to seek performance enhancement by increasing num-threads above 1.
num-threads: 1
# Ensure kernel buffer is large enough to not lose messages in traffic spikes
so-rcvbuf: 1m
# Ensure privacy of local IP ranges
private-address: 192.168.0.0/16
private-address: 169.254.0.0/16
private-address: 172.16.0.0/12
private-address: 10.0.0.0/8
private-address: fd00::/8
private-address: fe80::/10
Ah, I thought BOGUS meant it couldn't find anything and was the reason my browser kept coming up short. I'm still having the issue where pages aren't loading randomly though. Could there be a broader outage elsewhere?
I ran timedatectl status and verified the time/date are correct. I'm not sure what you mean by checking the unbound anchor, but here's the output from cat /var/lib/unbound/root.hints:
; This file holds the information on root name servers needed to
; initialize cache of Internet domain name servers
; (e.g. reference this file in the "cache . <file>"
; configuration file of BIND domain name servers).
;
; This file is made available by InterNIC
; under anonymous FTP as
; file /domain/named.cache
; on server FTP.INTERNIC.NET
; -OR- RS.INTERNIC.NET
;
; last update: June 24, 2021
; related version of root zone: 2021062401
;
; FORMERLY NS.INTERNIC.NET
;
. 3600000 NS A.ROOT-SERVERS.NET.
A.ROOT-SERVERS.NET. 3600000 A 198.41.0.4
A.ROOT-SERVERS.NET. 3600000 AAAA 2001:503:ba3e::2:30
;
; FORMERLY NS1.ISI.EDU
;
. 3600000 NS B.ROOT-SERVERS.NET.
B.ROOT-SERVERS.NET. 3600000 A 199.9.14.201
B.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:200::b
;
; FORMERLY C.PSI.NET
;
. 3600000 NS C.ROOT-SERVERS.NET.
C.ROOT-SERVERS.NET. 3600000 A 192.33.4.12
C.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2::c
;
; FORMERLY TERP.UMD.EDU
;
. 3600000 NS D.ROOT-SERVERS.NET.
D.ROOT-SERVERS.NET. 3600000 A 199.7.91.13
D.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2d::d
;
; FORMERLY NS.NASA.GOV
;
. 3600000 NS E.ROOT-SERVERS.NET.
E.ROOT-SERVERS.NET. 3600000 A 192.203.230.10
E.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:a8::e
;
; FORMERLY NS.ISC.ORG
;
. 3600000 NS F.ROOT-SERVERS.NET.
F.ROOT-SERVERS.NET. 3600000 A 192.5.5.241
F.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2f::f
;
; FORMERLY NS.NIC.DDN.MIL
;
. 3600000 NS G.ROOT-SERVERS.NET.
G.ROOT-SERVERS.NET. 3600000 A 192.112.36.4
G.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:12::d0d
;
; FORMERLY AOS.ARL.ARMY.MIL
;
. 3600000 NS H.ROOT-SERVERS.NET.
H.ROOT-SERVERS.NET. 3600000 A 198.97.190.53
H.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:1::53
;
; FORMERLY NIC.NORDU.NET
;
. 3600000 NS I.ROOT-SERVERS.NET.
I.ROOT-SERVERS.NET. 3600000 A 192.36.148.17
I.ROOT-SERVERS.NET. 3600000 AAAA 2001:7fe::53
;
; OPERATED BY VERISIGN, INC.
;
. 3600000 NS J.ROOT-SERVERS.NET.
J.ROOT-SERVERS.NET. 3600000 A 192.58.128.30
J.ROOT-SERVERS.NET. 3600000 AAAA 2001:503:c27::2:30
;
; OPERATED BY RIPE NCC
;
. 3600000 NS K.ROOT-SERVERS.NET.
K.ROOT-SERVERS.NET. 3600000 A 193.0.14.129
K.ROOT-SERVERS.NET. 3600000 AAAA 2001:7fd::1
;
; OPERATED BY ICANN
;
. 3600000 NS L.ROOT-SERVERS.NET.
L.ROOT-SERVERS.NET. 3600000 A 199.7.83.42
L.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:9f::42
;
; OPERATED BY WIDE
;
. 3600000 NS M.ROOT-SERVERS.NET.
M.ROOT-SERVERS.NET. 3600000 A 202.12.27.33
M.ROOT-SERVERS.NET. 3600000 AAAA 2001:dc3::35
You'd likely be unable to resolve any domain if your anchor was incorrect.
Assuming that you are just seeing those SERVFAILs for certain domains, your issue seems to be upstream.
SERVFAIL indicates one of Pi-hole's and in turn unbound's upstreams returned an error.
Intermittent SERVFAILs are not uncommon, and you'll hardly ever notice them.
Unfortunately, they are somewhat hard to troubleshoot if they persist over a longer period.
They may indicate a DNS server authoritative for that domain may be down, or something is interfering with DNS resolution (outside of your network) - see Pi-hole unbound servfail where an ISP was filtering DNS requests.
That's an interesting point about the ISP. I checked AT&T's ARRIS BGW210's settings and noticed packet filtering was turned on, so I disabled it since I have a downstream router to handle it. I'll keep an eye out and see if this helps going forward. If it doesn't help, I'll turn on the unbound-remote and see if I can get some additional insight there.
I did notice some additional firewall settings, but I'm not certain which, if any, I should change from their defaults: