Intermittent Error: getaddrinfo EAI_AGAIN with v6.0.3

arminus · February 22, 2025, 7:14am

Expected Behaviour:

Reliable DNS lookups

Pihole 6.0.3 on a Pi4:
Linux pi4 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux
Debian GNU/Linux 11 (bullseye)

Actual Behaviour:

Did the upgrade from v5 yesterday with this problem:

  [✓] Deleting existing list cache
  [✗] DNS resolution is currently unavailable
  [i] Waiting until DNS resolution is available....................................................................................................................   [i] Waiting until DNS resolution is available.....................................................................................................................  [i] Waiting until DNS resolution is available.....................................................................................................................  [i] Waiting until DNS resolution is available.....................................................................................................................  [i] Waiting until DNS resolution is available........................................................................................................................^C
  [i] User-abort detected
  [✓] Cleaning up stray matter

ended up having to do a repair install, which now basically works.

Now I have a heartbeat service which pings a remote uptime-kuma server every 15 minutes, what I'm observing since the upgrade to v6 are intermittent fails:

Error: getaddrinfo EAI_AGAIN <my-uptime.kuma>.ddns.net

This happened last at 06:20, at which time I have this in the query log

so apparently no failure?

Never had this error with v5.

Further digging through other logs, I found this from another cron job which regularely calls another server not on ddns.net:

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.hnd.bayern.de', port=443): Max retries exceeded with url: xxxxxxxx?addhr=hr_gefahr&vhs_type=std&ensvhs=&member= (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f97ce9460>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

Everything's back to normal 15 minutes later.

I also checked the load of the pihole raspberry at those times, all normal.

Debug Token:

https://tricorder.pi-hole.net/r47zmJDg/

Bucking_Horn · February 23, 2025, 1:05pm

Is that a DynDNS domain?

If so, would IP addresses have been dynamically updated around 6:20?

arminus · February 23, 2025, 1:42pm

No, it's not dynamic and I observe the behavior with other domains as well. It's actually fluctuating for these "every 15 minutes" cron jobs, works for hours, then not at some arbitrary point in time it fails. In the case of python scripts with

urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f77a6ef70>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

(in this case for a domain of a service I don't self-host)

arminus · February 24, 2025, 8:23am

Here's a full trace of a curl request last night at 05:20:00 - the request was running on the same box as pihole itself, both directly on the host, so timestamps from curl and pihole are in sync:

# Hostname: pi4
# Date/Time: 2025/02/24 05:20:00 (GMT+1)

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
...
curl: (28) Resolving timed out after 10000 milliseconds

Trace from /var/log/pihole/pihole.log

I yyyyy-ed out a selfhosted service which resolved properly at 05:20:00 and xxxxxx-ed out the service which keeps failing - even though I now added its IP address to the custom records, to eliminate any up-stream issues. Note that it took apparently 18 seconds until the curl request at 05:20:00 was "registered" at 05:20:18 with
query[A] yyyyyyyy.ddns.net from 127.0.0.1 while another request form another cron job from a different box at 05:20:00 was resolved within the same second

Feb 24 05:20:00 dnsmasq[615232]: query[A] influxdb.xxxxxxxx.ddns.net from 192.168.0.180
Feb 24 05:20:00 dnsmasq[615232]: cached influxdb.xxxxxxxx.ddns.net is x.x.x.x
Feb 24 05:20:00 dnsmasq[615232]: query[AAAA] influxdb.xxxxxxxx.ddns.net from 192.168.0.180
Feb 24 05:20:00 dnsmasq[615232]: cached influxdb.xxxxxxxx.ddns.net is NODATA-IPv6
Feb 24 05:20:00 dnsmasq[615232]: query[A] cs.appsmith.com from 192.168.0.180
Feb 24 05:20:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.12.32.188
Feb 24 05:20:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.16.134.105
Feb 24 05:20:00 dnsmasq[615232]: forwarded cs.appsmith.com to 8.8.8.8
Feb 24 05:20:00 dnsmasq[615232]: query[AAAA] cs.appsmith.com from 192.168.0.180
Feb 24 05:20:00 dnsmasq[615232]: cached cs.appsmith.com is NODATA-IPv6
Feb 24 05:20:18 dnsmasq[615232]: reply cs.appsmith.com is 3.16.134.105
Feb 24 05:20:18 dnsmasq[615232]: reply cs.appsmith.com is 3.12.32.188
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] imap.gmail.com from 192.168.0.180
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 2a00:1450:400c:c0a::6d
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 2a00:1450:400c:c0a::6c
Feb 24 05:20:18 dnsmasq[615232]: forwarded imap.gmail.com to 8.8.8.8
Feb 24 05:20:18 dnsmasq[615232]: query[A] imap.gmail.com from 192.168.0.180
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 66.102.1.108
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 66.102.1.109
Feb 24 05:20:18 dnsmasq[615232]: forwarded imap.gmail.com to 8.8.8.8
Feb 24 05:20:18 dnsmasq[615232]: query[A] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: /etc/pihole/hosts/custom.list yyyyyyyy.ddns.net is y.y.y.y
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: cached-stale yyyyyyyy.ddns.net is NODATA-IPv6
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 8.8.8.8
Feb 24 05:20:18 dnsmasq[615232]: query[A] influxdb.xxxxxxxx.ddns.net from 192.168.0.180
Feb 24 05:20:18 dnsmasq[615232]: cached influxdb.xxxxxxxx.ddns.net is x.x.x.x
Feb 24 05:20:18 dnsmasq[615232]: query[PTR] 200.0.168.192.in-addr.arpa from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: cached-stale 192.168.0.200 is ccu.fritz.box
Feb 24 05:20:18 dnsmasq[615232]: forwarded 200.0.168.192.in-addr.arpa to 192.168.0.1
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] imap.gmail.com from 192.168.0.180
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 2a00:1450:400c:c0a::6c
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 2a00:1450:400c:c0a::6d
Feb 24 05:20:18 dnsmasq[615232]: reply 192.168.0.200 is ccu.fritz.box
Feb 24 05:20:18 dnsmasq[615232]: query[A] imap.gmail.com from 192.168.0.180
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 66.102.1.109
Feb 24 05:20:18 dnsmasq[615232]: cached-stale imap.gmail.com is 66.102.1.108
Feb 24 05:20:18 dnsmasq[615232]: query[A] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: /etc/pihole/hosts/custom.list yyyyyyyy.ddns.net is y.y.y.y
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: cached-stale yyyyyyyy.ddns.net is NODATA-IPv6
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 127.0.0.1#5053
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 8.8.8.8
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 8.8.4.4
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 208.67.222.222
Feb 24 05:20:18 dnsmasq[615232]: forwarded yyyyyyyy.ddns.net to 208.67.220.220

And here's the pi4's beszel graph at that time, this was not caused by a maxed-out CPU either:

Nothing strange in the syslog at that time either.

Bucking_Horn · February 24, 2025, 9:14am

Your curl output doesn't contain a domain, so its unclear whether it would refer to yyyyy or xxxxx (or maybe some other domain).

In addition:

There seems to be no yyyyy query at 05:20, only xxxxx?

Note that queries for yyyyy as well as for xxxxx have been answered instantly, so what you observe is not a delay in resolution.

As Pi-hole instantly registers requests, it would seem that the process issuing the request does communicate with Pi-hole no earlier than the logged time.

What's remarkable about your output is that there is a sudden jump from 05:20:00 to 05:20:18.
Just to be sure, you didn't perhaps strip some lines from that output to focus on your issue, did you?

Would there be any activities logged in FTL.log around that time?

sudo grep "24 05:20" /var/log/pihole/FTL.log.*

arminus · February 24, 2025, 9:44am

Thanks for the help...

To clarify, the request for influxdb.xxxxxxxx.ddns was a request which resolved within the second it was triggered by the respective cronjob from another box at 05:20:00.

At the same time, at 05:20:00 a 2nd cron job directly on the pihole host called

curl --connect-timeout 10 --max-time 30 https://yyyyyyyy.ddns.net/api/push/D77MTYn2AI?status=up&msg=OK

gave up after 10 seconds and the respective request was only logged by pihole at 05:20:18.

And no, I did not cut any lines from the log between 05:20:00 and 05:20:18.

Empty, nothing logged at that time in FTL.log.

The 2 cron jobs I traced here run every 15 minutes, there's never an issue, except kind of every night (there was a similar pattern around 3am last night).

arminus · February 24, 2025, 9:47am

For comparison, here's a "nominal" trace:

Feb 24 10:35:00 dnsmasq[615232]: query[A] influxdb.yyyyyyy.ddns.net from 192.168.0.180
Feb 24 10:35:00 dnsmasq[615232]: cached-stale influxdb.yyyyyyy.ddns.net is y.y.y.y
Feb 24 10:35:00 dnsmasq[615232]: forwarded influxdb.yyyyyyy.ddns.net to 8.8.8.8
Feb 24 10:35:00 dnsmasq[615232]: query[AAAA] influxdb.yyyyyyy.ddns.net from 192.168.0.180
Feb 24 10:35:00 dnsmasq[615232]: cached influxdb.yyyyyyy.ddns.net is NODATA-IPv6
Feb 24 10:35:00 dnsmasq[615232]: query[AAAA] cs.appsmith.com from 192.168.0.180
Feb 24 10:35:00 dnsmasq[615232]: cached cs.appsmith.com is NODATA-IPv6                    
Feb 24 10:35:00 dnsmasq[615232]: reply influxdb.yyyyyyy.ddns.net is y.y.y.y 
Feb 24 10:35:00 dnsmasq[615232]: query[A] cs.appsmith.com from 192.168.0.180
Feb 24 10:35:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.16.134.105              
Feb 24 10:35:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.12.32.188       
Feb 24 10:35:00 dnsmasq[615232]: forwarded cs.appsmith.com to 8.8.8.8
Feb 24 10:35:00 dnsmasq[615232]: reply cs.appsmith.com is 3.16.134.105
Feb 24 10:35:00 dnsmasq[615232]: reply cs.appsmith.com is 3.12.32.188
Feb 24 10:35:00 dnsmasq[615232]: query[A] xxxxxx.ddns.net from 127.0.0.1
Feb 24 10:35:00 dnsmasq[615232]: /etc/pihole/hosts/custom.list xxxxxx.ddns.net is x.x.x.x

Bucking_Horn · February 24, 2025, 11:02am

As explained, queries for yyyyy as well as for xxxxx have all been answered within the second they registered, so resolution speed isn't your issue.

The difference is that Pi-hole sees the request arriving 18 seconds after your cron job issued it.

That would leave us with no indication that Pi-hole would be involved.

If Pi-hole would have been restarting or updating its gravity database at that time, that would have registered in FTL.log.

Then the sudden jump really is remarkable.

I'm not entirely sure, but I think pihole-FTL's/dnsmasq's defaults to a 10 second timeout, but we see an upstream answer arriving 18 seconds later:

arminus:

Feb 24 05:20:00 dnsmasq[615232]: query[A] cs.appsmith.com from 192.168.0.180
Feb 24 05:20:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.12.32.188
Feb 24 05:20:00 dnsmasq[615232]: cached-stale cs.appsmith.com is 3.16.134.105
Feb 24 05:20:00 dnsmasq[615232]: forwarded cs.appsmith.com to 8.8.8.8
(…)
Feb 24 05:20:18 dnsmasq[615232]: reply cs.appsmith.com is 3.16.134.105
Feb 24 05:20:18 dnsmasq[615232]: reply cs.appsmith.com is 3.12.32.188

Another interesting detail is that queries for yyyyyyyy.ddns.net A and AAAA records from 127.0.0.1 appear twice in the same second:

Feb 24 05:20:18 dnsmasq[615232]: query[A] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: /etc/pihole/hosts/custom.list yyyyyyyy.ddns.net is y.y.y.y
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: cached-stale yyyyyyyy.ddns.net is NODATA-IPv6
(…)
Feb 24 05:20:18 dnsmasq[615232]: query[PTR] 200.0.168.192.in-addr.arpa from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: cached-stale 192.168.0.200 is ccu.fritz.box
(…)
Feb 24 05:20:18 dnsmasq[615232]: query[A] yyyyyyyy.ddns.net from 127.0.0.1
Feb 24 05:20:18 dnsmasq[615232]: /etc/pihole/hosts/custom.list yyyyyyyy.ddns.net is y.y.y.y
Feb 24 05:20:18 dnsmasq[615232]: query[AAAA] yyyyyyyy.ddns.net from 127.0.0.1

Does your cron job perhaps run twice?

Also, your .180 seems pretty active at that time, but its also silent between 05:20:00 and 05:20:18, making me wonder whether your system may have adjusted its clock at that time?

arminus · February 24, 2025, 11:35am

Definitely not. Curl ping from the pi4 itself and an influxdb store data script from .180.

I don't think so, but journalctl --since "05:20:00" --until "05:21:00" on the .180 box revealed this:

Feb 24 05:20:04 asterix dockerd[3810565]: time="2025-02-24T05:20:04.092361348+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:54162" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:54162->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.\tIN\t A"
Feb 24 05:20:04 asterix dockerd[3810565]: time="2025-02-24T05:20:04.092438406+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:55392" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:55392->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.\tIN\t AAAA"
Feb 24 05:20:08 asterix dockerd[3810565]: time="2025-02-24T05:20:08.094059754+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:50783" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:50783->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.\tIN\t A"
Feb 24 05:20:08 asterix dockerd[3810565]: time="2025-02-24T05:20:08.094137628+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:36737" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:36737->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.\tIN\t AAAA"
Feb 24 05:20:12 asterix dockerd[3810565]: time="2025-02-24T05:20:12.097180175+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:55493" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:55493->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.fritz.box.\tIN\t A"
Feb 24 05:20:12 asterix dockerd[3810565]: time="2025-02-24T05:20:12.097206926+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:39814" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:39814->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.fritz.box.\tIN\t AAAA"
Feb 24 05:20:16 asterix dockerd[3810565]: time="2025-02-24T05:20:16.098734125+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:37944" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:37944->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.fritz.box.\tIN\t A"
Feb 24 05:20:16 asterix dockerd[3810565]: time="2025-02-24T05:20:16.098795641+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:192.168.0.180:58188" dns-server="udp:192.168.0.26:53" error="read udp 192.168.0.180:58188->192.168.0.26:53: i/o timeout" question=";imap.gmail.com.fritz.box.\tIN\t AAAA"
Feb 24 05:20:25 asterix nfsidmap[2562844]: nss_name_to_gid: name 'root@localdomain.de' does not map into domain 'local'

.26 is the pi4 with pihole and the i/o timeout error is strange ?

arminus · February 24, 2025, 12:19pm

I just added a 2nd pihole v5 which I started in an older pi zero as well as 8.8.8.8 as additional servers to /etc/systemd/resolved.conf on the .26 pihole:

resolvectl status
Global
       Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: foreign
     DNS Servers: 127.0.0.1 192.168.0.111 8.8.8.8

Maybe that will make a difference if 127.0.0.1 doesn't respond in time

Bucking_Horn · February 24, 2025, 12:25pm

So .180 has been sending lookups, repeating and finally expanding them by the search domain, but didn't get a reply, while the corresponding requests either failed to register in Pi-hole, or did so only much later.

And as your cronjob doesn't run twice, that would suggest that curl also went into repeating its request.

This may suggest that DNS requests may have been queued.

Depending on dns.replyWhenBusy, Pi-hole may silently drop queries when writing to its database, but that would usually just take ms, unless under heavy load, and you'd still the queries registering, making it rather unlikely that's the case here.

Nevertheless, let's see how many DNS requests Pi-hole had to handle at that time.
What's the output of:

sudo pihole-FTL sqlite3 /etc/pihole/pihole-FTL.db "SELECT count(*) FROM queries \
WHERE timestamp > strftime('%s','2025-02-24 05:21', '-3 seconds', 'utc') \
AND timestamp <= strftime('%s','2025-02-24 05:21', 'utc');"

And also:

sudo pihole-FTL --config database.DBinterval

I was just going to ask what your /etc/resolv.conf on your Pi-hole machine looks like: Perhaps curl (or rather the lookup initiated by it) has been trying to talk to another nameserver from resolv.conf before it opted to ask Pi-hole?
But I see that you've changed it in the meantime.

Bucking_Horn · February 24, 2025, 12:37pm

Note that may only make a difference for DNS requests originating from your .26 pihole.

Requests from other machines are already aimed at 192.168.0.26:53 specifically.

What did resolv.conf look like before that recent change?

arminus · February 24, 2025, 1:03pm

Bucking_Horn:

What's the output of:

sudo pihole-FTL sqlite3 /etc/pihole/pihole-FTL.db "SELECT count(*) FROM queries \
WHERE timestamp > strftime('%s','2025-02-24 05:21', '-3 seconds', 'utc') \
AND timestamp <= strftime('%s','2025-02-24 05:21', 'utc');"

12

60

Yes, that is understood. Apparently, those timeouts only happen for requests coming from the .26 pihole itself.

# Generated by resolvconf
nameserver 127.0.0.1

before and after the change, which I don't quite understand. I did restart systemctl restart systemd-resolved and it didn't touch /etc/resolv.conf

Bucking_Horn · February 24, 2025, 1:42pm

The comment on top of resolv.conf suggests its under control of resolvconf.

So Pi-hole commits queries from its in-memory to its long-term database once every 60 seconds (which is the default), and there were 12 queries in 3 minutes around the time of your observation.

That's far from being enough to have Pi-hole drop requests.

Is that .26 on a wifi connection?

arminus · February 24, 2025, 2:07pm

I think that may be a legacy leftover. resolvconf isn't actually installed on that pi.

It's ethernet wired and in that fail case it's "talking to itself" on the loopback... which brings me to another observation:

netstat -tulpan|grep ":53 "
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      615232/pihole-FTL   
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      930679/systemd-reso 
tcp6       0      0 :::53                   :::*                    LISTEN      615232/pihole-FTL   
udp        0      0 127.0.0.53:53           0.0.0.0:*                           930679/systemd-reso 
udp        0      0 0.0.0.0:53              0.0.0.0:*                           615232/pihole-FTL   
udp6       0      0 :::53                   :::*                                615232/pihole-FTL

Could that 127.0.0.1:53/systemd-resolved be causing issues? Bearing in mind, that "normally" everything works apparently?

Bucking_Horn · February 24, 2025, 2:28pm

Not with regard to your observation, at least as long as your resolv.conf wouldn't point to 127.0.0.53.

But pihole-FTL may fail to start if port 53 would already be bound (e.g. during a future reboot), so it's better to disable systemd-resolved's stub resolver.

Pi-hole's installer should have done so.

Does the following file exist?

cat /etc/systemd/resolved.conf.d/90-pi-hole-disable-stub-listener.conf

[Resolve]
DNSStubListener=no

arminus · February 24, 2025, 2:51pm

Apparently not ...

/etc/systemd/resolved.conf.d/90-pi-hole-disable-stub-listener.conf does not exist. In fact there's no /etc/systemd/resolved.conf.d directory, there's only an /etc/systemd/resolved.conf

[Resolve]
DNS=127.0.0.1 192.168.0.111 8.8.8.8
#FallbackDNS=
#Domains=
#DNSSEC=no
#DNSOverTLS=no
#MulticastDNS=yes
#LLMNR=yes
#Cache=yes
#DNSStubListener=yes
#DNSStubListenerExtra=
#ReadEtcHosts=yes
#ResolveUnicastSingleLabel=no

I've set DNSStubListener=no now and restarted systemd-resolved - with that, there's no more systemd-resolved listening on 127.0.0.1:53

We'll see what'll happen tonight I guess.

arminus · February 25, 2025, 11:14am

All the changes I applied yesterday didn't make any difference, same problem with that ddns.net domain. I think I'll change that to another top-level to see if that makes a difference.

However, I'm still not convinced my pihole v6 is fully stable:

with this trace in pihole.log:

Feb 25 10:19:10 dnsmasq[615232]: query[A] pool.ntp.org from 127.0.0.1
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.8.8
Feb 25 10:19:10 dnsmasq[615232]: query[AAAA] pool.ntp.org from 127.0.0.1
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.8.8
Feb 25 10:19:10 dnsmasq[615232]: query[A] pool.ntp.org from 127.0.0.1
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.8.8
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.4.4
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 208.67.222.222
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 208.67.220.220
Feb 25 10:19:10 dnsmasq[615232]: query[AAAA] pool.ntp.org from 127.0.0.1
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.8.8
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 8.8.4.4
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 208.67.222.222
Feb 25 10:19:10 dnsmasq[615232]: forwarded pool.ntp.org to 208.67.220.220
Feb 25 10:19:10 dnsmasq[615232]: reply pool.ntp.org is 195.201.107.151
Feb 25 10:19:10 dnsmasq[615232]: reply pool.ntp.org is 141.144.246.224
Feb 25 10:19:10 dnsmasq[615232]: reply pool.ntp.org is 185.232.69.65
Feb 25 10:19:10 dnsmasq[615232]: reply pool.ntp.org is 188.245.97.96
Feb 25 10:19:10 dnsmasq[615232]: reply pool.ntp.org is NODATA-IPv6

arminus · February 27, 2025, 9:35am

Just an update/observation here: I've moved the 15 minute cron job which was previously running on the pihole host itself to a different box where after 2 nights it ran without hiccups. So it seems the issue is related to lookups from a client on the pihole which fail every once in a while.

Bucking_Horn · February 27, 2025, 12:40pm

Those NTP messages can be ignored, as your log demonstrates that resolution succeeded just 4 seconds later.

That may suggest that the cronjob has triggered the very issue that it was observing. It's probably related to how curl handled the lookup.

There never would have been?
systemd-resolved's stub listener would usually default to 127.0.0.53:53, which your previous output has also shown.
Perhaps resolv.conf had nameserver 127.0.0.53 in it at some time, rather than nameserver 127.0.0.1?