Help us test FTL v5.8 / dnsmasq v2.85

Original instructions from DL6ER (NOT yet using the test branch):

if your already on 'update/dnsmasq-v2.85', just run 'pihole -up'

this should give you

pihole -v

  Pi-hole version is v5.2.4 (Latest: v5.2.4)
  AdminLTE version is v5.4 (Latest: v5.4)
  FTL version is update/dnsmasq-v2.85 vDev-b1deb8c (Latest: v5.7)

pihole-FTL -vv

****************************** FTL **********************************
Version:         vDev-b1deb8c

****************************** dnsmasq ******************************
Version:         pi-hole-2.85
Compile options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 Lua TFTP no-conntrack ipset auth cryptohash DNSSEC loop-detect inotify dumpfile

****************************** SQLite3 ******************************
Version:         3.35.2
Compile options: COMPILER=gcc-6.3.0 20170516 DEFAULT_FOREIGN_KEYS OMIT_DEPRECATED OMIT_LOAD_EXTENSION OMIT_PROGRESS_CALLBACK THREADSAFE=1
******************************** LUA ********************************
Lua 5.4.1  Copyright (C) 1994-2020 Lua.org, PUC-Rio
1 Like

YES, also read this in the discussion group (what changed his mind???)

Anyway, retry count back to an acceptable count, status 14 (Already forwarded, not forwarding again) count came alive (previous tests, this branch, was always zero)

Most important result, there appears to be no impact on the user experience, in the end, that is what counts...

today: 04/07/2021 -> 1617746400
total # of queries today: 3049
status  count   unique  description
0       0       0       Unknown status
1       276     28      Domain contained in gravity database
2       2295    422     Forwarded
3       56      17      Known, replied to from cache
4       326     22      Domain matched by a regex blacklist filter
5       0       0       Domain contained in exact blacklist
6       0       0       By upstream server (known blocking page IP address)
7       0       0       By upstream server (0.0.0.0 or ::)
8       0       0       By upstream server (NXDOMAIN with RA bit unset)
9       40      2       Domain contained in gravity database (CNAME)
10      0       0       Domain matched by a regex blacklist filter (CNAME)
11      0       0       Domain contained in exact blacklist (CNAME)
12      15      10      Retried query
13      0       0       Retried but ignored query (DNSSEC)
14      41      22      Already forwarded, not forwarding again

Thank you

Works amazingly well here as well. I noticed a small reduction in memory usage in /dev/shm as well :+1:

It was

small-hours musing

It is very funny how similar his conclusion it to the reasoning DL6ER gave in his very first post to the list.

The code DL6ER proposed here:

       /* If we don't want to retry just now, drop this query right after
          we added it to the list above */
       if (difftime(now, forward->time) < daemon->retry_timeout)
         return 0;

The code added by Simon:

       /* closely spaced identical queries cannot be a try and a retry, so
          it's safe to wait for the reply from the first without
          forwarding the second. */
       if (difftime(now, forward->time) < 2)
         return 0;

Can you tell a difference (else than the now hard-coded time)? :slight_smile:

What I'm a bit concerned is that he used @DL6ER's code 1:1 (the practical realization of it) and didn't credit his work. Not a very nice move.

We are very friendly with Simon and there are things and conversations that happen that you may not be privy to. Please keep in mind that you don't know who is reading these forums, there are far more anonymous users here.

pi@pihole:~ $ pihole checkout ftl update/dnsmasq-v2.85
Please note that changing branches severely alters your Pi-hole subsystems
Features that work on the master branch, may not on a development branch
This feature is NOT supported unless a Pi-hole developer explicitly asks!
Have you read and understood this? [y/N] y
[✓] Branch update/dnsmasq-v2.85 exists
[✓] Downloading and Installing FTL
[✓] Restarting pihole-FTL service...
[✓] Enabling pihole-FTL service to start on reboot...
pi@pihole:~ $ pihole -up
[i] Checking for updates...
[i] Pi-hole Core: up to date
[i] Web Interface: up to date
[i] FTL: up to date
[i] Warning: You are using FTL from a custom branch (update/dnsmasq-v2.85) and might be missing future releases.
[✓] Everything is up to date!
pi@pihole:~ $ dnsmasq
-bash: dnsmasq: command not found

I tried to install this dev branche, but how can I check which version of dnsmasq is used?

Run
pihole-FTL -vv

1 Like

just to let you know, no abnormal things detected (user experience, pihole dashboard, query log). The status 14 count is high, the usual suspects, such as alt1-mtalk.google.com (android mail notification check) but there are no noticeable consequences.

today: 04/07/2021 -> 1617746400
total # of queries today: 9315
status  count   unique  description
0       0       0       Unknown status
1       769     39      Domain contained in gravity database
2       6107    497     Forwarded
3       102     21      Known, replied to from cache
4       384     23      Domain matched by a regex blacklist filter
5       0       0       Domain contained in exact blacklist
6       0       0       By upstream server (known blocking page IP address)
7       0       0       By upstream server (0.0.0.0 or ::)
8       0       0       By upstream server (NXDOMAIN with RA bit unset)
9       95      2       Domain contained in gravity database (CNAME)
10      0       0       Domain matched by a regex blacklist filter (CNAME)
11      0       0       Domain contained in exact blacklist (CNAME)
12      32      20      Retried query
13      0       0       Retried but ignored query (DNSSEC)
14      1826    37      Already forwarded, not forwarding again

I updated to this branche,
and worked with it all day, no issues,
everything works fine and pi hole is just as reliable as always.
If I encounter any issues before the next release I will notify in this threat.

Thank you for the great tool!

This is quite a high number. It seems to be something like 20 % of all queries!

Compare this to my Pi-hole

$ sqlite3 /etc/pihole/pihole-FTL.db --header --column "SELECT status, count(*) "absolute", printf('%.2f%%',(100.0*count(*)/(SELECT count (*) FROM queries WHERE timestamp > strftime('%s','now','-24 hours')))) "relative" FROM queries WHERE timestamp > strftime('%s','now','-24 hours') group by status order by status asc;"
status      absolute    relative  
----------  ----------  ----------
1           1527        11.12%    
2           5616        40.89%    
3           6487        47.23%    
9           7           0.05%     
12          20          0.15%     
14          77          0.56%

where the in-progress queries are only some 0.6%.

If you have the time, it'd be interesting if you could check one or two of them, compare what pihole.log says* and if that matches the database. It should, but checking can never hurt.


*) I know it doesn't say much but you could check if the query came in multiple times but was forwarded only once. Also, pihole-FTL.log may contain something usefull if you still have query debugging enabled.

some results in a PM, but a think (hope) I found a reason why...

remember (previous discussions) , I'm on windows 10 20H2, with IPv4 (fixed address) and IPv6 (temporary IPv6 address, default out of the MS box). The IPv6 address changes every day, or after a reboot/restart. The fact the IPv6 address changes isn't really a problem, the fact that the windows 10 system has both IPv4 and IPv6 (GUA) appears to be a problem (I think).

Look at this very short list of queries, registered in the pihole.log today.

cat /var/log/pihole/pihole.log | grep www.whotracks.me
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 query[A] www.whotracks.me from 192.168.2.228
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 forwarded www.whotracks.me to fdaa:bbcc:ddee:2::5552
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 query[AAAA] www.whotracks.me from 192.168.2.228
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 forwarded www.whotracks.me to fdaa:bbcc:ddee:2::5552
Apr  8 17:58:22 dnsmasq[18854]: 14788 2a02:1810:4d02:6903:7dff:b06a:aed4:9194/59656 query[AAAA] www.whotracks.me from 2a02:1810:4d02:6903:7dff:b06a:aed4:9194
Apr  8 17:58:22 dnsmasq[18854]: 14789 2a02:1810:4d02:6903:7dff:b06a:aed4:9194/60196 query[A] www.whotracks.me from 2a02:1810:4d02:6903:7dff:b06a:aed4:9194
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 reply www.whotracks.me is 13.226.159.6
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 reply www.whotracks.me is 13.226.159.11
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 reply www.whotracks.me is 13.226.159.31
Apr  8 17:58:22 dnsmasq[18854]: 14786 192.168.2.228/60196 reply www.whotracks.me is 13.226.159.75
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:200:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:9800:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:2c00:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:6000:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:3600:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:8c00:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:a600:10:8b76:f140:93a1
Apr  8 17:58:22 dnsmasq[18854]: 14787 192.168.2.228/59656 reply www.whotracks.me is 2600:9000:21d7:b800:10:8b76:f140:93a1

The system does a A and AAAA query (first four (4) lines), using the IPv4 address, these are forwarded.
Than, the next two (2) lines, there is an A and AAAA, using the IPv6 address, apparently NOT forwarded, but using the same source port as the IPv4 queries did

The database entries:

If I read everything correct in the dnsmasq discussion group, the source port match causes dnsmasq to mark them as retries.

If my analysis is correct (NOT sure) this might be a windows problem, I have NO idea how to fix this...

I hope another windows 10 user reads this, perform the same tests, and hopefully confirm my diagnose...

No, because

The port doesn't matter at all and we don't know if it had the same ID but we know for sure the it is not the same IP address for the client (IPv4 vs. IPv6) --> it is a repeated query, not a retry in which case the behavior of dnsmasq/pihole-FTL is correct and my patch sent to Simon is actually saving you a whole lot of traffic. Much more than I assumed.

And your analysis is spot on: It is a Windows issue here. It should not send the same queries once per IPv4 and once per IPv6. This is needlessly doubled traffic. If you'd have only this one client (and it always behaves this way) you'd have 50% of all queries being unnecessary because IPv4 queries are in no way better than IPv6 queries (or vice versa).

A lot of internet docs say windows (and most other operating systems) prefer IPv6 over IPv4, yet, when looking at the pihole log, most of the DNS queries from this machine are IPv4 queries, only some, like the above example, do IPv4, immediately followed by an IPv6 queries.

The iPv4 DNS server is learned from the IPv4 DHCP server (e.g. the pihole IPv4 address), The IPv6 DNS server address is learned, using discovery. This IPv6 address is configured statically on the pfsense box (ends up in resolv.conf of the pfsense box), as soon as the client gets the IPv6 address (NOT a DHCP server, but the track interface option from pfsense), it also picks up the IPv6 DNS server address, using discovery.

Nothing I can do about this, I can't remove the IPv6 DNS server from the pfsense configuration (the pfsense web interface becomes really sluggish, if I do), I can't change the address into something else, because the client would pick up this address and thus bypass pihole.

Anyway, since the additional, unnecessary IPv6 queries are all local LAN only, and dnsmasq now prevents these queries from being forwarded, the user experience isn't really affected, the cause is now known, all I need to figure out is a solution...

Awaiting pihole-FTL v5.8, with final release of dnsmasq v2.85 (released today).

Again, thanks for your time, patience, effort to make pihole a better tool.

This seems like a bug. Is there something obvious differentiating the two behaviors? Like IPv4-only always from Chrome, IPv4+follow-up-IPv6 from the system for other apps? May help to narrow down who needs fixing.

When you say this, you mean they take it from Router Announcement (RA) packets broadcasted in your network? Or is there something else going on?

This very much looks like a bug in pfsense. Can you do us the favor and report it to them so they can fix it?

What if this is an invalid address? What happens? And what if you set this to ::1 ?

Maybe wait a few days. The bugs in 2.83 which lead to 2.84 (and now 2.85) were only discovered after the official dnsmasq release was out.

I concur. :+1:

  • using edge chromium, hasn't got any configurable DNS settings, as far as I know, apart from clear browser DNS cache, changing network adapter priorities doesn't solve anything, got suck (for the moment on network provider order (error: failed to get network providers).
  • RA
  • NOT a bug, found the requirement a long time ago on the netgate forum (system update status and packet manager take a very long time to load).
  • invalid and / or ::1 results in a timeout.
  • wait a few days? I have been testing v2.85 for the last week. Wait, what would be the point, sit back until other users get into trouble with the final release? Trying to help the developers and community here (mutual benefit)...

So one particular configuration got tested (extensively). I did another test. This does not sound convincing.

Exactly. Wait for problems to get reported to the main dnsmasq mailing list. If you monitor this (I know you do), you see that two bugs were reported (and fixed) since the release. The code is always in motion.

Found another culprit, causing 'status 14' to be high.
This time (today), the majority of the errors are caused by an nvidia shield pro (android tv) device.

partial result, illustrating the problem, the actual grep result is much longer.

cat /var/log/pihole/pihole.log | grep cdn-0.nflximg.com
Apr  9 12:55:54 dnsmasq[18854]: 23434 192.168.2.240/53924 query[A] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:55:54 dnsmasq[18854]: 23434 192.168.2.240/53924 forwarded cdn-0.nflximg.com to fdaa:bbcc:ddee:2::5552
Apr  9 12:55:54 dnsmasq[18854]: 23435 192.168.2.240/57834 query[A] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:55:54 dnsmasq[18854]: 23434 192.168.2.240/53924 reply cdn-0.nflximg.com is <CNAME>
Apr  9 12:55:54 dnsmasq[18854]: 23436 192.168.2.240/53924 query[AAAA] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:55:54 dnsmasq[18854]: 23436 192.168.2.240/53924 forwarded cdn-0.nflximg.com to fdaa:bbcc:ddee:2::5552
Apr  9 12:55:54 dnsmasq[18854]: 23437 192.168.2.240/57834 query[AAAA] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:55:54 dnsmasq[18854]: 23436 192.168.2.240/53924 reply cdn-0.nflximg.com is <CNAME>
Apr  9 12:56:25 dnsmasq[18854]: 23481 192.168.2.240/53924 query[A] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:56:25 dnsmasq[18854]: 23481 192.168.2.240/53924 forwarded cdn-0.nflximg.com to fdaa:bbcc:ddee:2::5552
Apr  9 12:56:25 dnsmasq[18854]: 23482 192.168.2.240/57834 query[A] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:56:25 dnsmasq[18854]: 23481 192.168.2.240/53924 reply cdn-0.nflximg.com is <CNAME>
Apr  9 12:56:25 dnsmasq[18854]: 23483 192.168.2.240/53924 query[AAAA] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:56:25 dnsmasq[18854]: 23483 192.168.2.240/53924 forwarded cdn-0.nflximg.com to fdaa:bbcc:ddee:2::5552
Apr  9 12:56:25 dnsmasq[18854]: 23484 192.168.2.240/57834 query[AAAA] cdn-0.nflximg.com from 192.168.2.240
Apr  9 12:56:25 dnsmasq[18854]: 23483 192.168.2.240/53924 reply cdn-0.nflximg.com is <CNAME>

database entries (for Apr 9 12:55:54 entries):

Although the device is IPv6 capable, I didn't enable IPv6 on this device, however, just like a chromecast, the nvidea shield (or the app?) appears to ignore the fact it shouldn't make AAAA queries, if IPv6 isn't enabled. Added (just now) the device to the 'allowAqueriesOnly' group, this will trigger the regex blaclist '.*;querytype=!A' (will everything still work? will find out in a few days, works perfectly for a chromecast). This should eliminate some of the 'status 14' entries (caused by AAAA queries). Nothing I can really do about the repeated A queries (device is fully patched)...

A post was split to a new topic: Smart TV domains

Thanks for all the testing and detailed feedback! I think we're good to release soon. Final testing is already going on behind the scenes.

4 posts were split to a new topic: Persistent cache size