Whitelist always wins

jpgpi250 · January 15, 2023, 9:31am

"whitelist always wins" has always been the primary rule of handeling domains, however...

Since one of the last versions ( 109.0.1518.49) of microsoft edge chromium, the browser does a lot of HTTPS queries. I'm blocking these request, have been doing this since this type was first mentioned here, since apple has been using these queries for a while now, never had any problems (even an OS upgrade, monterey over big sur, succeeds), blocking these queries. I'm using:

.*;querytype=HTTPS;reply=nodata

since the edge update, I noticed the massive amount of HTTPS queries, no negative impact on the browser experience so far, due to the block.

todays HTTPS count:

I looked into the data returned by these queries, noticed the returned data can contain a CNAME and / or IP address information.

The problem: I want to block all of the HTTPS queries, as long as there is no negative impact on the browser experience. I don't expect this to happen any time soon, since firefox (possibly other browsers) don't use HTTPS queries at all.

For my banking app (example), I need to whitelist some domains to be able to accept the initial cookie warning and login. Pi-hole sees the whitelist entry, thus forwards the queries for this domain, including the HTTPS queries, which results in the browser getting additional information, that I wanted to block, using the HTTPS query.

This is not a minor problem. To ensure that HTTPS queries do not get an answer, I applied a suricata rule, rejecting all HTTPS queries on the WAN. Despite the Pi-hole regex, I got a lot of reject alerts, I finally figured out this was due to the Pi-hole whitelist entries and had to suppress the alert in suricata (suppress means "applied" but no alert entry logged).

possible solution (@DL6ER feasable?): ~~a domainlist entry type that takes precedence over whitelist entries, e.g. evaluated before whitelist entries are checked?~~ The solution, proposed by yubiuser looks like a better approach, would be great if querytype= would allow specifying multiple query types.

yubiuser · January 15, 2023, 9:58am

You could try removing the general whitelist entry and add a regex whitelist for that specific domain applying when not query type HTTPS

jpgpi250 · January 15, 2023, 10:42am

this works (example), looks like a good start point for a solution:

assets.adobedtm.com;querytype=!HTTPS

It keeps the option open to whitelist specific HTTPS queries, unfortunately I also want (already do) block all ANY and SVCB queries. It doesn't appear to be possible to specify multiple query types (looked at the docs). May be a feature that requires a specific syntax, which I don't know.

DL6ER · January 15, 2023, 11:42am

I just came here in the second you did your edit (I saw the text changing), so that we good timing. It would blow up the regex datastructure quite significantly if we allow multiple query types to be specified, but I will check and come back here.

DL6ER · January 15, 2023, 12:22pm

Okay, we now have a custom branch

new/regex_multiple_query_types

which supports a comma-separated list of query types, like

abcabc;querytype=HTTPS,SVCB,TXT

Note that combinations with inversions may do strange things, e.g. !HTTPS,HTTPS will obviously match all possible query types so is not very useful even if possible. Consider this an alpha-feature, I did some limited but to no means extended testing so any feedback is, as always, appreciated on this.

The CI will need a few moments until the binaries are ready.

jpgpi250 · January 15, 2023, 3:23pm

the web interface doesn't allow to enter this, so had to edit existing sql entries (and stop start FTL)

It doesn't do what I expected, I entered (Maybe this isn't a correct entry?):

assets.adobedtm.com;querytype=!ANY,!HTTPS,!SVCB

type is 2 (regex whitelist, according to doc)
goal is to ~~block all ANY, HTTPS and SVCB queries, allow everything else for this domain.~~ whitelist the domain except when the query type is ANY, HTTPS or SVCB. In that case, the respective blacklist regexes will be applied (these are existing entries):

.*;querytype=ANY;reply=refused
.*;querytype=HTTPS;reply=nodata
.*;querytype=SVCB;reply=nodata

but
A -> 0.0.0.0
ANY -> valid result (CNAME) -> no error
HTTPS -> valid result (CNAME) -> no error
SVCB -> no result (SVCB record doesn't exist?) -> no error

also, the FTL log has several entries like, unfortunately, it doesn't indicate which regex the hint is about...:

[2023-01-15 15:59:45.487 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.488 28711/T28727]       - AAAA
[2023-01-15 15:59:45.488 28711/T28727]       - ANY
[2023-01-15 15:59:45.488 28711/T28727]       - SRV
[2023-01-15 15:59:45.489 28711/T28727]       - SOA
[2023-01-15 15:59:45.489 28711/T28727]       - PTR
[2023-01-15 15:59:45.490 28711/T28727]       - TXT
[2023-01-15 15:59:45.490 28711/T28727]       - NAPTR
[2023-01-15 15:59:45.490 28711/T28727]       - MX
[2023-01-15 15:59:45.490 28711/T28727]       - DS
[2023-01-15 15:59:45.491 28711/T28727]       - RRSIG
[2023-01-15 15:59:45.491 28711/T28727]       - DNSKEY
[2023-01-15 15:59:45.491 28711/T28727]       - NS
[2023-01-15 15:59:45.492 28711/T28727]       - OTHER
[2023-01-15 15:59:45.492 28711/T28727]       - SVCB
[2023-01-15 15:59:45.492 28711/T28727]       - HTTPS
[2023-01-15 15:59:45.493 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.493 28711/T28727]       - ANY
[2023-01-15 15:59:45.494 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.494 28711/T28727]       - HTTPS
[2023-01-15 15:59:45.496 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.497 28711/T28727]       - PTR
[2023-01-15 15:59:45.498 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.498 28711/T28727]       - ANY
[2023-01-15 15:59:45.498 28711/T28727]       - HTTPS

DL6ER · January 15, 2023, 8:04pm

It should be possible, I was able to add them exactly here, too:

But going the SQL path is obviously okay, too.

DEBUG_REGEX=true should be helpful here, this seems to have slipped through into non-debug mode.

What do you want this to do? You are saying: Everything except ANY and everything except HTTPS and everything except SVCB. This means: Everything.

I see you want to say: Everything except the three. This isn't possible at the moment but you can go the inverse way and say that something should be active for A,AAAA,TXT and whatever you want more.

jpgpi250:

In that case, the respective blacklist regexes will be applied (these are existing entries):
.*;querytype=ANY;reply=refused
.*;querytype=HTTPS;reply=nodata
.*;querytype=SVCB;reply=nodata
but
A -> 0.0.0.0
ANY -> valid result (CNAME) -> no error
HTTPS -> valid result (CNAME) -> no error
SVCB -> no result (SVCB record doesn't exist?) -> no error

Can you check with DEBUG_QUERIES=true, possibly also DEBUG_REGEX=true what the lines in FTL.log corresponding to the queries you mentioned here are? We're currently busy with testing + preparing the next release of Pi-hole but I should have some more time over the next days.

jpgpi250 · January 15, 2023, 9:07pm

regex_multiple_query_types.zip (12.4 KB)

But this is exactly what is needed to ensure everything gets resolved, except the specific query types, as opposed to specifying all the allowed types (which would make a whitelist regex very long, example whitelist regex:

 ^(.+\.)?(facebook|fb(cdn|sbx)?|tfbnw)\.[^.]+$;querytpe=A,AAAA,SRV,SOA,PTR,TXT,NAPTR,MX,DS,RRSIG,DNSKEY,NS,OTHER

A quick check of the dnsmasq man doesn't reveil any possibilities.
Is something like assets.adobedtm.com;querytype=!(ANY,HTTPS,SVCB) an option?

DL6ER · January 28, 2023, 2:33pm

Sorry for the long delay, there was so much v6.0 stuff going on, that this slipped my attention in between.

I just pushed a new commit implementing something like

just without the extra (). If the first character is an !, the list is used as "everything except". If there is no ! (a simple list like A,AAAA), the usual interpretation of "only these" is used. This was considerably easier to implement and is fully backwards compatible with what we have now.

The

jpgpi250:

[2023-01-15 15:59:45.487 28711/T28727]     Hint: This regex matches only specific query types:
[2023-01-15 15:59:45.488 28711/T28727]       - AAAA
[2023-01-15 15:59:45.488 28711/T28727]       - ANY
[2023-01-15 15:59:45.488 28711/T28727]       - SRV
[2023-01-15 15:59:45.489 28711/T28727]       - SOA

lines will only be shown in regex debug mode.

I also already added automatic tests for

;querytype=ANY,SRV,HTTPS - matching ONLY ANY, SRV , and HTTPS type queries, and
;querytype=!ANY,SRV,HTTPS - matching everything EXCEPT ANY, SRV , and HTTPS type queries.

Please test away. The binaries already finished compiling while I have been writing this reply.

edit Related pull request (draft):

https://github.com/pi-hole/FTL/pull/1527

jpgpi250 · January 29, 2023, 9:53am

pihole checkout ftl new/regex_multiple_query_types

tried various combinations, see below, player.h-cdn.com is blocked by gravity, www.sevenforums.com isn't (no matches)

changed some entries (enabled/disabled) to check the result, some pihole log examples:

Jan 29 09:31:56 dnsmasq[13776]: 315 192.168.2.227/56846 query[A] download.microsoft.com from 192.168.2.227
Jan 29 09:31:56 dnsmasq[13776]: 315 192.168.2.227/56846 forwarded download.microsoft.com to 127.10.10.2#5552

Jan 29 09:33:07 dnsmasq[13776]: 326 127.0.0.1/53490 query[HTTPS] download.microsoft.com from 127.0.0.1
Jan 29 09:33:07 dnsmasq[13776]: 326 127.0.0.1/53490 regex blacklisted download.microsoft.com is NODATA

Jan 29 09:33:55 dnsmasq[13776]: 430 127.0.0.1/47141 query[SVCB] download.microsoft.com from 127.0.0.1
Jan 29 09:33:55 dnsmasq[13776]: 430 127.0.0.1/47141 regex blacklisted download.microsoft.com is NODATA

Jan 29 09:34:14 dnsmasq[13776]: 431 127.0.0.1/44947 query[SRV] download.microsoft.com from 127.0.0.1
Jan 29 09:34:14 dnsmasq[13776]: 431 127.0.0.1/44947 forwarded download.microsoft.com to 127.10.10.2#5552

Jan 29 09:33:19 dnsmasq[16039]: 329 127.0.0.1/59917 query[ANY] download.microsoft.com from 127.0.0.1
Jan 29 09:33:19 dnsmasq[16039]: 329 127.0.0.1/59917 regex blacklisted download.microsoft.com is NODATA


Jan 29 09:55:21 dnsmasq[13776]: 805 127.0.0.1/37386 query[A] www.sevenforums.com from 127.0.0.1
Jan 29 09:55:21 dnsmasq[13776]: 805 127.0.0.1/37386 forwarded www.sevenforums.com to 127.10.10.2#5552

Jan 29 09:55:41 dnsmasq[20733]: 807 127.0.0.1/56873 query[ANY] www.sevenforums.com from 127.0.0.1
Jan 29 09:55:41 dnsmasq[20733]: 807 127.0.0.1/56873 regex blacklisted www.sevenforums.com is NODATA

Jan 29 09:56:15 dnsmasq[13776]: 909 127.0.0.1/38449 query[HTTPS] www.sevenforums.com from 127.0.0.1
Jan 29 09:56:15 dnsmasq[13776]: 909 127.0.0.1/38449 regex blacklisted www.sevenforums.com is NODATA

everything appears to be working, it requires a lot of evaluation to verify the answer is correct, for example:
it may be very confusing why query type A is allowed (using the regex player.h-cdn.com;querytype=!ANY,SVCB,HTTPS) because pihole -q player.h-cdn.com only returns:

Match found in https://www.github.developerdan.com/hosts/lists/ads-and-tracking-extended.txt:
   player.h-cdn.com

In order to find the reason why the A query is allowed, you need to run pihole -q player.h-cdn.com AND pihole-FTL regex-test player.h-cdn.com

The solution implies there will be no exact whitelist entries, only regex whitelist entries, when attmpting to block a specific query type.

a whitelist example (domain is blocked due to gravity entry):

pihole -q ab.tweakers.nl

 Match found in https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts:
   ab.tweakers.nl

result (blocked A, allowed AAAA, allowed SRV)

Jan 29 10:42:55 dnsmasq[13776]: 1263 127.0.0.1/60238 query[A] ab.tweakers.nl from 127.0.0.1
Jan 29 10:42:55 dnsmasq[13776]: 1263 127.0.0.1/60238 gravity blocked ab.tweakers.nl is 0.0.0.0

Jan 29 10:43:07 dnsmasq[13776]: 1265 127.0.0.1/50251 query[AAAA] ab.tweakers.nl from 127.0.0.1
Jan 29 10:43:07 dnsmasq[13776]: 1265 127.0.0.1/50251 forwarded ab.tweakers.nl to 127.10.10.2#5552

Jan 29 10:50:30 dnsmasq[13776]: 1301 127.0.0.1/43596 query[SRV] ab.tweakers.nl from 127.0.0.1
Jan 29 10:50:30 dnsmasq[13776]: 1301 127.0.0.1/43596 forwarded ab.tweakers.nl to 127.10.10.2#5552

DL6ER · January 29, 2023, 7:14pm

Thanks for your testing, the PR is now open for review + merge for the next FTL release (which should not take long as dnsmasq v2.89 is expected soon, too).

jpgpi250:

it may be very confusing why query type A is allowed (using the regex player.h-cdn.com;querytype=!ANY,SVCB,HTTPS) because pihole -q player.h-cdn.com only returns:
Match found in https://www.github.developerdan.com/hosts/lists/ads-and-tracking-extended.txt:
   player.h-cdn.com
In order to find the reason why the A query is allowed, you need to run pihole -q player.h-cdn.com AND pihole-FTL regex-test player.h-cdn.com

This is going to be improved with v6.0 because the result of pihole -q will be provided by FTL itself and not by a bash script. FTL knows how to handle the different types and this all will be reflected in the output.

system · February 19, 2023, 7:14pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.