Apply Pi-Hole blocking to CNAMEs

unixfox · November 20, 2019, 10:54am

Ublock Origin just introduced a way to detect third-party trackers disguised as first-party scripts: Address 1st-party tracker blocking · Issue #780 · uBlockOrigin/uBlock-issues · GitHub

According to the issue it seems that it's becoming quite common for websites to use this technique to avoid adblockers detection, as you can see there is a big list of popular websites available here that are currently using this new trick: Address 1st-party tracker blocking · Issue #780 · uBlockOrigin/uBlock-issues · GitHub

Could Pi-hole implement a similar feature as Ublock Origin but for domains?

EDIT: It seems that this functionality got quite popular because Adguard is considering adding a similar feature into their app: Match CNAME records against the blocklists · Issue #1185 · AdguardTeam/AdGuardHome · GitHub and the issue from the Ublock origin github repo is currently a top post on /r/privacy: https://old.reddit.com/r/privacy/comments/dyrhg1/to_prevent_thirdparty_trackers_disguised_as/.

jfb · November 20, 2019, 1:37pm

Can you summarize what you want Pi-Hole to do that it currently does not? How should this new feature work?

BastienDurel · November 20, 2019, 3:19pm

This was requested in an issue back in 2018 : Filtering not performed on canonical names returned by a CNAME record · Issue #2242 · pi-hole/pi-hole · GitHub

jfb · November 20, 2019, 3:34pm

I don't find a feature request was generated for that item.

BastienDurel · November 20, 2019, 3:56pm

Nor did I

What pi-hole isn't doing is blocking domains masked by CNAME chains ; for example :

bastien@data-bastien:~$ dig f7ds.liberation.fr
;; ANSWER SECTION:
f7ds.liberation.fr.	3369	IN	CNAME	liberation.eulerian.net.
liberation.eulerian.net. 6969	IN	CNAME	atc.eulerian.net.
atc.eulerian.net.	6969	IN	A	109.232.197.179

This request is not blocked, although eulerian.net is in the block list

bastien@data-bastien:~$ dig atc.eulerian.net.
;; ANSWER SECTION:
atc.eulerian.net.	2	IN	A	10.42.2.254

unixfox · November 20, 2019, 4:16pm

Here is an example (the domains are fake, it's for demonstration purpose only):
The domain adcompany.com is in my blacklist, so it returns the IP of my Pi-Hole if I do a DNS query:

$ host adcompany.com
adcompany.com has address 192.168.1.10

But if I do a DNS query of ad.newspaper.com it doesn't get blocked by Pi-Hole even though it's simply an alias (CNAME) for adcompany.com:

$ host ad.newspaper.com
ad.newspaper.com is an alias for adcompany.com.
adcompany.com has address 6.6.6.6

What I would like that Pi-hole do is to check if the domain is a CNAME (in the example ad.newspaper.com) then comparing the domain that is aliased to (in the example adcompany.com) with my blacklist. If it is in my blacklist block the domain (by returning the IP of my Pi-hole).

jfb · November 20, 2019, 4:21pm

Would this feature request be more correctly titled "continue Pi-Hole blocking all the way through a CNAME chain"?

unixfox · November 20, 2019, 4:30pm

Sure if you think that's better than the actual title.
I'm not actually good at giving a title for a topic but I do understand that some people may not understand correctly the current title.

jfb · November 20, 2019, 4:34pm

I changed the name accordingly. This makes it easier for users to search and clarifies the change request.

drewski · November 20, 2019, 5:11pm

Devices/services/apps without the ability to use browser extensions will be greatly affected once all the advertisers learn the trick. This is needed to stay on par with current functionality.

jfb · November 20, 2019, 5:53pm

In the case presented above (output shortened for clarity), could the user not just block the original domain with a regex?

dig f7ds.liberation.fr
;; ANSWER SECTION:
f7ds.liberation.fr.	3600	IN	CNAME	liberation.eulerian.net.
liberation.eulerian.net. 7200	IN	CNAME	atc.eulerian.net.
atc.eulerian.net.	7200	IN	A	109.232.197.179

DanSchaper · November 20, 2019, 6:09pm

The CNAME is pointing to a bad domain. Why not just add the CNAME to the blocklist/blacklist? Any other re-resolving of things is going to add layers and latency to things.

Check domain against black/block
Resolve domain
Check resolved domain against black/block

I'm not sure offhand if we can even get the intermediate CNAMEs from FTL either.

So something like:

Resolve f7ds.liberation.fr
If f7ds.liberation.fr is CNAME then check CNAME against blacklist
f7ds.liberation.fr is CNAME for liberation.eulerian.net.
If liberation.eulerian.net. is CNAME check against blacklist
liberation.eulerian.net. is CNAME for atc.eulerian.net
If atc.eulerian.net is CNAME check against blacklist
atc.eulerian.net is A.

Versus:

Resolve f7ds.liberation.fr
f7ds.liberation.fr is blacklist.
Return 0.0.0.0

anon55913113 · November 20, 2019, 7:18pm

Remember that those are wilcards and you would only have to look at eulerian.net in that example.

These are separate domains wich coukd be triggered as soon a CNAME is returned and so speed is kept for normal blocking.

Only slowdown when CNAME is returned by the upstream server.

DanSchaper · November 20, 2019, 7:21pm

Where are wildcards being used?

anon55913113 · November 20, 2019, 7:24pm

These are providing services to clients so if pi-hole.net would track us then the domain would be pi-hole.eulerian.net so wildcard.

DanSchaper · November 20, 2019, 7:28pm

I'm not getting you. eulerian.net is now a regex? That adds even more complexity as regex has to be checked on every step.

The issue I see is that we can add code that increases complexity and potential for breakage. It would increase memory consumption as we would have to store the initial query target and then the intermediate targets with pointers back to the initial target to relate the queries all to each other. If there's a CNAME pointing to a CNAME (as in the example case) then we have a stack of queries that need to be kept in memory and linked with pointers to each other.

Or you can add the target to your blacklist. (Chances are that the list maintainers are going to add them anyways...)

anon55913113 · November 20, 2019, 7:35pm

Wildcard has become a regex in Pi-hole but in normal language the meaning is still the same.

DanSchaper · November 20, 2019, 7:35pm

I still don't follow.

anon55913113 · November 20, 2019, 7:42pm

I was affraid for that.

First you have to understand what happening. You type in www.pi-hole.net and the CNAME makes you go to pi-hole.eulerian.net because you are using their services.

We don't want to end up with eulerian.net so we jump ship and can't visit www.pi-hole.net anymore.

I can't see that www.pi-hole.net is transfering to eurlerian and pi-hole self also not. The domain entered is www.pi-hole.net.

DanSchaper · November 20, 2019, 7:47pm

That's not what is happening here. Entire websites are not CNAMEd to another domain. If www.pi-hole.net was actually pi-hole.eulerian.net then the two are fully equivalent and you should not see anything from www.pi-hole.net anyways.