How do I block ads on YouTube?

Yes.

Very cool. Thanks!

We know from past experience that dnsmasq can handle over a million host entries. Google's reach is vast, so who knows how many fingerprints they use.

I was thinking about this script last night and if you really wanted to try it, you can put the braced expansion into an array and not have to figure out all of the domains as a community effort.

# Brace expanded array of Google Video fingerprints from sn-aaaaaaaa to sn-99999999
fingerprints=($(echo sn-{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}))
# For each fingerpint, 
for ((i = 0; i < "${#fingerprints[@]}"; i++)); do
  # Generate the list based on https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/11?u=jacob.salmela
  echo r{1..20}---${fingerprints[$i]}.googlevideo.com
  echo r{1..20}.${fingerprints[$i]}.googlevideo.com
done

Oh and just a tip, don't try this on an older Pi model as I was experimenting and it locked up my Pi for a good while. Ctrl+C couldn't even cancel it :smile:

Few additional observations on my end:

sn- is indeed always the start.
the following signature is either 8 characters long
or the signature is 7 long followed by a - and then either 4 letters or 2 letter and 1 number.

sn-a1b2c3d4
sn-a1b2c3d-abcd
sn-a1b2c4d-ab1

There seems to be no preference to any characters except letter over number due to there being more of them.
From what i can tell there is no realistic way pihole could handle a list containing all these possible entries, so it might not be possible this way unless we can use regular expressions or something in these lists.

But i hope we can get this done, because i really dont like ads! xD

# Brace expanded array of Google Video fingerprints from sn-aaaaaaaa to sn-99999999
fingerprints=($(echo sn-{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}{{a..z},{0..9}}))
# For each fingerpint, 
for ((i = 0; i < "${#fingerprints[@]}"; i++)); do
  # Generate the list based on https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/11?u=jacob.salmela
  echo r{1..20}---${fingerprints[$i]}.googlevideo.com
  echo r{1..20}.${fingerprints[$i]}.googlevideo.com
done

@jacob.salmela Can you humor me with this idea:

Create a script to generate the entire array from a more powerful machine (The one from above seems to be a good starting point). Then have a function that pings each of those domains. If the domain gets a response, the script writes that domain in a separate list. If the domain has no response, the script drops that domain.

The idea is to narrow down the list of possible ad-serving domains that actually responds, as opposed to having millions dud domains on the Pi.

After that is filtered out, upload it to the Pi and blacklist.


I want to actually do this but I don't know how to create shell scripts efficiently. I apologize if I sound so bossy and demanding :disappointed:.

2 Likes

@Excelerate246 We have an implementation for wildcard blocking already available that might make it into the next release:

You could maybe find a simple wildcard routine that can do what you want with one single entry?... Just thinking aloud.

2 Likes

The PR @DL6ER referenced should address this issue. But it would still be fun to try @Excelerate246's idea to generate the list and then ping each one to see if it's alive. It may be inaccurate if they are blocking ICMP though, but I'm sure you could get a good idea. It actually wouldn't be that difficult. You just need another for or while loop that goes through the array and pings each one (or use the same one), and then use an if else to see if it's pingable.

Again, might be more trouble than it's worth since wildcarding may arrive soon, but nonetheless, I'm sure @jacob.salmela would think it is fun to write the script.

1 Like

Unfortunately the new wildcards don't seem to work on this, if you wildcard googlevideo.com it simply wont load any video anymore.

2 Likes

I'm guessing the googlevideo.com domain is used for the videos themselves, and then the fingerprints subdomains are used for the ads. We might still be able to use the script to get the fingerprints and the wildcard the rest of the subdomains.

This is, of course, assuming @Excelerate246's findings are true, but I think it's a step in the right direction.

1 Like

From what i can tell the pi could never handle that amount of entries.

I also found that youtube connect to 3 things on the "googlevideo.com" domain:

manifest.googlevideo.com
redirector.googlevideo.com
and then the ones with the fingerprint.

But the white list does not seem to work on wildcard entries. So i cannot seem to exclude those entries to test.

2 Likes

Yes. Due to the way wildcards work it is not possible to whitelist any domains (the wildcard will always take preference in the DNS server backend!).

1 Like

Would there not be any way to add a regular expression to the lists? This would make it very easy to add any kind of crap google seems to try here.

1 Like

No, the way DNS works is the limiter here. Let's oversimplify things a bit, so it gets more obvious why wildcard blacklisting does what it is doing: Let's assume we want to visit awesome.domain.de.

  1. The root servers will be asked if they know .de and us an address for a server that knows all domains that are provided under .de

  2. The .de server will know the host name and will return the address of the server that manages domain.de

  3. This server will eventually be asked for the address of awesome.domain.de and will give us the final address to which we will connect.

You see, DNS works from right to left. If we now wildcard block domain.de none of the above steps will happen. Instead, the Pi-hole will immediately answer its own IP (regardless of the subdomain).

I know that this might be inconvenient but rest assured that we had some sleepless nights, scratching our heads how to make it better and there seems to be no better way with the DNS resolver dnsmasq which we are using.

1 Like

Thanks for the information.

Using this post: https://discourse.pi-hole.net/t/how-do-i-add-wildcard-sites-to-the-blacklist/337

I was able to white list the mentioned domains, and the tail log shows it's going trough, but there is still no video.
Also, using this method still shows it as piholed in the query log. But the tail log shows it going trough.
This leads me to believe that this wont work.

Well, Yes and No. Let me explain:

Yes, indeed, this method works. However, we did not implement it in the way you added it now, since (due to the right-to-left nature) this is also a wildcard whitelisting.

Say you wildcard blocklist domain.de and wildcard whitelist something.domain.de. While e.g. trash.domain.de will still be blocked, some.other.ad.something.domain.de will be permitted. This might be unexpected.

No, it shows it because it matches the wildcard blacklist filter. The wildcard whitelisting is not (officially) supported and hence the filter does not know about it.

This could indeed give problems, but for testing this instance it would white list what i needed, so the test is still valid.
It blocked all calls to any of the "fingerprinted" domains and let the 2 aforementioned domains trough, still no video. This means that youtube does not only send it's adds via those domains, but also the content. So blacklisting it, with a wildcard or otherwise, is not our solution to blocking these adds.

This is a greatly appreciated result!

Isn't it possible to make the name itself accept wildcards? So then we don't wildcard a whole domain.
What i mean is this:
Current situation is to wildcard blacklist example.domain so everything that has that domain is blacklisted.

What i hope we can achieve is this: some*.example.domain where the * is the wildcard. So everything like the following domains are blacklisted:
-somestart.example.domain
-something.example.domain
-someinsert.example.domain

but helliamallowed.example.domain is allowed and not blacklisted?

I hope the point i'm hoping to make is clear enough. I think it must be implemented by a sort of mask like an ipv4 mask. everything that hits the mask is blacklisted and everything thats not is allowed through.

Yes, I understand your request and would be really happy to do this. However, you have to keep in mind that we use the DNS resolver dnsmasq in the backend. This service is not capable of doing like you suggested. Therefore, we can not offer this to our users. Even if we would find the time to implement this in dnsmasq (which is close to impossibel with the current workload of us developers) we would still have to wait quite some long time until the new version of dnsmasq is shipped with the distributions package systems.

I understand yes, I don't know much about dnsmasq so thats to bad :frowning:

Thanks for replying and taking the time to explain. And thanks for developing this, its awesome :slight_smile:

I have checked https://dnsdumpster.com and it shows few domains related to googlevideo.com. Does it changes each day?

Like i posted before, the content AND the adds come from the same domains.
So blocking these domains is not a good idea.
You wil end up blocking both the adds and what you want to see.

Sadly this is not the solution to YouTube adds. Still looking for more, but pretty sure we cant block YouTube adds via domain.