YouTube script seems to be working very well

(copy of my original Reddit post here)

After a day and a half or so of running, the auto-populated list is at 122 hostnames and the number of ads I have see I can count on one hand. Only one since this morning and YouTube has been running all day.
You may well see ads appear, but if they are from a new googlevideo hostname, it will be added to the list when you run the script again. Mine is cron'd to run every few minutes and has been working well.

[original post]
A year and a half ago I made a post which showed how I was able to block, or at least reduce, ads on YouTube. Many people commented that they had decent, if mixed, results.

Quarantine time has my mind wandering a bit from my work and I though to revisit this. After poring over hundreds of megabytes of tcpdumps I found that name lookups are done on some of the googlevideo hostnames immediately before an ad ran. The returned IPs were often different than they were in earlier lookups.

So on a hunch I wrote this crap script this morning more as a "what if?" and have been running it all day. It seems to be working well; two hours of Peppa Pig as a test and no ads. (mind numbing...)

It's a very short script and up on Gitlab so others can mess around with it and see how it works for them. More of a proof-of-concept before I do anything further.

Running randoms scripts as root that some guy links to is not a good idea so only run it if you've examined it yourself and understand the risks.

What to do:

EDIT: see README.md for install/uninstall directions.

What it does:

  1. Checks for a file in your /etc/dnsmasq.d folder that will configure dnsmasq to add a new host file for use. Creates it if it does not exist.
  2. greps out any "-.googlevideo.com" hostnames from your Pi-hole logs.
  3. Adds the IP from $forceIP and the hostname from #2 in the new hosts file.
  4. Sorts it and removes dupes.

Should probably have it auto-update $forceIP and change the hosts file accordingly. May do that if people report back with good results.

My Pi-hole uses the local hosts files in lookups. I don't recall if that is stock practice or something I added after-the-fact.

Again: Running randoms scripts as root that some guy links to is not a good idea so only run it if you've examined it yourself and understand the risks.

Good luck and please report back results.


9 Likes

Have you tested the behavior of portable devices (phones/tablets)?

What about appletv/roku/embedded players on tvs/consoles?

I've been looking into this specifically because of ads in the official YouTube apps on several AppleTV and iOS devices. iOS Safari doesn't get them with AdGuard.

What seems to be happening is the name lookup that appears immediately before an ad returns a new IP that seems primed to fire the ad as soon as the client hits it. I've gone over network dumps until my eyes were near bleeding and observed it first hand with YouTube streaming as I watched live traffic from an AppleTV. An in-video ad has always been preceded by one of these lookups.

When the new hostname is pulled from the logs, any future lookup gets the IP defined in the script. This means that sometimes an ad will slip through once per hostname when first resolved. Also it seems that each device seems to start on a similar "ad curve" where the frequency of ads starts near normal then drops to very close to zero.

After near two of days of testing on my office "test AppleTV", I put our daughter's AppleTV on the setup. She said she had "few ads" that went down to nothing.
Before I set her up there were 122 hosts in the hosts.youtube file. This morning there are 126 and she was the only one watching. I'm guessing her her "few" ads came from the new hosts.

In the Reddit post I linked to in the original message people are reporting good results.

1 Like

I’ll run it on my network too as Peppa is a big thing here also ...

2 Likes

Would it make sense to share the found hostnames?

Doubtful, they seem random(ish). I went over weeks of *.googlevideo.com names and couldn't come up with any meaningful naming convention. Not saying there isn't one, just that nothing stands out for me.

Ok. implemented in my network with a shared list between 3 devices. Purged everything and now it's a matter of time ... I'll let you know in a couple of days how it looks on my side.

Be sure to schedule it to update. Mine runs every minute, it's just a blip of CPU.

It only took a few hours to see the difference.

I must say, I am pleased (and surprised) with the way this works.

Especially when (my) 3 Pi-hole instances gather all the data in one file (whomever catches one link that's not "recorded" yet, ads it to the centralized list), and all of them use that as the reference hosts.youtube file.

Very simple and logic :slight_smile:

Bravo and Thank you for sharing !

1 Like

Let me ask again: do we really have to execute pihole restartdns only once after the first script execution? I guess it's also necessary to regularly run pihole restartdns to re-load the changed file (/etc/hosts.youtube).

Something like that. Any change to an additional file need to signal ftl to re-read. The flat files are read and stored in memory at startup, there is no watch on the file to update the in-memory store.

That's really interesting, I thought updated hosts files were read from on the fly like a hosts file and standard resolver. It's working so well for us that I never bothered to check. But, yeah, added 1.2.3.4 foo.bar.com to the hosts.youtube and it didn't resolve until I restarted the DNS.

The script doesn't execute restartdns, even at first launch. In the latest tweak it tells the user to at first launch.

Guess the script could "pihole restartdns" on its own, but I really wanted to leave that up to the user.

edit: I should ask: would there be any problems with doing a "pihole restartdns" with every update if I cron it for */15 minutes? Maybe I'll just do that with another cron job at midnight or something.

No one is as surprised as I was when I tried it. But the fact that there was was a lookup and new IP coming in immediately before an in-video ad kept gnawing at me. That's why I tried it out. I left it running for a solid day+ with a few of us binging on YouTube before I cleaned up what I wrote and shared it.

Glad it works for you!

It would fully restart everything every 15 minutes. Cache would be dumped and you could race things. Better option would be to just reload the lists.

  restartdns          Full restart Pi-hole subsystems
                        Add 'reload' to update the lists and flush the cache without restarting the DNS server
                        Add 'reload-lists' to only update the lists WITHOUT flushing the cache or restarting the DNS server
2 Likes

Perfect, thank you! I added that with kudos for ya. :wink:

Some suggestions:

You can reload every minute if you want, but I wouldn't suggest it. Check to see if the workFile has any changes and only then call the reload.

Edit: Moved -o to correct place.

cp $ytHosts $workFile
zgrep -e "reply.*-.*\.googlevideo.*\..*\..*\..*" $piLogs \
	| awk -v fIP=$forceIP '{ print fIP, $6 }' >> $workFile	
sort -u $workFile -o $workFile
if ! cmp $workFile $ytHosts; then
    mv $workFile $ytHosts
	/usr/local/bin/pihole restartdns reload-lists
fi
3 Likes

Nicer check, thanks!

As the script only sets a IPv4, what if also IPv6 is active?

nslookup r3.sn-4g5ednsy.googlevideo.com

Server: 127.0.0.1
Address: 127.0.0.1#53

Non-authoritative answer:
Name: r3.sn-4g5ednsy.googlevideo.com
Address: 74.125.173.136
Name: r3.sn-4g5ednsy.googlevideo.com
Address: 2a00:1450:4001::8

Nothing happens beyond its intended use of adding customized IPv4 hosts. I'm current blocking outgoing IPv6 on the VLANs that are home to our players/embedded/mobile devices. It wasn't a concern when testing the idea out.

1 Like

I really like the script and wanted to share my modifications which reports the number of added domains and new total

if ! cmp $workFile $ytHosts; then
    LINES_OLD=$(wc -l <$ytHosts)
    LINES_NEW=$(wc -l <$workFile)  
    DIFF=$((LINES_NEW-$LINES_OLD))  
    mv $workFile $ytHosts
    /usr/local/bin/pihole restartdns reload-lists
    echo "Added "$DIFF" domains, new total "$LINES_NEW"."
fi

I also modified the cron to output to syslog

 */5 * * * * /home/nanopi/youtube 2>&1 | /usr/bin/logger -t pihole_youtube_blocker 

syslog

May  1 20:45:01 localhost CRON[30144]: (root) CMD (/home/nanopi/youtube 2>&1 | /usr/bin/logger -t pihole_youtube_blocker)
May  1 20:45:01 localhost pihole_youtube_blocker: /tmp/tmp.UcCGz2xV5T /etc/hosts.youtube differ: char 493, line 11
May  1 20:45:01 localhost pihole_youtube_blocker: Added 4 domains, new total 44.
May  1 20:50:01 localhost CRON[30437]: (root) CMD (/home/nanopi/youtube 2>&1 | /usr/bin/logger -t pihole_youtube_blocker)
May  1 20:50:01 localhost pihole_youtube_blocker: /tmp/tmp.FgKl1SibrR /etc/hosts.youtube differ: char 309, line 7
May  1 20:50:02 localhost pihole_youtube_blocker: Added 2 domains, new total 46.
May  1 20:55:01 localhost CRON[30644]: (root) CMD (/home/nanopi/youtube 2>&1 | /usr/bin/logger -t pihole_youtube_blocker)

2 Likes