The page https://api.hackertarget.com/hostsearch/?q=googlevideo.com somehow did generate some of the
fingerprints of googlevideo.com but didn't have the rxxxsnxxxx.googlevideo.com that my phone received so all the ads just slipped through. Can you help me with this ?
had to change the collumn AWK selects to filter the "r5---sn-5hnednlr.googlevideo.com" look-alike lines from the pihole log. Collumn 6 seems to feed my device ipaddresses to the youtube-ads-list.txt instead.
Based on the info in this thread I've put together a simple script. I don't use the YouTube app that often, so I'm not 100% this works perfectly. It also appends the urls with the r00---sn-xxxxxx.googlevideos.com pattern, which are not in the hackertarget list.
Place it somewhere convenient, add execute permissions (chmod +x filter-youtube-domains.sh) and add it to your cron jobs.
Edit:
I have noticed some ads still (although less than I used to). I'm not sure yet if the domains provided by hackertarget are incomplete or that I need to run my cron job more often (now it runs every 24h). I did find a longer list on Wolfram Alpha (click on subdomains) but they don't seem to have an easy way to get those in plain text.
However, I had to do some modifications, some of then suggested in this tread, to make it work, at least partially.
This is what I did:
#!/bin/sh
# This script will fetch the Googlevideo ad domains and append them to the Pi-hole block list.
# Run this script daily with a cron job (don't forget to chmod +x)
# More info here: https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/136
# File to store the YT ad domains
FILE=/etc/pihole/youtube.hosts
# Wolfram Alfa AppID
APPID=Your-AppID
# Fetch the list of domains, remove the ip's and save them
curl 'https://api.hackertarget.com/hostsearch/?q=googlevideo.com' \
| awk -F, 'NR>1{print $1}' \
| grep -vE "redirector|manifest" > $FILE
# Replace r*.sn*.googlevideo.com URLs to r*---sn-*.googlevideo.com
# and add those to the list too
curl "http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=${APPID}&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More" \
| grep -Po "r\d+---sn-.+.googlevideo.com" >> $FILE
# Scan log file for previously accessed domains
grep "^r*.googlevideo\.com" /var/log/pihole*.log \
| awk '{print $8}' \
| grep -vE "redirector|manifest" \
| sort | uniq >> $FILE
# Add to Pi-hole adlists if it's not there already
if ! grep $FILE < /etc/pihole/adlists.list; then echo "file://$FILE" >> /etc/pihole/adlists.list; fi;
I have to wrap wolframalpha's url in double quotes " because simple quoting ' broke the url and make it dysfunctional. It had something to do with the & characters in the middle of the url.
As others suggested I changed grep -Eo "r\d±–sn-.+.googlevideo.com" to grep -Po "r\d+---sn-.+.googlevideo.com"
I added the variable APPID so it's easier to add they wolframalpha's AppID
Till there everything OK and seems that the $FILEgets more populated in each step. However, I can make the last step work. I mean If I run it on the CLI like below, test.txt is empty.
So I guess the last part is not collecting urls from the log. Besides, I don't see any of the ones that are not blocked in my log in the final file.
On top of all of that I still see youtube ads and I've notice that there are other kind of .googlevideo.com subdomains with a ixhs suffix. I don't know if this mean something.
I'm leave here the final script that worked for me after @Chipster suggestion.
#!/bin/sh
# This script will fetch the Googlevideo ad domains and append them to the Pi-hole block list.
# Run this script daily with a cron job (don't forget to chmod +x)
# More info here: https://discourse.pi-hole.net/t/how-do-i-block-ads-on-youtube/253/136
# File to store the YT ad domains
FILE=/etc/pihole/youtube.hosts
# Wolfram Alfa AppID
APPID=Your-AppID
# Fetch the list of domains, remove the ip's and save them
curl 'https://api.hackertarget.com/hostsearch/?q=googlevideo.com' \
| awk -F, 'NR>1{print $1}' \
| grep -vE "redirector|manifest" > $FILE
# Replace r*.sn*.googlevideo.com URLs to r*---sn-*.googlevideo.com
# and add those to the list too
curl "http://api.wolframalpha.com/v2/query?input=googlevideo.com&appid=${APPID}&format=plaintext&podstate=WebSiteStatisticsPod:InternetData__Subdomains&podstate=WebSiteStatisticsPod:InternetData__Subdomains_More" \
| grep -Po "r\d+---sn-.+.googlevideo.com" >> $FILE
# Scan log file for previously accessed domains
grep "r*\.googlevideo\.com" /var/log/pihole*.log \
| awk '{print $8}' \
| grep -vE "redirector|manifest" \
| sort | uniq >> $FILE
# Add to Pi-hole adlists if it's not there already
if ! grep $FILE < /etc/pihole/adlists.list; then echo "file://$FILE" >> /etc/pihole/adlists.list; fi;
I don't know but I have the feeling that this is going to end blocking the whole youtube.
All the googlevideo.com queries are r*.googlevideo.com. So sometimes I have to reload if I want to watch a video so it changes the domain (or something like that).