Filter Blocklist on Update

I completely understand why regex whitelisting isn’t going to happen.

However, how about regex based filtering of the blocklists as they are installed or updated?

This would slow down the update process, but would also mean that the per-DNS whitelist check (assuming that is how it is implemented) could potentially be eliminated entirely.

Whitelisted domains are subtracted from gravity during a gravity update (making whitelisted domains “gravity proof”). The check to see if each requested domain is blocked is done from the gravity list first, which is a very fast process, since gravity is held in memory and the search algorithm is very efficient.

Please explain how you envision your feature request working and what advantage would result.

Thank you for the clarification; I misunderstood how the whitelist works and, thus, my request is far simpler. Let me step back and rephrase without implementation assumptions.

When loading a blocklist, I would like a filter that effectively “whitelists” various domains in the blocklist by simply filtering them out of the blocklist.

My current workflow:

  • update gravity
  • go to /etc/pihole as root
  • rm whitelist.txt

Then:

grep -e "\.apple.com$" list.0.dbl.oisd.nl.domains >> whitelist.txt
grep -e "\.icloud.com$" list.0.dbl.oisd.nl.domains >> whitelist.txt
grep -e "\.me.com$" list.0.dbl.oisd.nl.domains >> whitelist.txt

Then: “cat >> whitelist.txt” and paste a list of one-offs that I keep in a notes file.

When using more than one blocklist, I’d grep on list.* then sort | uniq the whitelist to eliminate dupes.

I see what you are doing, but why and why should Pi-Hole include this functionality? You are subscribing to blocklists that block things you don’t want blocked and you want Pi-Hole to help you clean up those lists locally?

Why should Pi-Hole have this code built in when you can easily do it yourself with your demonstrated code?

The block list has 1.2M domains I do want blocked and is well curated. That I specifically want to unblock Apple’s domains isn’t really something I’d expect a blocklist curator to necessarily agree with and I suspect there are others out there that, say, want to blanket whitelist all of other company’s domains.

As far as whether or not it is warranted as a feature in pi-hole? That depends on how many people find it valuable, I suppose. If I’m the only one, then-- sure-- doesn’t make sense. If lots of people want it and it isn’t far removed from core functionality (doesn’t seem to be to me), seems like a reasonable feature to add.

Even just a “run this script when the gravity is updated” feature would be useful. I.e. if I click “update gravity” in the WebUI, I’d like my whitelist to be updated automatically based on the regular expressions, too.

Thanks for the explanation. The feature request will be open for community votes, then evaluated and prioritized.

Here’s the rub on this - when a user has problems with a gravity update and requests help on this or other forums, even with a debug log we don’t see the other script, which can easily cause problems. So, this makes troubleshooting and technical support more difficult.

Thank you!

The maintenance issue is definitely something my solution did not consider.

As an alternatively, much lower tech, solution, being able to provide a list of root domains that are filtered out of the blacklists would work, too. In my case, I could provide a list of “apple.com”, “iCloud.com” and “me.com” and be done with it; no regular expressions to screw up, etc…

I’m not sure I get this. We’re already working on supporting regex whitelist entries, they should be doing exactly what you’re asking for.

I’m behind the times. I need to read Github more often.

Oh! Never mind, then!

That’ll solve exactly what I want.

I had taken a prior comment that regexes in the whitelist was a non-starter to try and come up with an alternative, not realizing that the whitelist effectively filters the various blacklists to generate the gravity.

Are you OK with closing this feature request to the open GitHub issue for regex whitelist?