Wildcard and regex support for whitelisting

hello developers!

I’d like the ability to use wildcard (and regex) on the whitelist.

The reason being, our office is pretty much “all in” on Microsoft 365, Sharepoint, etc. Basically, “all microsoft, all the time”. Yet, several of the blacklists I subscribe to block various microsoft servers, causing random issues, like logging in, etc. While I understand this may not be everyone’s cup of tea, it is what it is for us and it would be great if we could wildcard-whitelist .microsoft.com once and for all.

Conversely, we’re mainly Mac users at home. And I see on some blocklists things like ocsp.apple.com listed, as well as other servers. I have no clue why anyone would think it would be a good idea to block a ocsp server! But same thing, at home I’d like to whitelist everything .apple.com.


In the short term, you could reduce your false positives with fewer block lists. One of the problems with public block lists is that you have no control over the content, and what the list maintainer wants to block may not be what you want to block (as in the case of ocsp.apple.com, which I have also had to whitelist).

You might try just a few lists (or no lists) and set up some regex filters that you control. Here are some examples that will knock down a lot of the adware, metrics, etc.


Hey jfb,

Just a quick shout-out thanks for this list of regex filters! I’ve never really been able to wrap my head around regex, so this list is great! Thanks much!

Someone should consider adding this list as “examples” on the regex section of the documentation! :wink:


1 Like

I won’t take credit for these. They are on github, but for the life of me I can’t recall where. When I find the link, I’ll post it.

Edit - thanks @msatter, that was the link.

1 Like

regexp whitelistling would be a really nice feature because now it is possible to set .* as blacklist regexp and deny everything but it is really cumbersome to add all subdomains that are to be allowed instead of just adding *.microsoft.com for an example.

I am already using regex whitelisting for months. I have scripted it myself and move longtime domain pressent in whitelist to regex whitelist.

It will remove white listed domains from the gravity list. It does not overrules blacklist, regex and wildcard entries. For that you use the normal whitelist.

I’m not sure i’m following what your saying but the webgui has ‘exact’, ‘wildcard’(which also is regexp), ‘regexp’ for blacklisting. But only ‘exact’ for whitelisting. What I would like and I’m guessing this feature request is, would be to have the regexp for whitelisting as well.

Thanks for a fantastic software for any developer who’s reading.

I scripted it myself and is not a part of the official development of Pi-hole.

ah ok, that’s nice. Care to share? Or even better submit to the project?

1 Like

We tried to submit it to project but there was not much interest. I am fine with that and Pi-hole is consuming half of the memory now of that when I wrote the script. Motti made also a different implementation as you can read earlier in the thread.

It would be nice if there would be a hook implemented to run it during the import.


# if there is a /tmp dir that can be used use it or put them in /etc/pihole
[[ -d "/tmp" ]] && tmpdirPiHole="/tmp"

# build from regex.white a cleaning file. So that it can be used to filter out whitelisted domains in gravity.list
awk NF /etc/pihole/regex.white | awk '!/^#++'/ |sed "s/.*/awk \'\!\/&\/\' \|/" > "$tmpdirPiHole/gravity.regex.clean"

# filter white listed domains by running gravity.regex.clean on gravity.list
cat /etc/pihole/gravity.list | source "$tmpdirPiHole/gravity.regex.clean" > gravity.list.tmp"

mv -f /etc/pihole/gravity.list.tmp /etc/pihole/gravity.list
rm -f "$tmpdirPiHole/gravity.regex.clean"

# automatic reload
#pihole restartdns reload

So in /etc/pihole/regex.white you put the regex lines of domains that should not appear in gravity.list and you can reload gravity.list by running: pihole restartdns reload

1 Like

I would (still) welcome wildcard whitelisting incorporated into pi-hole.

We are a Microsoft 365 Business shop (for better or worse), and literally just last week I had to whitelist yet-another microsoft sub-domain after my users couldn’t login into Skype for Business because some blocklist decided it needed blocking. Grrrr…

1 Like


For what its worth, Microsoft publishes domain and ip addresses at https://docs.microsoft.com/en-us/office365/enterprise/urls-and-ip-address-ranges including machine readable json linked from there.

1 Like


Wow, this is tremendously helpful! Thank you so much!

Me too. It very much surprised me it wasn’t there, while pihole allowed me to put
in whitelist.txt, it didn’t error out or anything.

Is there an easy way to reload dnsmasq from this script from outside of a docker-container? I mean, I would have to reload the list manually everytime the list updates, that’s not going to work.
Also, it would be nice if pi-hole had an ‘execute after’ option for such scripts, like you write, a hook during import or something post all updates, just before reload.

I can’t help you with Docker. That hook would be indeed welcome however it won’t happen soon or never because the risk is that inexperienced users could make a mess. Support would then be difficult because the results are changed and if support does not know this then this is becoming time consuming.

A possible solution was brought forward by me but it is still no go.

2 posts were split to a new topic: Entering multiple regex at one time

Revisiting the original request: Whitelist regex support.

It is technically possible, however, I will tell you why I don’t think it is a good idea:

Regex filter evaluation is - always - a sequential (and hence slow) task. You have to try all of them until you know that none of them matched. This is the exact reason for why we split the blacklist into an “exact” and a “regex” component. The “exact” component is loaded into cache and can be replied to with close to no delay at all. Walking the chain of regex filters is, however, much slower.

The implementation could be made in two ways:

  1. Only use regex-based whitelist - very bad performance if you have many whitelisted domains
    This is to be avoided as Pi-hole v5.0 will just introduce support for massive whitelists, using an implementation strategy that will still give the result of a query with a typical delay of < 4 msec even if your have millions of domains on the whitelist.
  2. Add a regex-based whitelist next to the already existing whitelist - increase in complexity for the users.
    This is to be avoided as well as it would introduce a severe slowdown of the blocked domains preparation (AKA “gravity”). Instead of only excluding the whitelisted domains (which is very efficient), we’d need to evaluate all whitelist regex filters against any of the (possibly up to millions of) domains on the blocking lists. This would result in a catastrophic slowdown, maybe causing gravity to take hours instead of tens of seconds on Raspberry Pi devices. This is unacceptable.

I am using a different kind of whitelisting now for Pi-hole. I use a file named regex.allow which uses regex lines. This file is only used on import of the blocklist.

It will remove any domains in the blocklist that matches the regex in the allow file.

Whitelisting keeps it original function which is temporary giving access to sites if needed.
If a domain is for a longtime on the whitelist then I move it tothe allow list.

My strategy is not have anything in the blocklist that is already on other lists. Also outdated entries (NXDOMAIN) are removed.

A complete import takes less than a minute if the blocklists are downloaded without long waits. Used are default tools in Linux which are extremly efficient.

I disagree that regex are not efficient. That is the case if you look one on one but one regex could replace thousands of single entries. 20 Entries seems to be the breakevenpoint.