Use @@ to whitelist in ABP style adlists

You use the term "support" quite loosely. We support a single and specific element that appears in ABP lists, and nothing more.

"To be precise, it adds support for the following domain matching syntax defined here at the time of opening this PR:"

With "here" being: Adblock Plus filters explained

Initially Pi-hole doesn't even supported ABP Style lists. Currently ABP Style is only partially supported.

Maybe this can change on the future, if this Feature Request receives enough votes and somebody decides to code the necessary changes.

We are not against your request, but Feature Requests takes time to be done even when they receive a lot of votes.

My previous suggestion to create a list using Pi-hole hosts format was just an answer to your comment:

I would rather see my users support pihole

... what lists are you maintaining and are any of your lists in our native hosts format?

Maintaining specials/Blocklisten at master · RPiList/specials · GitHub for my userbase at https://www.youtube.com/watch?v=bDKxCr7bOMs

I converted almost all lists to the new ABP-style format.
My lists had almost 70 million domains... the new ABP-style brought it down to 38 million.

You use the term "support" quite loosely.

:slight_smile: Yes...but since this is the only ABP-style rule that makes a lot of sense it was a good addition. I know it took pihole 4 years to add it.

So since pihole basically "understands"... this one rule could be easily extended with another rule... the @@ before the || to achieve it.

Here is a script which will let you add your own whitelists of domains in hosts format.

#!/bin/bash

while IFS= read -r domain; do
  domain=$(echo $domain | sed -e 's/^.*\s//')
  pihole -w -nr --comment "Added from $1" "$domain"
done < "$1"
pihole restartdns reload-lists

Call it, eg whitelist, put it in your pihole home directory and make it executable (chmod +x whitelist). Run the command with a whitelist of domains in hosts or list format. Eg (showing the three common formats),

$ cat mywhitelist.txt
domain1.test
www.domain2.test
0.0.0.0 domain3.test
0.0.0.0 cdn01.domain4.test
127.0.0.1 domain5.test
127.0.0.1 safe.media.domain6.test
$ ./whitelist mywhitelist.txt 
  [i] Adding domain1.test to the whitelist...
  [i] Adding www.domain2.test to the whitelist...
  [i] Adding domain3.test to the whitelist...
  [i] Adding cdn01.domain4.test to the whitelist...
  [i] Adding domain5.test to the whitelist...
  [i] Adding safe.media.domain6.test to the whitelist...
  [✓] Reloading DNS lists

Or below one-liner:

I don't know the technical implications of this and what kind of performance issues this would generate (either in ingesting lists or in lookups for domain queries in operation) but what you are asking makes sense to me.

What the ABP format does is effectively a wildcard block and that does need to have a mechanism for fine tuning. I don't know how many parent domains being blocked would require child domains to be allowed, and any examples of that would be helpful.

I think it comes down to the user experience. I'm not happy with telling users that they need to manually tune lists because it's "The Right Way". I am aware of the 'danger' of a rogue whitelist and I've used that example as a reason to not implement non-local whitelisting in the past but I'm not sure if that belief is helpful or harmful for users.

1 Like

We had worked on subscribed whitelists (at that time still without ABP format support) some years ago but it was dropped after some longer discussions because of several concerns, partially mentioned above.

Let's have another look at this now, this time additionally with support for @@||a.b.com^ in addition to exact domains. To support this, I think we should add a possibility to say that an adlist is meant as "blocking" or "allowing".

  • Blocking adlists can contain either exact domains or ABP-style entries (||a.b.com^).
  • Allowing adlists can contain either exact domains or ABP-style entries (@@||a.b.com^).

maybe we should call them "subscribed allowlists" rather than "allow adlists" - I'm open for any suggestions

maybe this means we want to rename "adlists" to "subscribed blocklists", too?


Question 1:
If a blocking adlist has an @@||a.b.c^ should it automatically be added to the allowing list (even if the adlist is meant to be blocking? I think yes as this eases management of blocklists.

Question 2:
Should subscribed allowlists be allowed to only overwrite gravity domains? What I mean here is: Should your subscribed allowlist be able to overwrite your locally defined blocked domains? I don't think they should. They should only be able to undo overblocking caused by subscribed adlists.

2 Likes

I think it should. Fore sure there will be any kind of mixed-style lists out in the wild - we need to handle this gracefully.


I agree.

Question 1:
If a blocking adlist has an @@||a.b.c^ should it automatically be added to the allowing list (even if the adlist is meant to be blocking? I think yes as this eases management of blocklists.

I imagine that this would mean a lot of code tweaking to make it work.

It would be easier to just add the import function for whitelist domains (even without) ABP-style. This should be doable without tweaking the inner workings of pihole... just the webgui. Would be the fastest baby-step.

Since a whitelist entry outweighs any blocklist domain and abp-style domain, my feature request would have been met.

Transfering the @@||domain.com^ entry in a blocklist to the internal pihole whitelist would be an extra perk. But you would have to track all @@-whitelistings. When a user searches for a domain... your search would have to provide from which blocklist this whitelisting had come from. Sounds like a lot of work.

If Pihole has importable dedicated whitelists, there is no need for the @@-ABP-style any more. This might address security concerns of hidden unblock domains carried in blocklists. I don't have that concern but it could be addressed this way.

Question 2:
Should your subscribed allowlist be able to overwrite your locally defined blocked domains? I don't think they should. They should only be able to undo overblocking caused by subscribed adlists.

Sounds complicated to code, since this would lead to a two class whitelist-entry system. But I see what you are trying to do. Since a user now can neutralize any blocked domain with a whitelist-entry, he might want to be able to neutralize a whitelisted domain as well.

Your pyramide of rights would be:

  1. local whitelists (trumps all)
  2. local blocked domains
  3. imported whitelists (only trumps imported blocklists)
  4. imported blocklists.

Greetings

There is a sort of tension between Pi-hole and ABP which I think is worth pondering on.

In terms of their advertised functions they are aligned. They are presented as ad-blockers. People deploy them because they want to see fewer adverts. They both work great and complement each other.

In terms of their operation they are not aligned. Pi-hole is working at the network level providing DNS services to all clients and their assorted web and non-web applications. ABP is working at the application level in web browsers only, with full access to the URLs and HTML elements.

Pi-hole blacklists of domains in hosts format make sense for a DNS server whose job is processing domains. ABP filter lists in ABP format make sense for a web application whose job is processing URLs and HTML elements.

This means there are people who share interest in the common function of blocking adverts and who generously take time to curate domain host lists for Pi-hole and ABP filter lists for ABP. Those lists are aligned to the respective product. For example Pi-hole lists will include entries for things like smart TVs and IoT devices, which makes no sense in an ABP list. Conversely ABP lists may include URL elements and attribute and CSS selectors which make no sense in a Pi-hole list.

For sure there is some value in using ABP domain filters to inform Pi-hole, but any nuance around the reason why certain filters are present is lost. An example could be if a domain and all subdomains are blocked (Pi-hole understands that) but there are certain key ABP Allowlist Resources which are whitelisted (Pi-hole does not see web traffic).

Without that nuance it may not be valid to interpret the ABP filter as a domain block, and so the need to whitelist certain domains, using more ABP filters, becomes a bit of a hack to fix what ended up being broken by doing so. The opening post of this thread describes exactly that scenario ("But that's my dilemma...").

My fear is that Pi-hole gets sucked into the rabbit hole of interpreting ABP filter lists which are intended for managing URLs and HTML elements, and trying to map the list curator's blacklist and whitelist filter inclusion decisions onto domain names.

This post isn't intended to disparage anything in here, it's just my own observations of how Pi-hole interacts with web-based blockers over the years. It's definitely worth exploring and testing out.

No, actually not too much. Implementing this is not too much effort, what is more important is (a) ease to use and (b) future extensibility and overall maintainability.


Thanks, we indeed need to be clear what we should do and how. So far we support only the wildcard blocking syntax ||block.me.and.children.com^. Exactly this and nothing else. If there are any extra options - like block only specific paths on this domain - we ignore this rule.

I do see how this can lead to using lists maybe only partially, however, there should be no misinterpretations and we are not over-blocking anything. Still seems better than doing nothing. This is also the only way to configure wildcards per gravity.

To be more specific we are looking at adding "subscribed allowlists" here which con contain either

  • exact domain
  • ABP exception rule like @@||example.com^$document only with the option document

Even when this is yet another only partial support, it should avoid misunderstandings and utilize the maximum possibility for a network-wide DNS-based blocker.


Question 3:

Assume

||c.d.e^
@@||b.c.d.e^
||a.b.c.d.e^
  • c.d.e blocked,
  • x.c.d.e blocked,
  • x.x.x.x.c.d.e blocked,
  • b.c.d.e okay,
  • x.x.x.b.c.d.e okay, but
  • a.b.c.d.e blocked or okay?
  • x.x.x.a.b.c.d.e blocked or okay?

You see what I'm after here: Can an ABP exception rule be overwritten by a more specific block rule? I did not see a clear statement about this in the ABP filters description. It is mostly a performance question: Can we return immediately once we have found @@||b.c.d.e^ or do we have to go on and further chase down the rabbit hole until we (maybe) find an even more specific rule (may it be either blocking or an exception)?

2 Likes

Indeed, the filter workflow appears to not be documented to that level. Perhaps code on github will reveal how it behaves?

In terms of the use case presented in the opening post

the blocks would be treated as wildcards, as they are now

||c.d.e^ equivalent to regex blacklist ^.*c\.d\.e$

and exceptions would be treated as exact (representing the "it" that is being whitelisted mentioned in the use case). That also eliminates the uncertainty in the last couple of examples you gave.

@@||b.c.d.e^ equivalent to exact whitelist b.c.d.e (or regex whitelist ^b\.c\.d\.e$)

But I agree that's a single use case interpretation and perhaps the filter list maintainer intended for the latter to represent whitelisting all subdomains of b.c.d.e in order to be actually useful in his curated list, and it's not clear if that's how ABP interprets that syntax.

Treating as above would result in:

||c.d.e^       rule 1
@@||b.c.d.e^   rule 2
||a.b.c.d.e^   rule 3
  • c.d.e blocked by rule 1
  • x.c.d.e blocked by rule 1
  • x.x.x.x.c.d.e blocked by rule 1
  • b.c.d.e permitted by rule 2 which is a whitelist and therefore takes priority
  • x.x.x.b.c.d.e blocked by rule 1 since rule 2 is exact
  • a.b.c.d.e blocked by rule 1 or rule 3 (whichever is found in gravity first) since rule 2 is exact
  • x.x.x.a.b.c.d.e blocked by rule 1 or rule 3 (whichever is found in gravity first) since rule 2 is exact

Yeah, it's wildcard following the link @jfb posted above:

(I'm referring to the "structured the same as for blcking rules" part)

Maybe then whitelist the entire thing and give it priority as with current whitelists.

||c.d.e^       rule 1
@@||b.c.d.e^   rule 2
||a.b.c.d.e^   rule 3
  • c.d.e blocked by rule 1
  • x.c.d.e blocked by rule 1
  • x.x.x.x.c.d.e blocked by rule 1
  • b.c.d.e permitted by rule 2
  • x.x.x.b.c.d.e permitted by rule 2 since whitelists have priority and this aligns with the intent of that ABP "domain plus all subdomains" expression
  • a.b.c.d.e permitted by rule 2 which take priority over rule 1 or rule 3
  • x.x.x.a.b.c.d.e permitted by rule 2 which take priority over rule 1 or rule 3

Any unexpected behaviour can be sorted out using Tools > Search Adlists to work out what is blocking or allowing what.

Personally I'm not a fan of importing ABP wildcard blocks or importing any whitelists at all, even less so for whitelists with wildcard overrides, at the DNS level. It feels like giving too much trust to unknown third-parties at that layer of my network, since lists intended for Web usage on individual computers are now controlling all usage lower down on every computer and device.

1 Like

How does that really differ from trusting unknown third-parties to create the lists of domains to block? You're doing that at the DNS level for all computers and devices?

What is the reasoning behind the current paradigm of allowing bulk lists of domains to block access to but requiring manual intervention for tuning those lists? Is it our (Pi-hole's) need to tell users how they deploy Pi-hole and under what specific use-cases we deem it acceptable? Or should Pi-hole be a tool just like every other linux utility that is truly just a tool and it's up to the users to deploy and use as they see fit?

Pi-hole should be a generally set-and-forget application. The more manual intervention required means the less functional utility provided.

That's not the case though. We are not designing or advocating that users download ABP lists. We are advocating and designing a way for list maintainers to create lists using the ABP format for specific entry types. Any format that is non-hosts will have it's own DSL | specific syntax and will fall in the same category.

Any entry that does not conform to the specific and detailed examples of what is accepted is ignored. Any entry that has modifiers or URL only items is ignored. This is really no different than parsing a list that is hosts formatted but has items that are not hosts, we ignore those items as well.

I think the important thing here is that if we were to go with something like this... anyone is free to use or not use the functionality as they so please.

We all of us have different threat models - none of them wrong (well, unless you totally YOLO it and put all of your devices in a DMZ with everything wide open to the world...)

Where I have concern is that we have never allowed third parties (list maintainers, public list posters, etc.) to introduce specific whitelist entries into Pi-hole.

List maintainers are free to not include specific domains into their blacklist, but this doesn't explicitly whitelist those domains in Pi-hole. It just stops their specific list from blocking them. If a user manually blocks the domain or the domain appears on a different list, it is blocked.

When we allow third parties to introduce whitelist entries (how we handle these is TBD), this may eliminate the user from being the final say in what is whitelisted on their individual Pi-hole.

As it currently stands in Pi-hole, if a domain is whitelisted, a local (the only option as of now) whitelist entry will override all subsequent attempts to block that domain (whitelist trumps all). If a user says "domain xyz is always allowed on my network" and deliberately whitelists the domain, that is one thing. If a public list maintainer (with either good or bad intentions) makes this decision, that's another thing.

A public adlist (blocklist) as of now can only prevent things from working - cannot do anything bad. To make websites or apps work may require Pi-hole user input (a conscious decision to allow the domain). With a public whitelist (or whitelisted domains or ranges of domains embedded in a blocklist) imported into Pi-hole, it is no longer within the control of the individual user. A third party is inserting whitelist entries into an individual Pi-hole, with the potential to do bad.

If whitelist entries are embedded into an adlist along with the blacklist entries, the user will likely not be aware of the details. Nobody is going to look through 50,000 lines of text before they subscribe to an adlist.

This is where we should tread carefully.

1 Like

Would this include a user-configurable option to never download ABP-style whitelist entries?

1 Like

In terms of their advertised functions they are aligned. They are presented as ad-blockers

I like to differ because Pihole has morphed into so much more, where ABP can't even begin to compete. In fact the guys at Eyeo-Software messed up so bad, that (almost) nobody is using the original APB anymore. With their "accepted ads" policy they made a perfect example out of @jfb fear, that a bad player slips in a few @@ whitelistings and torpedo other lists.

The only thing that survived the original ABP is the formatting style, we refer to as ABP-style. The reason for that formatting style is - as you already said - a Browser-Extension in form of an Ad-Blocker can see the complete URL. So a BE can block access to a single file on a webserver. That makes it the perfect tool if you just want to block a banner file.

But Pihole (loaded with the external blocklist) can block access to malware files (on malware-domains), block phishing-attacks, block domain-squatting. In conjunction with the user-group feature, it can protect minors from visiting porn sites or other radical stuff parents don't want them to visit. It can protect a network from users using p2p-streaming websites and get sued...

You could say the power of pihole derives from the blocklists provided.

Adapting the ABP-style (even if only rule No. 2) is a valuable weapon in extending those protections. This, because attackers frequently change their subdomains to avoid blocking. Blocking every future subdomain will make it very expensive for these guys. Their domains get burned on first detection.

Thats what we all have now. There is not a single Adblocker-Extension that can even remotely do the same. And I haven't even started on the same possiblities for every IoT-Device, Smartphone, Tablet, Kindle, SmartTV. Try installing an Adblocker on a china-chatting smart-plug. Even if you could... what for... the users problem are the built-in tracking-features, no adblocker can catch.

So adapting a feature that ABP introduced has basically nothing to do with ABP. It enables the list maintainers to fine-tune their lists. And: For the user, he might be able to use other lists (from other products), that use the same syntax.

But on a negative side: Users who operate a simple network-proxy can not use our lists anymore, because the proxy software (at the moment) can't understand the new abp-style format. This comes to an end if the list-provider doesn't offer two variants.