Use @@ to whitelist in ABP style adlists

B_Trug · June 27, 2023, 2:00am

Hi,

since Pihole understands ABP-style domains like ||malware.com^, I would like to request the extension to that rule and make the rule @@||malware.com^ an importable whitelist entry.

This would solve so many problems, there currently are with pihole (in regarding a missing whitelist import feature). For example hat you can import blocklists easily, but you can't import (from internet lists) whitelists.

Why do we need this? With the newly ABP-style domains we can block lots (or all) subdomains of a bad domain. There are bad domains that have hundreds of subdomains, now easily blocked. But this brings the problem of overblocking since there could be a few "good" subdomains in that domains. So this good "good" player approaches the list maintainer asking for his domain to be unblocked. In my view he has a right to be unblocked. But that's my dilemma. I can't just unblock his domain, since his domain is adp-style blocked. So deleting the whole abp term opens the gate for alle the bad players again.

so with this solution in mind, I could place
||server-with-content-and-lots-of-ads.com^
and
@@||good-stuff.server-with-content-and-lots-of-ads.com^
to whitelist it.

And... the most important point is, that adguard-home already has this feature. So I guess it should be doable.

Thx and greetings.

jfb · June 27, 2023, 2:34am

Yes, you can. Whitelist always takes precedence over blacklist in Pi-hole. The order of precedence is:

Exact Whitelist
Regex Whitelist
Exact Blacklist
Blocklist domains (AKA gravity)
Regex Blacklist

Once one of the conditions (starting from top and working to bottom) is met, the process stops.

Let's use your domains as an example:

server-with-content-and-lots-of-ads.com appears in your blacklist (as either a regex or ABP block).

Whitelist the domain good-stuff.server-with-content-and-lots-of-ads.com locally in Pi-hole and this domain will never be blocked, regardless of whether it appears in a local blacklist, gravity list or a regex blacklist.

B_Trug · June 27, 2023, 3:03am

Yes, you can. Whitelist always takes precedence over blacklist.

The point I tried to make is that the possibility of getting the whitelisting to the userbase is not there. But as a list maintainer I need the ability to push the whitelisting via a list to my users.

It just doesn't work to say to every user "yeah... our list is overblocking because of abp-style... and to correct it you have to edit the whitelist by hand"...

Because he will say to me: "or you will add @@||domain.com^ to your list and I will switch to Adguard Home".

I would rather see my users support pihole...

rdwebdesign · June 27, 2023, 3:07am

Pi-hole actually use a different list format.

You can create host lists for Pi-hole.

jfb · June 27, 2023, 3:10am

Perhaps this is a problem that should be fixed at the list maintainer end, not changing our software.

Here is the problem I see with using public whitelists, particularly in this format.

A user downloads various public blacklists and whitelists and incorporates them into Pi-hole (assuming that Pi-hole is changed to import public whitelists).

A domain that the user doesn't want to be able to load (but some list maintainer thinks they should load) appears on a public whitelist.

What does the user do at that point, other than to find out which whitelist has the domain and drop that whitelist completely? The whitelist entry will override any blacklist entries.

The significant (in my opinion) advantage of our current setup is that only the user can whitelist domains and this must be done locally. The individual user (and only that user) has final control over which domains are absolutely allowed.

What is to stop a person (advertiser, scammer, person with malevolent intent, etc) from publishing a public whitelist which (among a few thousand entries) includes specific domains they want to be allowed on all Pi-holes?

B_Trug · June 27, 2023, 3:19am

Perhaps this is a problem that should be fixed at the list maintainer end

I would do it, if this were an option.

A domain that the user doesn't want to be able to load (but some list maintainer thinks they should load) appears on a public whitelist.

Yes, that is a possibility. A bad list maintainer could try to torpedo other blocklists. But the community is open enough to single those bad players out. And of course the whitelisted entries would appear listed in pihole, the same way it displays block listed entries already.

So the question remains, not to implement a good function just because a bad player could misuse it.

jfb · June 27, 2023, 3:21am

Exactly the case. As I noted, our current setup gives the end user of Pi-hole the ultimate and final control over what is whitelisted.

Out of curiousity, what lists are you maintaining and are any of your lists in our native hosts format?

B_Trug · June 27, 2023, 3:21am

No it is not. Pihole supports ABP-style since march 2023

jfb · June 27, 2023, 3:23am

You use the term "support" quite loosely. We support a single and specific element that appears in ABP lists, and nothing more.

github.com/pi-hole/FTL

Add support for Adblock Plus domain lists

development ← new/adb_style_blocking

opened 08:21PM - 15 Feb 23 UTC

DL6ER

+183 -32

# What does this implement/fix? This PR implements support for AdBlock Plus (…ABP)-style domain lists in Pi-hole. To be precise, it adds support for the following domain matching syntax defined [here](https://adblockplus.org/filter-cheatsheet#blocking2) at the time of opening this PR: ![image](https://user-images.githubusercontent.com/16748619/219143480-c96681eb-ff17-433c-b246-8b177bbf095f.png) We do *not* implement any other features such as *exception rules* as they are typically beyond what a DNS server can do (path information, for instance, is simply not available). It should be noted that this new feature is not for free but the rather complex syntax means it comes at some computational costs (= delays in DNS replies if you are on low-end hardware). To mitigate this drawback, ABP-style matching is only enabled when FTL actually detects such domains in the `gravity` table. This shouldn't be the case for the vast majority of users using "normal" HOSTS-style or simple one-domain-per-line adlists as sources for Pi-hole. **Related issue or feature (if applicable):** N/A **Pull request in [docs](https://github.com/pi-hole/docs) with documentation (if applicable):** N/A --- **By submitting this pull request, I confirm the following:** 1. I have read and understood the [contributors guide](https://docs.pi-hole.net/guides/github/contributing/), as well as this entire template. I understand which branch to base my commits and Pull Requests against. 4. I have commented my proposed changes within the code. 6. I am willing to help maintain this change if there are issues with it later. 7. It is compatible with the [EUPL 1.2 license](https://opensource.org/licenses/EUPL-1.1) 8. I have squashed any insignificant commits. ([`git rebase`](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html)) ## Checklist: - [x] The code change is tested and works locally. - [x] I based my code and PRs against the repositories `developmental` branch. - [x] I [signed off](https://docs.pi-hole.net/guides/github/how-to-signoff/) all commits. Pi-hole enforces the [DCO](https://docs.pi-hole.net/guides/github/dco/) for all contributions - [x] I [signed](https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits) all my commits. Pi-hole requires signatures to verify authorship - [x] I have read the above and my PR is ready for review.

"To be precise, it adds support for the following domain matching syntax defined here at the time of opening this PR:"

With "here" being: Adblock Plus filters explained

rdwebdesign · June 27, 2023, 3:29am

Initially Pi-hole doesn't even supported ABP Style lists. Currently ABP Style is only partially supported.

Maybe this can change on the future, if this Feature Request receives enough votes and somebody decides to code the necessary changes.

We are not against your request, but Feature Requests takes time to be done even when they receive a lot of votes.

My previous suggestion to create a list using Pi-hole hosts format was just an answer to your comment:

I would rather see my users support pihole

B_Trug · June 27, 2023, 3:29am

... what lists are you maintaining and are any of your lists in our native hosts format?

Maintaining specials/Blocklisten at master · RPiList/specials · GitHub for my userbase at https://www.youtube.com/watch?v=bDKxCr7bOMs

I converted almost all lists to the new ABP-style format.
My lists had almost 70 million domains... the new ABP-style brought it down to 38 million.

B_Trug · June 27, 2023, 3:33am

You use the term "support" quite loosely.

Yes...but since this is the only ABP-style rule that makes a lot of sense it was a good addition. I know it took pihole 4 years to add it.

So since pihole basically "understands"... this one rule could be easily extended with another rule... the @@ before the || to achieve it.

chrislph · June 27, 2023, 4:41am

Here is a script which will let you add your own whitelists of domains in hosts format.

#!/bin/bash

while IFS= read -r domain; do
  domain=$(echo $domain | sed -e 's/^.*\s//')
  pihole -w -nr --comment "Added from $1" "$domain"
done < "$1"
pihole restartdns reload-lists

Call it, eg whitelist, put it in your pihole home directory and make it executable (chmod +x whitelist). Run the command with a whitelist of domains in hosts or list format. Eg (showing the three common formats),

$ cat mywhitelist.txt
domain1.test
www.domain2.test
0.0.0.0 domain3.test
0.0.0.0 cdn01.domain4.test
127.0.0.1 domain5.test
127.0.0.1 safe.media.domain6.test

$ ./whitelist mywhitelist.txt 
  [i] Adding domain1.test to the whitelist...
  [i] Adding www.domain2.test to the whitelist...
  [i] Adding domain3.test to the whitelist...
  [i] Adding cdn01.domain4.test to the whitelist...
  [i] Adding domain5.test to the whitelist...
  [i] Adding safe.media.domain6.test to the whitelist...
  [✓] Reloading DNS lists

deHakkelaar · June 27, 2023, 5:43am

Or below one-liner:

DanSchaper · June 27, 2023, 6:14pm

I don't know the technical implications of this and what kind of performance issues this would generate (either in ingesting lists or in lookups for domain queries in operation) but what you are asking makes sense to me.

What the ABP format does is effectively a wildcard block and that does need to have a mechanism for fine tuning. I don't know how many parent domains being blocked would require child domains to be allowed, and any examples of that would be helpful.

I think it comes down to the user experience. I'm not happy with telling users that they need to manually tune lists because it's "The Right Way". I am aware of the 'danger' of a rogue whitelist and I've used that example as a reason to not implement non-local whitelisting in the past but I'm not sure if that belief is helpful or harmful for users.

DL6ER · June 27, 2023, 7:17pm

We had worked on subscribed whitelists (at that time still without ABP format support) some years ago but it was dropped after some longer discussions because of several concerns, partially mentioned above.

Let's have another look at this now, this time additionally with support for @@||a.b.com^ in addition to exact domains. To support this, I think we should add a possibility to say that an adlist is meant as "blocking" or "allowing".

Blocking adlists can contain either exact domains or ABP-style entries (||a.b.com^).
Allowing adlists can contain either exact domains or ABP-style entries (@@||a.b.com^).

maybe we should call them "subscribed allowlists" rather than "allow adlists" - I'm open for any suggestions

maybe this means we want to rename "adlists" to "subscribed blocklists", too?

Question 1:
If a blocking adlist has an @@||a.b.c^ should it automatically be added to the allowing list (even if the adlist is meant to be blocking? I think yes as this eases management of blocklists.

Question 2:
Should subscribed allowlists be allowed to only overwrite gravity domains? What I mean here is: Should your subscribed allowlist be able to overwrite your locally defined blocked domains? I don't think they should. They should only be able to undo overblocking caused by subscribed adlists.

yubiuser · June 27, 2023, 7:22pm

I think it should. Fore sure there will be any kind of mixed-style lists out in the wild - we need to handle this gracefully.

I agree.

B_Trug · June 27, 2023, 11:14pm

Question 1:
If a blocking adlist has an @@||a.b.c^ should it automatically be added to the allowing list (even if the adlist is meant to be blocking? I think yes as this eases management of blocklists.

I imagine that this would mean a lot of code tweaking to make it work.

It would be easier to just add the import function for whitelist domains (even without) ABP-style. This should be doable without tweaking the inner workings of pihole... just the webgui. Would be the fastest baby-step.

Since a whitelist entry outweighs any blocklist domain and abp-style domain, my feature request would have been met.

Transfering the @@||domain.com^ entry in a blocklist to the internal pihole whitelist would be an extra perk. But you would have to track all @@-whitelistings. When a user searches for a domain... your search would have to provide from which blocklist this whitelisting had come from. Sounds like a lot of work.

If Pihole has importable dedicated whitelists, there is no need for the @@-ABP-style any more. This might address security concerns of hidden unblock domains carried in blocklists. I don't have that concern but it could be addressed this way.

Question 2:
Should your subscribed allowlist be able to overwrite your locally defined blocked domains? I don't think they should. They should only be able to undo overblocking caused by subscribed adlists.

Sounds complicated to code, since this would lead to a two class whitelist-entry system. But I see what you are trying to do. Since a user now can neutralize any blocked domain with a whitelist-entry, he might want to be able to neutralize a whitelisted domain as well.

Your pyramide of rights would be:

local whitelists (trumps all)
local blocked domains
imported whitelists (only trumps imported blocklists)
imported blocklists.

Greetings

chrislph · June 28, 2023, 3:51am

There is a sort of tension between Pi-hole and ABP which I think is worth pondering on.

In terms of their advertised functions they are aligned. They are presented as ad-blockers. People deploy them because they want to see fewer adverts. They both work great and complement each other.

In terms of their operation they are not aligned. Pi-hole is working at the network level providing DNS services to all clients and their assorted web and non-web applications. ABP is working at the application level in web browsers only, with full access to the URLs and HTML elements.

Pi-hole blacklists of domains in hosts format make sense for a DNS server whose job is processing domains. ABP filter lists in ABP format make sense for a web application whose job is processing URLs and HTML elements.

This means there are people who share interest in the common function of blocking adverts and who generously take time to curate domain host lists for Pi-hole and ABP filter lists for ABP. Those lists are aligned to the respective product. For example Pi-hole lists will include entries for things like smart TVs and IoT devices, which makes no sense in an ABP list. Conversely ABP lists may include URL elements and attribute and CSS selectors which make no sense in a Pi-hole list.

For sure there is some value in using ABP domain filters to inform Pi-hole, but any nuance around the reason why certain filters are present is lost. An example could be if a domain and all subdomains are blocked (Pi-hole understands that) but there are certain key ABP Allowlist Resources which are whitelisted (Pi-hole does not see web traffic).

Without that nuance it may not be valid to interpret the ABP filter as a domain block, and so the need to whitelist certain domains, using more ABP filters, becomes a bit of a hack to fix what ended up being broken by doing so. The opening post of this thread describes exactly that scenario ("But that's my dilemma...").

My fear is that Pi-hole gets sucked into the rabbit hole of interpreting ABP filter lists which are intended for managing URLs and HTML elements, and trying to map the list curator's blacklist and whitelist filter inclusion decisions onto domain names.

This post isn't intended to disparage anything in here, it's just my own observations of how Pi-hole interacts with web-based blockers over the years. It's definitely worth exploring and testing out.

DL6ER · June 28, 2023, 4:30am

No, actually not too much. Implementing this is not too much effort, what is more important is (a) ease to use and (b) future extensibility and overall maintainability.

Thanks, we indeed need to be clear what we should do and how. So far we support only the wildcard blocking syntax ||block.me.and.children.com^. Exactly this and nothing else. If there are any extra options - like block only specific paths on this domain - we ignore this rule.

I do see how this can lead to using lists maybe only partially, however, there should be no misinterpretations and we are not over-blocking anything. Still seems better than doing nothing. This is also the only way to configure wildcards per gravity.

To be more specific we are looking at adding "subscribed allowlists" here which con contain either

exact domain
ABP exception rule like @@||example.com^$document only with the option document

Even when this is yet another only partial support, it should avoid misunderstandings and utilize the maximum possibility for a network-wide DNS-based blocker.

Question 3:

Assume

||c.d.e^
@@||b.c.d.e^
||a.b.c.d.e^

c.d.e blocked,
x.c.d.e blocked,
x.x.x.x.c.d.e blocked,
b.c.d.e okay,
x.x.x.b.c.d.e okay, but
a.b.c.d.e blocked or okay?
x.x.x.a.b.c.d.e blocked or okay?

You see what I'm after here: Can an ABP exception rule be overwritten by a more specific block rule? I did not see a clear statement about this in the ABP filters description. It is mostly a performance question: Can we return immediately once we have found @@||b.c.d.e^ or do we have to go on and further chase down the rabbit hole until we (maybe) find an even more specific rule (may it be either blocking or an exception)?