Feature request: Checkbox to convert list to abp format

PiHole supports adblock rules for example ||...^ but not many list maintainers provide their lists in this format. For instance https://dl.red.flag.domains/red.flag.domains.txt. While I did contact this maintainer and they was surprised to learn PiHole supports this format and would at a future date provide a list in this format, there are plenty of other useful lists that don't support this format and maybe never will for various reasons.

I think it would be a cool idea of PiHole would have something like a checkbox or switch that would say when downloading this adlist also convert it to adblock format.

Thanks

Take a look at the announcement of Pi-hole support for ABP lists.

Pi-hole only supports the "Blocking by domain name" syntax you mention, which blocks domains plus their wildcard subdomains, while ABP itself can support other syntaxes relevant to browser-based filtering. So Pi-hole will block domains from hosts lists (such as the list you mention), and domains (and their wildcard subdomains) from ABP lists where there are entries in that single supported format.

Going the other way doesn't really work, since hosts entries would have to be turned into something that ABP interprets as a non-wildcard domain based entry, but ABP doesn't have such an interpretation since it works at the browser level. Turning them into wildcard entries means too much would be blocked by ABP, and turning them into exact address entries means too little would be blocked by ABP when a web-based pathname is present.

It's not Pi-hole's place really to try and second-guess this or curate such lists. That said, if you decide on which compromise you want to make, you can turn your hosts lists into ABP lists using a bash script. When run against a hosts list it would write out the desired ABP list.

I understand however I don’t understand what second guessing and curating have anything to do with this.

Asking pihole to prepend ‘||’ and append ‘^’ to each valid domain line isn’t a whole lot to do. Yes I could run a bash script but I’d rather have this functionality natively considering some of us manage several piholes for family friends and a business or so. Besides it’s a check box that says convert the file to Adblock rules. It shouldn’t be set to default on and heck go put some logic in like if it’s a hosts file then don’t process.

I’ll keep pestering list maintainers who are not aware pihole supports these rules but bringing this functionality to pihole certainly gives a lot more power to pihole without the fuss of having to maintain more things such as bash scripts and cron jobs on multiple computers

Take a list like the one you mentioned in your first post. Say it has the domain in it:

example.com

Pi-hole will ingest that into Gravity and that entry will block the exact domain example.com. Using this new feature Pi-hole would also write that to a new list in ABP format as:

||example.com^

and this list could be used by an ABP installation. But that new entry doesn't mean the same thing. That new entry will block example.com and all subdomains of example.com, which is not what the original list intended.

So maybe instead, Pi-hole writes it to an ABP list as:

|http://example.com/|

But that new entry only blocks that root domain, not paths following that domain, so is largely useless for blocking in web browsing.

In otherwords, Pi-hole inteprets ABP entries of that first form as wildcard domains to block. But taking an exact domain from a hosts list and writing that back out as an ABP entry means Pi-hole has to make a decision as to what you want that exact domain entry to represent in your ABP list.

Is it a wildcard domain, which will catch all subdomains, which is not what the original exact domain entry was for, or is it an exact address entry which is largely usless for web usage, which is where ABP is going to be running? Or is it some other syntax supported by ABP?

That's what I meant about second-guessing – at the DNS level the exact domain entry simply means "make this domain unavailable at the network level". Pi-hole doesn't care about access protocols like HTTP or URL pathnames. But ABP does care about those things, hence the richer syntax in ABP lists. There may be an ABP syntax which results in the same effect in the browser as if the whole domain, but not subdomains, were blocked.

If you do want to do this, a bash script can be written which will read each host list. For each line which is a domain, it is written to a new file with the required syntax added around it to make it into the kind of ABP operation you're looking for. This could be automated and run using cron, with the resulting lists pushed somewhere like Dropbox or a private web page for your family and business users to add to their ABP lists. Pi-hole updates the lists every weekend so running this on Sunday evening would have it ready for the coming working week.

What else would it be running against anyway other than a hosts file? If it's an ABP file then use it directly in ABP as is.

1 Like

While I respect your reply I think this is becoming more complicated than it should be, or maybe I suck at explaining.

In my example url is a list of domains that are extremely suspicious and most likely malicious. I very much want to block the root domain and all its subdomains. Think if it like this… if the attacker controls the root domain and its malicious then there’s good reason to assume every subdomain is suspicious or malicious too. You are correct bash script could work but I don’t want to manage bash scripts and cron jobs on top of pihole instances. I simply want to tell pihole please explicitly treat this list of only domains as Adblock rules and please add the appropriate syntax by prepending and appending the syntax to each line.

So basically…
-take list of malicious root domains
-turn into Adblock rules
-block root domain and all subdomains
-no bash no cron no problem

I see what you mean now. We had our wires crossed. I thought you were describing a feature whereby Pi-hole reads a hosts format list as normal, and, as part of its Gravity processing, uses it to create a new list in ABP format which can be used with ABP somewhere else.

I think you are mixing things.

Pi-hole never converts list, only reads them (and remove invalid entries).
Pi-hole is able to read two types of entries and they are treated differently:

  • host entries like example.com (this will intentionally block ONLY this domain);
  • ABP entries like ||example.com^ (this will intentionally block this domain and every subdomain).

Note:
Each list can have host entries, ABP entries or a mix of this entries.

As chrislph said, adding ||...^ to every entry in a list will change its meaning.

Your request is to force some lists to be converted to ABP entries, but Pi-hole uses the lists as they are. Create/convert lists is a job for list developers. They should decide which domains and subdomains should be added or excluded.

You can ask for list maintainers to add this format.

I understand…
Hosts files
Adblock rules
And how pihole never converts lists.

I’m simply asking for a feature so I can do away with managing bash scripts and asking maintainers to provide this format.

I don’t want to
Convert all lists
Convert hosts files
Convert already Adblock rule lists

I want to convert generic list that contain root domains that are marked as malicious or what ever so I can block the entire domain because otherwise adding the list to pihole won’t block anything but the domain listed. This was the whole point of adding Adblock rule support to pihole recently because pihole would not block the entire domain unless you wanted to manage a ton of regexes.

If a list contains a domain in hosts format then the list maintainer wants to block that exact domain. If a list contains an entry in ABP domain format then the list maintainer wants to block that domain plus all its subdomains. The determination as to what degree to block a domain and/or its subdomains is made per-domain at the list level by the list maintainers. The risk vs usability is already baked into the list as a result.

If you think about what's in a hosts list it's lots of subdomains already. If a wildcard is applied to those then it would be referencing wildcard subdomains of exact subdomains which were already listed by a list maintainer for being those exact subdomains. If, instead, they are reduced to some root domain, that's not what the list maintainer intended when they added those exact subdomains of that root domain, and that will almost certainly lead to a lot of false-positives.

For example one exact AWS entry assumed as a wildcard for a root AWS domain could take out all AWS services available for that region, or else would be blocking non-existent subdomains of the specific service. This is best handled at source by the list maintainers.

Due to the need to carefully switch the feature on and off to catch the correct root domains, that's why I'm suggesting that a script aimed at a specific hosts list to partially convert it, or even creating your own ABP domain-style list to support your specific needs, is likely to give more usable and precise results.

I really don’t understand why we keep talking about hosts files. I’ve never mentioned it other then saying I don’t want to convert it and my example list in the very first post is not a hosts file

I get the whole argument about the risks you’re bringing up but risks were already there before this feature request. For instance

It was reported May 18th and wasn’t fixed till July 1st.

Here’s another

https://nocdn.threat-list.com/1/domains.txt

It literally blocks Amazon.com with the rule ||Amazon.com^

So maybe we shouldn’t trust list maintainers anymore.

The whole point of this feature request is to take the middle man out, the bash script and the cron job. That’s it.

The assumptions of the risks are present with or without the feature request.

The type of list you reference in your opening post is a list in so-called "hosts format" – a list of domain names, one per line, sometimes preceeded by 0.0.0.0 or 127.0.0.1. Pi-hole reads the domains from these lists during a Gravity update.

Your feature request is wondering about a way to convert these lists into ABP wildcard domain format during processing. Pi-hole can also understand lists which have domains listed in this format and they are treated as wildcard domains.

The points made in response are pointing out that such a conversion involves making assumptions about the root domains to use, which will result in over-blocking (I gave an AWS example), or else blocking possible subdomains of known subdomains, which likely offers no value.

This would be a per list feature set to default off.

Any bad rules only affect that user not basically the entire pihole community.

I don’t have to manage scripts and cron jobs or create new scripts

Bad rules exist before this request and have broken the internet before and unfortunately it will happen again.

If everyone assumes I don’t understand what and how /etc/hosts works, or what hosts files look like or the difference between an Adblock rule or hosts file rule or technically a line that contains a domain but doesn’t look or act like a hosts file but gets treated like as if it were a hosts file then that’s fine.

If pihole only wants to ingests lists then thats fine. I see huge value and convenience in a feature like this.

But at this point I don’t think anyone’s on point and or sees value so looks like I’ll go the scripting way.

I would say state your case (which you have) and answer the questions that come up, then see how many votes it gets. I'm willing to bet if a lot of votes are cast, they'll look at the suggestion.

This would be a good feature/option, to be able to convert specific list(s) to wild card. As mentioned, default option is off, but allows the user to override and convert to wildcard if they would like. Risk/outcome is completely with the end user. Worst case, toggle if off again. Just like on a per list you can enable/disable, you could also convert to wildcard on a per list basis. Generally you wouldn't use the toggle, but on certain specific list(s), you could now have the option. As in the example, if I have a specific list of known bad malicious domains, maybe I would like to block all subdomains as well, and this would now allow to easily convert that specific list, and automatically convert there after whenever that list updates.