Adding blocklist urls to the Gravity DB from the command-line

Moderator note:

This post contains scripts, or links to scripts, that are untested by the Pi-hole team. We encourage users to be creative, but we are unable to support any potential problems caused by running these scripts


Hi all,
I am currently working on a pihole based project on very limited ARM SOC-based hardware, so I have chosen to not install the web frontend. I needed a way to bulk add blocklist URLs from the command-line to the adlist table in the Gravity database. I believe this was easily done in v4 by adding the URL to a newline in /etc/pihole/adlists.list, however, since v5 this no longer works. So I have re-implemented this using a Python script. The script checks that each URL in adlists.list is current (ie. returns a HTTP 200 status) and is not already in the adlist table of the Gravity db. It then adds the URL to the adlist table in the gravity db. After that it's a quick pihole -g to update Gravity and all should be well.

The script requires Python >3.6 (as it uses f-strings) and pip3 install requests sqlalchemy. Copy and paste the script below into your fav text editor and save it into /etc/pihole as the filename_of_you_choice.py. Put your blocklist URLs into adlists.list in the same directory, one URL per line. Now run the script python filename_of_your_choice.py. Note that the URLs need to point to the blocklist file, so for lists on Github it needs to be the "raw" github URL eg. https://raw.githubusercontent.com/blocklistproject/Lists/master/ads.txt

The script is pretty quick and dirty with minimal error checking and is missing the ability to delete URLs from the db that are in the adlist db table but absent for the adlists.list file. I need a script to add an allowlist URL and replicate the `pihole -g' behavior for it. I also need to host this on github. I'll get to these in the next week or so and update this post when I'm done.

The script is as follows:

from sqlalchemy import create_engine, Column, Integer, String, Boolean, exists
from sqlalchemy.orm import session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from time import time
from requests import head

AD_LISTS_FILE = "adlists.list"


engine = create_engine(
    "sqlite:///gravity.db",
    echo=False,
    future=True,
)

Session = sessionmaker(bind=engine)
session = Session()

Base = declarative_base()


class Adlist_repo(Base):
    __tablename__ = "adlist"

    id = Column(Integer, primary_key=True)
    address = Column(String(100))
    enabled = Column(Boolean)
    date_added = Column(Integer)
    comment = String(100)


def url_exists(url):
    response = head(url.strip())
    if response.status_code < 400:
        return True
    else:
        return False


def url_in_db(session, Adlist_repo, url):
    return session.query(exists().where(Adlist_repo.address == url.strip())).scalar()


def add_to_db(session, Adlist_repo, url):
    adlist = Adlist_repo(
        address=url.strip(),
        enabled=True,
        date_added=int(time()),
    )
    session.add(adlist)
    session.commit()


def main():
    with open(AD_LISTS_FILE) as ad_lists_file:
        for ad_list in ad_lists_file.readlines():
            if url_exists(ad_list) and not url_in_db(session, Adlist_repo, ad_list):
                print(f"Adding URL to DB: {ad_list}")
                add_to_db(session, Adlist_repo, ad_list)
            elif not url_exists(ad_list):
                print(f"URL doesn't exist: {ad_list}")
            elif url_in_db(session, Adlist_repo, ad_list):
                print(f"URL already in the database: {ad_list}")


if __name__ == "__main__":
    main()

Hope that helps someone.

Regards,
Russell

PS. I release this Python Script under the MIT license.

Copyright 2021 Russell Armstrong

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

This one line SQL command will add a single adlist, without the bulk add or the duplicate checking. It doesn't require any additional software.

sudo sqlite3 /etc/pihole/gravity.db "INSERT INTO adlist (address, enabled, comment) VALUES ('https://domain.com/blocklist.txt', 1, 'comment');"

Pihole even includes and exposes a Sql shell which further reduce the needed software.

I'm not great with SQL and the solution I needed had to be scripted and deployable via Ansible. The code isn't perfect but it scratched my itch, ymmv. The code is under a permissive MIT license so people can take it and do whatever they want.

Regard,
Russell

A simple Bash Script

#!/bin/bash

#Get all URLs from file and add them to the gravity database
while IFS= read -r url; do
sqlite3 /etc/pihole/gravity.db "INSERT INTO adlist (address, enabled, comment) VALUES ('$url', 1, 'FROM ADLIST');
"
done < /etc/pihole/adlists.list

Comments can be changed

Edit:
if you place the script within the working directory of pihole container you can do the following.
docker exec -it pihole chmod +x /etc/pihole/bash_script.sh
docker exec -it pihole chmod 777 /etc/pihole/adlists.list
docker exec -it pihole ./etc/pihole/bash_script.sh