Add / remove regex blacklists on a cron schedule

Moderator note: This post contains scripts, or links to scripts, that are untested by the Pi-hole team. We encourage users to be creative, but we are unable to support any potential problems caused by running these scripts.


My aim was to not use sqlite, but stick to cron and pihole

#!/bin/bash
#
# this script blocks and unblocks domains from a regex list
# it is called from croni. The /etc/crontab might look like this:
#   2  3    * * 2,4  root /etc/pihole/scheduled-actions -b
#  29 17    * * 2,4  root /etc/pihole/scheduled-actions -d
#

##### Beginning of configuration

# LIST - a list of domain regular expressions
# add the domain regexes in here, one per line, with single quote marks around them
declare -a LIST=(
  '*.twitter.com'
  'twitter.com'
)

# PHC - where to find the pihole command
PHC='/usr/local/bin/pihole'

##### end of configuration

print_usage() {
  >&2 echo "$0 {-b|-d}"
  >&2 echo "  use -d to delete (unblock) and -b to block"
}

print_config_err() {
  >&2 echo "File '$PHC' is not executable or found"
  >&2 echo "$0 has not been configured well."
  >&2 echo "Help it by doing these two things:"
  >&2 echo "  1. find pihole by executing this code:"
  >&2 echo "    sudo find / -name pihole -type f -executable"
  >&2 echo "  2. use that answer to edit the configuration contained in $0"
}

if [[ -f "$PHC" && -x $(realpath "$PHC") ]]
then
  echo "Starting pihole to modify the regex blacklist"
else 
  print_config_err  
  exit 1
fi

PHARGS=''

while getopts 'bd' flag; do
  case "${flag}" in
    d) PHARGS=('--regex --delmode') ;;
    b) PHARGS=('--regex') ;;
    *) print_usage
       exit 1 ;;
  esac
done

for wc in "${LIST[@]}"; do
  $PHC ${PHARGS[@]} \"$wc\"
done

$PHC restartdns



Just a small hint: They are one and the same in the regex world. While the first one is actually a typo.

The second will match twitter.com anywhere in a domain. So it will match

twitter.com
anything.twitter.com
any.thing.more.twitter.com

but also

some.twitter.com.different.net

This happens because you are not using any anchoring at all (^ $).

I said the first one is a typo because this is not how regular expressions work: * is a multiplier to whatever there is in front of it. However, in your case, there is nothing in front of the multiplier, hence, it is just ignored making your first rule effective become

.twitter.com

Furthermore . is the wildcard character matching exactly one character. So this rule would also match atwitter.com. If you want to match only dots, you need to escape the wildcard character: \.


You see, the second you you added already covers also the first (redundant one) whereas it may also have unintended side-effects (by also matching some.twitter.com.different.net). If you really want to block twitter.com and all of its subdomains, but nothing else containing twitter.com, you should use a regex like

(\.|^)twitter\.com$

where \. matches the literal point, ^ marks the beginning of the domain, ( | ) is an alternation and $ anchors to the end of the domain. More info can also be found here: Pi-hole regex Tutorial.

1 Like

Thanks!.
I completely ignored the items in the list and was only interested in the bashing.
Tutorial duly visited