RegEx engine improvements

I just added a "quiet" regex-test mode which can directly be used within pihole -q

pihole-FTL -q regex-test "fbcdn.net"

results in

    (\.|^)fbcdn\.net$ matches (regex blacklist, DB ID 83)

(the matching regex is in bold when this is a terminal)

Does this solve the problem for query.sh?

pihole-FTL -q regex-test facebook.com
    ^(.+\.)?(facebook|fb(cdn|sbx)?|tfbnw)\.[^.]+$ matches (regex blacklist, DB ID 59)
    ^(.+\.)?(facebook|fb(cdn|sbx)?|tfbnw)\.[^.]+$ matches (regex whitelist, DB ID 93)

only shows the first match

 pihole-FTL -q regex-test facebook.com "^(facbok\.cm){+3}$"
    ^(facbok\.cm){+3}$ matches

validates the domain name versus a given regex, the output still doesn't blend in the output of pihole -q (extra word matches)

In order to use pihole-FTL in pihole -q, I think it's best to use the exit codes (see example in earlier post) and let the pihole-FTL -q option simply suppress all output, which would eliminate the need for redirection (>/dev/null 2>&1). This would be the identical to for example grep -q ( -q, --quiet, --silent suppress all normal output).

Doing it like this still pushes the regex through bash where it may be subject to exploiting things. I'd prefer a solution where FTL directly reads from the database to avoid this.

Blend in as comes as soon as available? Is this really a needed feature?

I agree on this.

No, it doesn't.

Screenshot at 2020-06-29 21-28-52

pihole-FTL -q regex-test "aaabbbccc"
    aaa matches (regex blacklist, DB ID 90)
    bbb matches (regex blacklist, DB ID 91)
    ccc matches (regex blacklist, DB ID 92)

Validating something against a single user-provided regex was never possible in pihole -q. So I don't really see what you want to say here.

I assume pihole -q in version 5.2 will also list regexes with approximative matching. For that to happen, /opt/pihole/query.sh needs to be modified, the pihole v5.0 version doesn't display these. All I try to achieve is, to ensure the output of pihole -q is properly (visually) formatted (eye candy).

when using this (the pihole-FTL -q solution) in /opt/pihole/query.sh

"regex" ) 
    for list in ${lists}; do
        if [[ "${domain}" =~ ${list} ]]; then
            printf "%b\n" "${list}";
        else
            /usr/bin/pihole-FTL -q regex-test "${domain}" "${list}"
        fi
    done;;

the output looks like this (extra indentation before the regex with Approximative matching):

image

When using this (using the pihole-FTL exit code) in /opt/pihole/query.sh

"regex" ) 
    for list in ${lists}; do
        if [[ "${domain}" =~ ${list} ]]; then
            printf "%b\n" "${list}";
        else
            if /usr/bin/pihole-FTL regex-test "${domain}" "${list}" >/dev/null 2>&1; then
                printf "%b\n" "${list}";
            fi
        fi
    done;;

the output is properly aligned:

image

if the -q option from pihole-FTL would behave as the -q option from grep, the second example could be one without >/dev/null 2>&1

enough said on the subject, do as you please, I don't want to waist anymore of your time.

Okay, I'm convinced. Please by forgiving with me when it appears I'm not picking up an idea. It doesn't always mean I don't like it. There is just too much stuff going on at the same time...

I pushed a change that should make the -q really quiet. I intentionally left one exception in there, that is regex errors where the error message is still logged on the terminal.

You may even want to leave the bash ~= completely out of this:

"regex" ) 
    for list in ${lists}; do
        if /usr/bin/pihole-FTL -q regex-test "${domain}" "${list}"; then
            printf "%b\n" "${list}";
        fi
    done;;

to ensure we don't get false-negatives. The regex-test should be fairly fast.

Thanks for the suggestion, I should of seen this myself, always good to have an expert opinion...
Your latest solution works like a charm, resulting output is perfect,..

Thanks again for your time and effort

So when can you release this? :slight_smile:

(Yes, I'm just kidding)

New feature: Apply regex only to a specific query type.

Example:

abc;querytype=AAAA

will block

dig AAAA abc

but not

dig A abc

This is still experimental. Tests would be appreciated.

2 Likes

installed FTL (pihole checkout ftl new/tre-regex).
works apparently A and AAAA...

  • could you give an example of why/when you want to do this (would be useful)?
  • doesn't appear to work for other querytypes (NS)?
  • web interface turns (example) (.|^)google.com$;querytype=A into (.|^)google.com$;querytype=a (querytype is always saved in lowercase)

We have seen requests for blocking specific query types. This seems to be the simplest realization. I do not expect extensive usage. You typically want one for all.

True, support for further query types has just been added in

https://github.com/pi-hole/FTL/pull/819

This code has not yet reached the regex branch.

Yes. The query types are intentionally recognized case-insensitive to compensate for this.

I assume you refer to this topic (blocking specific query types).

I have been thinking about this, and found a feature is missing in pihole-FTL, to turn this into a valid business case, I'll try to explain.

I'm already using the database schema, that allows duplicate entries in the domainlist table, so entering an identical whitelist/blacklist is possible. This works great if you want to block something for all clients (default group), but allow access for some clients (example used in earlier conversations: allowfacebook group)

This method cannot be used if a regex like .*; querytype=AAAA is used, because it would result in allowing all AAAA queries for certain clients, when using it as a whitelist regex (whitelist always wins). In order to use the above regex, it needs to be used as a blacklist regex entry, targeting specific clients.

Now comes the dilemma. If I want to apply this regex to all but some clients (use it as a blacklist regex), I need to create a group with all the clients, except the ones I want to allow making AAAA queries. This list (can be) very large, and probably will not be effective (new clients aren't member of this group)

Most firewalls have a solution for this dilemma, simply specify the clients (IPs) you want to be unaffected by the rule, and invert the selection. The result is all clients (IPs) except the ones listed. It looks like this:

and the result is this:

for pihole, this would mean you assign a limited number of clients (IP's) to a group, invert the selection, thus effectively targeting all clients, except the ones listed.

The above regex example would than target the ! AllowAAAAqueries group, making it a lot more effective in an environment where new clients come and go on a regular bases.

Something to consider, while your making all these great changes to pihole-FTL?

Immediately started working, will report anomalies

Blocking AAAA for all clients and allowing it only for some seems a legit use case (at least theoretically).

You should do the inverse: Create a group with the clients that should not match and add them in there. Then add the new regex only to Default which covers also the new clients.

Even when I tend to disagree on your conclusion above, I do agree that is useful to be able to invert a regular expression altogether. Hence, I added the new keyword ;invert
For instance,

^abc$;querytype=AAAA;invert

will not block abc with type AAAA (but everything else) for the clients attached to it.

This has the same effect as inverting the client selection, however, it is even somewhat more powerful.

something wrong???

I installed the latest pihole-FTL version (pihole checkout ftl new/tre-regex), the client name doesn't appear to be resolved anymore, only the IP is shown...

I haven't touched any code there, maybe you just have to wait a little longer. If they do not appear, check your /var/log/pihole.log if you see according PTR requests with answers.

correct, problem auto corrected

sorry...

I entered the blacklist regex

^abc$;querytype=a;invert

I expected a correct answer for the A record, no answer for AAAA (or anthing else, however, dig results (what am I missing?):

pi@raspberrypi:~ $ dig A abc.com

; <<>> DiG 9.11.5-P4-5.1+deb10u1-Raspbian <<>> A abc.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57238
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;abc.com.                       IN      A

;; ANSWER SECTION:
abc.com.                2       IN      A       0.0.0.0

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 07 23:40:09 CEST 2020
;; MSG SIZE  rcvd: 41

pi@raspberrypi:~ $ dig AAAA abc.com

; <<>> DiG 9.11.5-P4-5.1+deb10u1-Raspbian <<>> AAAA abc.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12030
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1472
;; QUESTION SECTION:
;abc.com.                       IN      AAAA

;; AUTHORITY SECTION:
abc.com.                873     IN      SOA     ns-318.awsdns-39.com. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400

;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jul 07 23:40:13 CEST 2020
;; MSG SIZE  rcvd: 114

If I disable the regex entry, I get all correct answers.

Sorry for waisting your time (you wanted feedback)...

This is awesome, can we have a subcategory for Feature Requests only about additional Regex features like this one? :slight_smile:

Your regex says

however your test is

^abc$ does not match abc.com.

The invert makes it match, however.
So the result you see is expected:

  • Block all A queries which match ^abc$ + invert = don't block A queries which match ^abc$ (but everything else because this is not-matching without invert!)
  • Don't do anything for any other type (you requested the regex to be valid only for A queries)

What the?...Sure he wants feedback. Even misunderstandings are important IMO as seeing users having issues understanding things may help writing the documentation in the end.

I assumed the invert option only affected the querytype, but after doing some more tests, I must conclude it works as follows (correct me if I'm wrong).

regex from this topic

".*;querytype=A;invert
  • pihole-FTL looks at the result of pihole-FTL regex-test "google.be" ".*;querytype=A", result: .*;querytype=A matches
  • pihole-FTL than inverts the result.

There appears to be no way to achieve what I wanted (hoped for), apply all rules (gravity, regex, ...) but only allow A queries for a specific device (with a single regex).

I totally misinterpreted the invert function, my mistake...