RegEx engine improvements

Ah, I see now where the warning came from. A simple logic error without consequences for the actual function. I pushed another update.

You can add DEBUG_REGEX=true in /etc/pihole/pihole-FTL.conf to get even more details about the compilation and execution of regex. Based on this data it'll be easier to see why one device is getting an unexpected reply.

the regex errors are gone!

in the pihole-FTL log, with DEBUG_REGEX=true:

[2021-04-18 16:06:31.457 4370/T4374] Regex blacklist: Enabling regex with DB ID 158 for client 192.168.2.240

in the database

158	3	.*;querytype=!A;reply=NODATA	1	1618749334	1618749361
159	3	.*;querytype=ANY;reply=NODATA	1	1618749350	1618749350

in the pihole log I find this:

Apr 18 16:07:47 dnsmasq[4370]: 29 192.168.2.240/37655 query[AAAA] mtalk.google.com from 192.168.2.240
Apr 18 16:07:47 dnsmasq[4370]: 29 192.168.2.240/37655 cached mtalk.google.com is regex blacklisted

in the pihole-FTL log, I find:

[2021-04-18 16:07:47.805 4370M] Regex blacklist (49, DB ID 158) >> MATCH: "mtalk.google.com" vs. ".*;querytype=!A;reply=NODATA"

in order to allow you to verify, pihole-FTL log attached.

pihole-FTL.zip (3.4 KB)

this time the reply is N/A (should this not be NODATA?)

Yes. Please also add DEBUG_QUERIES=true so we get more details in pihole-FTL.log and try again. The log will grow substantially. We only need the relevant lines from the query we're looking at here.

DEBUG_REGEX off (#), DEBUG_QUERIES=true

pihole log:

Apr 18 17:22:48 dnsmasq[18613]: 20 192.168.2.240/64738 query[AAAA] mtalk.google.com from 192.168.2.240
Apr 18 17:22:48 dnsmasq[18613]: 20 192.168.2.240/64738 cached mtalk.google.com is regex blacklisted

pihole-FTL log:

[2021-04-18 17:22:48.393 18613M] **** new UDP query[AAAA] query "mtalk.google.com" from eth0:192.168.2.240 (ID 20, FTL 10330, /root/project/src/dnsmasq/forward.c:1592)
[2021-04-18 17:22:48.393 18613M] mtalk.google.com is not known
[2021-04-18 17:22:48.394 18613M] Reply is 1
[2021-04-18 17:22:48.395 18613M] Set reply to NODATA (1)
[2021-04-18 17:22:48.395 18613M] Blocking mtalk.google.com as mtalk.google.com is regex blacklisted
[2021-04-18 17:22:48.395 18613M] Forcing next reply to 1

It seems to work for me just fine, I'll have to try more to reproduce this, apparently.

Screenshot from 2021-04-18 17-48-30

dig AAAA aaaaa

Do you see similar things for NXDOMAIN ?

regex deleted, new regex .*;querytype=!A;reply=NXDOMAIN

pihole log:

Apr 18 17:53:36 dnsmasq[25796]: 22 192.168.2.240/56410 query[AAAA] mtalk.google.com from 192.168.2.240
Apr 18 17:53:36 dnsmasq[25796]: 22 192.168.2.240/56410 cached mtalk.google.com is regex blacklisted

pihole-FTL log:

[2021-04-18 17:53:36.885 25796M] **** new UDP query[AAAA] query "mtalk.google.com" from eth0:192.168.2.240 (ID 22, FTL 11312, /root/project/src/dnsmasq/forward.c:1592)
[2021-04-18 17:53:36.885 25796M] mtalk.google.com is not known
[2021-04-18 17:53:36.886 25796M] Reply is 2
[2021-04-18 17:53:36.887 25796M] Set reply to NXDOMAIN (2)
[2021-04-18 17:53:36.887 25796M] Blocking mtalk.google.com as mtalk.google.com is regex blacklisted
[2021-04-18 17:53:36.888 25796M] Forcing next reply to 2

changed back to NODATA.

on the windows desktop , the DIG replies do actually register as NODATA, It is the device on 192.168.2.240, for witch the query log always displays N/A.

I tested this now, using dig and DEBUG_QUERIES=true, both the DIG command on the desktop and the device (192.168.2.240 - no access possible, black box, so queries are initiated automatically) use UDP. The output in the log is identical. Is the device asking something more / else, than a simple DIG command?

attached packet capture, filter port 53 and ip 192.168.2.240. You'll notice some DNS requests to 8.8.8.8, but the firewall redirects these to pihole.

packetcapture.zip (6.4 KB)

JUST NOTICED, one reply IP, the other NODATA

just asking, don't know the code, does this have something to do with Forcing next reply to 2, and the wrong reply is "forced"?
/edit

edit2
removed obsolete screenshots, caused by my mistake (wrong group assignement)
/edit2

Who knows. The capture is a good idea, could you generate the same from the Windows machine that behaves as expected? Comparing the two would make it a lot easier for me. Both the question and the reply look perfectly fine (and indeed NODATA).


So I tried to reproduce your setup here, correct me if my assumptions are wrong anywhere:

  1. Created a new group
    Screenshot from 2021-04-18 19-19-27
  2. Assigned one client exclusively to this group
  3. Added your regex only for this new group

When dig AAAA google.com from localhost it is not blocked (expected, it is not member of the group created for the regex above). If I query the same domain from 127.0.0.2 it is indeed blocked:

Does the link in your

Really go to the regex only meant for the other group?

I'm so sorry, I recreated the regex to use NODATA (after the NX DOMAIN test, but forgot to assign it to the correct group, fixed that, AAAA queries are back to normal on that device (with IPv6), again, SORRY....

On the desktop with IPv4 only, client assigned to the regex (via group), with an OK result, started capture, executed dig, stopped capture.

packetcapture (2).zip (358 Bytes)

edit
despite my stupid mistake, did you notice, the last screenshot of that post, one query is answered wit IP, the other with NODATA
/edit

edit2
even on the desktop, that behaves as it should, there are some inconsistencies, so it actually behaves a little bit random


/edit2

edit3
started capture for the desktop, repeated dig until there was a different reply (3 digs required)

packetcapture (3).zip (516 Bytes)

ignore the queries for ctldl.windowsupdate.com, different machine...


/edit3

Any progress on this OR dead end?

FYI
Just noticed a typo in this regex completely kills DNS resolution, e.g. example .*;querytype=!!A, a typo, is accepted by the webinterface (success), FTL notices the problem and generates a WARNING message in the FTL log

[2021-05-02 21:37:50.291 30445/T30449] REGEX WARNING: Invalid regex blacklist filter ".*;querytype=!!A": Unknown querytype

while the incorrect regex is enabled, all queries (dig) are replied with 0.0.0.0
disabling or deleting the incorrect regex restores DNS resolution.

My vote is dead end. It seems to be just too much. I can see adding additional selectors like querytype very useful but specifying the result in the regex itself seems just very wrong. Asking the regex experts @PromoFaux @DanSchaper about their opinion as well.

We should be cautious to not add more and more and more features when we want to have a still fast and easily to maintain Pi-hole ecosystem. Only this will ensure the project can live on in the future.

hahahaha, I don't know where anyone got the impression I was a Regex expert. I really shouldn't touch it...

1 Like

I wasn't aware that it is that bad. I know you are the badass SQL master so I presumptively extended this to other strange languages (like regex) as well-

I rebased this branch on the current beta branch.

testing (ran pihole checkout ftl new/regex_replytype). first test result here, due to topic relevance.

edit
thus tested reply=ip, dig result:

; <<>> DiG 9.16.4 <<>> fuk01.ps4.update.playstation.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19237
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;fuk01.ps4.update.playstation.net. IN   A

;; ANSWER SECTION:
fuk01.ps4.update.playstation.net. 2 IN  A       192.168.2.57

;; Query time: 4 msec
;; SERVER: 192.168.2.57#53(192.168.2.57)
;; WHEN: Fri Sep 10 09:55:10 Romance Daylight Time 2021
;; MSG SIZE  rcvd: 77

tested reply=nodata, dig result:

; <<>> DiG 9.16.4 <<>> fuk01.ps4.update.playstation.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14311
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;fuk01.ps4.update.playstation.net. IN   A

;; Query time: 6 msec
;; SERVER: 192.168.2.57#53(192.168.2.57)
;; WHEN: Fri Sep 10 10:35:18 Romance Daylight Time 2021
;; MSG SIZE  rcvd: 61

tested reply=nxdomain, dig result:

; <<>> DiG 9.16.4 <<>> fuk01.ps4.update.playstation.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 28488
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;fuk01.ps4.update.playstation.net. IN   A

;; Query time: 5 msec
;; SERVER: 192.168.2.57#53(192.168.2.57)
;; WHEN: Fri Sep 10 10:41:36 Romance Daylight Time 2021
;; MSG SIZE  rcvd: 61

all modes tested and working, replies as expected.
/edit

edit2
with (\.|^)ps4\.update\.playstation\.net$;reply=ip
dig AAAA fuk01.ps4.update.playstation.net
´´´
; <<>> DiG 9.16.4 <<>> AAAA fuk01.ps4.update.playstation.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51782
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;fuk01.ps4.update.playstation.net. IN AAAA

;; ANSWER SECTION:
fuk01.ps4.update.playstation.net. 2 IN AAAA 2a02:xxxx:yyyy:zzzz:9435:92d8:531d:b64a

;; Query time: 5 msec
;; SERVER: 192.168.2.57#53(192.168.2.57)
;; WHEN: Fri Sep 10 14:32:23 Romance Daylight Time 2021
;; MSG SIZE rcvd: 89
´´´

so this also produces a correct response.
/edit2

Thanks for your testing, the feature PR can be found here:


I should mention there is another option:

;reply=none

The new feature has already been reviewed and approved. It'll be part of the next Pi-hole release and is already part of the currently running beta.

Documentation update still outstanding.

Has been implemented in

2 posts were split to a new topic: Is there a way to block adclick.g.doubleclick.net ads

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.