Can't get group blocking to work

Nonstandard:

Everything is stock except I added a self-signed certificate to lighthttpd. This should make no difference to DNS lookups, though.

Expected Behaviour:

I added two wildcards to a group blacklist called 'doomscrolling', and added a client to that same list. I expect that that client would not be able to look up hosts matching the wildcards.

Actual Behaviour:

Adding a client to the 'doomscrolling' group seems to instead completely exempt the client from any adblocking. I have tried a few variants:

  • putting the client in just the default group and the two doomscrolling regexes only in that group: this works as expected

  • putting the doomscrolling regexes in both groups and the client in the default group: this also works as expected

  • putting the doomscrolling regexes in both groups and the client in both groups: this also works as expected

  • putting the doomscrolling regexes only in the doomscrolling group and the client in both doomscrolling and default groups: this is the part that does not work. The client is able to look up hosts in the doomscrolling group

  • putting the doomscrolling regexes in the global scope (that is, outside of any particular group, even the default one): this also does nothing. I'm not sure if it's supposed to work or not, but it doesn't.

It feels like the client's assignment to the doomscrolling group just isn't taking effect at all, or totally skips all adblocking for that client. I know I've got the mac address right, and have deleted and recreated the entry a few times just to make sure. I'm using dig directly against the pi so it's not a host DNS caching thing.

Debug Token:

https://tricorder.pi-hole.net/RGLdkQaD/

According to your debug log, your doomscrolling group (group id 3) comprises the following regex only:

*** [ DIAGNOSING ]: Domainlist (0/1 = exact white-/blacklist, 2/3 = regex white-/blacklist)
 id  type  enabled  group_ids  domain               date_added           date_modified
 --  ----  -------  ---------  -------------------  -------------------  -------------------
 10     3        1  3          (\.|^)reddit\.com$   2021-11-28 07:23:23  2021-11-28 07:33:07

Run from your doomscrolling grouped laptop, please share the output of

nslookup pi.hole
dig reddit.com @192.168.1.96

This is due to your group configuration. None of your blocklists are applied to any group other then the default group, and the only regex blacklist entry applied to the doomscrolling group is for Reddit.

If you want the rest of the adblocking applied to the doomscrolling group, you will need to assign your adlists to that group as well as the default group.

I removed all the relevant config, added it back in, and re-ran diagnostics. Same behavior.
New debug token: https://tricorder.pi-hole.net/2jbUIJFg/

Here's the output you wanted:

eosborne@air words % nslookup pi.hole
Server:		192.168.1.96
Address:	192.168.1.96#53

Name:	pi.hole
Address: 192.168.1.96
eosborne@air words % dig @pi.hole www.reddit.com

; <<>> DiG 9.10.6 <<>> @pi.hole www.reddit.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 8452
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.reddit.com.			IN	A

;; ANSWER SECTION:
www.reddit.com.		273	IN	CNAME	reddit.map.fastly.net.
reddit.map.fastly.net.	4	IN	A	151.101.117.140

;; Query time: 4 msec
;; SERVER: 192.168.1.96#53(192.168.1.96)
;; WHEN: Sun Nov 28 17:07:30 EST 2021
;; MSG SIZE  rcvd: 94

Clearly *.reddit.com isn't blocked.

This time I put *.reddit.com in both default and doomscrolling:

*** [ DIAGNOSING ]: Domainlist (0/1 = exact white-/blacklist, 2/3 = regex white-/blacklist)
 id     type  enabled  group_ids     domain                                                                                                date_added           date_modified        comment
 -----  ----  -------  ------------  ----------------------------------------------------------------------------------------------------  -------------------  -------------------  --------------------------------------------------
 1        2         1  0             (\.|^)newrelic\.com$                                                                                  2021-05-08 07:36:11  2021-11-28 17:05:46
 12        3        1  3             (\.|^)reddit\.com$                                                                                    2021-11-28 07:38:25  2021-11-28 17:05:52
 13        3        1  3             (\.|^)fefoo\.com$                                                                                     2021-11-28 07:38:36  2021-11-28 17:05:56

As far as I can tell I have configured things correctly.

  • The doomscrolling group exists:

My laptop (MAC ends in eo:79, aka air.local) is in it:

(2 selected):

Screen Shot 2021-11-28 at 5.21.31 PM

The two wildcard blocks I want are in the doomscrolling group:

and yet it doesn't work.

This is the same behavior I get if I put my laptop only in the doomscrolling group.

Adding a client to the 'doomscrolling' group seems to instead completely exempt the client from any adblocking.

This was me not explaining things well. I realize that if I just put two domains in a group and tie my laptop to that group that I will only block two domains. My point was that it's not blocking those two domains when I do that. I replied to another response in this thread; I added my laptop to two groups (default and doomscrolling) and adblocking works for stuff in the default group but not doomscrolling.

(You've changed your configuration, and you didn't run the exact command that I asked for, both of which making it potentially harder for us to help you analyse your issue.
At the same time, I don't think that matters much so far in this case, but please try to stick with the commands as supplied, and to keep your configuration stable if possible. :wink: )

No.
Your output shows it is still only group 3 (doomscrolling).

I concur that your dig result shows that www.reddit.com wasn't blocked.
This could suggest that Pi-hole isn't able to correctly identify the requesting client.

Could you provide the corresponding log entries from /var/log/pihole.log?
Run from your Pi-hole host machine, the following statement may help with finding them:

grep www.reddit.com /var/log/pihole.log

EDIT: This should allow us to see whether the DNS request is processed by Pi-hole at all, and if so, which client IP address has issued that request.

Also, please provide the output for the following commands, run from your doomscrolling grouped laptop:

ip address
sudo arp -a

EDIT:
The first command should show us whether there are other interfaces which your client may employ to submit the request, and what MACs and IPs are associated with those.
The second command should reveal whether Pi-hole and your laptop are on the same link, as your laptop's MAC may be hidden behind some layer 3 network equipment, and probably whether your client would employ MAC address randomisation.

Sorry about changing the config. I've left it alone now.

I just tried a lookup, here's the log:

    pi@pi:~ $ grep www.reddit.com /var/log/pihole.log
    Nov 29 08:01:39 dnsmasq[26777]: query[A] www.reddit.com from 192.168.1.163
    Nov 29 08:01:39 dnsmasq[26777]: forwarded www.reddit.com to 127.0.0.1
    Nov 29 08:01:39 dnsmasq[26777]: reply www.reddit.com is <CNAME>

My laptop is a mac so ip address isn't a thing, but:

	eosborne@air ~ % ifconfig en0
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=6463<RXCSUM,TXCSUM,TSO4,TSO6,CHANNEL_IO,PARTIAL_CSUM,ZEROINVERT_CSUM>
	ether 50:ed:3c:30:e0:79
	inet6 fe80::cdd:9857:8e8c:242a%en0 prefixlen 64 secured scopeid 0xb
	inet 192.168.1.163 netmask 0xffffff00 broadcast 192.168.1.255
	nd6 options=201<PERFORMNUD,DAD>
	media: autoselect
	status: active

You can see that the IP address is 192.168.1.163 and the MAC ends in e0:79, which I think is what you were looking for from the arp output.

For completeness' sake, here's arp output for that IP address from both my mac and the pihole machine:

mac:

eosborne@air ~ % arp -a | grep 163
<stdin>
air.home.notcom.com (192.168.1.163) at 50:ed:3c:30:e0:79 on en0 ifscope permanent [ethernet]

pi.hole:

    pi@pi:~ $ arp -a | grep 163
    air.home.notcom.com (192.168.1.163) at 50:ed:3c:30:e0:79 [ether] on eth0

and there are no other 192.168.1.* interfaces on my mac that this query could be going out:

    eosborne@air ~ % ifconfig | egrep '192|^[a-z]'
    lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
    gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
    stf0: flags=0<> mtu 1280
    anpi0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    anpi1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    en3: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    en4: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    en1: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
    en2: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
    ap1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        inet 192.168.1.163 netmask 0xffffff00 broadcast 192.168.1.255
    awdl0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
    llw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    bridge0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    utun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1380
    utun1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 2000
    utun2: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1000

Almost. :wink:
The main intention of arp was to see if your laptop has a same-link connection to your Pi-hole machine (I've also edited my above post to clarify the intention of each command).

This seems to be the case, so Pi-hole should be able to identify your client via its :79 MAC address.

Your debug log suggests you've correctly setup and enabled your doomscrolling group (3), your reddit regex is applied to that group and enabled, and your :79 MAC client is also correctly configured for your groups 3 as well as 0 (Default), and there are also no whitelist entries that would take precedence.

Yet the laptop's .163 IP associated with that MAC is registered in Pi-hole's log, and the log shows the query wasn't blocked.

I cannot explain that behaviour, unless...
Your laptop's ifconfig output and your debug log would hint that you may be employing some kind of VPN software.

If so, is your laptop perhaps configured to connect to your Pi-hole via a VPN tunnel, despite of being a same-link neighbour of your Pi-hole machine?
If that would indeed be the case and intended, you could try if identifying your client by IP address instead of MAC would solve your issue.

I cannot explain that behaviour, unless...
Your laptop's ifconfig output and your debug log would hint that you may be employing some kind of VPN software.

I do have a VPN client on my laptop but it's not enabled.
Just to remove all possible doubt, I added another mac address to the doomscrolling group (but not the Default group. This new mac (00:11:32:28:16:E8) is a Synology NAS with a single network interface, no VPN software, nothing. Same problem:

    eric@DiskStation:~$ nslookup www.reddit.com 192.168.1.96
    Server:        192.168.1.96
    Address:    192.168.1.96#53

    Non-authoritative answer:
    www.reddit.com    canonical name = reddit.map.fastly.net.
    Name:    reddit.map.fastly.net
    Address: 151.101.117.140

I then figured I'd remove my laptop mac from the doomscrolling group and play around with adding things by IP. When I removed my laptop mac, that went fine. When I tried to add my laptop IP, the gui reported no group entries whatsoever!

This isn't that big a deal because as far as I can tell groups never worked, and the non-group (global?) behavior still works.

I then tried to add my laptop by finding it in the dropdown box and the GUI reported that it successfully added the MAC associated with it, but the list of configured clients still reports zero.

I have to guess that this smells like a corrupted database or something, but now I'm out of my element.

So....last night out of the blue my pi stopped responding to DNS queries altogether. I had made no changes, and it wasn't working for any clients - phones, tablets, whatever. I've been running pihole for a few years now with no issues, which makes me thing it's got something to do with all the recent experiments I've been doing with groups.

I had to recover quickly so I just rebooted the pi and it started working again. All the group stuff still doesn't work, though. It's in this state where it doesn't show any clients, but if I try to re-add my MAC address (the one I've been using to test all along), I get this error:

So my money's still on a corrupt DB or something out of sync between the UI and whatever's behind it. I've stopped using groups for now and hope that'll keep the pi from going deaf like it did last night. If it happens again I'll get pihole -d, which I failed to do last night.

I agree that something is definitely off in your Pi-hole installation.

If (file system) corruption is happening on your device, your debug logs so far wouldn't suggest it affected your database (your group assignments were correctly reflected in the retrieved database information) nor any other parts of your system.

Just a thought: Has that been running on the same SD card since?
Maybe it's starting to wear out.

As I'm out of ideas why your Pi-hole misbehaves with regard to group management, I've also contacted development to take a closer look.

That may be able to explain the UNIQUE constraint violation reported in your most recent screenshot.
But such a mismatch wouldn't have an effect on Pi-hole's filtering behaviour, as long as the database would still be holding the correct information.

Could you please provide a new debug token?

Just a thought: Has that been running on the same SD card since?
Maybe it's starting to wear out.

It's on SSD; this was a fresh install about a month ago.

Could you please provide a new debug token?

https://tricorder.pi-hole.net/43cjijCM/

Thanks!

Now, that debug log has:

*** [ DIAGNOSING ]: contents of /var/log/lighttpd

-rw-r--r-- 1 www-data www-data 30K Nov 30 07:52 /var/log/lighttpd/error.log

   -----tail of error.log------
   2021-11-30 05:32:02: (mod_openssl.c.1746) SSL: 1 error:1422E0EA:SSL routines:final_server_name:callback failed 
   2021-11-30 06:07:12: (mod_fastcgi.c.421) FastCGI-stderr: PHP Warning:  SQLite3Stmt::execute(): Unable to execute statement: database disk image is malformed in /var/www/html/admin/scripts/pi-hole/php/groups.php on line 201

You could move that malformed db out of the way, and try to recover from it by running the following series of commands:

sudo service pihole-FTL stop
sudo mv /etc/pihole/gravity.db ~/gravity.malformed.db
sudo sqlite3 ~/gravity.malformed.db ".recover" | sudo sqlite3 /etc/pihole/gravity.db
sudo service pihole-FTL start

If .recover fails for any reason, you'd have to remove your gravity.db and recreate it from scratch:

sudo rm /etc/pihole/gravity.db
pihole -g -r

Warning:
You may lose (when .recovering) or will lose (when removing) all of your Group assignments, including blocklist configurations.

You may try to export and reimport your settings via Pi-hole's Teleporter UI, but of course, that may not produce the expected results when your database is already affected adversely.

I decided overkill was a valid form of kill and went straight to pihole -g -r. Good news is that I'm no longer getting errors when adding my laptop to the group database. Bad news is that my block for *.reddit.com still doesn't work.

New debug log is: https://tricorder.pi-hole.net/DL2WgcJO/

thanks!

From the terminal of device "air", the only device in the doomscrolling group, what is the output of the following:

nslookup reddit.com

Then, from the Pi terminal, what is the output of

tail -n100 /var/log/pihole.log | grep reddit.com

Edit, then repeat these two commands for the domain feefoo.com

    eosborne@air ~ % nslookup reddit.com
    Server:     192.168.1.96
    Address:    192.168.1.96#53

    Non-authoritative answer:
    Name:   reddit.com
    Address: 151.101.65.140
    Name:   reddit.com
    Address: 151.101.1.140
    Name:   reddit.com
    Address: 151.101.129.140
    Name:   reddit.com
    Address: 151.101.193.140

    eosborne@air ~ % nslookup fefoo.com
    Server:     192.168.1.96
    Address:    192.168.1.96#53

    Non-authoritative answer:
    Name:   fefoo.com
    Address: 69.163.217.207

and

    pi@pi:/var/log $ tail -n100 /var/log/pihole.log | egrep "fefoo.com|reddit.com"
    Nov 30 15:12:48 dnsmasq[14448]: query[A] reddit.com from 192.168.1.163
    Nov 30 15:12:48 dnsmasq[14448]: forwarded reddit.com to 127.0.0.1
    Nov 30 15:12:48 dnsmasq[14448]: reply reddit.com is 151.101.65.140
    Nov 30 15:12:48 dnsmasq[14448]: reply reddit.com is 151.101.1.140
    Nov 30 15:12:48 dnsmasq[14448]: reply reddit.com is 151.101.129.140
    Nov 30 15:12:48 dnsmasq[14448]: reply reddit.com is 151.101.193.140
    Nov 30 15:12:56 dnsmasq[14448]: query[A] fefoo.com from 192.168.1.163
    Nov 30 15:12:56 dnsmasq[14448]: forwarded fefoo.com to 127.0.0.1
    Nov 30 15:12:56 dnsmasq[14448]: reply fefoo.com is 69.163.217.207

In your most recent debug log, laptop is part of group "Default", no other groups exist. But I guess this is expected as you started afresh.

I don't think there was anything wrong with your configuration before, we just have to find out why your client isn't correctly identified by it's MAC address. Due to technical limitations of the underlying protocols (UDP or TCP), your Pi-hole can always only see the IP address of the devices sending queries. We realize MAC-clients by trying to derive a MAC from the IP address using ARP knowledge we have in our database.

Please try the following steps:

  1. Check the database
    1.1. Check if the database returns the expected IP addresses for the MAC address of your client:

    sqlite3 /etc/pihole/pihole-FTL.db "SELECT ip FROM network_addresses JOIN network ON network_addresses.network_id = network.id WHERE network.hwaddr = '50:ed:3c:30:e0:79';"
    

    (you may have to change the MAC at the end of the command above)

    If this returns addresses, please check if the current addresses of this machine are all included here. There may be outdated addresses included, this is harmless.

    1.2. Also check the reverse (IP to MAC address) relation using

    sqlite3 /etc/pihole/pihole-FTL.db "SELECT hwaddr FROM network WHERE id = (SELECT network_id FROM network_addresses WHERE ip = '192.168.1.163' GROUP BY ip HAVING max(lastSeen));"
    

    (you may have to change the IP address in the command above)

  2. Change the client to be identified by its IP rather than its MAC just to check if this would fix it

  3. Enable ARP debugging in FTL. For this, please add

    DEBUG_CLIENTS=true
    

    to the file /etc/pihole/pihole-FTL.conf (create if it does not exist) and run

    pihole restartdns
    

    Note that debugging has no influence on the behavior of your Pi-hole else than that your logfile will grow over time.
    The log file /var/log/pihole-FTL.log should now contain a lot of useful information about the clients that are active in your network. Try searching the file for both the IP address of your device and it's MAC address. We are looking for lines like

    Querying gravity database for client with IP <IP> ...
    

    followed by either

    --> Found record for <IP> in the client table (group ID ...)
    

    or

    --> No record for <IP> in the client table
    Querying gravity database for MAC address of <IP>...
    

    and then, hopefully,

    --> Querying client table for <MAC>
    

    and

    --> Found record for <MAC> in the client table (group ID ...)
    

    If this is not what we're seeing, we can play with a few more debug options.

1.1:

    pi@pi:~ $ arp -a | grep 163
    air.home.notcom.com (192.168.1.163) at 50:ed:3c:30:e0:79 [ether] on eth0
    pi@pi:~ $ sqlite3 /etc/pihole/pihole-FTL.db "SELECT ip FROM network_addresses JOIN network ON network_addresses.network_id = network.id WHERE network.hwaddr = '50:ed:3c:30:e0:79';"
    192.168.1.236
    192.168.1.163

1.2:

    pi@pi:~ $ sqlite3 /etc/pihole/pihole-FTL.db "SELECT hwaddr FROM network WHERE id = (SELECT network_id FROM network_addresses WHERE ip = '192.168.1.163' GROUP BY ip HAVING max(lastSeen));"
    Error: database disk image is malformed

so...that's interesting and I'll stop here for now.
This is after a pihole -g -r. Anything else you want me to try?

This corruption is likely the reason. FTL cannot perform the required MAC-to-IP lookup and assumes your client belongs to the default group (so that it is not left without any blocking at all).

pihole -g -r repair only the gravity database (/etc/pihole/gravity.db). The FTL database (/etc/pihole/pihole-FTL.db) is often too large to be repairable. You can try these instructions

If it fails, you'll have to start afresh. If you don't mind loosing the query history, you can just remove the database and restart FTL (this will work instantaneously):

sudo service pihole-FTL stop
sudo rm /etc/pihole/pihole-FTL.db
sudo service pihole-FTL start

I don't mind blowing away the search history so I just did the stop/rm/start thing.

After the restart and a few queries from my laptop (some successful, some for blackholed addresses) I get

    pi@pi:~ $ sqlite3 /etc/pihole/pihole-FTL.db "SELECT ip FROM network_addresses JOIN network ON network_addresses.network_id = network.id WHERE network.hwaddr = '50:ed:3c:30:e0:79';"
    192.168.1.163

    pi@pi:~ $ sqlite3 /etc/pihole/pihole-FTL.db "SELECT hwaddr FROM network WHERE id = (SELECT network_id FROM network_addresses WHERE ip = '192.168.1.163' GROUP BY ip HAVING max(lastSeen));"
    50:ed:3c:30:e0:79
    pi@pi:~ $

The first command is fine. Nothing from the second at all, which is better than the error I got last time but a little surprising. Shouldn't I see some activity from that MAC if things are working right?

In a couple of hours I'll recreate my group setup the way it was when I first reported the issue and report back here.

thanks!