Unable to select correct client - by name in advanced filter

Running * Core vDev (development-v6, ca9eaf96)

  • FTL vDev (development-v6, c82676b1)
  • Web interface vDev (development-v6, 103bfedf)

I am running current Beta 6 and have recently started using my ISP router DCHP server. The router adds a suffix of fritz.box to my clients which I can see in the client log. When I go to the Advanced Filter, Client (by name) I can see my devices listed (minus the fritz.box suffix) and when I select a client from the list and refresh the Queries list no entries are returned for that client.
Grateful for any suggestions, my network knowledge is limited.

The client names should update within at most a few hours after you have changed this. Please note that the clients drop-down field may not contain all your clients if you have really many. How long ago did you switch to your router's DHCP?

Which "client log"? The router's? Or the Pi-hole Query Log?

Note that you are not limited to the clients shown in the dropdown menu. You can also enter clients manually and confirm them with Enter.

Manually entering Desktop-LAN.fritz.box in client (by name) works correctly, thanks.
PC ipconfig /all shows:- DNS Suffix Search List. . . . . . : fritz.box

What is the optimal suggestion we want to use for the dropdown boxes?

Currently, the code simply does
SELECT DISTINCT(ip) FROM client_by_id LIMIT 30
and
SELECT DISTINCT(name) FROM client_by_id LIMIT 30

If now - as in your case - all clients change names and you have sufficient of them, they will never show up because they are at the "bottom" of the table.

We could come up with something like

SELECT COUNT(*) cnt,ci.name FROM query_storage qs
       JOIN client_by_id ci ON ci.id = qs.client
       WHERE timestamp > unixepoch() - 24*3600
       GROUP BY ci.name
       ORDER BY cnt DESC

which would give you the most active clients by name within the last 24 hours BUT this query obviously runs much slower (> 20x slower in my limited testing) so I'm not really convinced here.

Pinging the database expert @PromoFaux for any suggestions but I guess there isn't much we can do here - adding such a rolling-window count directly to the table (so we can sort without joining and counting through thousands of queries) is not trivial, either.
Reversing the table output won't work either - we simply don't know where the most active clients are.

I don't really have any underpowered hardware to test this on (currently set up anyway) but on my NAS box where v6 is running (probably an unfair test!) I yield the following results:

dev-v6:/# time pihole-FTL sqlite3 /etc/pihole/pihole-FTL.db "SELECT DISTINCT(IP) FROM client_by_id LIMIT 30" > /dev/null

real    0m0.003s
user    0m0.002s
sys     0m0.001s
dev-v6:/# time pihole-FTL sqlite3 /etc/pihole/pihole-FTL.db "SELECT COUNT(1) cnt, ci.name FROM query_storage qs JOIN client_by_id ci ON ci.id = qs.client WHERE timestamp > unixepoch() - 24*3600 GROUP BY ci.name ORDER BY cnt DESC LIMIT 30" > /dev/null

real    0m0.018s
user    0m0.016s
sys     0m0.001s

How many rows are in your client_by_id table? For me it's a few thousands from devices doing MAC randomization (until I disabled that). Also I'm at around 48.000 queries within the past 24 hours.

Marginally quicker here... though let me doublecheck the logic of the query...

dev-v6:/# time pihole-FTL sqlite3 /etc/pihole/pihole-FTL.db "WITH CTE AS (Select COUNT(*) cnt, client FROM query_storage where timestamp > unixepoch() - 24 * 2600 GROUP BY client ORDER BY cnt DESC LIMIT 30) SELECT CTE.cnt, ci.name FROM CTE JOIN client_by_id ci ON CTE.client = ci.id" > /dev/null

real    0m0.008s
user    0m0.002s
sys     0m0.006s

Only 72

This was a fluke, maybe. Subsequent runs are more like 0m0.037s. Which makes not a lot of sense because it should in theory be a little quicker

Edit:

It seems the most costly part of the query is actually querying the query_storage table - despite limiting to past 24 hours. The join to client_by_id barely adds any overhead

Obviously it could be dealt with by a FTL startup - but does FTL have the capability of running timed checks?

I'm thinking we could have a special facts table that is populated with the most active clients of the last 24 hours every hour or so, and then the the query need only be select name from special_table - which would likely be faster in user-land than crunching the numbers on the fly.

Doing it on the in-memory database seems to be fair in times of time used. I implemented it using the sub-query you proposed for all four lookups (top domains, top clients IP and names, top upstream servers):

2024-06-25 08:53:10.199 CEST [238208/T238226] DEBUG_API: Reading top domains from database
2024-06-25 08:53:10.205 CEST [238208/T238226] DEBUG_API: Read 30 domains from database

2024-06-25 08:53:10.205 CEST [238208/T238226] DEBUG_API: Reading top client IPs from database
2024-06-25 08:53:10.210 CEST [238208/T238226] DEBUG_API: Read 30 client IPs from database

2024-06-25 08:53:10.210 CEST [238208/T238226] DEBUG_API: Reading top client names from database
2024-06-25 08:53:10.215 CEST [238208/T238226] DEBUG_API: Read 30 client names from database

2024-06-25 08:53:10.215 CEST [238208/T238226] DEBUG_API: Reading top upstreams from database
2024-06-25 08:53:10.220 CEST [238208/T238226] DEBUG_API: Read 4 upstreams from database

Check out the millisecond timestamps in the debug log above. It is slightly less than optimal as we are actually doing the sorting for clients twice to get once names and once IPs. edit: We can simply fill the two array with one query in the latest commit of the new branch tweak/top_suggests.

@Steve45 @PromoFaux I'd be awesome if you could try if my proposal works and now shows the most relevant clients and domains for your particular network in the dropdown menus after you ran

pihole checkout ftl tweak/top_suggests

Have run on my Pi-hole and all looks good with current device names showing in drop down list. Thanks for progressing this :slight_smile:

1 Like