This is a good idea, however, I immediately have two objections, maybe you can convince me that they are minor. This is not at all to play down your idea.
The new field has no index on it. That mean that such a scan means reading the entire table from disk in a full-text search. This can easily take minutes. You may argue that we can add an index on the newly added column, however, this is also not an optimal idea as there is no way the database knows that the column is "new" (as in all rows have a NULL column) so at least the index creation would again ask for an entire full-text database operation.
Such a lookup will only show things which happened in the past. Queries which would be blocked by CNAME inspection but we haven't seen them so far would not be shown.
I prepared a new branch new/cname_inspection_logging for this feature request.
The CNAME details are now, in addition to the already existing display in the Query Log:
sqlite> SELECT * FROM queries WHERE additional_info NOT NULL;
id |timestamp |type |status |domain |client |forward |additional_info
8747483|1594668331|1 |11 |fonts.gstatic.com|192.168.2.223| |gstaticadssl.l.google.com
and also in pihole.log:
Jul 13 21:25:31 dnsmasq: query[A] fonts.gstatic.com from 192.168.2.223
Jul 13 21:25:31 dnsmasq: forwarded fonts.gstatic.com to 127.0.0.1
Jul 13 21:25:31 dnsmasq: reply fonts.gstatic.com is <CNAME>
Jul 13 21:25:31 dnsmasq: reply gstaticadssl.l.google.com is blocked during CNAME inspection
What is not done so far is restoring CNAME information from the database into Pi-hole's memory after an FTL restart. I will have to think a bit about how to do this (it may be that the domain initially causing the block is not known to FTL during history parsing because it was seen more than 24 hours ago).
I will not be testing the new branch yet, I want to complete my test on .*;querytype=!A first, this requires a lot of time (different devices). I'm sure however, there are sufficient interested users to test this, as it possibly resolves a problem for many.
Would it be possible to let pihole -q -exact <domain> perform the equivalent of a dig to that domain, this would expose a possible deep CNAME match and make the information available to show in the result of pihole -q. I assume pihole-FTL can retrieve DNS information, without using an external command (dig)?
I'm struggling to see the need for any addition to pihole -q for this one. If you see a domain blocked by CNAME on the query log, surely you would query the deep CNAME match, and not the actual requested domain?
I thought the issue here was that the information about the deep CNAME match was not persisted across reboots/restarts of pihole-FTL?
I'm not sure what there is to be gained from adding complexity to a function who's job it is to check the downloaded lists for matches of input...
Edit: though reading the request I see this is about not visiting the web interface to see this information. Original point still stands though, it's increased complexity for very little gain
It will be, though, which makes the pihole -q addition moot. As @DL6ER mentioned above, there's no real performant way to add this to pihole -q, and adding dig-like functionality begins to creep away from the scope of the script... which is, quite simply, to determine whether or not a domain is included on one of the blocklists, or is affected by regex rules. We would have to check each time pihole -q is run "is this domain a CNAME for some other domain that may or may not be on the blocklists?"
But like I said, this information will be available already on the query log (yes, I know it isn't right now after a restart)
The problem is that the actual domains blocked by deep CNAME inspection were reported in the dashboard "Top Blocked Domains" and on click the query log is empty. This was (so far) due to the query string (containing the actual domain) didn't match any domain.
Edit: more accuracy about which domain is shown on dashboard.
Yes, and my point is that this is wrong. Pi-hole should not show the actually blocked domain (because the user never tried to access this page directly) but rather the originally requested CNAME. Hence, in my world, showing this domain is wrong and the CNAME should be shown instead.
So, instead of settingsfd-geo.trafficmanager.net as top entry, there should be settings.data.microsoft.com
The link will then also work as expected and this is honestly how I think it would be best and likely how the devs wanted to have it (but then some bug came in their way).
Thanks for your more detailed explanation. I get your point now.
I'm not sure if I prefer 'What you see is what you meant' over 'What you see is what really happend' but (most?) users might be irritated by the latter. Fixing the query link would be one solution to reduce the confusion - the other would be changing the domain displayed at the dashboard as you suggest.
I added storing the regex ID responsible for a regex-block in the additional_info column when it is available (in the case of regex CNAME blocking, the domain is stored instead). So a restart will now also preserve regex links:
The CNAME inspection routine was a bit confusing with all the CNAME path terminology, I pushed a refactor which makes it clear which domain records are parents and which are children of any degree of relationship.
I gonna be honest, the topic was kinda done for me since jfb told me that the CNAME blocking is done at Pi-hole level while the log is at dnsmasq level, now i see 2 months later which dimensions the FR became.. help.
@DL6ER ´s description of why an implementation in pihole -q or pihole -t is either not possible or would end up modifing even more on dnsmasq looks reasonable. After all, the current wave of CNAME related questions probably comes from the release from Deep CNAME Inspection inside Pi-hole v5.0 and it unlikely to hit anyone at all, especially after that first wave is over. Probably not worth to sacrifice the integrity of the code when adding the FR would create a mess in dnsmasq, or if adding the feature would add confusion to the script scope that @PromoFaux mentioned before.
I am honestly sorry that PromoFaux got missleaded half way in, that the initial FR was to keep the CNAME blocked domains stored after FTL Restart. That was not the intention of the FR, through probably can't hurt to have included if it doesn't cause any issues with database/SD Card Writes or similar.
additional_info as column name could probably add confusion at some point, but i didn't came up with something better myself so far...
The developers have found a way how to do it. Meanwhile they optimized (bugfixed) some edge case in the CNAME routine (note: I'm guessing from the commit messages) and added restoring of some status-specific quantities like the really blocked domain and the regex ID in this new additional_info field.
Overall, this was all very useful in the end, pihole -t got the information, the long-term interface is going to get it added (@DL6ER will do this?) and pihole -q is out-of-scope right now. I think this is the correct summary.
Well, yes and no. Yes as in there are no full human-readable sentences in there, but you don't usually expect this in a database, do you? So if people don't know what to do with this extra field, then they can simply ignore it and nothing bad will happen.