Display blocked "Deep CNAME" domains in Pi-hole tail & query

This is a good idea, however, I immediately have two objections, maybe you can convince me that they are minor. This is not at all to play down your idea.

  1. Performance
    The new field has no index on it. That mean that such a scan means reading the entire table from disk in a full-text search. This can easily take minutes. You may argue that we can add an index on the newly added column, however, this is also not an optimal idea as there is no way the database knows that the column is "new" (as in all rows have a NULL column) so at least the index creation would again ask for an entire full-text database operation.
  2. Reliability
    Such a lookup will only show things which happened in the past. Queries which would be blocked by CNAME inspection but we haven't seen them so far would not be shown.

I prepared a new branch new/cname_inspection_logging for this feature request.

The CNAME details are now, in addition to the already existing display in the Query Log:

also stored in the database:

sqlite> SELECT * FROM queries WHERE additional_info NOT NULL;
id     |timestamp |type |status |domain           |client       |forward |additional_info
8747483|1594668331|1    |11     |fonts.gstatic.com||        |gstaticadssl.l.google.com

and also in pihole.log:

Jul 13 21:25:31 dnsmasq[23773]: query[A] fonts.gstatic.com from
Jul 13 21:25:31 dnsmasq[23773]: forwarded fonts.gstatic.com to
Jul 13 21:25:31 dnsmasq[23773]: reply fonts.gstatic.com is <CNAME>
Jul 13 21:25:31 dnsmasq[23773]: reply gstaticadssl.l.google.com is blocked during CNAME inspection

What is not done so far is restoring CNAME information from the database into Pi-hole's memory after an FTL restart. I will have to think a bit about how to do this (it may be that the domain initially causing the block is not known to FTL during history parsing because it was seen more than 24 hours ago).

does that branch also include the modifications for

I'm still testing these, aborting my test for another test doesn't look smart.

For having manageable and reviewable pull requests, I separate new features from each other, so

  • new/tre-regex: No, there is still an open PR for this
  • tweak/improve_unique_domains: Yes, because it is already contained in development
1 Like

I will not be testing the new branch yet, I want to complete my test on .*;querytype=!A first, this requires a lot of time (different devices). I'm sure however, there are sufficient interested users to test this, as it possibly resolves a problem for many.

Would it be possible to let pihole -q -exact <domain> perform the equivalent of a dig to that domain, this would expose a possible deep CNAME match and make the information available to show in the result of pihole -q. I assume pihole-FTL can retrieve DNS information, without using an external command (dig)?

I'm struggling to see the need for any addition to pihole -q for this one. If you see a domain blocked by CNAME on the query log, surely you would query the deep CNAME match, and not the actual requested domain?

I thought the issue here was that the information about the deep CNAME match was not persisted across reboots/restarts of pihole-FTL?

I'm not sure what there is to be gained from adding complexity to a function who's job it is to check the downloaded lists for matches of input... :man_shrugging:

Edit: though reading the request I see this is about not visiting the web interface to see this information. Original point still stands though, it's increased complexity for very little gain

1 Like

You may recall I already mentioned that information is no longer available, if you restart pihole-FTL, as you also mentioned.

query.sh will already need a change, due to pihole checkout ftl new/tre-regex, see here.

Adding the deep CNAME inspection info is the original feature request...

This is now done as well.

No offense here, appreciating all the great work the developers are doing, just trying to draw the complete picture...

Another reason why having the info available in pihole -q:
The deep CNAME inspection info is Not available when looking at the Long Term Data / query Log.

It will be, though, which makes the pihole -q addition moot. As @DL6ER mentioned above, there's no real performant way to add this to pihole -q, and adding dig-like functionality begins to creep away from the scope of the script... which is, quite simply, to determine whether or not a domain is included on one of the blocklists, or is affected by regex rules. We would have to check each time pihole -q is run "is this domain a CNAME for some other domain that may or may not be on the blocklists?"

But like I said, this information will be available already on the query log (yes, I know it isn't right now after a restart)

Linking for reference, as it is related to CNAME

This issue ticket should be transferred to the FTL repo. I looked through the CNAME code and tried to find the bug but couldn't.

I think the ticket is fine in AdminLTE.

The problem is that the actual domains blocked by deep CNAME inspection were reported in the dashboard "Top Blocked Domains" and on click the query log is empty. This was (so far) due to the query string (containing the actual domain) didn't match any domain.

Edit: more accuracy about which domain is shown on dashboard.

No. The dashboard shows 1:1 what it receives through the API from FTL. Wrong display on the dashboard means wrong data given by FTL.

The dashboard is fine. It reports the actual blocked domain. The link to the query log (query sting) does not return any result (so far) for this domain.

Yes, and my point is that this is wrong. Pi-hole should not show the actually blocked domain (because the user never tried to access this page directly) but rather the originally requested CNAME. Hence, in my world, showing this domain is wrong and the CNAME should be shown instead.

So, instead of settingsfd-geo.trafficmanager.net as top entry, there should be settings.data.microsoft.com
The link will then also work as expected and this is honestly how I think it would be best and likely how the devs wanted to have it (but then some bug came in their way).

1 Like

Thanks for your more detailed explanation. I get your point now.

I'm not sure if I prefer 'What you see is what you meant' over 'What you see is what really happend' but (most?) users might be irritated by the latter. Fixing the query link would be one solution to reduce the confusion - the other would be changing the domain displayed at the dashboard as you suggest.

Let's see what the devs intended.

I added storing the regex ID responsible for a regex-block in the additional_info column when it is available (in the case of regex CNAME blocking, the domain is stored instead). So a restart will now also preserve regex links:
Screenshot at 2020-07-14 19-33-01

Concerning the other issue,

I guess they intended what @Coro said above :wink:

The CNAME inspection routine was a bit confusing with all the CNAME path terminology, I pushed a refactor which makes it clear which domain records are parents and which are children of any degree of relationship.

I gonna be honest, the topic was kinda done for me since jfb told me that the CNAME blocking is done at Pi-hole level while the log is at dnsmasq level, now i see 2 months later which dimensions the FR became.. help.

@DL6ER ´s description of why an implementation in pihole -q or pihole -t is either not possible or would end up modifing even more on dnsmasq looks reasonable. After all, the current wave of CNAME related questions probably comes from the release from Deep CNAME Inspection inside Pi-hole v5.0 and it unlikely to hit anyone at all, especially after that first wave is over. Probably not worth to sacrifice the integrity of the code when adding the FR would create a mess in dnsmasq, or if adding the feature would add confusion to the script scope that @PromoFaux mentioned before.

I am honestly sorry that PromoFaux got missleaded half way in, that the initial FR was to keep the CNAME blocked domains stored after FTL Restart. That was not the intention of the FR, through probably can't hurt to have included if it doesn't cause any issues with database/SD Card Writes or similar.

additional_info as column name could probably add confusion at some point, but i didn't came up with something better myself so far...

The developers have found a way how to do it. Meanwhile they optimized (bugfixed) some edge case in the CNAME routine (note: I'm guessing from the commit messages) and added restoring of some status-specific quantities like the really blocked domain and the regex ID in this new additional_info field.

Overall, this was all very useful in the end, pihole -t got the information, the long-term interface is going to get it added (@DL6ER will do this?) and pihole -q is out-of-scope right now. I think this is the correct summary.

Well, yes and no. Yes as in there are no full human-readable sentences in there, but you don't usually expect this in a database, do you? So if people don't know what to do with this extra field, then they can simply ignore it and nothing bad will happen.

Added now in

pihole checkout web new/cname_inspection_logging