Option to ignore domains from appearing in the Query Log

I agree. I don't mind as much if they are still stored in the logs under the hood, but it would be nice to be able to filter items out on the Query Log in the web client. Bonus points if you can make the block stats do the same.

I see, but, unfortunately, there is a (significant!) performance penalty connected to this feature which is the sole reason for why we have not added it yet. Assume we have 15 clients on a typical system with 20.000 queries within 24 hours. Then, the filter would have to be applied 15*20.000 = 300.000 times as we would have to check in each and every query for each and every domain individually if we want to show them or not.

There's no need to check each domain individually (excluding wildcard functionality). Searching a Hashset of excluded domains would be trivially fast, assuming users don't put 50k domains in.

Another option would be to maintain a separate "filtered" bit and set it appropriately when the request is serviced. If the user changes the filters, reprocess all requests and reset the bits appropriately. This would remove the need to do the filtering on the web client.

The issue is not how long it takes to figure out if one query should be filtered out or not, it's that we have to check each query. It takes O(1) to check a hash set (theoretically), but since you do that N times it takes O(n).

As @DL6ER noted, we are solving these filtering questions in the API, where it is easier to use things like hash sets.

Yes, the additional filter check would add O(n) to request processing, but that process must already be at least O(n). You would be adding 1 to the multiplier.

Or is the problem that Hashsets aren't available at the level where the web interface would be doing the filtering?

The web interface does not do the filtering, that currently happens in FTL. Because FTL is in C, it does not have many nice things like hash maps or hash sets built in. The API is in Rust, which does have those features as part of the standard library. For FTL to do this level of filtering, it would either need to implement a hash set/map (complex, not fun) or use a slower approach with an array (fastest reasonable approach would be binary search, O(n log n) to filter all of the queries). There can be hundreds of thousands of queries, so we want to keep filtering performant.

2 Likes

any news here?
or its already implemened? how i can exclude domains from querylog (to see better the other domains)

1 Like

There are two complimentary requests regarding not logging certain resolutions to queries table: this one about ignoring certain domains and another about ignoring certain clients.

Pi-Hole already knows how to very efficiently decide what to do with requests based on the set of rules. It would be amazing to be able to also define what it should do with logging based on a similar set of rules (e.g. client's group, domain, decision to block or allow etc). It should cover both requests very nicely.

In my particular situation I have a Chinese IoT that tries to access baidu.com every couple of seconds. It is always blocked but it does not prevent anything, and the FTL database grows several hundred thousands records every week. Deleting them manually and truncating statistics is very boring to say the least...

2 Likes

Would love to see this feature! When my Samsung TV is on it hits it's ad network which I've blocked but would be nice to not see my query logs always full of the same thing.

2 Likes

New to discussion but it feels like the devs misunderstood the request. I believe OP wanted the ability to simply not see certain domains in the query log screen in the web interface. This would be in the Rust code as I understand, not in the C code. It literally would be a line of code such as [if domain I'm about to write to the screen is not in this list then ...] if that makes sense. At least this is what I want to be able to do.

There are some domains that I don't even care to log, eg. doubleclick, googleadservices, or google-analytics ... Yep, everybody and their dog uses these on their websites, and even my dogs fitness tracker tries to send analytics. Logging these queries is just wasting space and causing unnecessary disk writes. There are other threads where this is made more clear.

Just adding my vote... some polling services query so much causing the log to be cluttered with useless items, which makes it difficult to find the actually useful info you're looking for :confused:

Perhaps a different approach would be to:

  1. add the ability to exclude domains/clients via the query log filter
  2. add an optional setting in the settings page to automatically exclude certain domains/clients, so when opening the query log the filter would automatically be set to exclude these domains

Already 4 yeas old, I wonder why this gets so few attention :frowning:
Actually this is a bug, as the actual status affects the usability of the query log and blows resources. It should be possible to completely drop domains from any logging.

Comparing an incoming domain against yet another list will have an performance impact on Pi-hole as this is obviously something that has to happen after the query is received but before anything is logged.

How is this a bug, which is generally defined as an unintended behavior in software?

Is our code producing unintended results?

Have you looked at the source of the repetitive queries? Experimented with lower rate limits?

As this would be intended for just a small amount of domains, it could be implemented in another way that does not affect resources.

Yes you're right, it is not a bug per se, but it has a bad impact on the usability of the query log.

It is especially my hue bridge which sends so many requests for the blocked domain. I have found no proper solution. Only way I read about is to disable the portal which also disables the update functionality of the bridge.

And which way would that be? Hints to how you see the implementation would help us see the same.

I am no coder but I want to suggest a possible solution.

I think we do not need another list. Why not use just the domain list we have.
We just put a specific special character string right at the beginning of the domain.
So for example !#adserver.evil
The "!#" indicates that this domain will be blocked but completely silent.
This could also be extended easily for other options.

2 Likes

What do you think would this work? @DanSchaper

1 Like