Queries over time graph inaccurate

Hey PiHole devs,

My graph is inaccurate. We're sometimes seeing 229% of requests blocked. I know that PiHole is awesome and advertising on the internet is bad, but 229% efficiency seems a smidge high. I've already done -f and -r. Just putting this out there in case anyone else has seen the same.

Run pihole -d for a debug token. Does this happen every day?

Token: uof2njxwze

Everyday since the 2.10.1 update.
I haven't updated to the most recent 2.11.1 - currently sitting on 2.11.

Also, love the transparent debug collection. Nice work.

We haven't account for historical things in our debug process, but it seems your Pi-hole is functioning as it should based on the information we did get.

I think we'd probably need to get a sample of the log from the time period where the blocked queries are higher than the legit ones. We might be able to get this information in a similar manner as the debug log. Let me know if this works for you.

If it were just me, I couldn't care less that you see my frequent visits to catcenterfolds.com and theuglybugball.com. However there are too many people using the network that may have a concern over privacy. Given the unpredictability of blocked being higher than actual, that'd be too much user data to hand over to internet strangers. :slight_smile:

I'm all ears if there are any other ideas though.

"Too many users" != open to the public. We just have a large family.

Looks like you're not the only one seeing this! Personally, I've not seen it, but in any case, I've opened it up in our issue tracker.

Okay, I redesigned Pi-hole's algorithms, so I'll take care and assist you.

First: What is happening here?
If you have more blocked than total, that simply means that there are more blocked lines which look like

Jan  5 00:16:19 dnsmasq[28022]: /etc/pihole/gravity.list www.googleadservices.com is 1.2.3.4

than query lines which look like:

Jan  5 00:16:19 dnsmasq[28022]: query[A] www.googleadservices.com from 1.2.3.5

Usually, that should not happen because there should only be one answer for each query, but for you it seems to happen nevertheless. We understand your privacy concerns and respect them. I'll compile a list of things you could do for us in order to help me isolating the issue (I would do this with the log if I would have access to it):

  • Can you search yourself for lines where multiple gravity.list answers follow a query line?

  • Could you also check of there are (many) gravity.list lines which answer query lines which are not

  • query[A], or

  • query[AAAA]

  • Also, could you send me your /etc/pihole/gravity.list so I can check for double entries (via private message)? I think this could be a possible explanation. It does not contain any information about the privacy of the users in your network so that should be fine.

Sure thing. I'll keep an eye on it and dig in after work today. Attached image of this morning's view below.

Note: I removed the custom adlists.list and reverted to default. One of the subscribed lists was blocking goo.gl and bit.ly shortlink services.

If you want, you can upload some of your data to our server and we can take a look, just give us the token that is returned after you pipe your info into netcat

| nc tricorder.pi-hole.net 9999

Anything uploaded is destroyed after 24 hours and we have the server locked down like it's 29th century technology.

Haha! When you put it that way... I'll toss a bunch of stuff over after this incredibly monotonous day at work.

I tried sending the full log file, but didn't receive a token. Perhaps I did it incorrectly?

nc tricorder.pi-hole.com 9999 < pihole.log
Timestamp: 01/05/17 19:49:29 (PST)

May just be in the way it's configured, I'm not sure offhand, (working from memory, on site at the moment.) but try cat pihole.log | nc tricorder.pi-hole.net 9999 or try your direction input, but remember the domain is pi-hole.net not pi-hole.com.

Well there's the problem. Was copy / pasting the address from the earlier comment.

Token: tx5mfbi5dq

TimeStamps of interest:
07:00-07:09, 08:10 - 08:19, 12:00-12:30, and 20:10-20:19

[quote="jacob.salmela, post:10, topic:1131"]
pi-hole.com
[/quote]

You should know better!!

Actually, it was Dan:

For shame!

1 Like

I had a look at your data and was able to identify what is going on...

In my rewrite of the algorithms I got confused by the variable names which have already been present:

return [$byTimeDomains,$byTimeAds];

Hence, I programmed it such that byTimeAds = #blocked and byTimeDomains = #total - #blocked. However, the graph expects byTimeDomains = #total. That explains why the blue line can be above the green line (it will always be if more than 50% of all requests is blocked - the shown percentage is wrong due to the very same problem).

It will be fixed in the next release:

https://github.com/pi-hole/AdminLTE/pull/337

Ahaha I didn't catch it when copy/pasting!

Thanks! Just found this when about to post on the same thing. :slight_smile:

1 Like