Push notifications triggered by failure states

jrucker · April 27, 2021, 11:18pm

I have seen a few feature requests for notifications, but I'm hoping a more focused approach will help get some traction.
Notifications are needed for one main reason: Unexpected behavior. Adding additional things to notifications can be follow-up features, but basic notifications on unexpected errors is something that most services should have.

An example: I had no idea that several of the entries in my adlist were throwing http errors during scheduled gravity updates due to maintainers taking their lists down. Gravity continued doing its thing, moving passed the failed requests, which is expected, but there is no indication to the user that something is wrong.

One of the responses I saw in another feature request was to build something that makes API requests. An error state needs to trigger a push notification, not wait for a pull. Constant API requests looking for an error or failure state does nothing but generate a tremendous amount of unnecessary traffic.

My personal preference is discord or slack, but email is probably more applicable to the majority of users. Most services that I've used that implement notifications provide multiple options.

Building out a notification feature that can then be expanded later by adding "categories" of opt-in notifications would probably be a good approach. I'm sure there are additional failure states that I'm not aware of that should trigger a notification.

DanSchaper · April 27, 2021, 11:28pm

It's an interesting idea. I don't see it happening any time soon because we just don't have the resources to be able to do it.

jfb · April 27, 2021, 11:57pm

Related to this example, the output of the weekly unattended gravity update is stored in a log for your review at any time until the next unattended gravity update.

/var/log/pihole_updateGravity.log

jrucker · April 28, 2021, 12:29am

This is a good example of why notifications triggered by failures are needed. A user shouldn't need to dig through logs to find out if everything is running smoothly.

jfb · April 28, 2021, 12:37am

Logs are meant to be read, which is why the log is retained.

We provide a line by line output in the manual gravity update, and few people actually look at that output either.

DanSchaper · April 28, 2021, 12:46am

http://pi.hole/admin/groups-adlists.php Look for the icon on the left. Click if it's not Green Check.

There's also an error notification system in the admin interface that will display a yellow yield sign if there are issues you need to resolve.

jrucker · April 28, 2021, 12:55am

I think you guys are missing the point I'm making. You spend a lot of time looking at the gui, and logs, because you spend a lot of time working on the app. The overwhelming majority of users don't do that, and shouldn't need to. Pihole should (and does) work very well without any user intervention. There should be no reason for the user to dig through the GUI or logs looking for problems, because it "just works"

A notification triggered by an unexpected event is what is needed for any user that doesn't actively monitor pihole (which is almost all of them.)

DanSchaper · April 28, 2021, 12:58am

I understand your point and I've replied to that already.

We're making you aware of other options that you can make use of in the meantime. I don't expect any kind of push notification system to be possible in the next 6 months at least.

DanSchaper · April 28, 2021, 1:00am

Pi-hole is a DNS server. Your network doesn't work with a non-functioning DNS server. Putting in a few minutes a day or even a week isn't too much to ask of you for maintaining the free to you software.

Do you have a system that updates the software packages and the operating system? Does it tell you on discord when you need to reboot to apply a new kernel?

jrucker · April 28, 2021, 1:05am

I understand, I just wanted to address the options you provided to make it clear that I don't want this request to be dismissed completely by "go look at the logs"

Yes, Ansible

Yes.

DanSchaper · April 28, 2021, 1:08am

Which is a push system that reads logs (or command output) to see what actions to take. And could be set up to look at the logs we write, determine if there's a condition that needs alerting to and then alert you to it?

Edit: https://www.middlewareinventory.com/blog/ansible-search-string-file-check-if-string-exists/

jrucker · April 28, 2021, 1:20am

Yup, there are lots of ways you could do it with an external tool. Splunk would probably be a more reasonable way vs using ansible, since that's kinda what it's for.

I'm aware there are alternative options. The goal of this feature request is to integrate the functionality into Pihole, which is a common feature on many services, so users don't have to roll their own solution.

yubiuser · April 28, 2021, 6:38am

An intermediate solution could be the integration of gravity warnings/errors into the existing diagnosis system Pi-hole already has.

The amount of work to do this is likely very small compared to the implementation of a whole push notification system.

Of course the user still has to visit the GUI (or check the relevant database table) to see the error, but it is only one place they have to look at.

jpgpi250 · April 28, 2021, 7:40am

I really don't know when this has changed, and unfortunately, it's undocumented, but it's possible this (failed requests) could be detected, using a simple SQL query on the gravity database / adlist table. There is a field, status, that might indicate a failed download (not sure).

To test this, I've added a list (entry for a non existing file - https://ligatus.com/blocklist.txt) and ran pihole -g.
The status field value for this list is 4, the message on screen (when running pihole -g) is List download failed: no cached list available

I use several scripts and cron jobs to query the databases, running these when it suits my needs. The result, if alarming, is mailed to me, so I know it's happening, and I can take the appropriate action, if needed.

There are several events I wish to be notified about, the information is almost always available in the databases, the missing thing is a script (SQL query) to detect the event and a cron job to report the event on an appropriate time.

In order to increase the detectable events, I recently submitted a feature request to keep the old database available, I've already wrote the code and submitted it for review to DL6ER
I'm currently running this modified gravity.sh and query.sh, it allows me, among other things, to get a list of domains that have been blocked for the first time, since the latest gravity run, which makes troubleshooting unexpected blocks a lot easier.

I admit this is a workaround, you need to write the scripts, schedule the cron jobs (OR execute the scripts for an immediate eval), but it is an improvement over not knowing things are going wrong, like you indicated.

edit
looking at the code in gravity.sh, there are 5 status codes
0: list has been added, gravity hasn't run yet, thus no status
1: No checksum available, create one for comparing on the next run
2: The list changed upstream, we need to update the checksum
3: List download failed: using previously cached list
4: List download failed: no cached list available
/edit

edit2
looks like there is a little bug? / problem? in gravity:
created two, non existing list entries (htttps://ligatus.com/test.txt and http://localhost/test.txt)
The first status (htttps://ligatus.com/test.txt) is correct (4),
however,
http://localhost/test.txt reports status 1, even though the file does not exist (the default webserver page (Did you mean to go to the admin panel?) is downloaded and evaluated instead).

This implies failures for files, served from the local machine, using url cannot be detected. Workaround is to use the syntax file:///var/www/html/test.txt.

/edit2

Bucking_Horn · May 4, 2021, 6:30am

4 posts were split to a new topic: Include severity in certain logs

system · October 31, 2021, 6:31am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.