I've noticed that (most of) my adlists get the status Status: Retrieval successful
instead of No changes detected
when running pihole -g
, even when running it twice within a minute.
Looking at the code, I see that the ''Status' is based on HTTP response. The code even states that some servers don't provide the necessary header.
As pihole will save all downloaded lists locally, I request to additionally compare old local lists and (new) downloaded to decide if "Status" should be "changed" or "not changed".
This could be extended to locally saved adlist, which at the moment will never get "Status: no change detected".
This would allow to improve gravity's output by separating it into two lines for each adlist:
Status: Retrieval successful
Status: (No) Changes detected
Furthermore this would reliably allow to determine if the adlist content has been change, even when the server don't provide the necessary HTTP response.
This is also a prerequisite for FR like
Hi,
What I've found over time is that I'm my blocklist (currently 33 individual URLs) is growing to be quite large.
Is it possible to include a "Last modified" header field (from the website where the content is being downloaded from) in the list so that I can potentially remove blocklists that are not being maintained by their respective owners?
Perhaps add the feature to sort by last modified date so it's easier to quickly identify non-maintained block lists and remove them if applicable.
T…
and
Since v5.0 pihole displays a mouseover tooltip for each adlist in group management containing information about "Last modified" (field date_modified in gravity.db).
From a user's perspective I would expect this date corresponds to the date the content of the adlist has changed. Instead, it shows when the database entry for that adlist changed.
This can easily seen by enabling/disabling an adlist and the "Last modified" will change.
[Bildschirmfoto zu 2020-07-05 09-21-07]
[Bildschirmfoto zu…
DL6ER
December 29, 2020, 6:47am
2
I was just working on this one when I saw that
looks like this on my Pi-hole:
[i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
[✓] Status: No changes detected
[i] Analyzed 58750 domains
Can you confirm the server raw.githubusercontent.com
is still not sending the header for you?
I will likely still add this check you requested here, however, I will need some new test lists now
edit @yubiuser can you recommend one for me for testing? I picked a few random ones (not githubusercontent
) from https://v.firebog.net/hosts/lists.php?type=tick but all of them seem to be able to deliver the header just fine:
Strange that you get a different response..
From today
[i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
[✓] Status: Retrieval successful
[i] Analyzed 58750 domains
nanopi@nanopi:~$ curl -I https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 1788565
Content-Type: text/plain; charset=utf-8
Cache-Control: max-age=300
Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
ETag: "6cfff029d57c53731a9f0b797682f676ed6370af44a99686f04bf242300c052e"
Strict-Transport-Security: max-age=31536000
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
Via: 1.1 varnish (Varnish/6.0), 1.1 varnish
X-GitHub-Request-Id: 53EA:DE6F:2FEE7DA:3274584:5FEB14E2
Accept-Ranges: bytes
Date: Tue, 29 Dec 2020 11:47:03 GMT
X-Served-By: cache-fra19131-FRA
X-Cache: HIT, HIT
X-Cache-Hits: 1, 3
X-Timer: S1609242423.314208,VS0,VE0
Vary: Authorization,Accept-Encoding, Accept-Encoding
Access-Control-Allow-Origin: *
X-Fastly-Request-ID: d2099b4af6e09adef7b0c58b92075c4bcc918bc1
Expires: Tue, 29 Dec 2020 11:52:03 GMT
Source-Age: 241
Others do send the header
nanopi@nanopi:~$ curl -I http://dehakkelaar.nl/lists/cryptojacking_campaign.list.txt
HTTP/1.1 200 OK
Date: Tue, 29 Dec 2020 11:52:14 GMT
Server: Apache/2.4.25 (Debian)
Last-Modified: Tue, 29 Dec 2020 04:47:06 GMT
ETag: "38a3-5b79315298b55"
Accept-Ranges: bytes
Content-Length: 14499
Vary: Accept-Encoding
Content-Type: text/plain
I enabled all adlists I have available and run gravity twice. I have only one adlist (next to githubusercontent
) that does not send the header
https://gitlab.com/ZeroDot1/CoinBlockerLists/raw/master/hosts
Thats oc to remind me I have to create another way to pull the domains from the excel sheet
EDIT: No modified header as well when curl
that one.
PromoFaux
Split this topic
December 30, 2020, 4:45pm
5
4 posts were split to a new topic: Question about a list
DL6ER
December 29, 2020, 7:41pm
7
Don't ask me what happened here, I didn't look at the headers this morning. Just tested this now again and it is back to its usual self:
[i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
[✓] Status: Retrieval successful
[i] Analyzed 58750 domains
This feature request is getting implemented by
pi-hole:development
← pi-hole:new/gravity_adlist_infos
opened 07:10AM - 28 Dec 20 UTC
**By submitting this pull request, I confirm the following:**
- [X] I have r… ead and understood the [contributors guide](https://github.com/pi-hole/pi-hole/blob/master/CONTRIBUTING.md), as well as this entire template.
- [X] I have made only one major change in my proposed changes.
- [X] I have commented my proposed changes within the code.
- [X] I have tested my proposed changes, and have included unit tests where possible.
- [X] I am willing to help maintain this change if there are issues with it later.
- [X] I give this submission freely and claim no ownership.
- [X] It is compatible with the [EUPL 1.2 license](https://opensource.org/licenses/EUPL-1.1)
- [X] I have squashed any insignificant commits. ([`git rebase`](http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html))
---
**What does this PR aim to accomplish?:**
Store more gravity details in gravity.db adlist table for user display on the web interface
**How does this PR accomplish the above?:**
Store status of downloaded list (downloaded, using cache, some error,…) and number of (in-)valid domains on this list in the gravity database. This updates the gravity database to version 14.
**What documentation changes (if any) are needed to support this PR?:**
Gravity database schema changes, this needs to be accounted for in https://docs.pi-hole.net/database/gravity/groups/ in a follow-up PR.
This has been implemented.