Please follow the below template, it will help us to help you!
If you are Experiencing issues with a Pi-hole install that has non-standard elements (e.g you are using nginx instead of lighttpd, or there is some other aspect of your install that is customised) - please use the Community Help category.
Expected Behaviour:
Fresh install of Pi-hole on Raspian OS 64-bit today, 1-16-23
Actual Behaviour:
The Steven Black List is reporting 15 invalid domains
Yeah, good one, maintaining a dict (or something) to filter flagged lines is a pretty good idea.
I was thinking, alternately, you could pick a number, say 10^3 or 10^4, beyond which you would say, for example, 168,235 domains added, without regard for the relatively minuscule number of lines discarded.
Because a batch larger than, say, 10^3 or 10^4 domains is a bulk upload, where the <0.1% failure rate is of little or no concern. A bulk addition is categorically different than a small list, where discarded lines are more interesting, and even actionable. With large bulk uploads, nothing's really actionable.
Detailed error reporting could be a flag or an enum, with default off beyond the threshhold.
If I imported an adlist and didn't know why hostnames would be ignored I would see "ignored lines" or "unused lines" the same way, and be wondering what I did wrong and what I need to do to fix it, and would then post somewhere to ask. The exclamation mark also makes it seem dramatic, as if action should be taken.
[i] List contains 8 entries comprising 3 usable domains and 5 non-domain entries.
Sample of non-domain entries:
[...]
I think the exclamation mark will be gone for cases like this.
This is not an error. The code is ignoring some lines because they are not used be pi-hole. No action is needed in this case.
Hello, I was just heading here to enquire about this issue after the latest Pi-Hole update as I had Steven's list pop up with an invalid entry report when updating gravity via ssh after noticing the exclamation marks next to the lists on the web interface.
It's good of Steven to chase this down and confirm that his lists are not the issue here, props.
On the plus side the gravity update did report an actual invalid entry from the blackbook list I use which I have reported to that lists maintainer.
I found the output from pihole -g quite alarming as it is not clear as to whether Pi-Hole is still blocking the remaining domains on the list or whether it didn't load the list at all after detecting "invalid" entries.
I'd like to suggest adding a simple console line edit to confirm that the valid entries on the list have been imported - purely for the sake of clarity for end-users so they know what Pi-Hole is doing with the lists when a rejection occurs, for example:
Yep I saw that in the web gui, my comment was regarding the console output when running pihole -g via the console:
[i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
[✓] Status: Retrieval successful
[i] Analyzed 168834 domains, 15 domains invalid!
Sample of invalid domains:
- 0.0.0.0
- broadcasthost
- fe
- ff
[i] List stayed unchanged
It does state "List stayed unchanged", but I guess the use of "invalid" here sent some alarm bells ringing for myself and plenty of other people, hence my suggestion about changing the wording used for this feature in the console output.
Annother angle I am exploring is to still have the information, but hide it behind a --verbose switch. It may yet be useful information to an end user.
I'm just trying to find a useful compromise mainly.
Changes would be along the lines of (wording subject to change):
[i] Target: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
[✓] Status: Retrieval successful
[i] Analyzed 168834 entries, and imported 168819 domains
[i] 15 entries unusable by Pi-hole
[i] This is not an error
Sample of unusable entries:
- 0.0.0.0
- broadcasthost
- fe
- ff
- ip6-allhosts
[i] List stayed unchanged
I do like the ouput in this PR as it is much clearer what is happening with the lists and it filters out the false-positives using the dict.
If you were looking for something more descriptive to insert for the sample of unusable entries, perhaps inserting a small bracketed section with examples before the colon to avoid adding extra lines, for example:
[i] Target: https://raw.githubusercontent.com/jankytay/pihole-troubleshooting/master/issue5008.txt
[✓] Status: Retrieval successful
[i] Imported 3 domains, 5 entries unusable by Pi-hole
Sample of unusable entries [regex, wildcards, data after TLD, etc]:
- (\.|^)example6\.com$
- (^|\.)example5\.com
- (^|\.)example8\.com;reply=none
- *.example3.com
- ^example77?
It's a little concise but in general I think people would get the gist of why those entries may be unusable without adding a wall of text into the console.
As a positive for this feature, the blackbook list I use that Pi-Hole reported an unusable domain from have already updated their list following the issue I created in their repo documenting the problem to them.