Update gravity doesn't work for lists, when domain is on blocked list

ryrun · August 3, 2019, 8:23pm

Please follow the below template, it will help us to help you!

Expected Behaviour:

Successfully download a list from a blocked domain.

Actual Behaviour:

It fails all the time. I need to whitelist the domain before, then it works.

Here an example output from gravity update:

  [i] Target: s3.amazonaws.com (simple_malvertising.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_malvertising.txt
  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_malvertising.txt (304000)
  [✗] List download failed: using previously cached list

Did i miss something or is this maybe broken?
Pi-hole Version: v4.3.1

Speedy70 · August 3, 2019, 8:54pm

Hello

I'm not a specialist, but I would say that Pi-hole does exactly what it was taught. Namely, the domains contained in block lists to block the DNS queries.
Search via "Query Lists" in the web interface in which blocklist the domain s3.amazonaws.com is contained or if a regex is responsible for the blocking. That you can - I think - with "Tail Pi-hole.log" find out in the admin interface of your Pi-hole.

Then you can decide whether you remove the domain Whitlistest or the responsible blocklist.

lg

jfb · August 3, 2019, 8:56pm

Pi-Hole is working as intended. Line 2 of your output - Pi-Hole recognizes that the domain is blocked, so it bypasses itself and tries to load the list using your specified upstream DNS server directly.

The list would not load from the requested server.

Just curious - why do you have s3.amazonaws.com blocked?

ryrun · August 3, 2019, 10:36pm

I'm just using several lists i found here in this forum. Are you sure, that this is intended? Pi-hole detect the dns block, which is correct and it's using 8.8.8.8 to resolve s3.amazonaws.com. So to get the ip and then download the file shouldn't be the problem, right? Or maybe i just misinterpret this message

DanSchaper · August 3, 2019, 10:41pm

Pi-hole is seeing that the list source is blocked and then uses 8.8.8.8 to resolve the IP of the target list. That all appears to be functioning as it should. What looks like is happening is that either the target URL is gone or there is something else preventing you from downloading that URL.

A quick check would be to set the /etc/resolv.conf to 8.8.8.8 on the Pi-hole and then trying to curl that file. If that fails then you should get a reason for the failure to download. Just set the /etc/resolv.conf back to what it was previously when you are done.

ryrun · August 3, 2019, 10:49pm

It works via ssh/console:

pi@raspberrypi:~ $ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# nameserver 127.0.0.1
nameserver 8.8.8.8
pi@raspberrypi:~ $ curl https://s3.amazonaws.com/lists.disconnect.me/simple_malvertising.txt -v -s > /dev/null
...
...
{ [5 bytes data]
< HTTP/1.1 200 OK
< x-amz-id-2: mN14KgY58xfeu58D7t/29mFUHBgD1pczcstlLHnnOTZpR2YIXYO0CXaQ85VRPVM/vkZMYuQU/ds=
< x-amz-request-id: E7B8D1382377E7EC
< Date: Sat, 03 Aug 2019 22:48:08 GMT
< Last-Modified: Sat, 03 Aug 2019 22:21:25 GMT
< ETag: "1e776c1ad54b664a18f97f81580ebc91"
< Accept-Ranges: bytes
< Content-Type: application/octet-stream
< Content-Length: 44181
< Server: AmazonS3
<
{ [5 bytes data]
* Connection #0 to host s3.amazonaws.com left intact

DanSchaper · August 3, 2019, 10:52pm

Reset /etc/resolv.conf back to localhost and then try pihole -g again to see if it was a temporary error that is fixed now.

ryrun · August 3, 2019, 10:54pm

Ok, nothing changed, same problem:

pi@raspberrypi:~ $ cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.0.1
pi@raspberrypi:~ $ pihole -g
  [i] Pi-hole blocking is enabled
  [i] Neutrino emissions detected...
  [✓] Pulling blocklist source list into range

  [i] Target: raw.githubusercontent.com (hosts)
  [✓] Status: Retrieval successful

  [i] Target: mirror1.malwaredomains.com (justdomains)
  [✓] Status: No changes detected

  [i] Target: sysctl.org (hosts)
  [✓] Status: No changes detected

  [i] Target: zeustracker.abuse.ch (blocklist.php?download=domainblocklist)
  [✓] Status: No changes detected

  [i] Target: s3.amazonaws.com (simple_tracking.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt (200000)
  [✗] List download failed: using previously cached list

DanSchaper · August 3, 2019, 11:03pm

Do you have s3.amazonaws.com as a manual blacklist item? Can you get the results of pihole -q --exact s3.amazonaws.com?

ryrun · August 3, 2019, 11:08pm

Nope, its an partial match in one of my lists:

  [i] Over 100 results found for s3.amazonaws.com
        This can be overridden using the -all option

I got this after using your command:

pi@raspberrypi:~ $ pihole -q --exact s3.amazonaws.com
  [i] No exact results found for -s3.amazonaws.com within the block lists

Currently checking the gravity.sh source for how the download should work, when the domain is blocked.

DanSchaper · August 3, 2019, 11:11pm

Do you have any regex blocks added? You can run sudo bash /opt/pihole/gravity.sh to see the output, looking for a line that has httpCode as something not 200 or 304.

DanSchaper · August 3, 2019, 11:12pm

Sorry, try pihole -q -exact s3.amazonaws.com and note only one dash before the word exact.

ryrun · August 3, 2019, 11:18pm

I made a dirty change to the gravity.sh script to get the curl command:

  [i] Target: s3.amazonaws.com (simple_tracking.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
  [i] Status: Pending...
curl -s -L --resolve s3.amazonaws.com:443:s3-1.amazonaws.com. 52.216.133.53 -z /etc/pihole/list.4.s3.amazonaws.com.domains -w %{http_code} -A Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt -o /tmp/tmp.T5wHOS76TO.phgpb

Is this right for an 8.8.8.8 dns resolving? As far as i see, nothing else will be executed before, when this curl request is fired.

Found an exact match:

pi@raspberrypi:~ $ pihole -q -exact s3.amazonaws.com
 Exact match for s3.amazonaws.com found in:
   list.13.v.firebog.net.domains

DanSchaper · August 3, 2019, 11:23pm

What was the result/response from the curl? If you run sudo bash /opt/pihole/gravity.sh you'll get the full display with all of the result variables and string checks to see why the script is showing a failure.

Blocking the entire s3 infrastructure is overkill but you can find the exact name of the list blocking via:

pihole -q -exact -adlist s3.amazonaws.com

ryrun · August 3, 2019, 11:31pm

The output is the same:

pi@raspberrypi:~ $ sudo bash /opt/pihole/gravity.sh
  [i] Pi-hole blocking is enabled
  [i] Neutrino emissions detected...
  [✓] Pulling blocklist source list into range

  [i] Target: raw.githubusercontent.com (hosts)
  [✓] Status: Retrieval successful

  [i] Target: mirror1.malwaredomains.com (justdomains)
  [✓] Status: No changes detected

  [i] Target: sysctl.org (hosts)
  [✓] Status: No changes detected

  [i] Target: zeustracker.abuse.ch (blocklist.php?download=domainblocklist)
  [✓] Status: No changes detected

  [i] Target: s3.amazonaws.com (simple_tracking.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt (200000)
  [✗] List download failed: using previously cached list

  [i] Target: s3.amazonaws.com (simple_ad.txt)
  [i] Status: Pending...^C

  [i] User-abort detected
  [✓] Cleaning up stray matter
  [✓] DNS service is running
  [✓] Pi-hole blocking is Enabled
pi@raspberrypi:~ $ pihole -g
  [i] Pi-hole blocking is enabled
  [i] Neutrino emissions detected...
  [✓] Pulling blocklist source list into range

  [i] Target: raw.githubusercontent.com (hosts)
  [✓] Status: Retrieval successful

  [i] Target: mirror1.malwaredomains.com (justdomains)
  [✓] Status: No changes detected

  [i] Target: sysctl.org (hosts)
  [✓] Status: No changes detected

  [i] Target: zeustracker.abuse.ch (blocklist.php?download=domainblocklist)
  [✓] Status: No changes detected

  [i] Target: s3.amazonaws.com (simple_tracking.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt (200000)
  [✗] List download failed: using previously cached list

  [i] Target: s3.amazonaws.com (simple_ad.txt)
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt
  [i] Status: Pending...^C

  [i] User-abort detected
  [✓] Cleaning up stray matter
  [✓] DNS service is running
  [✓] Pi-hole blocking is Enabled

Result of the curl. I've added single quotes for the useragent string, which was missing from my echo output. It's strange that there is HTML in the tmp file.

pi@raspberrypi:~ $ curl -s -L --resolve s3.amazonaws.com:443:s3-1.amazonaws.com. 52.216.133.53 -z /etc/pihole/list.4.s3.amazonaws.com.domains -w %{http_code} -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Geck
o) Chrome/70.0.3538.102 Safari/537.36' https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt -o /tmp/tmp.T5wH
OS76TO.phgpb
200000pi@raspberrypi:~ $ head -n3 /tmp/tmp.T5wHOS76TO.phgpb
<!DOCTYPE html>
<html class="no-js aws-lng-en_US" lang="en-US" data-static-assets="https://a0.awsstatic.com" data-js-version="1.0.294" data-css-version="1.0.295">
 <head>
pi@raspberrypi:~ $ head -n10 /tmp/tmp.T5wHOS76TO.phgpb
<!DOCTYPE html>
<html class="no-js aws-lng-en_US" lang="en-US" data-static-assets="https://a0.awsstatic.com" data-js-version="1.0.294" data-css-version="1.0.295">
 <head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <link rel="dns-prefetch" href="https://a0.awsstatic.com" />
  <link rel="dns-prefetch" href="//d0.awsstatic.com" />
  <link rel="dns-prefetch" href="//d1.awsstatic.com" />
  <title>Cloud Object Storage | Store &amp; Retrieve Data Anywhere | Amazon Simple Storage Service</title>
  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />

DanSchaper · August 3, 2019, 11:35pm

Okay, then let's try verbose output:

sudo bash -x /opt/pihole/gravity.sh

DanSchaper · August 3, 2019, 11:38pm

Can you tell me exactly what list is blocking s3 so that I can try to duplicate the situation here?

ryrun · August 3, 2019, 11:39pm

Yep, sorry i missed this:

pi@raspberrypi:~ $ pihole -q -exact -adlist s3.amazonaws.com
 Exact match for s3.amazonaws.com found in:
   https://v.firebog.net/hosts/Kowabit.txt

ryrun · August 3, 2019, 11:41pm

Verbose output:

++ pihole -q -adlist s3.amazonaws.com
++ head -n1
++ awk -F 'Match found in ' '{print $2}'
+ bad_list=
+ echo -e '\r  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt'
  [✗] Status: s3.amazonaws.com is blocked by . Using DNS on 8.8.8.8 to download https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt
+ echo -ne '  [i] Status: Pending...'
  [i] Status: Pending...+ cmd_ext='--resolve s3.amazonaws.com:443:s3-1.amazonaws.com.
52.216.138.149 '
++ curl -s -L --resolve s3.amazonaws.com:443:s3-1.amazonaws.com. 52.216.138.149 -z /etc/pihole/list.4.s3.amazonaws.com.domains -w '%{http_code}' -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36' https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt -o /tmp/tmp.6aqmRvxwx2.phgpb
+ httpCode=200000
+ case $url in
+ case "${httpCode}" in
+ echo -e '\r  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt (200000)'
  [✗] Status: https://s3.amazonaws.com/lists.disconnect.me/simple_tracking.txt (200000)
+ [[ '' == true ]]
+ [[ -r /etc/pihole/list.4.s3.amazonaws.com.domains ]]
+ echo -e '  [✗] List download failed: using previously cached list'
  [✗] List download failed: using previously cached list
+ echo ''

httpCode=200000 seems wrong for me here.

DanSchaper · August 3, 2019, 11:42pm

Okay, that's the problem, there's no such http code. Checking on that.