OpenDNS - "reply error is REFUSED"

Hello,

Starting today, without any known reason, the OpenDNS upstreams server stop working with PiHole.
My logs are full of :

Mar 15 22:13:13 dnsmasq[1174]: forwarded [xp.itunes-apple.com.akadns.net] to 208.67.220.220
Mar 15 22:13:13 dnsmasq[1174]: reply error is REFUSED

As soon as I change the upstream DNS server to an other one, everything is working great.
So the problem seems related to OpenDNS. But if I try to use OpenDNS resolver from my laptop, the resolver works great. The issue is really between PiHole and OpenDNS.

mystery mystery. Any ideas ?
Thanks

[Full disclosure: I'm an employee of Cisco and my teams works on the OpenDNS resolvers. View expressed here are my own though informed by my history and exposure to the product.]

Sounds like you may be hitting a ratelimit. We only use REFUSED in a few cases, primarily when rate limits have been violated but also if the domain in questions appears to be part of certain attack patterns (I don't think that's the case here.) Do you have an OpenDNS account or are you just pointing to OpenDNS without an account? Accounts have higher ratelimits (as we have a means to contact you if things are going sideways.)

support@opendns.com is always an option to, but I doubt it's anything specific to Pi-Hole that's causing it.

1 Like

To get a feeling for the amount of queries, please post the output of

echo ">stats >quit" | nc localhost 4711

Additionally, please upload a debug log and post just the token that is generated after the log is uploaded by running the following command from the Pi-hole host terminal:

pihole -d

or do it through the Web interface:

Tools > Generate Debug Log

Hello,

Many thanks guys. I'm a Cisco employee as well, so I probably have a high rate limits.
Having said that, I do not see any error messages in my OpenDNS/Umbrella console.

Here is the output of the the request command. Does it look too high ?

domains_being_blocked 82717
dns_queries_today 44559
ads_blocked_today 1379
ads_percentage_today 3.094773
unique_domains 2542
queries_forwarded 35244
queries_cached 7757
clients_ever_seen 2
unique_clients 2
dns_queries_all_types 44559
reply_NODATA 95
reply_NXDOMAIN 198
reply_CNAME 838
reply_IP 1668
privacy_level 0
status enabled

And the output of the debug log:
(I had to configure 1.1.1.1 for the upstream DNS server, as a temporary workaround)

https://tricorder.pi-hole.net/b5ry5y0pfq

Unless all that volume is happening in a short amount of time (check your OpenDNS or Pi Hole dashboard), no. That volume should be fine. I'm pushing significantly more than that traffic through :slight_smile: Something to try (if you are still seeing REFUSED) is debug.opendns.com/TXT lookup (dig txt debug.opendns.com or nslookup -type=txt debug.opendns.com) Make sure originid is not 0. Otherwise, I'd suggest a ticket. If you are a Cisco Employee, there may be some additional contacts you can reach out to. A couple of us that work on the resolvers run pi-hole at home (we even modified FTL for a hack-a-thon project to embed device id.) I'll shoot you a PM.

The stats report the number for the last 24h - but of course they could have happend in a very short time and afterwards only a few could have been generated.


The debug log doesn't indicate a reason for the REFUSED. The only things that I noticed is a mismatch of the current and the configured IP for eth0


*** [ DIAGNOSING ]: Networking
[✓] IPv4 address(es) bound to the eth0 interface:
   172.17.0.2/16 does not match the IP found

    IPV4_ADDRESS=192.168.1.51

and a failure of the web interface

    REV_SERVER=false
    INSTALL_WEB_SERVER=true
    INSTALL_WEB_INTERFACE=true

*** [ DIAGNOSING ]: Pi-hole processes
[✗] lighttpd daemon is inactive


   2021-03-14 08:41:07: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-14 08:41:07: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket-0 after connect failed: Connection refused 
   2021-03-15 08:41:07: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-15 08:41:07: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket-0 after connect failed: Connection refused 
   2021-03-15 15:38:28: (server.c.2059) server stopped by UID = 0 PID = 0 
   2021-03-15 21:46:02: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-15 22:08:58: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-15 22:08:58: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket-0 after connect failed: Connection refused 
   2021-03-15 22:10:01: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-15 22:10:01: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket-0 after connect failed: Connection refused 
   2021-03-16 08:40:00: (server.c.2059) server stopped by UID = 0 PID = 0 
   2021-03-16 08:41:14: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-17 08:41:08: (server.c.1464) server started (lighttpd/1.4.53) 
   2021-03-17 08:41:08: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket-0 after connect failed: Connection refused 

I confirm that as soon as I switch back to OpenDNS as the upstream server, I do have tons of REFUSED in the logs of my PiHole.

And I confirm that the 45.000 request does not happen in a short amount of time.

Here is the output of the requested command.

Care to share that modification back to the project?

I didn't open a PR as I wasn't sure if there was a desire for it above what I was doing for a few of us internally. I didn't do anything for the WebUI to support this. Happy to make it an official PR (likely needs rebased since the last change pushed there is 2018

1 Like

I'm seeing this a lot as well, is there any news from our insiders at Cisco? :slight_smile:

@tresni I see the Pi-hole developers have a good point that this should be sent to the dnsmasq mailing list. Did you already do this? I haven't seen it. Maybe it can still make it into 2.85 if you are fast. It will then already be included in the next FTL release (I feel this is close)-

I have not yet. I will try to do it today, I have everything ready, just need to actually make sure it patches nicely against the dnsmasq source and then create a clean patch.

Patch has been submitted.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.