Pi-hole says 'DNS Service not running' even though service is functional

Expected Behaviour:

Item 1 - Query Log - when I go to the query log and press 'show all' it should show all.

item 2 - pi-hole UI status banner (top left corner) - should show 'dns is running, temp:, etc'

Actual Behaviour:

item 1 - query log - when I go to the query log and press 'show all' it says 'an unknown error has occurred while loading the data'. Looking in /var/log/lighttpd/error.log shows the following:

2020-07-28 22:15:25: (mod_fastcgi.c.2543) FastCGI-stderr: PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes) in /var/www/html/admin/api_FTL.php on line 319

item 2 - pi-hole UI status banner (top left corner) - shows 'dns is not running' even though the service is most certainly running and operating normally.

netstat -nltup | grep 'Proto\|:53 \|:5053 \|:5353 \|:5335 \|:8953 \|:67 \|:80 \|:471'
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:4711          0.0.0.0:*               LISTEN      20946/pihole-FTL
tcp        0      0 192.168.1.4:80          0.0.0.0:*               LISTEN      17983/lighttpd
tcp        0      0 192.168.1.61:80         0.0.0.0:*               LISTEN      14315/docker-proxy
tcp        0      0 192.168.1.4:53          0.0.0.0:*               LISTEN      20946/pihole-FTL
tcp6       0      0 :::80                   :::*                    LISTEN      17983/lighttpd
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           637/avahi-daemon: r
udp        0      0 192.168.1.4:53          0.0.0.0:*                           20946/pihole-FTL
udp        0      0 192.168.1.60:53         0.0.0.0:*                           14205/docker-proxy
udp6       0      0 :::5353                 :::*                                637/avahi-daemon: r

Debug Token:

https://tricorder.pi-hole.net/y2i099lnod

Quering "All" in Query Log (not Long-Term database!) should not exhaust PHP's memory. You can try to restart lighttpd and see if it helps.

Could be browser cache issue. Did you recently update your pihole installation? Have you tried to clear your browser's cache?

Yes, have restarted the lighthttpd service multiple times, as well as clearing browser cache and retsarting the browser itself (Firefox).

I recently upgraded from v4 to v5 a week or so ago I think? That's literally the only significant change in close to a year - just periodic 'pihole -up' sessions.

You could increase PHP memory size - but I'm still not sure why pihole needs so much for displaying the regular query log.

Yes, but look at the message - 'allowed memory size of 128mb exhausted (tried to allocate 20mb)'. That just seems off. lighttpd itself is only using 66mb of memory in my system as shown by pmap, so if it's only trying to grab 20mb to work with the 'show all' function of the query log why is it somehow erroring out and saying it exhausted 128mb allotted for php?

I'm not sure if the memory used by lighttpd is technically even factored in when php is doing it's thing, pretty sure it's its own chunk of memory just for php processing. So what else is in php memory to exhaust that 128mb when the 'show all' query log function only needs a small portion?

PHP has it's own separate memory, defined by PHP's configuration files. 128mb is all that PHP is allowed with your configuration, that's a hard limit. You could have 4 TB of RAM and PHP would only use 128mb, any attempt to use more would get that same error.

Edit: Clarification. PHP tried to allocate 20mb for what ever function it needed to use it for. That 20mb allocation put it over the limit of 128mb and the request was denied. You'd need to increase the limit if you want to handle larger datasets.

Cool, thanks. My system has 32gb memory available, so I can probably dial it up a decent bit, but before I change anything, I have a couple questions:

1 - is each launch of php by a process given 128mb or is it 128mb in total? Reason I ask is that I also have a Unifi Controller instance running on this system, could they possibly be colliding in memory use on php or will they both have their own 128mb chunk?

2 - If the answer to 1 is 'shared memory', is there a way to see the way php memory is currently allocated?

I don't know honestly. That's something that would need outside research.

Fair enough. Doubled total memory of php to 256mb memory, bounced lighttpd and attempted a 'show all' on the query log again. It took it a minute (almost literally) to give me the results, but it did it. all 130,000 results of the past 24hrs. Yikes.

I strongly believe you're not talking about the query log, but the long-term data/query log as you see results from the last 24h and not just the last 100 queries (as regular query log does show).

Out of interest: in what kind of environment are you using pihole? How may clients do you have? Seeing 130.000 queries in 24h is a lot. You might want to further increase the memory limit if you ever going to query something beyond 24h.

It's a house with 3 tech savvy adults. So 3 smart phones, 3 desktop pc's, a number of laptops (work and personal), smart home integration with segregated IoT network (30-40 devices/sensors?), multiple video game consoles, smart appliances, etc.

Looking at the activity count though - the three most active clients are windows 10 systems (more than 26,000 queries each) that relate to my desktop, my work laptop, and my housemate's work laptop.

And yeah - the basic query log (last 100 entries) has never thrown an error, only the 'show all' which goes back however far that is designed to go. But the 'last 100 entries' has never really shown much beyond almost 'this instant' kind of history - so I've had to hit the 'show all' link at the top of the page fairly frequently.

Ah, my fault. I messed up the drop-down "All" with the close-by "show all".
But then it is clear why it run out of memory...

:flushed:

Precisely. I have no clue why those three machines need to incessantly hit DNS queries like that over a 24hr period, but you can see why something like pi-hole would be super helpful :wink: I also have a docker lancache setup for things like game downloads, OS patching, etc. Currently it seems like doubling php memory helped and hasn't impacted the system it runs on any, so that's one problem down.

I'm still stumped on why the UI says 'dns service not running' though, it absolutely is otherwise the whole house would break.

Minor bump and update of topic subject since one issue has been resolved (doubled php memory to compensate for 100,000+ queries by my environment of 50+ clients)

'DNS Service not running' is reported in multiple places through pi-hole (gravity update, dashboard/UI in top left corner, etc). But DNS is running and functioning fine.

pi@ph5:~ $ grep -i 'DNS Service not running' -R /var/www/html/admin/
/var/www/html/admin/scripts/pi-hole/php/header.php:                            echo '<a id="status"><i class="fa fa-circle text-red"></i> DNS service not running</a>';
---
pi@ph5:~ $ less /var/www/html/admin/scripts/pi-hole/php/header.php
[..]
                    <p>Status</p>
                        <?php
                        $pistatus = exec('sudo pihole status web');
                        if ($pistatus == "1") {
                            echo '<a id="status"><i class="fa fa-circle text-green-light"></i> Active</a>';
                        } elseif ($pistatus == "0") {
                            echo '<a id="status"><i class="fa fa-circle text-red"></i> Offline</a>';
                        } elseif ($pistatus == "-1") {
                            echo '<a id="status"><i class="fa fa-circle text-red"></i> DNS service not running</a>';
                        } else {
                            echo '<a id="status"><i class="fa fa-circle text-orange"></i> Unknown</a>';
                        }
[..]

pi@ph5:~ $ which pihole
/usr/local/bin/pihole

pi@ph5:~ $ less /usr/local/bin/pihole
[..]
statusFunc() {
  # Determine if service is running on port 53 (Cr: https://superuser.com/a/8063
31)
  if (echo > /dev/tcp/127.0.0.1/53) >/dev/null 2>&1; then
    if [[ "${1}" != "web" ]]; then
      echo -e "  ${TICK} DNS service is running"
    fi
  else
    case "${1}" in
      "web") echo "-1";;
      *) echo -e "  ${CROSS} DNS service is NOT running";;
    esac
    return 0
  fi
[..]

Looks like it depends on results for below:

pi@ph5:~ $ echo > /dev/tcp/127.0.0.1/53
pi@ph5:~ $

Example with wrong port:

pi@ph5:~ $ echo > /dev/tcp/127.0.0.1/99999
-bash: connect: Connection refused
-bash: /dev/tcp/127.0.0.1/99999: Connection refused

Is pihole-FTL listening on 127.0.0.1 ?

pi@ph5:~ $ sudo netstat -nltup | grep 'Proto\|:53 '
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      1663/pihole-FTL
tcp6       0      0 :::53                   :::*                    LISTEN      1663/pihole-FTL
udp        0      0 0.0.0.0:53              0.0.0.0:*                           1663/pihole-FTL
udp6       0      0 :::53                   :::*                                1663/pihole-FTL

No firewall active ?

pi@ph5:~ $ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

That is definitely the issue - pihole is checking for it's DNS on 127.0.0.1 and I'm not running it there. The default config is for pihole to grab all interfaces as shown by the 0.0.0.0:53 listen, but I can't do that because I've got a docker lan-cache process running and the 0.0.0.0 style of pihole-FTL was interfering by grabbing port 53 on all the docker container IP's as well. So I told pi-hole to only bind to 192.168.1.4, which means it's no longer listening on 127.0.0.1.

So the option is to either have pihole-FTL check for itself on IP:port combos that it's config'd for (seems like a bug/issue that it doesn't pay attention to it's own config already to do this), or that I need to specify a second bind port and tell it to listen on 192.168.1.4:53 and 127.0.0.1:53, but given that the UI only has 'bind to everything' or 'bind to only this interface', I'm not sure how to have it function on more than one IP but less than all IP's.

Recommendations?

How did you do that exactly?

Back in 2018 I had a problem with pihole using all interfaces instead of just the single one (Pi-hole DNS runs on all interfaces/addresses rather than just eth0) and someone pointed me to a similar complaint/solution (Make pihole-FTL bind only on certain IPs [v4.0] - #6 by deHakkelaar) - but I haven't tried it on v5 yet. Honestly it didn't occur to me to re-try this solution, any v5 exports know if this will work?

That worked from that point until recently with the v5 upgrade. I'll have to scrub through my history or re-attempt searches to see if I can find links I've already clicked on but after the v5 upgrade I stumbled upon using this command - pihole -a interface all - which was referenced as being a toggle to the UI's 'listen on all/single' function that wasn't quite working right, and that was great and everything was hunky dory for a while, then I patch further into v5 and it seems to have borked itself again.

So you recall that you tweaked some dnsmasq settings (and only those?), but do not exactly know which.

Let's take a look at your dnsmasq settings:

grep -nRv '^#\|^$' /etc/dnsmasq.d

That 02-pihole-enp0s31f6-only.conf file has a datestamp of August 2018, aligning directly with the forum post I made that was linked above. If I can specify a 127.0.0.1 in that listen-address line that would likely resolve the issue, perhaps?

/etc/dnsmasq.d/01-pihole.conf:22:addn-hosts=/etc/pihole/custom.list
/etc/dnsmasq.d/01-pihole.conf:25:localise-queries
/etc/dnsmasq.d/01-pihole.conf:28:no-resolv
/etc/dnsmasq.d/01-pihole.conf:32:cache-size=10000
/etc/dnsmasq.d/01-pihole.conf:34:log-queries
/etc/dnsmasq.d/01-pihole.conf:35:log-facility=/var/log/pihole.log
/etc/dnsmasq.d/01-pihole.conf:37:local-ttl=2
/etc/dnsmasq.d/01-pihole.conf:39:log-async
/etc/dnsmasq.d/01-pihole.conf:40:server=192.168.1.60#53
/etc/dnsmasq.d/01-pihole.conf:41:domain-needed
/etc/dnsmasq.d/01-pihole.conf:42:bogus-priv
/etc/dnsmasq.d/01-pihole.conf:43:local-service
/etc/dnsmasq.d/01-pihole.conf:44:server=/use-application-dns.net/
/etc/dnsmasq.d/02-lan.conf:1:addn-hosts=/etc/pihole/lan.list
/etc/dnsmasq.d/02-pihole-enp0s31f6-only.conf:1:listen-address=192.168.1.4
/etc/dnsmasq.d/02-pihole-enp0s31f6-only.conf:2:bind-interfaces