Pihole "DNS Service Not running" and "FTL offline" after v5 installation

Pihole "DNS Service Not running" and "FTL offline"

Problem with Beta 5.0:
Pihole "DNS Service Not running" and "FTL offline" no Internet connectivity

I upgraded my Pihole from 4 > 5 following the official guide.

Before this the Pihole was working.

During install I had multiple errors, I had to change back to my router as DNS and DHCP, edited /etc/resolve.conf to restore Internet access to the Pihole device (Debian PC)
I ran through the installation again and it installs but with some errors:
/usr/local/bin/pihole: line 118: service: command not found

When I run pihole -r and repair it completes successfully

Debug Token:
https://tricorder.pi-hole.net/hvk8zxgy0v

I found that dnsmasq service wasn't installed, not sure why this was removed when installation of v5 failed.

After installing this the DNS service is running but FTL engine still appears offline.

When I run pihole -r it completes but there is an error:
[✗] /usr/local/bin/pihole: line 118: service: command not found

I found that dnsmasq and FTL engine are both trying to use port 53 but the latter cannot start because the port is already in use, I killed the dnsmasq service then started FTL and the errors have disappeared in the web interface.

After waiting then refreshing the web interface both errors have appeared again, likely because dnsmasq wasn't restarted.

Please generate a fresh debug log, upload it and post the token here.

[✓] Your debug token is: https://tricorder.pi-hole.net/rbg2ecaabj

Your debug log shows dnsmasq running on port 53:

[53] is in use by dnsmasq (https://discourse.pi-hole.net/t/hardware-software-requirements/273#ports)
[53] is in use by dnsmasq (https://discourse.pi-hole.net/t/hardware-software-requirements/273#ports)

Let's confirm and then you will need to not only stop dnsmasq, but also kill it and likely remove it. Pi-hole has dnsmasq embedded in pihole-FTL, and a separate dnsmasq on the host is not required and causes problems as you have discovered.

sudo netstat -nltup | grep 'Proto\|:53 \|:5053 \|:5353 \|:5335 \|:8953 \|:67 \|:80 \|:471'

root@SVR:/home/user# sudo netstat -nltup | grep 'Proto\|:53 \|:5053 \|:5353 \|:5335 \|:8953 \|:67 \|:80 \|:471'
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      2616/lighttpd       
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      564/dnsmasq         
tcp        0      0 127.0.0.1:5335          0.0.0.0:*               LISTEN      617/unbound         
tcp6       0      0 :::80                   :::*                    LISTEN      2616/lighttpd       
tcp6       0      0 :::53                   :::*                    LISTEN      564/dnsmasq         
udp        0      0 0.0.0.0:5353            0.0.0.0:*                           376/avahi-daemon: r 
udp        0      0 0.0.0.0:53              0.0.0.0:*                           564/dnsmasq         
udp        0      0 127.0.0.1:5335          0.0.0.0:*                           617/unbound         
udp6       0      0 :::5353                 :::*                                376/avahi-daemon: r 
udp6       0      0 :::53                   :::*                                564/dnsmasq

sudo service dnsmasq stop

sudo systemctl disable dnsmasq

sudo apt-get remove dnsmasq-base

sudo service pihole-FTL start

I have run the commands and it's now back to the state before I installed dnsmasq

DNS service not running and FTL offline

[✓] Your debug token is: https://tricorder.pi-hole.net/fge2wfyp67

Please note I had to edit /etc/resolve.conf to include a DNS IP so that I can upload the debug log, not sure if this affects the results.

Thanks for your support.

See if any errors show up:

sudo systemctl status --full --no-pager pihole-FTL

journalctl --no-pager -u pihole-FTL | tail -20

root@SVR:/home/user# sudo systemctl status --full --no-pager pihole-FTL
● pihole-FTL.service - LSB: pihole-FTL daemon
   Loaded: loaded (/etc/init.d/pihole-FTL; generated)
   Active: active (exited) since Thu 2020-04-30 07:56:16 BST; 5min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 367 ExecStart=/etc/init.d/pihole-FTL start (code=exited, status=0/SUCCESS)

Apr 30 07:55:51 SVR systemd[1]: Starting LSB: pihole-FTL daemon...
Apr 30 07:55:53 SVR pihole-FTL[367]: Not running
Apr 30 07:55:56 SVR su[425]: (to pihole) root on none
Apr 30 07:56:00 SVR su[425]: pam_unix(su:session): session opened for user pihole by (uid=0)
Apr 30 07:56:16 SVR pihole-FTL[367]: FTL started!
Apr 30 07:56:16 SVR systemd[1]: Started LSB: pihole-FTL daemon.
root@SVR:/home/user# journalctl --no-pager -u pihole-FTL | tail -20
-- Logs begin at Thu 2020-04-30 07:55:43 BST, end at Thu 2020-04-30 08:01:19 BST. --
Apr 30 07:55:51 SVR systemd[1]: Starting LSB: pihole-FTL daemon...
Apr 30 07:55:53 SVR pihole-FTL[367]: Not running
Apr 30 07:55:56 SVR su[425]: (to pihole) root on none
Apr 30 07:56:00 SVR su[425]: pam_unix(su:session): session opened for user pihole by (uid=0)
Apr 30 07:56:16 SVR pihole-FTL[367]: FTL started!
Apr 30 07:56:16 SVR systemd[1]: Started LSB: pihole-FTL daemon.
root@SVR:/home/user# 

I did some further testing and found that FTL will start and then crash after some time

root@SVR:/home/user# sudo /etc/init.d/pihole-FTL start
Not running

FTL started!

root@SVR:/home/user# sudo /etc/init.d/pihole-FTL status
[ ok ] pihole-FTL is running
root@SVR:/home/user# sudo /etc/init.d/pihole-FTL status
[ ok ] pihole-FTL is running
root@SVR:/home/user# sudo /etc/init.d/pihole-FTL status
[    ] pihole-FTL is not running

Can you have a look in /var/log/pihole-FTL.log to see why FTL stopped?

It complains that gravity.db doesn't exist, it's a long log but there's also a segfault which I believe is caused by reading outside allocated memory:


[2020-04-30 09:51:34.403 2666] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[2020-04-30 09:51:34.403 2666] ---------------------------->  FTL crashed!  <----------------------------
[2020-04-30 09:51:34.403 2666] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[2020-04-30 09:51:34.403 2666] Please report a bug at https://github.com/pi-hole/FTL/issues
[2020-04-30 09:51:34.403 2666] and include in your report already the following details:
[2020-04-30 09:51:34.403 2666] FTL has been running for 13 seconds
[2020-04-30 09:51:34.403 2666] FTL branch: release/v5.0
[2020-04-30 09:51:34.403 2666] FTL version: vDev-5b0cfb5
[2020-04-30 09:51:34.403 2666] FTL commit: 5b0cfb5
[2020-04-30 09:51:34.403 2666] FTL date: 2020-04-28 21:05:08 +0200
[2020-04-30 09:51:34.403 2666] FTL user: started as pihole, ended as pihole
[2020-04-30 09:51:34.403 2666] Compiled for x86_64 (compiled on CI) using gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
[2020-04-30 09:51:34.403 2666] Received signal: Segmentation fault
[2020-04-30 09:51:34.403 2666]      at address: 0x10
[2020-04-30 09:51:34.403 2666]      with code: SEGV_MAPERR (Address not mapped to object)
[2020-04-30 09:51:34.403 2666] Backtrace:
[2020-04-30 09:51:34.404 2666] B[0000]: 0x560471605689, /usr/bin/pihole-FTL(+0x31689) [0x560471605689]
[2020-04-30 09:51:34.404 2666] B[0001]: 0x7f540044a730, /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730) [0x7f540044a730]
[2020-04-30 09:51:34.404 2666] B[0002]: 0x5604715fe155, /usr/bin/pihole-FTL(in_whitelist+0xf5) [0x5604715fe155]
[2020-04-30 09:51:34.404 2666] B[0003]: 0x560471608e49, /usr/bin/pihole-FTL(+0x34e49) [0x560471608e49]
[2020-04-30 09:51:34.404 2666] B[0004]: 0x56047160a467, /usr/bin/pihole-FTL(_FTL_new_query+0x707) [0x56047160a467]
[2020-04-30 09:51:34.404 2666] B[0005]: 0x56047162410d, /usr/bin/pihole-FTL(receive_query+0xa6d) [0x56047162410d]
[2020-04-30 09:51:34.404 2666] B[0006]: 0x56047163a8ab, /usr/bin/pihole-FTL(+0x668ab) [0x56047163a8ab]
[2020-04-30 09:51:34.404 2666] B[0007]: 0x56047163c71c, /usr/bin/pihole-FTL(main_dnsmasq+0x129c) [0x56047163c71c]
[2020-04-30 09:51:34.404 2666] B[0008]: 0x5604715f88ac, /usr/bin/pihole-FTL(main+0xdc) [0x5604715f88ac]
[2020-04-30 09:51:34.404 2666] B[0009]: 0x7f540029b09b, /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f540029b09b]
[2020-04-30 09:51:34.404 2666] B[0010]: 0x5604715f8a1a, /usr/bin/pihole-FTL(_start+0x2a) [0x5604715f8a1a]
[2020-04-30 09:51:34.404 2666] ------ Listing content of directory /dev/shm ------
[2020-04-30 09:51:34.404 2666] File Mode User:Group  Filesize Filename
[2020-04-30 09:51:34.404 2666] rwxrwxrwx root:root 260 .
[2020-04-30 09:51:34.404 2666] rwxr-xr-x root:root 3K ..
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 4K FTL-per-client-regex
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 4K FTL-dns-cache
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 12K FTL-overTime
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 262K FTL-queries
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 4K FTL-upstreams
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 20K FTL-clients
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 98K FTL-domains
[2020-04-30 09:51:34.405 2666] rw------- pihole:pihole 4K FTL-strings
[2020-04-30 09:51:34.406 2666] rw------- pihole:pihole 12 FTL-settings
[2020-04-30 09:51:34.406 2666] rw------- pihole:pihole 124 FTL-counters
[2020-04-30 09:51:34.406 2666] rw------- pihole:pihole 48 FTL-lock
[2020-04-30 09:51:34.406 2666] ---------------------------------------------------
[2020-04-30 09:51:34.406 2666] Thank you for helping us to improve our FTL engine!
[2020-04-30 09:51:34.406 2666] FTL terminated!

I updated Pihole with pihole -up and the core updated.

After restarting the issue is similar, DNS and FTL are offline, when I start FTL manually (sudo /etc/init.d/pihole-FTL start) it shows as active for a while but crashes again with a different error:

Apr 30 11:17:53 dnsmasq[560]: read /etc/pihole/gravity.list - 1 addresses
Apr 30 11:17:53 dnsmasq[560]: query[SOA] local from 127.0.0.1
Apr 30 11:33:34 dnsmasq[2148]: started, version pi-hole-2.81 cachesize 10000
Apr 30 11:33:34 dnsmasq[2148]: compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect inotify dumpfile
Apr 30 11:33:34 dnsmasq[2148]: using only locally-known addresses for domain use-application-dns.net
Apr 30 11:33:34 dnsmasq[2148]: using nameserver 208.67.220.220#53
Apr 30 11:33:34 dnsmasq[2148]: using nameserver 208.67.222.222#53
Apr 30 11:33:34 dnsmasq[2148]: read /etc/hosts - 5 addresses
Apr 30 11:33:34 dnsmasq[2148]: read /etc/pihole/local.list - 2 addresses
Apr 30 11:33:34 dnsmasq[2148]: failed to load names from /etc/pihole/black.list: No such file or directory
Apr 30 11:33:34 dnsmasq[2148]: bad address at /etc/pihole/gravity.list line 2

I decided to uninstall PiHole because I've been unable to fix or address the issue.

Both aren't fatal errors. This was not the reason for the crash you've seen.

This is too unfortunate as it makes us unable to trace down the bug. Note that you've tried a beta version of the code and we explicitly said

Please do not run this if you are not comfortable with digging into any issues that may arise. That said, we would like to have some support in making sure we have every imaginable configuration covered before release. Pi-hole can already do so much, it is almost impossible to test all features ourselves properly.

And, again, please use the “Beta 5.0” Category on our Discourse Forum to discuss the beta/report any findings. We’ll be there to give help and update the beta quickly in case you find any errors.

However, with "quick", we obviously cannot mean within a few hours as we're all only volunteers.

As one last thing, do you still have the log? What were the lines above

?

You said

what was this error precisely and was it happening immediately above the crash report?

I still have the log, I will try and recover it from the machine soon.

I had some other issues with the machine when uninstalling Pihole, as I ran apt-get autoremove I lost most functionality when this removed core packages, I'm looking to reimage it once I've got the log.

From memory the log basically said that Gravity.db doesn't exist / can't be opened and yes this did happen just before the above log posted.

There were around 82k lines where it was trying to read from the list and stating the entry was invalid for each line.

I had this - everything worked but FTL and DNS service would randomly stop after a period of time.

I fixed it by running a command to checkout the latest v5 FTL (I forget the exact command - its in another topic somewhere) and since then FTL and DNS have been rock solid

EDIT found it - its:

pihole checkout ftl release/v5.0

Running this and my FTL hasn't crashed since

2 Likes

Unfortunately I wasn't able to recover the log file.

After the uninstall and turning off the PC it wouldn't turn on, I've spent most the weekend diagnosing it and found the PSU was faulty. I replaced it but when I started Debian I couldn't startx and it would've been a lot of extra work to recover the log file.

I've since installed Ubuntu server and installed pihole V5 Beta following the same procedure but on a clean install and it's working fine so far, I think the issue was likely caused by some old settings conflicting when upgrading.

Thanks for the software and also the help, I appreciate it

A post was split to a new topic: FTL will not stay running

A post was split to a new topic: Pi-hole won't stay running