FTL failed to start since last update

My FTL service has been failing ever since this update.
Ubuntu Server 20.4.1
"Failed to start LSB: pihole-FTL daemon"

https://tricorder.pi-hole.net/teqjteu0zj

Please generate a debug log with pihole -d and post the token here.

https://tricorder.pi-hole.net/teqjteu0zj

If it matters, I am also using unbound, all setup with the directions provided by Pi-Hole.net
This all has worked fine for a long time, only the last few days have sucked.
I tried a full reinstall of the server and pi-hole, and that seemed to fix it last night, but this morning it was not working again. I was strongly considering switching off Ubuntu server, but im not sure any other options would be better.

Please run

pihole restartdns

and post the output of

sudo service pihole-FTL status

as well as

cat /var/log/pihole-FTL.log

I am reasonably certain that failure at this point is Unbound, not pi-hole. I tried a complete reinstall and when I got to the part of installing unbound, I get this during the installation:

==========Screen Output==============
Created symlink /etc/systemd/system/multi-user.target.wants/unbound.service → /lib/systemd/system/unbound.service.
Created symlink /etc/systemd/system/unbound.service.wants/unbound-resolvconf.service → /lib/systemd/system/unbound-resolvconf.service
.
Job for unbound.service failed because the control process exited with error code.
See "systemctl status unbound.service" and "journalctl -xe" for details.
Job for unbound.service failed because the control process exited with error code.
See "systemctl status unbound.service" and "journalctl -xe" for details.
invoke-rc.d: initscript unbound, action "start" failed.
● unbound.service - Unbound DNS server

  • Loaded: loaded (/lib/systemd/system/unbound.service; enabled; vendor preset: enabled)*
    
  • Active: activating (auto-restart) (Result: exit-code) since Mon 2021-01-25 20:37:44 UTC; 7ms ago*
    
  •   Docs: man:unbound(8)*
    
  • Process: 2369 ExecStartPre=/usr/lib/unbound/package-helper chroot_setup (code=exited, status=0/SUCCESS)*
  • Process: 2372 ExecStartPre=/usr/lib/unbound/package-helper root_trust_anchor_update (code=exited, status=0/SUCCESS)*
  • Process: 2376 ExecStart=/usr/sbin/unbound -d $DAEMON_OPTS (code=exited, status=1/FAILURE)*
  • Main PID: 2376 (code=exited, status=1/FAILURE)*
    Processing triggers for systemd (245.4-4ubuntu3.4) ...
    Processing triggers for man-db (2.9.1-1) ...
    Processing triggers for libc-bin (2.31-0ubuntu9.1) ...
    ==========End Screen Output==============

I can continue the configuration and it will work briefly, and then FTL will fail again.

I am going to file a bug report for unbound.

I posted below previously:

[1611608043] unbound[2716:0] notice: Start of unbound 1.9.4.
[1611608043] unbound[2716:0] debug: increased limit(open files) from 1024 to 4152
[1611608043] unbound[2716:0] debug: creating udp6 socket ::1 53
[1611608043] unbound[2716:0] debug: creating tcp6 socket ::1 53
[1611608043] unbound[2716:0] error: can't bind socket: Address already in use for ::1 port 53 (len 28)
[1611608043] unbound[2716:0] fatal error: could not open ports

Pi-hole is already running on port 53:

pi@ph5:~ $ sudo netstat -nltup | grep 'Proto\|:53 \|:5335 '
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      2207/pihole-FTL
tcp        0      0 127.0.0.1:5335          0.0.0.0:*               LISTEN      16839/unbound
tcp6       0      0 :::53                   :::*                    LISTEN      2207/pihole-FTL
udp        0      0 0.0.0.0:53              0.0.0.0:*                           2207/pihole-FTL
udp        0      0 127.0.0.1:5335          0.0.0.0:*                           16839/unbound
udp6       0      0 :::53                   :::*                                2207/pihole-FTL

You probably made a configuration mistake somewhere.
Follow below guide exactly and in particular the "port: 5335" configuration bit:

https://docs.pi-hole.net/guides/dns/unbound/

So, I have done a complete reinstall, checked all the config, and its working for now, however I expect this to fail again tomorrow. I got it working last night and it failed this morning. Once it fails, I will continue the other troubleshooting steps with pi-hole as previously recommended.

Well this morning, you most likely lost that port: 5335 directive somehow.
Thats not a thing thats expected to happen on its own.
Keep an eye out on below one if outputs any:

sudo zgrep -i voltage /var/log/syslog*

So, I am checking "cat /var/log/pihole-FTL.log" and its growing rapidly and massively:

[2021-01-25 21:17:24.889 880M] Resizing "FTL-queries" from 473956352 to (7409664 * 64) == 474218496 (/dev/shm: 474.3MB used, 8.3GB total)
[2021-01-25 21:17:25.355 880M] Resizing "FTL-queries" from 474218496 to (7413760 * 64) == 474480640 (/dev/shm: 474.6MB used, 8.3GB total)
[2021-01-25 21:17:25.822 880M] Resizing "FTL-queries" from 474480640 to (7417856 * 64) == 474742784 (/dev/shm: 474.8MB used, 8.3GB total)
[2021-01-25 21:17:26.295 880M] Resizing "FTL-queries" from 474742784 to (7421952 * 64) == 475004928 (/dev/shm: 475.1MB used, 8.3GB total)
[2021-01-25 21:17:26.822 880M] Resizing "FTL-queries" from 475004928 to (7426048 * 64) == 475267072 (/dev/shm: 475.3MB used, 8.3GB total)
[2021-01-25 21:17:27.289 880M] Resizing "FTL-queries" from 475267072 to (7430144 * 64) == 475529216 (/dev/shm: 475.6MB used, 8.3GB total)
[2021-01-25 21:17:27.759 880M] Resizing "FTL-queries" from 475529216 to (7434240 * 64) == 475791360 (/dev/shm: 475.9MB used, 8.3GB total)
[2021-01-25 21:17:28.227 880M] Resizing "FTL-queries" from 475791360 to (7438336 * 64) == 476053504 (/dev/shm: 476.1MB used, 8.3GB total)
[2021-01-25 21:17:28.702 880M] Resizing "FTL-queries" from 476053504 to (7442432 * 64) == 476315648 (/dev/shm: 476.4MB used, 8.3GB total)
[2021-01-25 21:17:29.176 880M] Resizing "FTL-queries" from 476315648 to (7446528 * 64) == 476577792 (/dev/shm: 476.7MB used, 8.3GB total)
[2021-01-25 21:17:29.649 880M] Resizing "FTL-queries" from 476577792 to (7450624 * 64) == 476839936 (/dev/shm: 476.9MB used, 8.3GB total)
[2021-01-25 21:17:30.119 880M] Resizing "FTL-queries" from 476839936 to (7454720 * 64) == 477102080 (/dev/shm: 477.2MB used, 8.3GB total)
[2021-01-25 21:17:30.590 880M] Resizing "FTL-queries" from 477102080 to (7458816 * 64) == 477364224 (/dev/shm: 477.4MB used, 8.3GB total)
[2021-01-25 21:17:31.063 880M] Resizing "FTL-queries" from 477364224 to (7462912 * 64) == 477626368 (/dev/shm: 477.7MB used, 8.3GB total)
[2021-01-25 21:17:31.539 880M] Resizing "FTL-queries" from 477626368 to (7467008 * 64) == 477888512 (/dev/shm: 478.0MB used, 8.3GB total)
[2021-01-25 21:17:32.009 880M] Resizing "FTL-queries" from 477888512 to (7471104 * 64) == 478150656 (/dev/shm: 478.2MB used, 8.3GB total)
[2021-01-25 21:17:32.485 880M] Resizing "FTL-queries" from 478150656 to (7475200 * 64) == 478412800 (/dev/shm: 478.5MB used, 8.3GB total)
[2021-01-25 21:17:32.968 880M] Resizing "FTL-queries" from 478412800 to (7479296 * 64) == 478674944 (/dev/shm: 478.8MB used, 8.3GB total)
[2021-01-25 21:17:33.426 880M] Resizing "FTL-queries" from 478674944 to (7483392 * 64) == 478937088 (/dev/shm: 479.0MB used, 8.3GB total)

Could this be why it worked after a reinstall and failed in the morning?

output is blank/non-existent

● pihole-FTL.service - LSB: pihole-FTL daemon
Loaded: loaded (/etc/init.d/pihole-FTL; generated)
Active: active (exited) since Mon 2021-01-25 21:03:51 UTC; 16min ago
Docs: man:systemd-sysv-generator(8)
Process: 815 ExecStart=/etc/init.d/pihole-FTL start (code=exited, status=0/SUCCESS)

Jan 25 21:03:50 paganpihole systemd[1]: Starting LSB: pihole-FTL daemon...
Jan 25 21:03:50 paganpihole pihole-FTL[815]: Not running
Jan 25 21:03:50 paganpihole su[869]: (to pihole) root on none
Jan 25 21:03:50 paganpihole su[869]: pam_unix(su:session): session opened for user pihole by (uid=0)
Jan 25 21:03:51 paganpihole su[869]: pam_unix(su:session): session closed for user pihole
Jan 25 21:03:51 paganpihole systemd[1]: Started LSB: pihole-FTL daemon.

Again, just FYI, its up and running for the moment after another clean reinstall.

You most likely have created a DNS forwarding loop somewhere causing high numbers of queries because queries got trapped in a loop.
Do you have conditional forwarding (CF) enabled in Pi-hole ?
Try disable this one for diagnosing.
And tell us what IP is configured in the router WAN/Internet DNS settings, is it that of Pi-hole thus potentially creating a loop if CF is active ?

EDIT: Try default all your router DNS settings and do exactly below to distribute the Pi-hole IP as single DNS server to all your clients through DHCP:

https://docs.pi-hole.net/main/post-install/

Okay, CF was active and causing a forwarding loop. I disabled that, restarted, and now FTL has failed again.

journalctl -u pihole-FTL | tail -20

?

EDIT: One more:

pihole-FTL --test

Okay, here is what I have done so far;

  1. I have turned off CF
  2. cleared the DB
  3. restarted all services (working).
    I also made one change to my ASUS router:
    Excessive requests for dns.msftncsi.com - #5 by DanSchaper

I know this is probably not a good approach for the ASUS router, and I may have to reverse it later, but we shall see for now. Everything is functional, and we shall see if it stays up and running.

Warning if want to use the Asus router to advertise the Pi-hole IP through DHCP:

So, this is no longer an issue for me, it has not failed again since yesterday. I have no idea why it failed in the first place or why I had to complete a full reinstall twice to get it working and stable.

Because the trapped forever looping queries caused by the CF option put the host under heavy strain.
Memory and other resources run out and if no memory, processes cant function like intended/designed.