Pihole stopped serving DNS requests

Please follow the below template, it will help us to help you!

Expected Behaviour:

pihole services DNS reqests

Actual Behaviour:

First users connecting got no DNS requests answered.
Hard booted the Pihole server to restore service.
I have no expectation of locating the issue, but noticed this in the pihole-FTL log.

[2020-01-10 09:34:06.293 679] Resizing "/FTL-strings" from 65536 to 69632
[2020-01-10 09:34:06.293 679] FATAL: realloc_shm(): Failed to open shared memory
 object "/FTL-strings": Permission denied
[2020-01-10 12:09:27.060 684] Using log file /var/log/pihole-FTL.log
[2020-01-10 12:09:27.062 684] ########## FTL started! ##########

That silence from 09:34 until 12:09 when I reloaded the pihole server looks ominous.

Does that error give any hint as to the cause.?

Debug Token:

https://tricorder.pi-hole.net/wwr26xrhzo

Maybe...
If you do you run that on an RPi, you'll regularly see time jumps in your logs. RPis are missing a proper RTC, the fake hw-clock will load a last known timestamp from a file on boot and use that until time is synced from the network again.

Likely, you were suffering an outage from somewhere shortly after 9:34 until you rebooted at 12:09. Without further data, you can't be sure about this. Your observed time gap may or may not be the result of such a time-sync update.

As to the error, a bit of a wide shot, but since I've seen you experimenting with dnsmasq and pihole-FTL in other topics on this forum:
If you'd run pihole-FTL under a different user previously, and if that run somehow not terminated correctly, it could leave shared memory locks that Pi-hole's user is not able to remove.

Getting rid of those locks by running the following command on your Pi-hole machine could help you fixing this:

sudo rm /dev/shm/FTL*

Restart pihole-FTL after that.

Graph on the restarted pihole server agrees with the logfile.

Local network has a Pi based tier 1 NTP service. Namepi is synced to it. Holds time to within 100 nanoseconds or so.

Namepi has been up and running for some time, probably ten days or so, with no issues up til now.

namepi is dedicated to providing name service, so other than housekeeping, there should be nothing competing with pihole-FTL, other than the other two name servers upstream of pihole-FTL.

As far as I know any scheduled activities such as backup run between 02:00 and 05:00.

pihole-FTL usually runs as user pihole. Can't prove it was before the reboot, but it certainly is now. pihole user was set up by the install script I guess, no idea its properties.

netadmin@namepi:/var/log $ ps aux | grep pihole-FTL 
pihole    1690  0.2  1.8  72876 18172 ?        Sl   12:22   0:08 /usr/bin/pihole-FTL
netadmin  4952  0.0  0.0   4368   560 pts/0    S+   13:28   0:00 grep --color=auto pihole-FTL
netadmin@namepi:/var/log $

Doubt any of the original shm config config survived the hard boot it got just after 12 :slight_smile:

I'm pretty sure that the issue was pihole-FTL locking up, thinking back, before the reboot the admin display did not have any entries in the boxes across the top and the graphs did not display.

Unless that permission issue in the log means something to somebody, I guess this will stay a mystery.

Harry

Apologies - my time-sync remark was aimed at explaining the time gap you observe, not trying to suggest a cause for your outage.
(I am going to rephrase that sentence above).

Maybe a developer could add his insights if a mere failed FTL-String resize could bring Pi-hole to a complete halt.

You are quite probably right about that :wink:

Looks pretty fatal :slight_smile:

Yep, if FTL can't write to the shared memory where everything is stored then bad things happen.

This could be a one time glitch due to a power problem or other interruption, or it could be an early sign that you are having system or hardware problems.

All those are possibilities. Sort of doubtful it's a power glitch in the incoming supply, there are at least six other Pi's on the same feed.

PSU is a standard Pi foundation unit, as are all the other pi's.

Only thing that comes to mind is that that Pi is using a 16g microSD. Possibly space issue?

If it happens again, is there any data I can capture before I reload it? Any shm diagnostics I can pre load so I can get more data if I see another incident.

Harry

netadmin@namepi:/dev/shm $ ls -al
total 1236
drwxrwxrwt  2 root   root       220 Jan 10 12:22 .
drwxr-xr-x 14 root   root      3420 Jan 10 12:09 ..
-rw-------  1 pihole pihole  323584 Jan 10 12:22 FTL-clients
-rw-------  1 pihole pihole     108 Jan 10 12:22 FTL-counters
-rw-------  1 pihole pihole   65536 Jan 10 12:54 FTL-domains
-rw-------  1 pihole pihole   12288 Jan 10 12:22 FTL-forwarded
-rw-------  1 pihole pihole      28 Jan 10 19:09 FTL-lock
-rw-------  1 pihole pihole   53248 Jan 10 12:22 FTL-overTime
-rw-------  1 pihole pihole 1179648 Jan 10 19:09 FTL-queries
-rw-------  1 pihole pihole      12 Jan 10 12:22 FTL-settings
-rw-------  1 pihole pihole   28672 Jan 10 12:58 FTL-strings
netadmin@namepi:/dev/shm $

Looks reasonable to my view

Harry

The files were regenerated when you restarted the FTL daemon.

Looks like you are running with FTL in debug mode, you can enable SHM debugging as well at the expense of more verbosity in logging.

https://docs.pi-hole.net/ftldns/configfile/

netadmin@namepi:/etc/pihole $ cat pihole-FTL.conf 
PRIVACYLEVEL=0
RESOLVE_IPV6=no
AAAA_QUERY_ANALYSIS=no
DEBUG_NETWORKING=true
DEBUG_SHMEM=true
netadmin@namepi:
Jan 11 11:53:01 namepi pihole-FTL[15244]: FTL started!
Jan 11 11:53:01 namepi systemd[1]: Started LSB: pihole-FTL daemon.
netadmin@namepi:
*** [ DIAGNOSING ]: contents of /dev/shm
-rw------- 1 pihole pihole 323584 Jan 11 11:53 /dev/shm/FTL-clients
-rw------- 1 pihole pihole 108 Jan 11 11:53 /dev/shm/FTL-counters
-rw------- 1 pihole pihole 65536 Jan 11 11:53 /dev/shm/FTL-domains
-rw------- 1 pihole pihole 12288 Jan 11 11:53 /dev/shm/FTL-forwarded
-rw------- 1 pihole pihole 28 Jan 11 11:53 /dev/shm/FTL-lock
-rw------- 1 pihole pihole 53248 Jan 11 11:53 /dev/shm/FTL-overTime
-rw------- 1 pihole pihole 1179648 Jan 11 11:55 /dev/shm/FTL-queries
-rw------- 1 pihole pihole 12 Jan 11 11:53 /dev/shm/FTL-settings
-rw------- 1 pihole pihole 24576 Jan 11 11:53 /dev/shm/FTL-strings

Harry

You need to watch the FTL log:

-rw-r--r-- 1 pihole pihole 13980 Jan 10 12:38 /var/log/pihole-FTL.log
   -----head of pihole-FTL.log------
   [2020-01-10 09:34:06.293 679] Resizing "/FTL-strings" from 65536 to 69632
   [2020-01-10 09:34:06.293 679] FATAL: realloc_shm(): Failed to open shared memory object "/FTL-strings": Permission denied
   [2020-01-10 12:09:27.060 684] Using log file /var/log/pihole-FTL.log
   [2020-01-10 12:09:27.062 684] ########## FTL started! ##########
   [2020-01-10 12:09:27.062 684] FTL branch: master
   [2020-01-10 12:09:27.062 684] FTL version: v4.3.1
   [2020-01-10 12:09:27.062 684] FTL commit: b60d63f
   [2020-01-10 12:09:27.062 684] FTL date: 2019-05-25 21:37:26 +0200
   [2020-01-10 12:09:27.062 684] FTL user: pihole
   [2020-01-10 12:09:27.062 684] Starting config file parsing (/etc/pihole/pihole-FTL.conf)
   [2020-01-10 12:09:27.062 684]    SOCKET_LISTENING: only local
   [2020-01-10 12:09:27.062 684]    AAAA_QUERY_ANALYSIS: Hide AAAA queries
   [2020-01-10 12:09:27.062 684]    MAXDBDAYS: max age for stored queries is 365 days
   [2020-01-10 12:09:27.062 684]    RESOLVE_IPV6: Don't resolve IPv6 addresses
   [2020-01-10 12:09:27.062 684]    RESOLVE_IPV4: Resolve IPv4 addresses
   [2020-01-10 12:09:27.062 684]    DBINTERVAL: saving to DB file every minute
   [2020-01-10 12:09:27.062 684]    DBFILE: Using /etc/pihole/pihole-FTL.db
   [2020-01-10 12:09:27.063 684]    MAXLOGAGE: Importing up to 24.0 hours of log data
   [2020-01-10 12:09:27.063 684]    PRIVACYLEVEL: Set to 0
   [2020-01-10 12:09:27.063 684]    IGNORE_LOCALHOST: Show queries from localhost
   [2020-01-10 12:09:27.063 684]    BLOCKINGMODE: Null IPs for blocked domains
   [2020-01-10 12:09:27.063 684]    ANALYZE_ONLY_A_AND_AAAA: Disabled. Analyzing all queries
   [2020-01-10 12:09:27.063 684]    DBIMPORT: Importing history from database
   [2020-01-10 12:09:27.063 684]    PIDFILE: Using /var/run/pihole-FTL.pid
   [2020-01-10 12:09:27.063 684]    PORTFILE: Using /var/run/pihole-FTL.port
   [2020-01-10 12:09:27.063 684]    SOCKETFILE: Using /var/run/pihole/FTL.sock
   [2020-01-10 12:09:27.063 684]    WHITELISTFILE: Using /etc/pihole/whitelist.txt
   [2020-01-10 12:09:27.063 684]    BLACKLISTFILE: Using /etc/pihole/black.list
   [2020-01-10 12:09:27.063 684]    GRAVITYFILE: Using /etc/pihole/gravity.list
   [2020-01-10 12:09:27.063 684]    REGEXLISTFILE: Using /etc/pihole/regex.list
   [2020-01-10 12:09:27.063 684]    SETUPVARSFILE: Using /etc/pihole/setupVars.conf
   [2020-01-10 12:09:27.063 684]    AUDITLISTFILE: Using /etc/pihole/auditlog.list
   [2020-01-10 12:09:27.063 684]    MACVENDORDB: Using /etc/pihole/macvendor.db
   [2020-01-10 12:09:27.063 684]    PARSE_ARP_CACHE: Active
   [2020-01-10 12:09:27.064 684] ************************

   -----tail of pihole-FTL.log------
   [2020-01-10 12:22:37.686 1688] Successfully accessed setupVars.conf
   [2020-01-10 12:22:37.691 1690] PID of FTL process: 1690
   [2020-01-10 12:22:37.692 1690] Listening on port 4711 for incoming IPv4 telnet connections
   [2020-01-10 12:22:37.692 1690] Interface lo is IPv4
   [2020-01-10 12:22:37.692 1690] Interface eth0 is IPv4
   [2020-01-10 12:22:37.692 1690] Interface dummy0 is IPv4
   [2020-01-10 12:22:37.692 1690] Interface dummy1 is IPv4
   [2020-01-10 12:22:37.692 1690] Interface lo is IPv4
   [2020-01-10 12:22:37.692 1690] Interface eth0 is IPv4
   [2020-01-10 12:22:37.692 1690] Interface dummy0 is IPv4
   [2020-01-10 12:22:37.692 1690] Interface dummy1 is IPv4
   [2020-01-10 12:22:37.692 1690] Found 8 IPv4 and 0 IPv6 capable interfaces
   [2020-01-10 12:22:37.693 1690] Listening on Unix socket
   [2020-01-10 12:22:37.694 1690] Received SIGHUP, reloading cache
   [2020-01-10 12:22:37.694 1690] Blocking status is enabled
   [2020-01-10 12:22:37.694 1690] INFO: No whitelist file found
   [2020-01-10 12:22:37.694 1690] Compiled 0 Regex filters and 0 whitelisted domains in 0.1 msec (0 errors)
   [2020-01-10 12:22:37.694 1690] ************************
   [2020-01-10 12:22:37.694 1690] * Debugging enabled    *
   [2020-01-10 12:22:37.694 1690] * DEBUG_DATABASE   NO  *
   [2020-01-10 12:22:37.694 1690] * DEBUG_NETWORKING YES *
   [2020-01-10 12:22:37.694 1690] * DEBUG_LOCKS      NO  *
   [2020-01-10 12:22:37.694 1690] * DEBUG_QUERIES    NO  *
   [2020-01-10 12:22:37.694 1690] * DEBUG_FLAGS      NO  *
   [2020-01-10 12:22:37.694 1690] * DEBUG_SHMEM      NO  *
   [2020-01-10 12:22:37.694 1690] * DEBUG_GC         NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_ARP        NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_REGEX      NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_API        NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_OVERTIME   NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_EXTBLOCKED NO  *
   [2020-01-10 12:22:37.695 1690] * DEBUG_CAPS       NO  *
   [2020-01-10 12:22:37.695 1690] ************************
   [2020-01-10 12:22:39.050 1690] /etc/pihole/gravity.list: parsed 113801 domains (took 1355.0 ms)
   [2020-01-10 12:38:10.697 1690] Resizing "/FTL-queries" from 983040 to 1179648
1 Like

I have remote syslog on this server, was looking through my syslog server and came across syslog entries covering this crunch. Not sure if they are of any use in tracking this, but possibly, so posted here:

The initial log is for a DHCP refresh. Think the next logs are chrontab driven pihole-FTL activiies.

Jan 10 09:26:59 namepi systemd-timesyncd[315]: Network configuration changed, trying to establish connection.
Jan 10 09:26:59 namepi systemd-timesyncd[315]: Synchronized to time server 192.168.101.120:123 (192.168.101.120).
Jan 10 09:30:01 namepi CRON[18667]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 10 09:30:01 namepi CRON[18671]: (root) CMD (   PATH="$PATH:/usr/local/bin/" pihole updatechecker local)
Jan 10 09:30:02 namepi CRON[18667]: pam_unix(cron:session): session closed for user root
Jan 10 09:34:06 namepi systemd-logind[377]: Removed session c2.
Jan 10 09:34:06 namepi systemd[1]: Stopping User Manager for UID 999...
Jan 10 09:34:06 namepi systemd[668]: Stopped target Default.
Jan 10 09:34:06 namepi systemd[668]: Stopped target Basic System.
Jan 10 09:34:06 namepi systemd[668]: Stopped target Paths.
Jan 10 09:34:06 namepi systemd[668]: Stopped target Timers.
Jan 10 09:34:06 namepi systemd[668]: Stopped target Sockets.
Jan 10 09:34:06 namepi systemd[668]: Closed GnuPG cryptographic agent (access for web browsers).
Jan 10 09:34:06 namepi systemd[668]: Closed GnuPG cryptographic agent (ssh-agent emulation).
Jan 10 09:34:06 namepi systemd[668]: Closed D-Bus User Message Bus Socket.
Jan 10 09:34:06 namepi systemd[668]: Closed GnuPG cryptographic agent and passphrase cache.
Jan 10 09:34:06 namepi systemd[668]: Closed GnuPG cryptographic agent and passphrase cache (restricted).
Jan 10 09:34:06 namepi systemd[668]: Reached target Shutdown.
Jan 10 09:34:06 namepi systemd[668]: Starting Exit the Session...
Jan 10 09:34:06 namepi systemd[668]: Received SIGRTMIN+24 from PID 18691 (kill).
Jan 10 09:34:06 namepi systemd: pam_unix(systemd-user:session): session closed for user pihole
Jan 10 09:34:06 namepi systemd[1]: Stopped User Manager for UID 999.
Jan 10 09:34:06 namepi systemd[1]: Removed slice User Slice of pihole.
Jan 10 09:39:01 namepi CRON[18701]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 10 09:39:01 namepi CRON[18705]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Jan 10 09:39:01 namepi CRON[18701]: pam_unix(cron:session): session closed for user root
Jan 10 09:39:25 namepi systemd[1]: Starting Clean php session files...
Jan 10 09:39:25 namepi systemd[1]: Started Clean php session files.
Jan 10 09:40:01 namepi CRON[18754]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 10 09:40:01 namepi CRON[18758]: (root) CMD (   PATH="$PATH:/usr/local/bin/" pihole updatechecker local)
Jan 10 09:40:01 namepi CRON[18754]: pam_unix(cron:session): session closed for user root
Jan 10 09:50:01 namepi CRON[18785]: pam_unix(cron:session): session opened for user root by (uid=0)
Jan 10 09:50:01 namepi CRON[18789]: (root) CMD (   PATH="$PATH:/usr/local/bin/" pihole updatechecker local)
Jan 10 09:50:01 namepi CRON[18785]: pam_unix(cron:session): session closed for user root

Harry

As fate would have it, I have just experienced an outage on my RPi Zero today myself. I initiated a reboot at about 10:45.

My syslog does seem quite different from yours, but interestingly enough, they share a line :

Jan 14 09:21:03 piname rngd[519]: stats: HRNG source speed: (min=92.649; avg=460.828; max=837.209)Kibits/s
Jan 14 09:21:03 piname rngd[519]: stats: FIPS tests speed: (min=730.359; avg=4001.054; max=6353.692)Kibit/s
Jan 14 09:21:03 piname rngd[519]: stats: Lowest ready-buffers level: 2
Jan 14 09:21:03 piname rngd[519]: stats: Entropy starvations: 0
Jan 14 09:21:03 piname rngd[519]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
=> Jan 14 09:30:01 piname CRON[17740]: (root) CMD (   PATH="$PATH:/usr/local/bin/" pihole updatechecker local)
Jan 14 09:17:16 piname fake-hwclock[88]: Tue 14 Jan 08:17:02 UTC 2020
Jan 14 09:17:16 piname systemd-fsck[103]: e2fsck 1.43.4 (31-Jan-2017)
Jan 14 09:17:16 piname systemd-fsck[103]: rootfs: clean, 42526/221792 files, 342204/939008 blocks
Jan 14 09:17:16 piname systemd[1]: Started Create Static Device Nodes in /dev.
Jan 14 09:17:16 piname kernel: [    0.000000] Booting Linux on physical CPU 0x0
(...)
Jan 14 10:47:56 piname systemd[637]: Time has been changed

It would seem that Pi-hole stopped working somewhere after 09:30 after intiating
pihole updatechecker local.

Are there any further logs I could take a look at?

/var/log/pihole-FTL.log is where I found the "Fatal" log.

Harry

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.