FTL-queries outgrowing /dev/shm

foolishtacos · November 14, 2024, 4:16am

Expected Behaviour:

Query log should not be allowed to outgrow /dev/shm.

Actual Behaviour:

Query log is allowed to outgrow /dev/shm leading to a FTL crash loop.

Debug Token:

FTL-queries resize errors just before crash:

WARNING: RAM shortage (/dev/shm) ahead: 95% is used (/dev/shm: 63.9MB used, 67.1MB total, FTL uses 63.8MB)
[2024-11-13 19:02:59.374 2642787M] add_message(type=7, message=/dev/shm) - SQL error step DELETE: database is locked
[2024-11-13 19:02:59.374 2642787M] Error while trying to close database: database is locked
[2024-11-13 19:02:59.388 2642787M] Resizing "FTL-queries" from 63537152 to (1138688 * 56) == 63766528 (/dev/shm: 64.1MB used, 67.1MB total, FTL uses 64.1MB)

I don't have a DNS loop and I haven't changed any pihole or DNS-related config in months/years. This morning I started having a loop of the error above and then a crash, and after flushing logs for the last 24 hours the issue is gone. I think what happened is my internet service went down for maintenance overnight and during the outage all my devices must have gone into a rage and filled up the log space. Unfortunately I didn't think to grab stats/logs before I flushed them. But knowing that being offline does spike the DNS queries on my network, it got me thinking whether the management of FTL logs should handle the out of space scenario somehow to avoid this issue repeating when WAN goes down for long enough. Given the option, I would opt for dropping earliest entries when out of /dev/shm space; it might be bad to write to disk in case a loop was the cause.

rdwebdesign · November 14, 2024, 5:23am

Note:

Logs are stored in /var/log/pihole, not in /dev/shm.
Other files are stored in /dev/shm.
Also, /dev/shm is not on disk. It is actually RAM memory.

foolishtacos · November 14, 2024, 7:54am

Yes, I should have been more precise. I assume /dev/shm/FTL-queries is some sort of representation of queries that have occurred. In any event, it's outgrowing /dev/shm and causing a crash loop. Is my concern about this happening due to external factors reasonable, and do you have thoughts as to mitigation?

Bucking_Horn · November 14, 2024, 8:24am

In-memory usage is directly proportional to the number of DNS requests Pi-hole has processed over the course of the last 24 hours.
An overall value of 64MB is quite heavy - my own Pi-hole would show a total between 2M and 4M, at 5,000 to 10,000 DNS requests a day, where FTL-queries would claim 500K to 1M.

This would suggest that you are seeing an excessive amount of DNS requests.

As Pi-hole is on the receiving end, and you've ruled out a DNS loop, you'd have to find and tune down the offending client issuing those requests.

Usually, the following commands could help with that:

echo ">stats >quit" | nc localhost 4711

echo ">top-clients >quit" | nc localhost 4711

But your debug log suggests that your long term database had no data for the past 24 hours:

*** [ DIAGNOSING ]: contents of /var/log/pihole

   -----tail of FTL.log------
   [2024-11-13 19:21:09.117 2643969M] Imported 0 queries from the long-term database
   [2024-11-13 19:21:09.117 2643969M]  -> Total DNS queries: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Cached DNS queries: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Forwarded DNS queries: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Blocked DNS queries: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Unknown DNS queries: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Unique domains: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Unique clients: 0
   [2024-11-13 19:21:09.117 2643969M]  -> Known forward destinations: 0

Similar, your current shm consumption looks normal:

*** [ DIAGNOSING ]: contents of /dev/shm
total 744K
-rw------- 1 pihole pihole  84K Nov 13 19:24 FTL-clients
-rw------- 1 pihole pihole  248 Nov 13 19:21 FTL-counters
-rw------- 1 pihole pihole  12K Nov 13 19:50 FTL-dns-cache
-rw------- 1 pihole pihole  12K Nov 13 19:39 FTL-domains
-rw------- 1 pihole pihole   88 Nov 13 19:21 FTL-lock
-rw------- 1 pihole pihole 8.0K Nov 13 19:21 FTL-overTime
-rw------- 1 pihole pihole 4.0K Nov 13 19:21 FTL-per-client-regex
-rw------- 1 pihole pihole 224K Nov 13 20:04 FTL-queries
-rw------- 1 pihole pihole   16 Nov 13 19:21 FTL-settings
-rw------- 1 pihole pihole  80K Nov 13 19:57 FTL-strings
-rw------- 1 pihole pihole 308K Nov 13 19:21 FTL-upstreams

This means you'd have to wait for the issue to reoccur before running those commands.

system · December 5, 2024, 8:25am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.