I recently got a 512mb ram VPS (centos) to run pihole on, installed no issues, however a 3M block list resulted in a tremendous amount of RAM being used:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
pihole 2638 12.4 63.6 553272 333820 ? Sl 13:50 0:09 /usr/bin/pihole-FTL
I've looked through the forums, and noticed in High CPU, High RAM, not working, dnsmasq issues? - Solved that the behavior is mentioned as "expected"... due to duplicate cache entries. I tried the NXDOMAIN change, however that only dropped the usage by like 0.6% (I already have ipv6 disabled). Is this RAM usage solely related to FTL creating an internal cache of ALL hosts/domains from the blacklists? If so, this unfortunately makes no sense to me, especially when running on a VPS, since the DNS request rates are low enough to use SQLite/Mongo and avoid filling RAM. Is there any way to change the FTL "backend storage" to perhaps SQLite (or some other sane storage), instead of storing in RAM?
P.S. I've been a pi-hole fan for a while, and this is my first VPS install. I certainly appreciate all the effort that has been put into it, and can donate dev time if needed (Perl/Bash mainly)
Without knowing what each number represents in that output, it is hard to tell how much RAM FTL is using. From a guess, it looks like maybe 553272, which probably represents the number of bytes, so only half a megabyte. Half a megabyte is nothing to worry about, especially when you have 1,000 times that amount available. Again, this is guesswork as you have not provided enough information to know how much memory FTL is actually using.
As for if FTL can use a database instead of RAM for the cache, the TL;DR answer is no. Basically, FTL injects the blocked domains into dnsmasq's cache, which is not controlled by FTL and is stored in memory. To change how the cache works would require a lot of changes and would make updating the dnsmasq version very hard. In addition, the performance of the cache would suffer greatly, as it now has to talk to a database instead of being a straightforward in-memory lookup.
Sorry about that, I forgot to add the 'ps' header. I updated it now. That was a regular ps output, and VSZ was at 540MB while RSS was at 325MB (both are represented in kB). After reducing the lists down some more:
Thanks for the quick rundown on why it doesn't support an alternate cache storage engine. Sounds like a cache hack with no interoperability
However, I strongly disagree in terms of cache performance degradation. Redis for example would probably work best since it can manage both RAM, persistent cache limits, and can easily handle 70000+ queries/second on a single core. I highly doubt anyone goes over 70k DNS queries/second using pi-hole as-is, to where that cap would even make a difference. If that were the case, they likely have good enough hardware to handle the loads, and Redis performance would match their needs as much as resources allow.
Personally, I use my pi-hole VPS for home use and securing my kids phones using DoT (nginx reverse proxy to pi-hole, which also runs the webui). The average home user (and I'm sure others using pi-hole on servers with large lists) would gladly trade off a few nanoseconds while Redis (or whatever cache storage backend) is called for a host lookup, versus bloating the RAM as-is. All I'm saying is it can be made a whole lot more lightweight by just changing the cache storage method , and it's certainly something that should be considered for future releases (along with out-the-box support for serving DoT).
That only applies to cache that can be free'd without issues... I don't need other services (like vpn) oom'ing due to RAM bloat. If the sole purpose of this instance was run FTL with 3-4m domains, then yeah I could see how having them all loaded in RAM could come in handy. The problem is that this usage doesn't follow standard malloc practices, and the unused memory is never released. For all I know I could have 1m domains that never get queries, just taking up RAM for no reason, vs keeping 1m frequently queried in memory, which would rightfully hold up space.
Since a change like that would involve the dnsmasq base code I think you will have more success with asking the author and developers of dnsmasq why or why not to use cache. If they agree that cache is something of value then it will be included with FTL. Personally I think it's complex enough as it is without adding in even more complexity like a redis engine or memcached.