If I compare memory stats before and after restart of FTL I see this:
pi@pi5:~$ free
total used free shared buff/cache available
Mem: 8128200 5624908 216192 15332 2498632 2503292
Swap: 2097148 512 2096636
pi@pi5:~$ sudo service pihole-FTL stop
pi@pi5:~$ free
total used free shared buff/cache available
Mem: 8128200 2335220 3510496 3300 2481988 5792980
Swap: 2097148 512 2096636
pi@pi5:~$ sudo service pihole-FTL start
pi@pi5:~$ free
total used free shared buff/cache available
Mem: 8128200 2444984 3381772 12648 2510500 5683216
Swap: 2097148 512 2096636
pi@pi5:~$
So over 3 GB used by FTL. The thing is, it will keep growing over the course of several days. I haven't let it keep going to see what happens when it keeps going, so I don't know if it'll bring the system down or not. This behavior doesn't happen on the development branch.
The extra processes are probably dedidcated TCP workers, please run something like
grep 2076641 /var/log/pihole/pihole.log
to see if there is anything related in the log file. They should terminate themselves after a short timeout, I will try to reproduce why they don't. The memory htop claims they are using isn't actually used. Linux uses a method called copy-on-write (COW) which ensures that additional processes (that are copies of another) do not need to duplicate the memory when splitting out.
Your htop and btop screenshots are in conflict with each other, have they been done at the same time?
I checked the related code changes again and found nothing that would justify any difference with regard to development. Could you create two new screenshots at (roughly) the same time from htop and btop so I can compare them?
I have restarted Pi-hole a short time ago, so the memory issue has not had time to develop yet. Would you like screen shots now, or wait until the memory consumption grows?
When we look now at htop, we see 1.7% for the entire FTL process which is about 1.7% * 7.75 GB = 131 MB. This agrees well with the memory btop claims each of the pihole-FTL processes is using. This suggests btop is incorrectly handling the COW principle I have talked about above. Seems have found a btop bug.
Also, have a look at the sum of used memory at the systemd process (2.3G) and then at the total used memory which is more than 10% less. Another indication that something isn't right here. Yes, I know, you may say btop's MemB is RSS and - as such - inaccurate, however, "used" memory also includes memory used by the kernel, modules and, e.g. shared memory. Hence, the real difference between the shown 1.99 GB and the sum of the memory used by all the processes under systemd will in reality be even larger.
Just out of curiosity: Which version of btop are you running? My local version is v1.2.13 and I do not seem affected by this at first glance (compare 1 and 2, especially the total sum of systemd at the top of 2):
I have a regex that redirects clients to Pi-hole's for any NTP DNS requests because it seems many IOT devices don't respect some DHCP settings such as DNS and NTP. I will turn that off as well so clients can get time.
NTP server disabled and FTL restarted. Total system memory comsumption is down from 3.1 to 1.7GB according to htop. I'll watch and see if those processes show up again.
Okay, thanks for the feedback. I can still not really explain what is happening here but it seems there is some strange issue with the forks not terminating as they should. When they then end up in some zombie-state, and the overall FTL process moves on, even COW eventually causes memory to be wasted.
Please update to get my latest changes to the custom branch. I simply removed the forking altogether because it shouldn't really be needed. The expected output of pihole-FTL --hash on this branch after the update is a76e7918.
Please re-enable the server so we can see if the constant memory eating is now absent or if this has created additional trouble of any kind (I don't expect any but be prepared for the unexpected):
Some background and observations (prior to updating)...
I have two RPis on my network running Pi-hole in a failover setup.
My router DHCP setup includes NTP addresses for both Pi-holes.
I use a regex to rewrite DNS NTP requests back to the active Pi-hole to force IOT devices to use local NTP services provided by Pi-hole.
I have one IOT client that "spams" the NTP server at least every 20 seconds.
The backup Pi-hole was running the tweak/ntp_errors branch with little traffic going to it and only one zombie process was present after ~10-12 hours
Next, I changed the regex on the primary to send NTP requests to the backup and changed the backup to the dev branch. Now it was receiving all network NTP traffic. No zombie processes were noted in 12 hours.
I switched the backup to the tweak/ntp_errors branch and observed two zombie processes in 10-12 hours.
It looks like the problem has something to do with load and many NTP requests.