Pihole v6 on low spec hardware

kennywest · September 16, 2024, 5:46am

Good morning

I've been running pihole v5 for quite some time now on old pogoplug hardware. This hardware has an ARM CPU with 2 cores and only (!) 128 MB of memory. Ever since I've switched to v6, people in my household started complaining about "slow Internet" ... which boils down to a slower DNS, so this weekend I started the plug running v5 again, to see how it compares.

After running for a few days I see this in htop:

    1[|||                                                                                                                                1.9%]   Tasks: 28, 26 thr; 1 running
    2[||||||||                                                                                                                           5.1%]   Load average: 0.00 0.00 0.00
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||46.5M/117M]   Uptime: 11 days, 16:42:47
  Swp[|||                                                                                                                         26.5M/1.50G]   Time: 07:22:22
  Hostname: pogo02

On the other plug, after fresh boot, same hardware, I can see this:

    1[||||||||||                                                                                                                           6.5%] Tasks: 16, 19 thr, 50 kthr; 1 running
    2[|||                                                                                                                                  1.9%] Load average: 0.48 0.50 0.37
Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||76.9M/113M] Uptime: 00:12:30
Swp[|||||||||||                                                                                                                    103M/1.50G] Time: 07:39:40
Hostname: pogo01

I am guessing memory usage is the problem here, causing performance degradation over time. Obviously the FTL is now doing a lot more than it used to, since it serves a whole new API and also the admin pages causing higher CPU/mem usage
I realise this is good and the way forward.
I would prefer not to deprecate my old hardware, and I was wondering how to make FTL run as light as possible, maybe somehow disabling the API, admin interface, ...
Thanks.

DL6ER · September 16, 2024, 6:31pm

Is there even an operation system that still receives security updates on the Pogoplug?

As unpleasant this reply might be - this may be too little nowadays. Not only features, also hardware is quickly evolving. While Pi-hole still commits to running on low-performance hardware, this may just be too little.

I guess the major increase in memory comes from the in-memory database, but please also provide the output of

ls -lh /dev/shm/FTL-*

kennywest · September 17, 2024, 6:43am

Yes, these Pogoplugs are still "supported", well, kind off. There is a community:

that is working on getting Debian or any other Linux distro on exotic hardware, such as these plugs.
Anyway, I have 2 of them. The other one is still running pi-hole v5, this one was running an NFS server and is now converted to run pi-hole v6.
I was going to provide the output of the ls statement you asked, but pihole crashed again last night (please see other thread). In the logs I could find this though:

2024-09-16 22:56:39.614 CEST [1129/T1141] INFO: ------ Listing content of directory /dev/shm ------
2024-09-16 22:56:39.615 CEST [1129/T1141] INFO: File Mode User:Group      Size  Filename
2024-09-16 22:56:39.624 CEST [1129/T1141] INFO: rwxrwxrwx root:root       300   .
2024-09-16 22:56:39.625 CEST [1129/T1141] INFO: rwxr-xr-x root:root         3K  ..
2024-09-16 22:56:39.626 CEST [1129/T1141] INFO: rw------- pihole:pihole   560K  FTL-fifo-log
2024-09-16 22:56:39.627 CEST [1129/T1141] INFO: rw------- pihole:pihole     4K  FTL-per-client-regex
2024-09-16 22:56:39.628 CEST [1129/T1141] INFO: rw------- pihole:pihole   229K  FTL-dns-cache
2024-09-16 22:56:39.629 CEST [1129/T1141] INFO: rw------- pihole:pihole    12K  FTL-overTime
2024-09-16 22:56:39.630 CEST [1129/T1141] INFO: rw------- pihole:pihole     2M  FTL-queries
2024-09-16 22:56:39.631 CEST [1129/T1141] INFO: rw------- pihole:pihole    12K  FTL-upstreams
2024-09-16 22:56:39.633 CEST [1129/T1141] INFO: rw------- pihole:pihole   168K  FTL-clients
2024-09-16 22:56:39.633 CEST [1129/T1141] INFO: rw------- pihole:pihole   152K  FTL-domains
2024-09-16 22:56:39.634 CEST [1129/T1141] INFO: rw------- pihole:pihole   123K  FTL-strings
2024-09-16 22:56:39.636 CEST [1129/T1141] INFO: rw------- pihole:pihole   140   FTL-settings
2024-09-16 22:56:39.637 CEST [1129/T1141] INFO: rw------- pihole:pihole   296   FTL-counters
2024-09-16 22:56:39.638 CEST [1129/T1141] INFO: rw------- pihole:pihole    56   FTL-lock
2024-09-16 22:56:39.638 CEST [1129/T1141] INFO: rw-r--r-- root:root         0   .tmpfs
2024-09-16 22:56:39.639 CEST [1129/T1141] INFO: ---------------------------------------------------

DL6ER · September 17, 2024, 2:45pm

Okay, this all looks fine (except the crash but let keep this separated to the other topic). Please set debug.database = true on your Pi-hole v6 instance, e.g., by running

sudo pihole-FTL --config debug.database true

and then check the log file for lines like

mem database size: 1.25 MB (12345 queries)

These lines should be appearing almost instantaneously /var/log/pihole/FTL.log given that new queries arrive on this Pi-hole.

kennywest · September 17, 2024, 6:26pm

Enabled logging on the fresh compiled binary and I indeed see log statements appearing while surfing the web. I do however see the following delays:

2024-09-17 20:17:00.067 CEST [1112/T1128] DEBUG_DATABASE: Accessing in-memory database
2024-09-17 20:17:01.144 CEST [1112/T1128] DEBUG_DATABASE: Accessing in-memory database
2024-09-17 20:17:01.370 CEST [1112/T1128] DEBUG_DATABASE: Exported 1 rows to disk.domain_by_id
2024-09-17 20:17:01.371 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows to disk.client_by_id
2024-09-17 20:17:01.374 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows to disk.forward_by_id
2024-09-17 20:17:01.377 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows to disk.addinfo_by_id
2024-09-17 20:17:01.378 CEST [1112/T1128] DEBUG_DATABASE: Exported 2 rows to disk.sqlite_sequence
2024-09-17 20:17:11.635 CEST [1112/T1128] DEBUG_DATABASE: Opening FTL database in export_queries_to_disk() (/root/FTL/src/database/query-table.c:696)
2024-09-17 20:17:11.732 CEST [1112/T1128] DEBUG_DATABASE: dbquery: "INSERT OR REPLACE INTO ftl (id, value) VALUES ( 1, 1726597015.960639 );"

So, as you can see, there are a few seconds between the Exported 2 rows to disk.sqlite_sequence and the Opening FTL database statement. And indeed, I can see some delay than when surfing. Does this make sense? Is this observation correct? If so, this means this is due to slow I/O ?

DL6ER · September 17, 2024, 6:39pm

Partially, but the debug output is also not designed to be understandable without reading the correlated source code but rather designed to be brief and not necessarily clutter the log file with output.

What is happening here is that after

data is indeed written to the disk and

is the next thing happening thereafter. It means that writing the queries really took this long. Your observed lines should have immediately be followed by

Exported 12345 rows for disk.query_storage (took 1234.567 ms, last SQLite ID 12341234)

which should help aided understanding.

kennywest · September 17, 2024, 7:14pm

Indeed:

2024-09-17 20:26:10.135 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 10067.4 ms, last SQLite ID 3074833)

Found this, writing takes a while.

DL6ER · September 18, 2024, 12:43am

Oh, this is really slow for writing only 3 rows. It is taking less than 1 ms for a couple of dozen on my Pi-hole running on an older microserver (x86, though) with an SSD as well. Writing queries to disk should not be blocking DNS operation at all, however, it may be that your system is so busy at that time that it stops responding overall.

If you enable

sudo pihole-FTL --config debug.queries true

you should be able to see the effects of database writing on the query processing.

kennywest · September 18, 2024, 5:36am

Found this in the logs:

2024-09-18 07:13:11.402 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 11306.1 ms, last SQLite ID 3085793)
2024-09-18 07:14:10.772 CEST [1112/T1128] DEBUG_DATABASE: Exported 137 rows for disk.query_storage (took 10733.2 ms, last SQLite ID 3086067)
2024-09-18 07:15:10.231 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 10203.5 ms, last SQLite ID 3085930)
2024-09-18 07:16:10.995 CEST [1112/T1128] DEBUG_DATABASE: Exported 130 rows for disk.query_storage (took 10952.1 ms, last SQLite ID 3086190)
2024-09-18 07:17:11.633 CEST [1112/T1128] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 11549.4 ms, last SQLite ID 3086060)
2024-09-18 07:18:10.659 CEST [1112/T1128] DEBUG_DATABASE: Exported 119 rows for disk.query_storage (took 10606.7 ms, last SQLite ID 3086298)
2024-09-18 07:20:01.379 CEST [4717/T4723] DEBUG_DATABASE: Exported 43 rows for disk.query_storage (took 1287.0 ms, last SQLite ID 3086265)
2024-09-18 07:21:16.523 CEST [4717/T4723] DEBUG_DATABASE: Exported 27 rows for disk.query_storage (took 16436.9 ms, last SQLite ID 3086319)
2024-09-18 07:22:12.730 CEST [4717/T4723] DEBUG_DATABASE: Exported 63 rows for disk.query_storage (took 12636.9 ms, last SQLite ID 3086445)
2024-09-18 07:23:09.326 CEST [4717/T4723] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 9183.6 ms, last SQLite ID 3086382)
2024-09-18 07:24:11.520 CEST [4717/T4723] DEBUG_DATABASE: Exported 117 rows for disk.query_storage (took 11439.4 ms, last SQLite ID 3086616)
2024-09-18 07:25:10.457 CEST [4717/T4723] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 10345.9 ms, last SQLite ID 3086499)
2024-09-18 07:26:10.240 CEST [4717/T4723] DEBUG_DATABASE: Exported 72 rows for disk.query_storage (took 10214.0 ms, last SQLite ID 3086643)
2024-09-18 07:27:09.809 CEST [4717/T4723] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 9755.1 ms, last SQLite ID 3086571)
2024-09-18 07:28:11.028 CEST [4717/T4723] DEBUG_DATABASE: Exported 48 rows for disk.query_storage (took 10960.1 ms, last SQLite ID 3086667)
2024-09-18 07:29:10.692 CEST [4717/T4723] DEBUG_DATABASE: Exported 9 rows for disk.query_storage (took 10649.1 ms, last SQLite ID 3086685)
2024-09-18 07:30:10.600 CEST [4717/T4723] DEBUG_DATABASE: Exported 154 rows for disk.query_storage (took 10511.2 ms, last SQLite ID 3086993)
2024-09-18 07:31:10.212 CEST [4717/T4723] DEBUG_DATABASE: Exported 0 rows for disk.query_storage (took 10086.7 ms, last SQLite ID 3086839)
2024-09-18 07:32:11.055 CEST [4717/T4723] DEBUG_DATABASE: Exported 141 rows for disk.query_storage (took 11035.9 ms, last SQLite ID 3087121)

So this is slow almost all the time. I was wondering how this compares to v5? Because I don't experience any slowness on the other machine with the same specs. Would it help to increase the privacy level ... so not everything is being logged? I don't really care about historical data anyway ...

kennywest · September 18, 2024, 12:01pm

Trying to do some performance benchmarking ...
Tried to detect write speed of SSD. This is on the plug running pi-hole v5:

root@pogo01:~# dd if=/dev/zero of=/root/tempfile bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 72.093 s, 7.4 MB/s

This is on the pi-hole v6:

root@pogo02:~# dd if=/dev/zero of=/root/tempfile bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 46.1867 s, 11.6 MB/s

I think we can conclude here that pogo01 has a much slower SSD (although type is the same) and that this is causing all my problems ...

kennywest · September 18, 2024, 12:10pm

Hmmm, upgraded to a newer kernel on the plug that is running v6, and now I get this:

root@pogo01:~# dd if=/dev/zero of=/root/tempfile bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 37.7295 s, 14.2 MB/s
root@pogo01:~# dd if=/dev/zero of=/root/tempfile bs=1M count=512 conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 36.0702 s, 14.9 MB/s

Either way, this might also be too slow to run pi-hole v6 ... not sure

DL6ER · September 18, 2024, 8:25pm

We should first collect some more information, like

DL6ER:

however, it may be that your system is so busy at that time that it stops responding overall.

If you enable
sudo pihole-FTL --config debug.queries true
you should be able to see the effects of database writing on the query processing.

kennywest · September 20, 2024, 12:12pm

Thank you for your continuous help and support.
I enabled both debug flags. Are there any specifics I need to look out for? Do you want to take a look at the logs?

kennywest · September 23, 2024, 7:02am

Anyway ... after setting privacy level to:

Hide domains and clients: Display and store all domains as hidden and all clients as 0.0.0.0

Not a lot is logged to the database anymore and memory/cpu consumption has dropped dramatically. So for low spec hardware and running v6, this is the way to go, I think.

Bucking_Horn · September 23, 2024, 10:02am

If reducing database writes helps, your issue may be about database handling (perhaps i/o contention) just as well as about memory.

But looking at your SSD speed test results:
Your maximum of ~15MB/s would match my sdcard write speed, where your pogoplug's SATA I (nominally 1.5Gbps) port should be capable of well above 100MB/s. Your current results are even lower than the transfer speeds of its USB2 port (nominally 480Mbps).

Are you sure you are testing your SSD? Where does /root/tempfile reside?

That said, even your 15MB/s should have no impact on Pi-hole, as Pi-hole would not have to write data at that rate, at least not unless you had a lot of very busy clients and raised your rate limit. At the default 1,000 requests per minute, and assuming 100 bytes stored per query (where my personal database average is 60ish), that would still amount to just 100,000 bytes to write; or reversely, each of the 1,000 queries would have to weigh around 15,700 bytes to push your 15MB/s boundary.

Also, as sqlite3 depends on the filesystem correctly locking files, perhaps file locking isn't working correctly on your pogoplug.
Are you perhaps sharing your SSD volume via NFS?

If your issue is related to your database, then - instead of reducing the privacy level - disabling Pi-hole's query database altogether (by setting DBFILE=) could also be worth a try.

Of course, if you'd do that straight away, you'd deprive yourself from further analysis using debug.queries, so you'd probably want to try that only after that analysis is complete.

kennywest · September 23, 2024, 10:35am

/root/tempfile was on SSD
since the plug has a limited amount of memory (128MB), it has swap space configured, which is also being used
the swap space is on the same disk, so (not an expert here), since v6 uses more memory than v5, it will start using swap space, so we have disk I/O going to both swap and database on the same SSD, so I guess this is causing the performance degradation?
about the speeds, it is indeed on internal SATA port, but I see this in dmesg:

    9.628026] ata1.00: configured for UDMA/133
[    9.633011] sata_oxnas: resetting SATA core
[    9.677383] oxnas-pcie 47c00000.pcie: link down
[    9.683000] oxnas-pcie 47c00000.pcie: PCI host bridge to bus 0000:00
[    9.690086] pci_bus 0000:00: root bus resource [mem 0x48000000-0x49ffffff]
[    9.697697] pci_bus 0000:00: root bus resource [mem 0x4a000000-0x4bdfffff pref]
[    9.705681] pci_bus 0000:00: root bus resource [io  0x0000-0xfffff]
[    9.712660] pci_bus 0000:00: root bus resource [bus 00-7f]
[    9.719680] PCI: bus0: Fast back to back transfers enabled
[    9.726709] clk: Disabling unused clocks
[   10.417448] ata1.00: limiting speed to UDMA/100:PIO4
[   11.357432] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   11.364981] ata1.00: configured for UDMA/100
[   11.370004] sata_oxnas: resetting SATA core
[   12.147445] ata1.00: limiting speed to UDMA/33:PIO4
[   13.087439] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   13.094917] ata1.00: configured for UDMA/33
[   13.099855] sata_oxnas: resetting SATA core
[   14.817436] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[   14.824929] ata1.00: configured for UDMA/33
[   14.829864] ata1: EH pending after 5 tries, giving up
[   14.836357] scsi 0:0:0:0: Direct-Access     ATA      SanDisk SSD U100 2.01 PQ: 0 ANSI: 5
[   14.847651] sd 0:0:0:0: [sda] 31277232 512-byte logical blocks: (16.0 GB/14.9 GiB)
[   14.856064] sd 0:0:0:0: [sda] Write Protect is off
[   14.861642] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   14.861877] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   14.875368]  sda: sda1 sda2

Bucking_Horn · September 23, 2024, 10:59am

(Fallback to UDMA/33 could be caused by using a 40-wires instead of an 80-wires SATA cable?)

Even UDMA/33 should give you transfer rates around 30MB/s.

But as said, Pi-hole runs fine on sdcard systems, not nearly exhausting even their more limited bandwidth.

Does your pogoplug host anything else besides Pi-hole?
If not, did you try running without any swap space?

kennywest · September 23, 2024, 11:39am

It is running a cloudflare tunnel for this -> cloudflared (DoH) - Pi-hole documentation
I will try running with swap off and query logs enabled ... let's see.

EDIT: well what do you know ... this seems to work

DL6ER · September 23, 2024, 9:33pm

Sorry, I was traveling without a chance to check on Pi-hole matters until today.

But I see you already found a solution midway. My question was intended into the direction of "do you really see DNS resolution stopping when database writing happens or is it just (very) slow?".

Raw bandwidth (as in MB/s) probably isn't the concern here as the database insertions on the scale your previous logs suggested would be few bytes only. Using empty slots in the database would require some out-of-order I/O ops but this is typically no issue with modern SSDs. Hence, I was very much assuming your device is just busy with "something".