When Updating large lists use the parallel call for SED

It only uses one core at mostly 100% for some time to update.
Just an example I have no idea where the code is.

cat bigfile.txt | parallel --pipe sed s^old^new^g

https://huboqiang.cn/2016/05/09/multiCore

How would such a change affect the availability of Pi-hole during the gravity update process? Pi-hole processes gravity in the background, and only when the new gravity database is being swapped for the old one is Pi-hole offline (and that is a very short duration).

I don't understand the use case here.

The FTL runs at nice -10, would it keep running during the SED operation in parallel ?
To update my gravity takes approx 45 mins, I can time it again on a 3B.

I opened 2 terminals.
It takes 30.5 mins to do gravity update.
Elapsed time
The lists download 5 min
SED 4 cores 6 min
SED 1 core 10 min
Grep runs 11 min
SQLite 30.5 min
Done

The problem is not with the gravity update process. You are downloading and processing over 10 million domains, and that's always going to take a while. A Pi-3B isn't a particularly powerful CPU.

I've tried parallel with this line

__
Results:

Standard

real	1m20,742s
user	1m2,293s
sys	0m10,478s

With parallel

real	1m23,278s
user	1m13,053s
sys	0m16,753s


There might be some special parallel option to improve speed, but without tweaking it is actually slower. When considering that analyzing the lists (which are usually relatively small) is fast compared to the subsequent indexing of the gravity database, it's not worth the effort.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.