Faulty regex consumes all memory and causes issues with pihole -q

Hahahaha, I know. I'm going to do some more test but with the help you provided I'm a lot further along.

Okay, new changes up. I do like the while loop, but it's a bit easier to read with a loop over the unquoted variable.

Tip, use %b instead of the %s for something that may contain values that you don't see. And double for regex values. :slight_smile:

Hey, if it works, I will happily concede for a simple for-loop.

Looks to work OK for me, boss!

mmotti@ubuntu-server:~$ pihole -q analytics.adserver.ads.test.com
 Match found in regex blacklist
   ^(.+[_.-])?ad[sxv]?[0-9]*[_.-]
   ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]
   ^analytics?[_.-]

That's with the crappy (\.|^)*\.services\.generalmagic\.com$ in my regex blacklist too.

I only have 125k domains though so would be useful for others to test with a higher domain count.

Can you go to the blacklist (or whitelist) display on the web interface and enter *.adserver.ads.test.com as a wildcard type? That's the key for this fix.

It adds as (\.|^)*\.adserver\.ads\.test\.com$ which obviously won't match anything with the leading \.

mmotti@ubuntu-server:~$ pihole -q adserver.ads.test.com
 Match found in regex blacklist
   ^(.+[_.-])?ad[sxv]?[0-9]*[_.-]
   ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]

If I manually adjust the shitty regex to an equally shitty regex (\.|^)*adserver\.ads\.test\.com$:

mmotti@ubuntu-server:~$ pihole -q adserver.ads.test.com
 Match found in regex blacklist
   ^(.+[_.-])?ad[sxv]?[0-9]*[_.-]
   ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]
   (\.|^)*adserver\.ads\.test\.com$

But, it doesn't crap out like awk did. So overall, for now, it looks like a win.

Edit edit edit edit:

I'm back on release v5.0 with the awk still in place and actually even with these shitty regexps, I'm not experiencing any crashing at all lol.

That should match tryme.adserver.ads.test.com if it's doing things as I intended.

Ah! two secs

Edit:

You are correct

mmotti@ubuntu-server:~$ pihole -q tryme.adserver.ads.test.com
 Match found in regex blacklist
   ^(.+[_.-])?ad[sxv]?[0-9]*[_.-]
   ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]
   (\.|^)*\.adserver\.ads\.test\.com$
   (\.|^)*adserver\.ads\.test\.com$

But please do not encourage users to use these examples :rofl:

Just a thought, too. It might be worth testing the performance when looping through thousands of regexps as no doubt some people with have a bazillion wildcards.

When you have to check x amount of wildcards against millions of domains, it could become troublesome.

Might be worth adding a warning notice that it could take time to run the query if the regexps exceed a certain amount, or better yet look to run the query and get the result straight from ftl?

This fix will work for now ofc but thinking long term, and communicating directly with ftl would remove the need to loop through everything is bash.

nanopi@nanopi:~$ pihole -q tryme.adserver.ads.test.com
 Match found in regex blacklist
   (\.|^)*\.adserver\.ads\.test\.com$

Agree, my main concern for moving to shell regex checking was the performance issue. This fix gets us in the door and working without having to wait to redo the web interface or other more drastic changes. Future options that work better are good to have.

I think you found a solution to the issue.

This is post #100 ... one reason more to set it to solved :wink:

Fixed in Malformed wildcard blocking doesn't crash awk. by dschaper Ā· Pull Request #3186 Ā· pi-hole/pi-hole Ā· GitHub