Regex Chatter

NOT a regex expert!
@mmotti uses in his regex list

^(www[0-9]*\.)?xn--

which translates into:

your regex

(\.|^)xn--.*\..*$

translates into

whith the example from here, using ‘xn–c1yn36f’

mmotti regex101 result: match, 8 steps
your result: NO match

I wonder which one is more efficient and recommended?

That has not a tld and to be in the database of pihole you should have a tld.

The question then is do we want domains without a tld or is that a bridge to far.

(\.|^)xn--
``` is then the most recognizable.

couldn’t match, because(\.|^)xn--.*\..*$ says that there must be at least a single dot behind xn–c1yn36f.
To have a regex closer to a FQDN without specifying the domain exactly i prefer(\.|^)xn--.*\..+$
Then a dot and at least one letter is needed like xn–c1yn36f.c
and so on…
I’m very new to regex too, but meanwhile i love it. It’s great what you can do with it.

Why are you guys looking to specify a regexp for the FQDN incl tlds? You only need a partial match for it to block and it could be adding extra steps / processing where not necessary

Right,

It could!
While learning about Regex I tested all my filter “creations” to see what I’m doing there and how I could get the best results even with an eye on processing and clarity. I found that these points are often and unexpectedly compromising. Sometimes a more precise specification is helpful in saving processing power, and sometimes it is not. If not I prefer precision.

That’s cool :slight_smile:

Just as a warning, this is the reason my regex isn’t open to all subdomains: Google play issues

Just in case you guys happen to encounter similar problems. I never nailed down the exact cause (e.g. If only affecting users outside of the UK etc), but I had to tweak it for my list due to the number of people it could have potentially caused issues for.