Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    REGEX blocking

    Scheduled Pinned Locked Moved pfBlockerNG
    48 Posts 8 Posters 7.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jc1976
      last edited by

      Is it possible to use regex blocking with a list? Like one of the lists in dnsbl or ip?

      I know i can activate regex blocking but then i've gotta manually add all the urls.

      i came across this link "https://ftprivacy.cloud/#regex-blocklists/#the-main-general-blocklist" where, like the feeds from pfblockerng are just long lists.

      Per the configuration in pfblockerng, it doesn't seem like i can paste a url in the whitespace and it'll pull all those entries down.

      Can i use that URL in dnsbl and it'll import the urls like it would in any other dnsbl?

      thanks

      GertjanG 1 Reply Last reply Reply Quote 0
      • GertjanG
        Gertjan @jc1976
        last edited by

        @jc1976 said in REGEX blocking:

        Is it possible to use regex blocking with a list? Like one of the lists in dnsbl or ip?

        You can list an 'item' or regex expression, or even many of them, like a list.
        regex is special. Do not approach it without a drugstore nearby. Stockpile your koffee beans.

        Example :

        ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-] #test RGX1
        ^(.+[_.-])?telemetry[_.-] #test RGX2
        ^ad([sxv]?[0-9]*|system)[_.-]([^.[:space:]]+\.){1,}|[_.-]ad([sxv]?[0-9]*|system)[_.-] #test RGX3
        ^adim(age|g)s?[0-9]*[_.-] #test RGX4
        ^adtrack(er|ing)?[0-9]*[_.-] #test RGX5
        ^advert(s|is(ing|ements?))?[0-9]*[_.-] #test RGX6
        ^aff(iliat(es?|ion))?[_.-] #test RGX7
        ^analytics?[_.-] #test RGX8
        ^banners?[_.-] #test RGX9
        ^beacons?[0-9]*[_.-] #test RGX10
        ^count(ers?)?[0-9]*[_.-] #test RGX11
        ^pixels?[-.] #test RGX12
        ^stat(s|istics)?[0-9]*[_.-] #test RGX13
        ^stat(s|istics)?[0-9]*[_.-] #test RGX14
        

        where

        #test RGX14
        

        is a unique, that's why I included a number - comment line.

        For the syntax : welcome to regex.

        No "help me" PM's please. Use the forum, the community will thank you.
        Edit : and where are the logs ??

        1 Reply Last reply Reply Quote 0
        • U
          Uglybrian
          last edited by

          Hi, Here is a regex list that i use with pfBlocker even tho its for piehole
          https://github.com/mmotti/pihole-regex/blob/master/regex.list
          For facebook: https://github.com/mmotti/pihole-regex/tree/master/social

          GertjanG J A 3 Replies Last reply Reply Quote 0
          • GertjanG
            Gertjan @Uglybrian
            last edited by

            @uglybrian said in REGEX blocking:

            https://github.com/mmotti/pihole-regex/blob/master/regex.list

            👍

            And lol, line 10 :

            Due to the restrictive nature of these regexps, you may encounter a small number of false positives

            Look at the regex expressions :
            An url with 'counter' will get flagged.
            Worse : if you find a 'stat' in an url : => exit !!

            That not 'restrictive', that's a sledge hammer approach ;)

            No "help me" PM's please. Use the forum, the community will thank you.
            Edit : and where are the logs ??

            1 Reply Last reply Reply Quote 0
            • U
              Uglybrian
              last edited by

              All my hammers are big

              Screenshot from 2022-10-20 09-19-48.png

              1 Reply Last reply Reply Quote 0
              • J
                jc1976 @Uglybrian
                last edited by

                @uglybrian

                yes, that's what i'm going for. I dunno if i was very clear at all in my description when i reread it.

                to clarify (and you probably understood where i was already going but i'll continue anyway).

                That pihole URL has a github list where people can test it and update it as time goes on, just like the lists in pfblockerng.

                in pihole, the user merely pastes that url into some place (or where ever... i've never used pihole so i don't know) and pihole automatically goes to that link and imports the contents of the file, and it does so however many times, at whatever hour, is specified.

                Is there a way to just 'input' the url to that list which you provided, where it auto updates with a cron event, just like any of the other lists in pfblockerng?

                the regex module in pfblocker is such that 'you put a check in the box to activate it, then you go below to the white space and paste a list'.. however you're not pasting a 'link to a list' where pfblocker will go to that url and import the list like it would with the others.

                GertjanG 1 Reply Last reply Reply Quote 0
                • U
                  Uglybrian
                  last edited by

                  Hi JC, I am not entirely sure you are able to do it the way you would like. I have tried in the past to set up regex on the ip side of pfBlocker ( in the format drop down you can select regex). For the source I put the url for the list. But,the aliases it created had 12 ip address, that never hit for me.
                  So I went to the DNSBL side and found a Python Regex List. This is where i did a copy and paste of the list. (see pic) After i did the save and update I started to get the hits. I just check the list periodically to see if there any updates.regex list from 2022-10-20 12-03-45.png

                  1 Reply Last reply Reply Quote 0
                  • GertjanG
                    Gertjan @jc1976
                    last edited by

                    @jc1976 said in REGEX blocking:

                    Is there a way to just 'input' the url to that list which you provided, where it auto updates with a cron event, just like any of the other lists in pfblockerng?

                    I guess the regex feature is in it's feature stage : it was added because 'why not'.
                    regex is very powerful (many falls positives ahead), though. It usage should be limited - not many lines.

                    No "help me" PM's please. Use the forum, the community will thank you.
                    Edit : and where are the logs ??

                    1 Reply Last reply Reply Quote 0
                    • Cool_CoronaC
                      Cool_Corona
                      last edited by

                      So in short, this is very easy to block specific domains this way?

                      GertjanG 1 Reply Last reply Reply Quote 0
                      • GertjanG
                        Gertjan @Cool_Corona
                        last edited by

                        @cool_corona

                        yes.

                        No "help me" PM's please. Use the forum, the community will thank you.
                        Edit : and where are the logs ??

                        Cool_CoronaC 1 Reply Last reply Reply Quote 0
                        • Cool_CoronaC
                          Cool_Corona @Gertjan
                          last edited by

                          @gertjan Why is there no package for this to translate the regex language??

                          Like type in facebook.com -> Regex translated and voila..... Then you dont have to worry about ASN and ip address....

                          GertjanG 1 Reply Last reply Reply Quote 0
                          • GertjanG
                            Gertjan @Cool_Corona
                            last edited by

                            @cool_corona said in REGEX blocking:

                            to translate the regex language??

                            Translate what to what ?
                            If you're new to regex, never used command line utilities like 'grep', 'awk' and 'sed', then I understand ... that a whole new world just opened to you ;)

                            I said above, IMHO, the pfblockerng-devel - python - regex functionality was added because it needed one function (line) and it is soooooooooooo powerfull ***:

                            def pfb_regex_match(q_name):
                                global regexDB
                            
                                if q_name:
                                    for k,r in regexDB.items():
                                        if r.search(q_name):
                                            return k
                                return False
                            

                            where the q_name is the 'hostname' to be macthed.
                            regexDB.items(): is the list with regular expressions. You can find that list here : /var/unbound/pfb_unbound.ini (bottom part of the file, the [REGEX] section).

                            **** there will be a lot of 'shot in de foot' situations, as real power needs to be managed.

                            No "help me" PM's please. Use the forum, the community will thank you.
                            Edit : and where are the logs ??

                            Cool_CoronaC 1 Reply Last reply Reply Quote 0
                            • Cool_CoronaC
                              Cool_Corona @Gertjan
                              last edited by Cool_Corona

                              @gertjan Yes and thats why a GUI would be great for the noobs like me 😆

                              Should I use

                              def pfb_regex_match(**facebook.com**):
                                  global regexDB
                              
                                  if q_name:
                                      for k,r in regexDB.items():
                                          if r.search(q_name):
                                              return k
                                  return False
                              
                              

                              And then facebook is blocked?

                              GertjanG 1 Reply Last reply Reply Quote 0
                              • GertjanG
                                Gertjan @Cool_Corona
                                last edited by

                                @cool_corona said in REGEX blocking:

                                And then facebook is blocked?

                                Noop.
                                you should add "facebook.com" to the regex list.

                                An facebook example is present in the list @Uglybrian has shown.

                                No "help me" PM's please. Use the forum, the community will thank you.
                                Edit : and where are the logs ??

                                Cool_CoronaC 1 Reply Last reply Reply Quote 0
                                • Cool_CoronaC
                                  Cool_Corona @Gertjan
                                  last edited by

                                  @gertjan Thanks :)

                                  I cant find Regex under DNSBL tab despite using unbound python mode....

                                  GertjanG 1 Reply Last reply Reply Quote 0
                                  • GertjanG
                                    Gertjan @Cool_Corona
                                    last edited by

                                    @cool_corona

                                    41a5cc5f-667f-4fe5-85b5-61661149df1f-image.png

                                    and then :

                                    bfdd633c-6255-4d37-949d-e20ea91e2c9e-image.png

                                    and then the list get available :

                                    b82ff1ed-2601-426e-99ba-da94a786504f-image.png

                                    No "help me" PM's please. Use the forum, the community will thank you.
                                    Edit : and where are the logs ??

                                    Cool_CoronaC 2 Replies Last reply Reply Quote 0
                                    • Cool_CoronaC
                                      Cool_Corona @Gertjan
                                      last edited by

                                      @gertjan Got it 👍 Thanks man!

                                      1 Reply Last reply Reply Quote 0
                                      • Cool_CoronaC
                                        Cool_Corona @Gertjan
                                        last edited by Cool_Corona

                                        @gertjan Sometimes it works...sometime it doesnt.

                                        Kind of annoying....

                                        
                                        ^ad([sxv]?[0-9]*|system)[_.-]([^.[:space:]]+\.){1,}|[_.-]ad([sxv]?[0-9]*|system)[_.-]
                                        ^(.+[_.-])?adse?rv(er?|ice)?s?[0-9]*[_.-]
                                        ^(.+[_.-])?telemetry[_.-]
                                        ^adim(age|g)s?[0-9]*[_.-]
                                        ^adtrack(er|ing)?[0-9]*[_.-]
                                        ^advert(s|is(ing|ements?))?[0-9]*[_.-]
                                        ^aff(iliat(es?|ion))?[_.-]
                                        ^analytics?[_.-]
                                        ^banners?[_.-]
                                        ^beacons?[0-9]*[_.-]
                                        ^count(ers?)?[0-9]*[_.-]
                                        ^mads\.
                                        ^pixels?[-.]
                                        ^stat(s|istics)?[0-9]*[_.-]
                                        # ^(.+[_.-])?(facebook|fb(cdn|sbx)?|tfbnw)\.[^.]+$
                                        

                                        Facebook block only works sometimes and it makes DNS horribly slow. (its intentional that I commented it out in the current config).

                                        GertjanG 1 Reply Last reply Reply Quote 0
                                        • GertjanG
                                          Gertjan @Cool_Corona
                                          last edited by

                                          @cool_corona

                                          I'm not sure you can comment it out with a #. I advise you to remove it.
                                          This is regex, not some script ;)

                                          Btw :
                                          See line 1451 in /var/unbound/pfb_unbound.py
                                          as long as there are items on the regex list, the pfb_regex_match(q_name) function is called.
                                          The main advantage of regex is : is 'blindingly' fast (way faster as what PHP could have done, python is far superior). When I activated that rule, my browser instantaneously showed

                                          9b3ccba4-ad33-4d29-9ef0-6f077a8277bd-image.png

                                          when I entered www.facebook.com in the URL bar.

                                          The DNSBL Python logs :

                                          8127dec6-8456-47d0-a98a-6b911135a931-image.png

                                          Regex blocking is as fast as it can gets, way faster as the parsing needed for all the main DNSBL list, compiled from all your DNBL feeds.

                                          No "help me" PM's please. Use the forum, the community will thank you.
                                          Edit : and where are the logs ??

                                          GertjanG 1 Reply Last reply Reply Quote 0
                                          • GertjanG
                                            Gertjan @Gertjan
                                            last edited by

                                            WTF ....

                                            I added a 'facebook' regex above so I could collect some DNSBL log lines - see image above.

                                            I also saw :

                                            2611c22e-aae9-4881-a83b-96b3da600c98-image.png

                                            and, as 'facebook' is in the hostname, all looks fine.

                                            I removed the facebook regex, and reloaded pfBlockerng.

                                            Still, Whatsapp didn't work (on my phone).
                                            And wtf, when I deactivated wifi on my phone, still Whatsapp "doesn't work".

                                            Facebook did it again : they have managed to shut themselves, whatsapp this time, out of the Internet. Probably on a planetary level 👍

                                            No "help me" PM's please. Use the forum, the community will thank you.
                                            Edit : and where are the logs ??

                                            Cool_CoronaC A 3 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.