Netgate Discussion Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Search
    • Register
    • Login

    Any regex for URL and domains for sanitizing blacklists like shallalist etc?

    Scheduled Pinned Locked Moved Cache/Proxy
    1 Posts 1 Posters 822 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • G
      guru_meditation
      last edited by

      Every now and then I update few categories of shallalist's block list and add them as Target Categories into Squidguard, which in turn complains about URLs wouldn't be URLs, like

      "1.2.3.4/xy" is not an URL
      "1.2.3.4:88/xy" is not an URL

      I then go and delete/correct these settings manually.

      Does anyone have a regex for use with sed or awk to do that job automatically?

      Like,
      find * -type f -exec sed -i -r '/([0-9]{1,3}.){3}[0-9]{1,3}/d' {} ;
      deletes URLs with IP addresses, or

      find * -type f -exec sed -i -r ':a;N;$!ba;s/\n/ /g' {} ;
      replaces line break with space.

      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.