• Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login
Netgate Discussion Forum
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Search
  • Register
  • Login

Any regex for URL and domains for sanitizing blacklists like shallalist etc?

Scheduled Pinned Locked Moved Cache/Proxy
1 Posts 1 Posters 820 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    guru_meditation
    last edited by Mar 1, 2016, 12:34 AM

    Every now and then I update few categories of shallalist's block list and add them as Target Categories into Squidguard, which in turn complains about URLs wouldn't be URLs, like

    "1.2.3.4/xy" is not an URL
    "1.2.3.4:88/xy" is not an URL

    I then go and delete/correct these settings manually.

    Does anyone have a regex for use with sed or awk to do that job automatically?

    Like,
    find * -type f -exec sed -i -r '/([0-9]{1,3}.){3}[0-9]{1,3}/d' {} ;
    deletes URLs with IP addresses, or

    find * -type f -exec sed -i -r ':a;N;$!ba;s/\n/ /g' {} ;
    replaces line break with space.

    1 Reply Last reply Reply Quote 0
    1 out of 1
    • First post
      1/1
      Last post
    Copyright 2025 Rubicon Communications LLC (Netgate). All rights reserved.
      This community forum collects and processes your personal information.
      consent.not_received