Can you block an http request for a specific page?



  • I rebuilt my web server, and started the site fresh this past week.

    Already, in my access logs, I see that I'm getting hammered by the "googlebot." Now, I don't mind google indexing my site, I think it's a good thing. But this GET request looks a bit strange to me:

    66.249.71.46|Wed 17 Oct 2012 17:25:09 -0500|200|3400||GET /perspective?page=0%2C2%2C6%2C6%2C6%2C5%2C1%2C2%2C1 HTTP/1.1|Host: www.jonesfamily.us|Connection: keep-alive|Accept: /|From: googlebot(at)googlebot.com|User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)|Accept-Encoding: gzip,deflate

    I use Drupal, and have never had a page with that name.

    It started from one IP address (different from the one above. both of which I verified to be a googlebot IP). My initial assumption was that it was a malfunction; because it was hitting the server every 3-5 seconds. I blocked the first IP. Then today, I found it again from the IP above. In every case, the request is for /perspective?page= <slightly changing="" string="" of="" gibberish="" like="" the="" one="" shown="" above="">Is there a way to block http requests on pfSense? …say, all requests for "/perspective" ????

    Thanks,
    Ron</slightly>


  • Rebel Alliance Developer Netgate

    It isn't possible to do something like that without running the HTTP traffic through a proxy or L7 to inspect the actual payload which can be a bit cumbersome on the firewall for just this. Snort might be able to catch it, if it's really bad.

    Or you could just block it in your web server software, most should have a mechanism for denying certain requests.



  • @jimp:

    It isn't possible to do something like that without running the HTTP traffic through a proxy or L7 to inspect the actual payload which can be a bit cumbersome on the firewall for just this. Snort might be able to catch it, if it's really bad.

    Or you could just block it in your web server software, most should have a mechanism for denying certain requests.

    Thanks, I was afraid of that. Since I'm useing pfSense on a Soekris board with a CF card, Snort probably wouldn't be a good option.

    After additional research, I discovered that while the googlebot is the one pounding my site, it is probably following a link with SQLi script from some site(s), which publishes exploits and vulnerabilities.

    In addition to pfSense, on this new web server, I have switched over to the Hiawatha Web Server software (from Apache), and they seem to do a good job of preventing damage. However, my main concern was the potential for DOS. So, for the moment, I've blocked the Booglebot IP. We'll unblock it in a few weeks to see if the storm has passed.

    regards,

    Jones


Log in to reply