Lightsquid Server Aggregation
-
Hey guys, just thought I'd share this little tidbit I found on the lightsquid forums. Are you sick of looking at your lightsquid reports, and seeing every server in a farm for services such as Facebook and Youtube? I was, and did some looking through lightsquid stuff, if you open /usr/local/www/lightsquid/lightparser.pl for editing, just about halfway down, you will see this:
#simplified some common banner system & counters
$url=$Lurl;
$url =~ s/([a-z]+://)??..(spylog.com)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(yimg.com)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(adriver.ru)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(bannerbank.ru)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(mail.ru)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(adnet.ru)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(rapidshare.de)/$1www.$2/o;
$url =~ s/([a-z]+://)??..(rapidshare.com)/$1www.$2/o;This is the code that simplifies some addresses, you can add you own servers to this list, which shrunk some of my reports in half, I have provided some from my own additions as an example, I have made a few modifications as well, I have removed $1www. from the entries, which now shows facebook.com, instead of www.facebook.com, more of a cosmetic change than anything. I also replaced .com with an asterisk, this helped me as a Canadian user, now regional servers such as google.ca is simplified as well, without any additional entries.
$url =~ s/([a-z]+://)??..(facebook.)/$2/o;
$url =~ s/([a-z]+://)??..(youtube.)/$2/o;
$url =~ s/([a-z]+://)??..(msn.)/$2/o;
$url =~ s/([a-z]+://)??..(fbcdn.)/$2/o;
$url =~ s/([a-z]+://)??..(ytimg.)/$2/o;
$url =~ s/([a-z]+://)??..(hotmail.)/$2/o;
$url =~ s/([a-z]+://)??..(live.)/$2/o;
$url =~ s/([a-z]+://)??..(yahoo.)/$2/o;
$url =~ s/([a-z]+://)??..(google.)/$2/o;
$url =~ s/([a-z]+://)??..(s-msn.)/$2/o;
$url =~ s/([a-z]+://)??..(advertising.)/$2/o;
$url =~ s/([a-z]+://)??..(atdmt.)/$2/o;
$url =~ s/([a-z]+://)??..(doubleclick.)/$2/o;
$url =~ s/([a-z]+://)??..(twitter.)/$2/o;Edit 1: The only issue that I have found with this, is that it doesn't quite like some URLs with another servername in the path like : "http://www.google.com/reader/api/0/stream/contents/user/-/state/com.google/starred?" however, I have found a very low incidence of this.
Edit 2: Also, if you are running RC1, and update to the latest snapshot, these changes must be re-done as the modicifations are in the core script, not a config file.
Hope someone finds this helpful. :)
- Marc