[PLUG] Extract Google referrals from Apache logs

Paul Heinlein heinlein at madboa.com
Mon Mar 14 18:28:38 UTC 2005


I wanted a fast/easy way to get a summary of what Google queries are 
landing people on what pages. The script below is the result. It 
expects an Apache log as STDIN.

My Perl skills are a bit rusty these days, so feel free to suggest 
improvements.

--Paul Heinlein <heinlein at madboa.com>

#!/usr/bin/perl -w

use strict;
use CGI;

while ( <> ) {
   chomp;
   # parse log entries
   m/^
     (\S+)\s+           # remote address or hostname
     (\S+)\s+(\S+)\s+   # user and group
     \[([^\]]+)\]\s+    # date and time
     "([^"]+)"\s+       # request, parsed later
     (\d+)\s+(\S+)\s+   # http status code and number of bytes sent
     "([^"]+)"\s+       # referring url
     "([^"]+)"          # user agent
     $
   /x;

   # assign matches
   my ($addr, $user, $group, $time, $req, $status, $bytes, $ref, $agent) =
      ($1, $2, $3, $4, $5, $6, $7, $8, $9);

   # let CGI.pm parse the referrer string
   my $q = CGI->new($ref);
   my $query = $q->param('q') or next;

   # this should always return a good value...
   my $page = $1 if $req =~ m/^\S+\s+(\S+)/;

   # report findings
   print "Search query:  ", $q->param('q'), "\n";
   print "Page accessed: ", $page, "\n\n";

}



More information about the PLUG mailing list