[PLUG-TALK] Web Server Logs, Marketing, and Privacy

Keith Lofstrom keithl at kl-ic.com
Mon May 16 15:25:44 PDT 2005

Warning: this posting could lead to strong emotions and flameage!

I am working on a startup.  At this stage, our biggest problem is to
estimate the number of potential clients.  I pointed out to my marketing
guy how easily this could done.  We could provide a collection of
interesting contents on a website, attractive to our potential clients,
and then digest the information that the web server leaves in the log
files, and use that to determine which companies and divisions to
research and then cold-call.

He was horrified;  he had never considered what a trail he leaves
behind him as he surfs the net.  The marketing fellow is a highly 
ethical fellow (yay!) and is very concerned for customer privacy.
I taught him that most web servers save logs of IP address of the
requesting site, the page, the time, the browser used, and the
referring page.  All this log info is analyzed in depth by some
websites.  You can tell a lot about a user from the order and timing
of pages viewed, and what company he/she is surfing from - unless it
is a dynamic IP address that changes very frequently.  

This may be news for a lot of you folks, too.  Those pr0n sites you
are visiting may know who you are and what your kinks are;  perhaps 
they can even use those visits as esculpatory evidence if you ever
take them to court for spamming you with pr0n.  Watch where you surf...

Meanwhile, anybody with a sniffer anywhere along the path knows where
you are going and when.  Even if the website completely discards the
server log info, Uncle Carnivore knows what you are doing, and is
looking at ALL your surfing, not just one site's worth.

The whole issue of privacy during browsing, or alternately extracting
marketing information during browsing, deserves some discussion.  My
feeling is that if you explain your policy, reduce the data to anonymity,
and give the user a clickybox to opt out, then it is borderline OK to
collect the data in return for useful content.  OTOH, the worst sites
collect all of the data and analyze the hell out of it, and who wants
to be evil like them?  On the gripping hand, much of this data could be
blocked on the way out of the browser, or by scrubbing through third
party proxies;  but nobody seems to bother. 

Nobody seems to turn off their Apache logs, so nearly all of you
running websites are passively collecting this info (a very few
concerned web-hosts scrub it).  At best this is "security through
lethargy".  At worst, it is acting as an unpaid spy, and in the
future your web logs may be subpoenaed or merely hacked out of your
site.  Some sites even have publically available web server logs!

So how valuable is web surfing privacy, and what are you willing to do,
or to give up, in order to protect it for yourself or others?


(for more info, see
http://www.theregister.co.uk/2003/04/06/the_trails_left_in_web/print.html )

Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs

More information about the PLUG-talk mailing list