[PLUG] wget and politeness

Keith Lofstrom keithl at kl-ic.com
Fri Dec 24 23:49:27 UTC 2004


A question about politeness and wget.

I have been spending quite a bit of time away from the net, tending to
my mother in the hospital and elsewhere.  I am looking at twiki as a
possible replacement for kwiki, and did a "wget -rk" of the twiki.org
site so I could peruse it offline.  Now, wget -r respects robots.txt,
so it isn't downloading anything the site does not want a site scraper
to see.

The wget went slow - it was still going 2 days and 41MB later (I have
300KBps cable, so this is surprising).  But the big annoyance was that
it stopped, before wget got to the -k part (it should be redesigned to
do this as it goes along, but that is a different gripe).  When I 
manually web to the site, I get:

> You are black listed at the TWiki web site due to excessive access or
> suspicious activities. Please contact site administrator
> peter.thoeny at attglobalSTOPSPAM.net if you got on the list by mistake.
> Black listed IP addresses will be submitted to major blacklist databases.

So, the question;  is there some wget etiquette that I don't know about?

Keith

-- 
Keith Lofstrom          keithl at keithl.com         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs



More information about the PLUG mailing list