[PLUG] wget and politeness
Keith Lofstrom
keithl at kl-ic.com
Fri Dec 24 23:49:27 UTC 2004
A question about politeness and wget.
I have been spending quite a bit of time away from the net, tending to
my mother in the hospital and elsewhere. I am looking at twiki as a
possible replacement for kwiki, and did a "wget -rk" of the twiki.org
site so I could peruse it offline. Now, wget -r respects robots.txt,
so it isn't downloading anything the site does not want a site scraper
to see.
The wget went slow - it was still going 2 days and 41MB later (I have
300KBps cable, so this is surprising). But the big annoyance was that
it stopped, before wget got to the -k part (it should be redesigned to
do this as it goes along, but that is a different gripe). When I
manually web to the site, I get:
> You are black listed at the TWiki web site due to excessive access or
> suspicious activities. Please contact site administrator
> peter.thoeny at attglobalSTOPSPAM.net if you got on the list by mistake.
> Black listed IP addresses will be submitted to major blacklist databases.
So, the question; is there some wget etiquette that I don't know about?
Keith
--
Keith Lofstrom keithl at keithl.com Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs
More information about the PLUG
mailing list