[PLUG] replacing multipule instances of text
Jeme A Brelin
jeme at brelin.net
Wed Oct 1 22:59:02 UTC 2003
On Tue, 30 Sep 2003, alex wrote:
> Ive searched high and low but have not been able to find a
> solution to this problem. For the background, I run a photo web
> server(around 600 pages or so and growing) at my home for a track club
> here in the Portland area. They recently acquired their own domain-name
> and I need
> to replace all the homepage links in the HTML with the updated address.
>
> ex. replace www.foobar.com/RCTC/ with www.RCTC.com/
>
> My question is..... is there an easy way to do this other than
> the obvious highlight and pasting of text?
>
> I'm using MDL9.1 and Apache.
>
> Any help will be greatly appreciated.
The perly ways are probably the ways I'd do it... or use my copy of sed
which includes a friend's patch so that it takes a perl-like -i switch.
Have you considered just sucking down the pages with wget using the -k
option (--convert-links)?
>From `man wget`:
-k
--convert-links
After the download is complete, convert the links in
the document to make them suitable for local viewing.
This affects not only the visible hyperlinks, but any
part of the document that links to external content,
such as embedded images, links to style sheets, hyper-
links to non-HTML content, etc.
Each link will be changed in one of the two ways:
o The links to files that have been downloaded by
Wget will be changed to refer to the file they
point to as a relative link.
Example: if the downloaded file /foo/doc.html
links to /bar/img.gif, also downloaded, then the
link in doc.html will be modified to point to
../bar/img.gif. This kind of transformation works
reliably for arbitrary combinations of directo-
ries.
o The links to files that have not been downloaded
by Wget will be changed to include host name and
absolute path of the location they point to.
Example: if the downloaded file /foo/doc.html
links to /bar/img.gif (or to ../bar/img.gif), then
the link in doc.html will be modified to point to
http://hostname/bar/img.gif.
Because of this, local browsing works reliably: if a
linked file was downloaded, the link will refer to its
local name; if it was not downloaded, the link will
refer to its full Internet address rather than pre-
senting a broken link. The fact that the former links
are converted to relative links ensures that you can
move the downloaded hierarchy to another directory.
Note that only at the end of the download can Wget
know which links have been downloaded. Because of
that, the work done by -k will be performed at the end
of all the downloads.
J.
--
-----------------
Jeme A Brelin
jeme at brelin.net
-----------------
[cc] counter-copyright
http://www.openlaw.org
More information about the PLUG
mailing list