[PLUG] replacing multipule instances of text

Jeme A Brelin jeme at brelin.net
Wed Oct 1 22:59:02 UTC 2003


On Tue, 30 Sep 2003, alex wrote:
>         Ive searched high and low but have not been able to find a
> solution to this problem. For the background, I run a photo web
> server(around 600 pages or so and growing) at my home for a track club
> here in the Portland area. They recently acquired their own domain-name
> and I need
> to replace all the homepage links in the HTML with the updated address.
>
> ex. replace www.foobar.com/RCTC/ with www.RCTC.com/
>
>         My question is..... is there an easy way to do this other than
> the obvious highlight and pasting of text?
>
>         I'm using MDL9.1 and Apache.
>
>         Any help will be greatly appreciated.

The perly ways are probably the ways I'd do it... or use my copy of sed
which includes a friend's patch so that it takes a perl-like -i switch.

Have you considered just sucking down the pages with wget using the -k
option (--convert-links)?

>From `man wget`:
       -k

       --convert-links
           After the download is complete, convert the links in
           the document to make them suitable for local viewing.
           This affects not only the visible hyperlinks, but any
           part of the document that links to external content,
           such as embedded images, links to style sheets, hyper-
           links to non-HTML content, etc.

           Each link will be changed in one of the two ways:

           o   The links to files that have been downloaded by
               Wget will be changed to refer to the file they
               point to as a relative link.

               Example: if the downloaded file /foo/doc.html
               links to /bar/img.gif, also downloaded, then the
               link in doc.html will be modified to point to
               ../bar/img.gif.  This kind of transformation works
               reliably for arbitrary combinations of directo-
               ries.

           o   The links to files that have not been downloaded
               by Wget will be changed to include host name and
               absolute path of the location they point to.

               Example: if the downloaded file /foo/doc.html
               links to /bar/img.gif (or to ../bar/img.gif), then
               the link in doc.html will be modified to point to
               http://hostname/bar/img.gif.

           Because of this, local browsing works reliably: if a
           linked file was downloaded, the link will refer to its
           local name; if it was not downloaded, the link will
           refer to its full Internet address rather than pre-
           senting a broken link.  The fact that the former links
           are converted to relative links ensures that you can
           move the downloaded hierarchy to another directory.

           Note that only at the end of the download can Wget
           know which links have been downloaded.  Because of
           that, the work done by -k will be performed at the end
           of all the downloads.

J.
-- 
   -----------------
     Jeme A Brelin
    jeme at brelin.net
   -----------------
 [cc] counter-copyright
 http://www.openlaw.org




More information about the PLUG mailing list