[PLUG] Parsing HTML with Perl
Paul Heinlein
heinlein at madboa.com
Wed Jun 30 15:20:03 UTC 2004
On Wed, 30 Jun 2004, Shahms King wrote:
> Have you tried using HTML::Parser? It should be included with
> Fedora/RedHat and will probably work a lot better than just using
> regular expressions.
This is the best easy solution.
An alternative would be to run the HTML through tidy[1] and use a
full-fledged XML parser to grab your data, which would probably allow
you a bit more flexibility than HTML::Parser.
--Paul Heinlein <heinlein at madboa.com>
[1] http://tidy.sourceforge.net/
More information about the PLUG
mailing list