[PLUG] Perl, regex and xslt
Michael Ewan
mhewan1 at comcast.net
Fri Jun 6 14:17:02 UTC 2008
Since these are defined as Unicode, is there a Unicode module for Perl,
I would start there.
Or, how about this approach...
#convert the # to a zero
$value =~ s/\#/0/;
# now in a numeric context 0xF00 is interpreted as a number so a simple
range test works...
next if ($value >= 0xE000 and $value <= 0xFFFD);
Otherwise you'll need to use the RE quantifiers to give an explicit repeat
$value =~ s/\#x[0-9A-F]{2,4}//ig
Sean Whitney wrote:
> I have some perl code that generates xml and with the help of xslt
> generates xhtml.
>
> I've found that some of my source material contains characters that xslt
> doesn't like.
>
> If I'm reading this right, this webpage seems to stipulate what
> characters are ok in xslt and what are not.
>
> http://www.w3.org/TR/REC-xml/#NT-Char
>
> #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> What I am trying to figure out is how to write a series of perl
> substitution lines to ensure that the output is ok for xslt.
>
> something like
>
> $value =~ s/\x0//g
>
> This gets rid of the first null character, but is there a way to
> stipulate 0-8?
>
> If there's a better way of doing it I'm all eyes.....
>
>
>
More information about the PLUG
mailing list