[PLUG] Perl, regex and xslt

Michael Ewan mhewan1 at comcast.net
Fri Jun 6 14:17:02 UTC 2008


Since these are defined as Unicode, is there a Unicode module for Perl, 
I would start there. 
Or, how about this approach...

#convert the # to a zero
$value =~ s/\#/0/; 
# now in a numeric context 0xF00 is interpreted as a number so a simple 
range test works...
next if ($value >= 0xE000 and $value <= 0xFFFD);

Otherwise you'll need to use the RE quantifiers to give an explicit repeat
$value =~ s/\#x[0-9A-F]{2,4}//ig



Sean Whitney wrote:
> I have some perl code that generates xml and with the help of xslt 
> generates xhtml.
>
> I've found that some of my source material contains characters that xslt 
> doesn't like.
>
> If I'm reading this right, this webpage seems to stipulate what 
> characters are ok in xslt and what are not.
>
> http://www.w3.org/TR/REC-xml/#NT-Char
>
> #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> What I am trying to figure out is how to write a series of perl 
> substitution lines to ensure that the output is ok for xslt.
>
> something like
>
> $value =~ s/\x0//g
>
> This gets rid of the first null character, but is there a way to 
> stipulate 0-8?
>
> If there's a better way of doing it I'm all eyes.....
>
>
>   




More information about the PLUG mailing list