[PLUG] getting a file's character set?

Randal L. Schwartz merlyn at stonehenge.com
Wed Jun 25 14:44:48 UTC 2008


>>>>> "Brent" == Brent Rieck <bsr at spek.org> writes:

Brent> It's not a problem when you stay within the 7 bit ascii limits but when
Brent> they start throwing in trademark, registration, curly quotes, em
Brent> dashes, en dashes, ellipsis, etc, characters it makes the xml parser
Brent> we're using puke.

Very likely, those files are just latin-1, where the default for XML
is (sanely) UTF-8.  Just add the ISO-8859-1 header, and try again.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn at stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion




More information about the PLUG mailing list