This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Character encoding problem
- To: xsl-list at lists dot mulberrytech dot com
- Subject: Re: [xsl] Character encoding problem
- From: Uche Ogbuji <uche dot ogbuji at fourthought dot com>
- Date: Thu, 24 May 2001 16:56:04 -0600
- Reply-To: xsl-list at lists dot mulberrytech dot com
Could you zip up and send these files? Cut and paste isn't bringing over the
UTF-16 properly.
Thanks.
--Uche
> Hi, folks--
>
> I'm developing a simple XSLT transformation for selecting languages
> (English or Japanese) on a bilingual website. It takes a source XHTML
> document with paired headings in English and Japanese, e.g.:
>
> <p xml:lang="en">
> [ some stuff in English ]
> </p>
> <p xml:lang="ja">
> [ same content in Japanese ]
> </p>
>
> ... and outputs everything in the selected language plus any content
> that has no language specified. At least that's the theory. I've tried
> processing it w/ (full) Saxon and 4XSLT's command line interfaces, but
> keep getting errors:
>
> Saxon:
> $ saxon main.html i18n.xsl currentLanguage=en
> Transform failed: =US-ASCII
>
> The above 'saxon' is a simple shell script I wrote just to
> save typing. It just invokes 'java com.icl.saxon.Whatever
> [<args>]'.
>
> 4XSLT:
> $ 4xslt -DcurrentLanguage=en main.html i18n.xsl
> [ long stack trace ]
> TypeError: argument(2) to filter() must be a sequence type
>
> The 4XSLT error looks like a possible bug, but the Saxon output is
> just plain puzzling. Where is 'US-ASCII' coming from? I edit the
> source in EUC-JP, then convert it to UTF-8 or UTF-16 (same results
> either way) using iconv.
>
> So, can anybody give me a clue? Any leads would be much appreciated.
>
> Matt Gushee
>
>
> ---- i18n.xsl ---------------------------------------------
>
> <?xml version="1.0"?>
> <!-- None of the commentings-out made any difference -->
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>
> <xsl:param name="currentLanguage" select="'en'"/>
>
> <xsl:variable name="charEncoding">
> <xsl:choose>
> <xsl:when test="$currentLanguage='en'">iso-8859-1</xsl:when>
> <xsl:when test="$currentLanguage='ja'">euc-jp</xsl:when>
> <xsl:otherwise>utf-8</xsl:otherwise>
> </xsl:choose>
> </xsl:variable>
>
> <xsl:output method="html" encoding="$charEncoding"/>
>
> <xsl:template match="/">
> <xsl:apply-templates/>
> </xsl:template>
>
> <!-- <xsl:template match="*[lang($currentLanguage) or not(@xml:lang)]"> -->
> <xsl:template match="*[lang($currentLanguage)]">
> <xsl:copy>
> <!-- <xsl:for-each select="@*[name() != 'id']"> -->
> <xsl:for-each select="@*">
> <xsl:copy/>
> </xsl:for-each>
> <xsl:apply-templates/>
> </xsl:copy>
> </xsl:template>
>
> </xsl:stylesheet>
>
>
> --- main.html [pre-conversion: euc-jp encoding] --------------
>
> <?xml version="1.0" encoding="UTF-16"?>
> <!--
> <!DOCTYPE html PUBLIC
> "-//W3C//DTD XHTML 1.1//EN"
> "/usr/local/share/xml/xhtml/xhtml11.dtd"
> >
> -->
> <html xmlns="http://www.w3.org/1999/xhtml"
> version="-//W3C//DTD XHTML 1.1//EN"
> xml:lang="en">
> <head>
> <title>Welcome</title>
> </head>
>
> <body xml:lang="en">
> <h1 xml:lang="en">Welcome</h1>
> <h1 xml:lang="ja">$B$h$&$3$=(B</h1>
> <hr xmlns="http://www.w3.org/1999/xhtml"/>
> <p xml:lang="en">
> The Kaiwa Club is an informal group for people who want to practice
> Japanese conversation. We welcome members at all levels of
> proficiency.
> </p>
> <p xml:lang="ja">
> $B2qOC6f3ZIt$OF|K\8l$N2qOC$rN}=,$7$?$$?M$N$?$a$N%$%s%U%)!<%^%k$J%0%k!<%W$G(B
> $B$4$6$$$^$9!#%l%Y%k$O$+$+$o$i$:!"?7$7$$2q0w$rBg4?7^$7$F$*$j$^$9!#(B
> </p>
> </body>
> </html>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list