This is the mail archive of the mailing list .

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Character encoding problem

Hi, folks--

I'm developing a simple XSLT transformation for selecting languages
(English or Japanese) on a bilingual website. It takes a source XHTML
document with paired headings in English and Japanese, e.g.:

	 <p xml:lang="en">
           [ some stuff in English ]
         <p xml:lang="ja">
           [ same content in Japanese ]

... and outputs everything in the selected language plus any content
that has no language specified. At least that's the theory. I've tried
processing it w/ (full) Saxon and 4XSLT's command line interfaces, but
keep getting errors:

	$ saxon main.html i18n.xsl currentLanguage=en
	Transform failed: =US-ASCII

	The above 'saxon' is a simple shell script I wrote just to
	save typing. It just invokes 'java com.icl.saxon.Whatever

	$ 4xslt -DcurrentLanguage=en main.html i18n.xsl
	[ long stack trace ]
	TypeError: argument(2) to filter() must be a sequence type

The 4XSLT error looks like a possible bug, but the Saxon output is
just plain puzzling. Where is 'US-ASCII' coming from? I edit the
source in EUC-JP, then convert it to UTF-8 or UTF-16 (same results
either way) using iconv.

So, can anybody give me a clue? Any leads would be much appreciated.

Matt Gushee

---- i18n.xsl ---------------------------------------------

<?xml version="1.0"?>
<!-- None of the commentings-out made any difference -->
<xsl:stylesheet version="1.0"

  <xsl:param name="currentLanguage" select="'en'"/>

  <xsl:variable name="charEncoding">
      <xsl:when test="$currentLanguage='en'">iso-8859-1</xsl:when>
      <xsl:when test="$currentLanguage='ja'">euc-jp</xsl:when>

  <xsl:output method="html" encoding="$charEncoding"/>

  <xsl:template match="/">

  <!-- <xsl:template match="*[lang($currentLanguage) or not(@xml:lang)]"> -->
  <xsl:template match="*[lang($currentLanguage)]">
      <!-- <xsl:for-each select="@*[name() != 'id']"> -->
      <xsl:for-each select="@*">

--- main.html [pre-conversion: euc-jp encoding] --------------

<?xml version="1.0" encoding="UTF-16"?>
  "-//W3C//DTD XHTML 1.1//EN"
<html xmlns=""
  version="-//W3C//DTD XHTML 1.1//EN"

  <body xml:lang="en">
    <h1 xml:lang="en">Welcome</h1>
    <h1 xml:lang="ja">ようこそ</h1>
    <hr xmlns=""/>
    <p xml:lang="en">
The Kaiwa Club is an informal group for people who want to practice
Japanese conversation. We welcome members at all levels of
    <p xml:lang="ja">

 XSL-List info and archive:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]