This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: XML apparently cannot be used for general text markup: whitespace gripe


Hi Chad,

> For example, can anyone get this simple document into HTML without
> either removing required spaces or adding inappropriate spaces?
>
>   <?xml version="1.0"?>
>   <book>
>      <par>
>       Is his name really <first>John</first>      <last>Doe</last>?
>     </par>
>   </book>
>
>  Either you will end up with:
>     "Is his name really JohnDoe?"
>   which is wrong, or:

This occurs when an XSLT processor strips whitespace-only text nodes
(such as the text between the 'first' and 'last' elements) from the
node tree prior to transformation.

Most XSLT processors do not strip whitespace-only text nodes by
default, but MSXML is the exception. To stop MSXML from stripping
whitespace-only text nodes, you have to call it from script and
explicitly set the preserveWhiteSpace property on the DOMDocument
object to 'true.

Another way around it is to add an xml:space attribute with the value
'preserve' to the surrounding element, so that whitespace-only text
nodes are preserved:

 <par xml:space="preserve">
   Is his name really <first>John</first>      <last>Doe</last>?
 </par>

>     "Is his name really John Doe ?"
>   which is also wrong.

If whitespace-only text nodes are preserved, I don't believe that you
actually get that in the HTML. I think that the HTML that you get is:

  <p>
    Is his name really John      Doe?
  </p>

However, HTML display semantics are that when some text contains
multiple whitespace characters, it is displayed as a single whitespace
character. So what you see on the screen is:

  Is his name really John Doe?

Preventing that happening is a matter of creating the correct HTML to
get the display that you want, which usually means including
non-breaking spaces rather than normal spaces:

  <p>
    Is his name really John&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Doe?
  </p>

Which will display as:

  Is his name really John      Doe?

You can generate this from XSLT, of course, by substituting spaces for
the non-breaking space character. For example:

<xsl:template match="par">
  <p>
    <xsl:value-of select="translate(., ' ', '&#160;')" />
  </p>
</xsl:template>

Or if you're using a push method, you can include a template that
matches text nodes and substitutes spaces in their content as follows:

<xsl:template match="text()">
  <xsl:value-of select="translate(., ' ', '&#160;')" />
</xsl:template>

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]