This is the mail archive of the docbook@lists.oasis-open.org mailing list for the DocBook project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: XML to XML and entities


/ FredaAnces@aol.com was heard to say:
| I am doing an XML to XML transformation.

[With my moderator hat on, I want to remind everyone that the appropriate
DocBook list for stylesheet and other application-related questions is
docbook-apps@lists.oasis-open.org.]

| If I put the following into my XSL file, entities such as — create 
| gibberish characters in the new XML file:
| 
|                  <xsl:output method="xml" />

I think with closer inspection you'll find that they aren't gibberish,
they're UTF-8 representations of the Unicode characters that those
entities represent. For example, an mdash is Unicode character 8212.
The only way to represent that in UTF-8 is with a multi-byte sequence
of octets. That sequence, when viewed in a tool that does not understand
UTF-8 encodings appears as three upper-ASCII characters.

| If I replace the output method with "html" the entities work fine but the 
| processing instructions no longer have the correct format.  For example - I 
| get the following line without the ? for the closing tag:
| 
|                 <?xml:stylesheet type="text/xsl" href="ae_toc.xsl">

Right. When you asked for HTML, you told the processor to output HTML,
which is in ISO-Latin1, so entities have to be used for special
characters, and PIs have the SGML form.

| Any ideas?    Thanks very much, Freda

I'm not sure what you want. The first form, in UTF-8, should be
understandable to any XML processor. The second form isn't XML.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | Any idiot can face a crisis; it's
http://www.oasis-open.org/docbook/ | this day-to-day living that wears
Chair, DocBook Technical Committee | you out.--Anton Chekhov

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]