This is the mail archive of the docbook-apps@lists.oasis-open.org mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Character encoding problems in text files includedwith Saxon extensions


/ Jirka Kosek <jirka@kosek.cz> was heard to say:
| iso-8859-2 and windows-1250 rather then UTF-8. Using current
| implementation incorrectly interpretes non-ASCII characters because
| their codes are different in single bytes encoding and in UTF-8.

Ah. This is clearly a bug on my part.

| DataInputStream. InputStreamReader automatically converts content of
| file from system encoding to Java Unicode characters. 

Ok, I'll switch to InputStreamReader asap.

| In addition to default usage of system encoding, we could provide some
| mechanism how to specify encoding of included file. InputStreamReader is
| able to convert files from many encodings, so adding some attribute,
| notation or parameter to DocBook source would be quite easy. E.g.
| 
| <inlinegraphics format="linespecific"
| fileref="example_with_russian_comments.java;charset=iso-8859-5"/>

Yikes. Maybe it's time to promote the linespecific hack to a proper
text-include element. Or maybe it's time to support XInclude. Or
something.

| For now, I'm using modified version of Norm's extension. Of course, it

Uh, if you wanna contribute the fixes back to the main distribution, it's
not like I'd turn them down :-)

| So what is your opinion?

You found a bug. Thanks!

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com>      | The Future is something which
http://www.oasis-open.org/docbook/ | everyone reaches at the rate of
Chair, DocBook Technical Committee | sixty minutes an hour, whatever he
                                   | does, whoever he is.--C. S. Lewis

------------------------------------------------------------------
To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: docbook-apps-request@lists.oasis-open.org


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]