This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

KOI8


One fairly important character encoding not yet supported by Cygwin
1.7 is KOI8. Well, two actually, because there are slightly different
versions for Russian and Ukrainian: KOI8-R and KOI8-U, aka Windows
codepages 20866 and 21866. Apparently they're de-facto standards for
Unix machines and the  in the former Soviet Union. (Windows uses
CP1251, whereas ISO-8859-5 (Cyrillic) never caught on.)

Cygwin's Midnight Commander actually uses KOI8 if the locale is set to
"ru" or "uk", even if a charset is specified explicitly, e.g.
"ru.CP1251". Hence you get gibberish where a helpful hint in the
user's language should be. (Of course that's primarily a shortcoming
in mc.)

Anyway, to help support them, the attached patch adds the KOI8
charsets to newlib's Unicode conversion and ctype tables. I took the
conversion tables from iconv and adapted the ctype tables from the
CP1251 version. Since KOI8 has printable characters in the C1 range
from 0x80 to 0x9F, it seems easiest to treat them as Windows
codepages.

To complete support, "KOI8-R" and "KOI8-U" would need to be recognised
in _setlocale_r and mapped to codepages 20866 and 21866.

Andy

Attachment: koi8.patch
Description: Binary data

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]