This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

case conversion with Turkish locale

From: Baris Metin <baris at uludag dot org dot tr>
To: libc-alpha at sources dot redhat dot com
Cc: cekirdek at uludag dot org dot tr
Date: Fri, 22 Oct 2004 15:06:20 +0300
Subject: case conversion with Turkish locale

Hello,

Most of the GNU (and also non-GNU free) programs which depend on case
conversion are problematic in tr_TR and tr_TR.UTF-8 locales.

The problem is simple but a big pain for us. In Turkish upper-case
version of i is "I with dot above" (0130;LATIN CAPITAL LETTER I WITH DOT
ABOVE). A single byte character is converted to a multi-byte character.
lowercasing I. Lower-case version of I in Turkish is "i without the dot
above" (0131;LATIN SMALL LETTER DOTLESS I).

Most programs (assuming a constant byte-count) apply the conversion on
the original string and the string is ruined or at least the resulting
string does not represent a correct result.

Nowadays we try to find the problematic programs and patch them. Gawk,
grep, coreutils, vim, emacs and some others are effected from the
problem.

What I want to ask is. Is this the only solution for us? If so is it
possible to add a caution in the glibc documentation for the developer
to be aware?

best regards,
-- 
Baris Metin
http://www.metin.org

Attachment: signature.asc
Description: Digital signature

Follow-Ups:
- Re: case conversion with Turkish locale
  - From: Roland McGrath

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]