This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: iconv_open behaviour on EILSEQ


: On Sun, 5 May 2002 00:58:54 -0700 (PDT), Paul Eggert wrote:

>> Date: Sun, 5 May 2002 00:53:04 -0400 (EDT)
>> From: Jungshik Shin <jshin@mailaps.org>
>> 
>> What is it supposed to do when it encounters a *valid*
>> byte sequence in the specified source codeset which cannot be converted
>> to the specified target codeset.
>
>POSIX 1003.1-2001 says it "shall perform an implementation-defined
>conversion on this character."

... which does not happen in the iconv() conversion of glibc 2.2.4 (SuSE
7.3).

Example: When converting Unicode to locale-specific data (ISO 8559-1,
for instance), conversion stops with a return value of -1 and EILSEQ if
the Unicode character in question cannot be converted into a
locale-specific character (try some cyrillic input).

Comparing current iconv() behaviour to how Microsoft implemented and
documented WideCharToMultiByte()


http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_2bj9.asp

it seems that iconv() behaviour is not in line with POSIX 1003.1-2001.

Working around this problem can be relatively easy if the source codeset
in an iconv conversion is the same codeset as the current process
codeset and there is not too much degree of freedom in the choice of
source and destination codeset. In this case, simply skip mblen() input
bytes and write out, say, a '?' to the output stream.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]