This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/1124] New: iconv incorrectly converts cp1255


When converting CP1255 to UTF-8, the last letter in the output buffer is
missing, if inbytesleft is passed without couting the terminating null.

Assuming I have a five letter cp1255 string, the first four letters will be
converted correctly, but the last letter will be missing (though inbytesleft is
zero after iconv).

The problem does not happen if inbytesleft is 6, counting the terminating null
of inbuf.

I've attached a code that reproduces the case.

Doing the same conversion on cp1253, giving 5 as the length, produces the
correct string, so I assume there's a problem with cp1255 specificly.

Simply change CP1255 to CP1253 and watch the last letter appear...

Reproduction code:
#include <stdio.h>
#include <iconv.h>

void tohex(char *str) {
    char *ch;
    for (ch = str ; *ch ; ch++) {
        printf("%02x ", *ch & 0xff);
    }
    printf("\n");
}

int main() {
    // A five-letter hebrew word (cp1255) - null terminated
    char buf[] = { 0xf4, 0xe5, 0xf8, 0xe5, 0xed, 0x00 };

    char outbuf[255];

    char *cp1255 = buf;
    char *utf8 = outbuf;

    size_t bytes_read = 5, bytes_written = 255;
    iconv_t the_iconv = iconv_open("UTF8", "CP1255");
    iconv(the_iconv, &cp1255, &bytes_read, &utf8, &bytes_written);

    printf("buf: ");
    tohex(buf);

    printf("bytes_read: %d\n", bytes_read);
    printf("bytes_written: %d\n", bytes_written);
    printf("outbuf: ");
    tohex(outbuf);

    return 0;
}

-- 
           Summary: iconv incorrectly converts cp1255
           Product: glibc
           Version: 2.3.5
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: gotom at debian dot or dot jp
        ReportedBy: z9u2k at bezeqint dot net
                CC: glibc-bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=1124

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]