This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
gcc ignores locale (no UTF-8 source code supported)
- To: libc-alpha at sources dot redhat dot com
- Subject: gcc ignores locale (no UTF-8 source code supported)
- From: Markus Kuhn <Markus dot Kuhn at cl dot cam dot ac dot uk>
- Date: Fri, 22 Sep 2000 12:31:21 +0100
If I compile a program (source code encoded in ISO 8859-1) that contains
the line
wprintf(L"Schöne Grüße!\n");
with
LANG=en_GB.ISO-8859-1 gcc -W -Wall -O widetest.c -o widetest
then I get correctly in the produced binary the UCS-4 encoded wide
string
000005c0 53 00 00 00 63 00 00 00 68 00 00 00 f6 00 00 00 S...c...h...ö...
000005d0 6e 00 00 00 65 00 00 00 20 00 00 00 47 00 00 00 n...e... ...G...
000005e0 72 00 00 00 fc 00 00 00 df 00 00 00 65 00 00 00 r...ü...ß...e...
000005f0 21 00 00 00 0a 00 00 00 00 00 00 00 00 00 00 00 !...............
However, if I accidentally work in a UTF-8 locale and compile the
ISO 8859-1 source code with
LANG=en_GB.UTF-8 gcc -W -Wall -O widetest.c -o widetest
then no warning message is issued and the resulting binary still
contains the result of the above ISO 8859-1 -> UCS-4 translation.
It seems that gcc ignores the locale and does not use glibc's multi-byte
decoding functions to read in wide-string literals. :-(
$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-suse-linux/2.95.2/specs
gcc version 2.95.2 19991024 (release)
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>