This is the mail archive of the cygwin-xfree mailing list for the Cygwin XFree86 project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: X11R7.5 and C.UTF-8


Linda Walsh wrote:
> C.UTF_8 doesn't exist.

You're wrong. Please read the whole of this thread -- and the last two
months' worth of cygwin-developers.

> mintty is broken.

No, it isn't.  It just doesn't work the way *you* expect it to.

> Might want to try 'Console' nstead of using mintty.  Not perfect either,
> but fewer compatibility problems that I've noticed.
> 
> Examples of valid LANG values:
>   C, ca_FR, en_US, fr_FR, it_IT, nl_NL, wa_BE@euro
> 
> You can't have "C" and "UTF-8", because C means no encoding (default).

No, it doesn't.  "C" means "POSIX" and is defined here:
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
Note how all the glyphs are defined in terms of character NAMES, not
hexadecimal values?  That's because "C", all by itself, just doesn't
SPECIFY any encoding.  You're still allowed to HAVE one -- in fact, you
ALWAYS have one.

On most systems, that has historically been the plain ASCII 7-bit
encoding; many others used the EBCDIC encoding and were not considered
in violation of the POSIX "C" locale specification.  Now, many systems
are starting to use the UTF-8 encoding by default, even in the "C" locale.

"C"/"POSIX" locale (without an additional .ENCODING suffix) is
encoding-AGNOSTIC, that's all.  So, you're allowed to add an .ENCODING
suffix to force a specific encoding if you like, without violating
POSIX.  (And your system is also allowed, in that case, to IGNORE that
.ENCODING suffix, and still be Posix-compliant IIUC, so it's rather a
hole in the spec IMO).

> UTF-8 IS an encoding, so they are mutually exclusive.  I don't
> know under what circumstances "C" might imply UTF-8.

Whenever the platform decides to use UTF-8 as its default encoding,
which is perfectly acceptable according to Posix.  Cygwin-1.7 has
decided to do that.  So, on cygwin-1.7, "C" implies .UTF-8.  X11R7.5
doesn't yet know that, without outside help (e.g. explicitly setting
$LANG to "C.UTF-8" by default, so that XWin "knows" about the new
default behavior).

>  If the definition
> of "C" changes?  It might be easier than changing "c" (as used in physics).
> 
> My understanding of locale issues is also limited and subject to change or
> re-education...

Uhm, yeah.

--
Chuck

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://x.cygwin.com/docs/
FAQ:                   http://x.cygwin.com/docs/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]