This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] en_CA, es_AR, es_ES: Define yesstr and nostr.


On Mon, Apr 08, 2013 at 01:48:57AM +0200, Keld Simonsen wrote:
> On Mon, Apr 08, 2013 at 01:23:59AM +0200, Petr Baudis wrote:
> >   Hi!
> > 
> > On Mon, Apr 08, 2013 at 01:14:51AM +0200, Keld Simonsen wrote:
> > > On Sun, Apr 07, 2013 at 11:02:06PM +0200, Petr Baudis wrote:
> > > >   (Though I'm not particularly fond of having the ASCII contents of the
> > > > datapoint sequence repeated in the comment, as all data duplication adds
> > > > a potential for inconsistencies. Ideally, we would just actually write
> > > > the characters right in the values instead of the codepoints; I didn't
> > > > find any technical reason why to insist on the <U...> syntax for all
> > > > characters. But then again, I'm personally unlikely to gather the
> > > > momentum to do such a change, mainly to verify that it really is 100%
> > > > safe.)
> > > 
> > > The locales are character set independent, so they will run with utf-8, iso-8859-1, iso-8859-15
> > > and even EBCDIC. They are written in ASCII only, to better the portability between systems with
> > > different character sets.
> > 
> >   But itt's 2013. I claim that portability of locale source files to
> > EBCDIC is totally irrelevant in glibc and whoever cares should bear the
> > burden of writing the conversion tools.
> 
> No, it is not. We are discussing EBCDIC in the Austin group.
> Anyway we need still to be character set independent. Then the EBCDIC support comes for free.

  I don't understand, why cannot we assume ASCII-compatible (or, if you
will, POSIX portable character set compatible) charset in the locale
definition files?

  If we never ever write anything but <...> stuff in the key values,
why shouldn't we be able to define how is non-<...> stuff interpreted?
(Actually, in lineparser.c, we already explicitly state that "We assume
here that every character which stands for itself is encoded using ISO
8859-1.")

> >   I don't think it would be a big fuss if we just UTF8-encoded locale
> > files, but even if we only embrace the ASCII (!) and substitute 7bit
> > codepoint markups with the actual ASCII characters, that would be a
> > huge practical step forward already.
> 
> We should not just do UTF-8, that would be a major mistake.
> We have embedded systems, we have UTF-16, we have 8-bit systems, EBCDIC.

  But what does that have to do with locale definition files? It can be
localedef's job to deal with this.

-- 
				Petr "Pasky" Baudis
	For every complex problem there is an answer that is clear,
	simple, and wrong.  -- H. L. Mencken


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]