This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] [BZ 14510] Fix LC_NUMERIC for various es_* locales


On Wed, Aug 22, 2012 at 2:01 PM, Jeff Law <law@redhat.com> wrote:
> Back in November 2011, Uli checked in a large change which affected the
> LC_NUMERIC settings of various es_* locales.  This change didn't reference
> any supporting documentation.
>
> It's now being reported that various es_* locals have the wrong LC_NUMERIC
> settings for the decimal mark and thousands separator.

Thank you for working through this!

We've had had a couple of attempts to fix this but most stall out and
we need more volunteers to help in this area :-(

Unfortunately your first change to es_MX is wrong.

It underscores three somewhat secret beliefs I've had about locales:

(a) that without a native language reader to interpret the standards
we are bound to continue making mistakes

and

(b) that we need to reach out to government by email and ask for
official documents or rulings.

and

(c) that sometimes nobody cares what their government says and they
write it whichever way they want.

> First I compared the es_* locales to CLDR for LC_NUMERIC settings.  This
> turned up several differences (es_DO, es_GT, es_HN, es_MX, es_NI, es_PA,
> es_PE, es_PR, es_SV).
>
> For each of those locales I then went in search of documents, preferably
> government documents which would show usage of the decimal mark and
> thousands separator.
>
>
> Mexico:
> http://www.economia.gob.mx/files/diagnostico_economia_mexicana.pdf

Aurelian Jarno raised this issue for Mexico here:
http://sourceware.org/ml/libc-alpha/2012-06/msg00073.html

The same comments apply to your patch.

>diff --git a/localedata/locales/es_MX b/localedata/locales/es_MX
>index 7a1cccc..13fa1a0 100644
>--- a/localedata/locales/es_MX
>+++ b/localedata/locales/es_MX
>@@ -78,7 +78,9 @@ n_sign_posn          1
> END LC_MONETARY
>
> LC_NUMERIC
>-copy "es_ES"
>+decimal_point        "<U002C>"

This is wrong.

The official Mexican standard with amendment says "comma or dot" for
decimal sign and common practice is "dot". Therefore decimal_point
should be "<U002E>."

>+thousands_sep        "<U002E>"

This is wrong.

The thousand_sep value of "<U002E>" is a full stop and is not valid
according to the normative Mexican standard.

According to [1] on page 57 it states that the thousand separator must
be a "small space" (pequeño espacio), and must never be a comma,
point, or other symbol. There is a `thin space' <U+2009> which
probably serves the best purpose here. Unfortunately as you can see in
the ensuing discussion, every other standard uses <U+0020> (normal
space) instead of <U+2009>. Nobody knows why, or what would happen if
you used <U+2009> (non-ASCII). We assume it would get transliterated
to <U+0020> in the right instance, but we might hit a few bugs.

I'm happy to accept a patch that uses either <U+0020> or <U+2009>,
both are forward progress on this issue for es_MX.

>+grouping             3;3
> END LC_NUMERIC
>
> LC_TIME
>---

[1] "Secretary of the Economy - Normative Mexican Standard - General
system of units of measure" (translated title provided by me)
http://www2.ine.gob.mx/publicaciones/download/008scfi.pdf

> Dominican Republic:
> http://www.bancentral.gov.do/noticias/avisos/aviso2010-06-25.pdf

I don't think this is sufficiently normative.

I've sent the Dominican Republic Ministry of Industry and Commerce an
email asking for clarification.

You are on the CC to the email.

I will update libc-alpha when I get a response back.

diff --git a/localedata/locales/es_DO b/localedata/locales/es_DO
index 7cf54cf..4753ecf 100644
--- a/localedata/locales/es_DO
+++ b/localedata/locales/es_DO
@@ -78,7 +78,9 @@ n_sign_posn          1
 END LC_MONETARY

 LC_NUMERIC
-copy "es_ES"
+decimal_point        "<U002E>"
+thousands_sep        "<U002C>"
+grouping             3;3
 END LC_NUMERIC

 LC_TIME
---
This translates to "1,000.00" which seems sensible.

> We can get grouping from this document from the Guatemala Government. Once
> we know grouping uses ',', then the decimal mark must be '.'.
> http://www.ine.gob.gt/np/enei/ENEI2011.htm

I've sent the Guatamalan Government Department of Statistics an email
asking for clarification.

You're on the CC.

I'll update libc-alpha when I get a response back.

I'll tackle Honduras, Nicaragua, Panama, Puerto Rico, and El Salvador
after dinner.

> Honduras:
> http://www.ine.gob.hn/drupal/node/175
> http://archivo.laprensa.hn/Negocios/Ediciones/2011/02/07/Noticias/Tasa-de-desempleo-de-Honduras-subio-a-44
>
> Nicaragua:
> http://www.bcn.gob.ni/estadisticas/economicas_anuales/nicaragua_en_cifras/2010/Nicaragua_en_cifras2010.pdf
>
> Panama:
> http://www.mef.gob.pa/portal/2011-Comunicados/2011-DISMINUYESUSTANCIALMENTEELDESEMPLEOENPANAMA.html
>
> Puerto Rico:
> http://www.periodicolaperla.com/index.php?option=com_content&view=article&id=3606:en-puerto-rico-la-tasa-de-empleo-cae-al-nivel-mas-bajo-en-la-historia&catid=93:analisis-economico&Itemid=300
>
> El Salvador:
> http://www.minec.gob.sv/index.php?option=com_content&view=article&catid=1:noticias-ciudadano&id=1567:encuesta&Itemid=77
>
> All the above referenced documents show a decimal mark as '.' and the
> thousands separator as ',', which indicate glibc's localedata is wrong.
>
>
> Interestingly enough, Peru which was supposed to use '.' as the decimal
> separator and ',' as the thousands separator according to CLDR seems to do
> the opposite according to these government inflation and labor reports:
>
> http://www.bcrp.gob.pe/docs/Publicaciones/Reporte-Inflacion/2010/marzo/Reporte-de-Inflacion-Marzo-2010.pdf
> http://www.inei.gob.pe/biblioineipub/bancopub/Est/Lib0909/libro.pdf
>
> Thus es_PE is correct as-is.
>
>
> This patch fixes es_DO, es_GT, es_HN, es_MX, es_NI, es_PA, es_PR and es_SV.

Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]