This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/13237] New: country_name field of LC_ADDRESS


http://sourceware.org/bugzilla/show_bug.cgi?id=13237

             Bug #: 13237
           Summary: country_name field of LC_ADDRESS
           Product: glibc
           Version: 2.14
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
        AssignedTo: libc-locales@sources.redhat.com
        ReportedBy: cjlhomeaddress@gmail.com
    Classification: Unclassified


Created attachment 5954
  --> http://sourceware.org/bugzilla/attachment.cgi?id=5954
Summary of glibc country_name field entries.

I have performed a comprehensive analysis of the use of the LC_ADDRESS field
for country_name.  I am somewhat concerned by the findings of that analysis for
a field that should be populated with the name of the country in the language
of the locale, two pieces of information inherent in the locale name.


There are 279 locales (excluding the deprecated iw_IL).

Of those 279, only 84 locales have populated country_name fields. 



84 populated

43 empty, (not readily determined)

152 empty, but can be easily determined by look-up in ISO-3166 L10n files.

equals 279 total


Of the 84 populated country_name fields:

37 can be confirmed from ISO-3166 L10n files.

31 cannot be confirmed from ISO-3166 L10n files (not necessarily a problem).

16 have obvious encoding errors or require review and / or correction.


Examples of errors:

km_KH encodes Lao characters spelling Laos, not Khmer characters spelling
Cambodia.

bg_BG, ku_TR, mk_MK, mn_MN, tr_TR encode English, not native language/script
names

bo_CN and bo_IN coded as FIXME, should be commented out.

dz_BT coded as BHU

ur_IN uses "copy hi_IN", thus encoding Localein Hindi, not Urdu language name
of India.

en-US encodes USA (not United States)
es-US encodes USA (not Estados Unidos)

Others include conflicts with ISO-3166 entries that require clarification.

Some consideration should be given to correcting the obvious errors and making
the easily confirmed additions so that the LC_ADDRESS country_name field is
more usefully populated with the country name of the locale in the language of
the locale.


The first column attached spreadsheet contains links to 2xlibre.net locale
files (purely for convenience), This data had been recently refreshed from 2.14
release.

All details checked against original sources at:
http://sourceware.org/git/?p=glibc.git;a=tree;f=localedata/locales;h=aa17c365ce474cfb9c7dab92b623bfb5a8786208;hb=HEAD

The key columns are the "Action" (suggested) and the "Corrected country_name"
column.  The entries in the "Evidence ISO-3166" column link directly to the
relevant location within the PO files.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]