This is the mail archive of the xconq7@sourceware.cygnus.com mailing list for the Xconq project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: internationalization




On 28 Apr 1998, Massimo Campostrini wrote:

> Stan Shebs <shebs@cygnus.com> writes:
> 
> > Character sets I'm not too experienced with - what does the code have
> > to do differently?  Presumably any a-z tests are bogus, what else?
> 
> I don't know the Mac's approach to non-ASCII characters.  In X, you
> can pick a font containing all the characters your language needs
> (including characters with diacritics, ligatures etc.).  The
> ISO-8859-1 standard (256 characters, the first 128 coinciding with
...
> A minimalist approach in an X-only Xconq would be to select a charset
> (per language?), then code all the strings (in relevant .c and .g
> files) in the given charset.  This would mean that we can't mix
> languages at will.  And I guess it's not portable.
> 
> To really tackle the problem, we must
> 0) choose a character set large enough to display all the desired languages;
> 1) choose an internal representation for the character set;
> 2) implement character display on every front-end.
> 
> An ultimate solution would be adopting Unicode (ISO-10646).  The
> internal representation could be UTF-8, which is compatible with
> ASCII.  Unicode support is getting widespread; we can hope that
> Unicode software and fonts will be available on most systems Real Soon
> Now.
> 
> Any Unicode expert out there?  Are there good free X unicode fonts?  Mac?
Well, I am *not* an Unicode expert at all, but you can find lots of them
working on Omega (Unicode-capable TeX implementation) and they have
decent-looking printable fonts if you need them.

But using complete Unicode fonts (I hear there are some available for X,
and surely commercial platforms like Mac and Windows claim to have
something in that field) would be very expensive. In X that would
translate in font buffering rocketing up to the skies. 

In fact, we do not need complete Unicode fonts, just the encoding. We can
do the rest with "virtual fonts" (see http://www.ens.fr/omega/ for
references): mappings into real fonts that can be existing fonts, close to
Unicode pages or not (ISO-8859-1 is a lot like Unicode page 1, but a
virtual font does not care, just the one that has to write the mapping
does. No problem for Mac encodings here).

This is essentially the same mechanism Emacs 20 is using to get mulitple
character sets together, but they did not have enough bits for Unicode and
so have less elegant solutions for now.

The nature of war is such, that there are usually several languages
involved, but luckly rarerly all of them, so you would usually have only 1
to 3 fonts loaded at a time. If they were all the same size (see
http://www.biz.net.pl/english/x-fonts/index.html for an example of
ISO-8859-2 X11 fonts maching the ISO-8859-1 fonts of X11 distribution in
size) it would not even complicate the display that much. That would only
happen when complex scripts get implemented :)

If this looks complicated that is because it is. But it will be less
complicated to think about this before any work is started than if every
encoding is implemented separately.

> 
> Jan Javorsek informed me (did he cc to the list?) that Slovenian has
> declensions and the dual case, like classical Greek; once Slovenian is
> supported, I think that all Indo-European languages can be generated.

Well, I did not post this to the list and since we are now facing this
posted, I have to mention that Slovenian is not *really* much more
difficult than any other slavic language, but it does have some exotic
feauters (and, of course, lacks others, such as vocatif, for example).

I would of course be more than happy to present it as a test case, but
considering our pacifistic history, we might have trouble finding good
historical examples :)

Jan Javorsek





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]