This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Translation-i18n] xtensa message pluralization


On Wed, Nov 08, 2017 at 05:52:59PM +0000, Paul.Koning@dell.com wrote:
> The reason for the "entire sentences" rule is that a lot of languages adjust word forms in fairly complex ways, depending not just on the number (singular/plural, etc.) but also on other considerations.  If you have two sentence fragments, in English you can typically just concatenate them and be ok.  In lots of other languages it's not that simple; the correct way to phrase part 2 may depend on what is in part 1.  A sentence is a grammatical unit, and can be translated in isolation without running into these issues.  But half a sentence cannot, at least not necessarily.

Understood, and answers my question, thanks.  Using concat is not right.

> Given the limitations of the gettext machinery, if you want clean translation there are certain message constructs you have to avoid.  It appears that messages with more than one to-be-pluralized element are such an example, since there isn't an "nngettext" to give you the correct message for the plural based on more than one value.

That isn't going to happen.  The translation project is going to be
faced with sentences that really do need two or more pluralized nouns
for the sense to be conveyed naturally in English.  Avoiding two
plurals in one sentence will mean loss of information (eg. dropping
"bytes" from a quantity) or stilted contrived sentences.

To recap, the sentence we are talking about here is:
	"format '%s' allows %d slots, but there are %d opcodes"

Bruno suggested the best solution was to break the sentence at the
conjunction "but", which is of course a natural place to break a
sentence into phrases.  (It's how I broke the sentence at first too,
when considering the reordering issue.)  The code to do that would be:

      char *phrase1, *phrase2;
      int slots = xtensa_format_num_slots (xtensa_default_isa, vinsn->format);

      if (asprintf (&phrase1, ngettext ("format '%s' allows %d slot,",
					"format '%s' allows %d slots,",
					slots),
		    xtensa_format_name (xtensa_default_isa, vinsn->format),
		    slots) == -1
	  || asprintf (&phrase2, ngettext ("there is %d opcode",
					   "there are %d opcodes",
					   vinsn->num_slots),
		       vinsn->num_slots) == -1)
	as_fatal ("%s", xstrerror (errno));

      as_bad (_("%s but %s"), phrase1, phrase2);
      free (phrase1);
      free (phrase2);

This would give a translator the following to work with:

msgid "format '%s' allows %d slot,"
msgid_plural "format '%s' allows %d slots,"
msgstr[0] ""
msgstr[1] ""

msgid "there is %d opcode"
msgid_plural "there are %d opcodes"
msgstr[0] ""
msgstr[1] ""

msgid "%s but %s"
msgstr ""

The patch I posted gives:

msgid "allows %d slot"
msgid_plural "allows %d slots"
msgstr[0] ""
msgstr[1] ""

msgid "there is %d opcode"
msgid_plural "there are %d opcodes"
msgstr[0] ""
msgstr[1] ""

msgid "format '%s' %s, but %s"
msgstr ""

Note that in both cases a translator does in fact have access to the
entire sentence, but with some restrictions.  In both cases the
"slots" phrase translation can't depend on the quantity in the
"opcodes" phrase translation, and vice versa.  Bruno's suggestion has
a further restriction in that the translation for "format" must be
adjacent to the "slots" translation.  So, abbreviating F for format, S
for slots, O for opcodes components, a translator could arrange to
emit FSO, SFO, OFS, OSF, but not FOS or SOF.

The patch I posted allows all the ordering possibilities, but the
translation for "format" can't depend on the "slots" quantity, and a
translator has a little more difficulty in piecing together the
sentence by just looking at the .pot file.

There is also the issue of other messages that may share "%s but %s"
construction in the future.  If such exist then that is another
complication for anyone wanting to reorder phrases, and a reason why
it may be better to put "format" with "but".

I think that covers all the issues I've considered.  I'm not a
linguist, and besides English only know a little German.  So I'm quite
happy to take advice, Bruno.  The only reason to extend this thread
was wondering whether you had considered everything, and to make
other binutils developers aware of the problems they cause!  I know
there is room for a lot of improvement in the binutils source
regarding translation, not least being the fact that ld's einfo
function doesn't allow reordering.

-- 
Alan Modra
Australia Development Lab, IBM


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]