[PATCH v4 0/3] C++20 P0482R6 and C2X N2653: char8_t, mbrtoc8(), and c8rtomb()

Tom Honermann tom@honermann.net
Wed Jul 6 03:27:45 GMT 2022


On 7/4/22 3:08 PM, Adhemerval Zanella wrote:
>
>> On 30 Jun 2022, at 09:52, Tom Honermann via Libc-alpha <libc-alpha@sourceware.org> wrote:
>>
>> This series of patches provides the following:
>> - A fix for BZ #25744 [1].
>> - Implementations of the mbrtoc8 and c8rtomb functions adopted for
>>   C++20 via WG21 P0482R6 [2] and for C2X via WG14 N2653 [3].
>> - A char8_t typedef as adopted for C2X via WG14 N2653 [3].
>>
>> The fix for BZ #25744 [1] is included in this patch series because the tests
>> for mbrtoc8 and c8rtomb depend on it for exercising the special case where a
>> pair of Unicode code points is converted to/from a single double byte sequence.
>> Such conversion cases exist for Big5-HKSCS.
>>
>> N2653 was adopted by WG14 for C2X and wording is present in the N2912 revision
>> of the C2X working draft [4].  This patch series enables the new declarations
>> in C2X mode and when _GNU_SOURCE is defined.
>>
>> This patch series addresses feedback provided in response to the prior
>> submission series starting at [5].  All feedback was addressed except as noted
>> below:
>> - I removed the previously proposed wcsmbs/test-char8-type.c test as redundant
>>   per feedback. I prototyped extending the c++-types.data test as suggested,
>>   but that would have introduced a C++20 dependency.  The test demonstrated that
>>   the char16_t and char32_t types were mapped to 'Ds' and 'Di' as expected, but
>>   char8_t got mapped to 'h' rather than 'Du'.  It might be worth extending this
>>   test in the future if/when C++20 support becomes a minimal requirement.
>> - I did not switch the iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c test to use
>>   TEST_COMPARE in favor of the existing custom error handling; the current
>>   error messages are customized to provide a more detailed explanation of the
>>   error, so removing that seemed like an arguable regression.  My changes retain
>>   consistency with the existing code.  I think it is reasonable to switch to
>>   TEST_COMPARE, but I think that should be done separately from this change and
>>   separately motivated.
>> - I did not change the c8rtomb() and mbrtoc8() implementations to use thread
>>   local storage when PS is null; the implementations continue to use static
>>   storage consistent with c16rtomb(), mbrtoc16(), c32rtomb(), and mbrtoc32().
>>   I think it is reasonable to switch all of these functions to use thread
>>   local storage, but I think such a change should be done consistently across
>>   all of them and should be separately motivated.
>>
>> Thank you to Joseph Myers, Carlos O'Donell, Adhemerval Zanella, and Florian
>> Weimer for prior reviews of this patch series.
>>
>> Tested on Linux x86_64.
>>
>> [1]: Bug 25744
>>      "mbrtowc with Big5-HKSCS returns 2 instead of 1 when consuming the
>>      second byte of certain double byte characters"
>>      https://sourceware.org/bugzilla/show_bug.cgi?id=25744
>>
>> [2]: WG21 P0482R6
>>      "char8_t: A type for UTF-8 characters and strings (Revision 6)"
>>      https://wg21.link/p0482r6
>>
>> [3]: WG14 N2653
>>      "char8_t: A type for UTF-8 characters and strings (Revision 1)"
>>      http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2653.htm
>>
>> [4]: WG14 N2912
>>      "Programming languages — C working draft — June 8, 2022"
>>      https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf
>>
>> [5]: libc-alpha mailing list
>>      "[PATCH 0/3]: C++20 P0482R6 and C2X N2653: support for char8_t, mbrtoc8(), and c8rtomb()."
>>      https://sourceware.org/pipermail/libc-alpha/2022-February/136723.html
> Hi Tom,
>
> Patches look good in general, just some really minor style issues I
> found in last review.  I have fixed them and if you are ok I can
> install them, I have create a branch so you can check [1].

Excellent, thank you! I verified that your branch differs from the 
patches I posted just by the style issues mentioned in your review and 
the additional removal of the (empty) 
sysdeps/mach/hurd/libhurduser.abilist and 
sysdeps/mach/libmachuser.abilist files (which I presume to be 
intentional). Looks good to me! I very much appreciate you making the 
additional style corrections!

Tom.

>
> [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/mbrtoc8-c8rtomb
>


More information about the Libc-alpha mailing list