This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: Bug in mbsrtowcs?
On Feb 13 15:54, Jeff Johnston wrote:
> Corinna Vinschen wrote:
>> On Feb 13 14:36, Jeff Johnston wrote:
>>
>>> Corinna Vinschen wrote:
>>>
>>>> Hang on. If _mbrtowc_r encounters an incomplete MB char then it does
>>>> not form an invalid character so there's no reason to return with -1 and
>>>> set errno to EILSEQ. However, it also doesn't form a *valid* character,
>>>> it's just incomplete. Thus it must be the start of the last character
>>>> at the end of the input string.
>>>>
>>>>
>>> This code is there because it means that the character has redundant
>>> shift state. From mbrtowc:
>>>
>>> (*size_t*)-2
>>> If the next /n/ bytes contribute to an incomplete but potentially
>>> valid character, and all /n/ bytes have been processed (no value is
>>> stored). When /n/ has at least the value of the {MB_CUR_MAX} macro,
>>> this case can only occur if /s/ points at a sequence of redundant
>>> shift sequences (for implementations with state-dependent encodings).
>>>
>>> In our case, n is MB_CUR_MAX so it must be redundant shift sequence. The
>>> state is stored so if we increase the src pointer, it should continue
>>> where it left off.
>>>
>>
>> Uhh, that was what I was missing, now I understand. However, it's still
>> at the end of the string so we can return from the function at this
>> point.
>>
>>
> No, it isn't at the end of the string. It has processed all MB_CUR_MAX
> bytes we asked it to successfully and not completed the character. The
> reason is that there are stupid shift states added (e.g. Out In Out which
> eats up extra bytes without doing anything and so if we only ask it to
> process MB_CUR_MAX bytes we don't see the end of the last shift sequence).
> End of string will cause a different error. Does that explain it better?
But that's not the case for _mbsnrtowcs_r. It always calls _mbrtowc_r
with the current value of nms. Thus, if the return value is -2, we
can skip nms bytes *and* return, because nms is 0 afterwards anyway.
Corinna
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat