This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: PATCH: New 64 bit wcscmp implementation
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Dmitrieva Liubov <liubov dot dmitrieva at gmail dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 29 Aug 2011 15:15:59 +0000 (UTC)
- Subject: Re: PATCH: New 64 bit wcscmp implementation
- References: <CAHjhQ92-2K42N+jrB3nnqUaLoRixjb_F_4GskRr=azD+7BtvHg@mail.gmail.com>
On Mon, 29 Aug 2011, Dmitrieva Liubov wrote:
> Hello,
>
> I've optimized wcscmp function (new implementation is written with
> assembly language coding), it looks better than current version for
> all machines with SSE2 supporting I have.
> It improves performance by up to 3.6X.
Do your versions avoid the bug in the present version that the comparisons
are done in an unsigned type even though when wchar_t is signed they must
be done in the signed wchar_t type?
A testcase for this bug for wcscmp is:
#include <stdlib.h>
#include <wchar.h>
wchar_t w0[] = { WCHAR_MIN, 0 };
wchar_t w1[] = { WCHAR_MAX, 0 };
int
main (void)
{
if (wcscmp (w0, w1) < 0)
return 0;
else
abort ();
}
and for wmemcmp:
#include <stdlib.h>
#include <wchar.h>
wchar_t w0 = WCHAR_MIN;
wchar_t w1 = WCHAR_MAX;
int
main (void)
{
if (wmemcmp (&w0, &w1, 1) < 0)
return 0;
else
abort ();
}
Both tests fail at present. The relevant standard wording is "Unless
explicitly stated otherwise, the functions described in this subclause
order two wide characters the same way as two integers of the underlying
integer type designated by wchar_t.". And, since there might be some
question over validity of the negative values of wchar_t, C1X adds an
explicit statement "Arguments to the functions in this subclause may point
to arrays containing wchar_t values that do not correspond to members of
the extended character set. Such values shall be processed according to
the specified semantics, except that it is unspecified whether an encoding
error occurs if such a value appears in the format string for a function
in 7.29.2 or 7.29.5 and the specified semantics do not require that value
to be processed by wcrtomb." to make it clear that like the narrow string
and memory functions all wchar_t values are valid without regard to their
semantics in any character set.
--
Joseph S. Myers
joseph@codesourcery.com