This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: should mbrtowc(&wc, "", 1, &ps) set wc?


Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>:

> That sounds indeed like a glibc 2.2 bug then.

And I see you've reported it to the appropriate authorities. Thanks.

I've noticed another thing, that someone might want to call a bug:

In a UTF-8 locale, mbrtowc(&wc, "A", (size_t)(-1), &ps) returns 1, but
mbrtowc(&wc, "\302\240", (size_t)(-1), &ps) returns -1. So does
mbrtowc(&wc, s, (char *)0 - s, &ps), where s is "\302\240", but
mbrtowc(&wc, s, (char *)(-1) - s, &ps) returns 2.

I don't know whether strings are allowed to wrap around the address
space, but the behaviour of mbrtowc is inconsistent in this respect
and might confuse some program.

By the way, http://www-gnats.gnu.org:8080/cgi-bin/wwwgnats.pl,
mentioned in BUGS, doesn't work.

I'm not subscribed to libc-alpha, so please copy replies to me.

Edmund


Program tested on i386-linux:

#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>

int main()
{
  wchar_t wc;
  char *s = "\302\240";
  mbstate_t ps;
  size_t r;

  setlocale(LC_ALL, "");

  memset(&ps, 0, sizeof(ps));
  r = mbrtowc(&wc, "A", (size_t)(-1), &ps);
  printf("%ld %ld\n", (long)r, wc);

  memset(&ps, 0, sizeof(ps));
  r = mbrtowc(&wc, s, (size_t)(-1), &ps);
  printf("%ld %ld\n", (long)r, wc);

  memset(&ps, 0, sizeof(ps));
  r = mbrtowc(&wc, s, (char *)0 - s, &ps);
  printf("%ld %ld\n", (long)r, wc);

  memset(&ps, 0, sizeof(ps));
  r = mbrtowc(&wc, s, (char *)(-1) - s, &ps);
  printf("%ld %ld\n", (long)r, wc);

  return 0;
}

Output:

$ LANG=eo ./a.out                                                              
1 65
-1 65
-1 65
2 160

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]