This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/RFA]: Optimize wctomb/mbtowc functions for speed


On 06/11/09 02:55 PM, Corinna Vinschen wrote:
Hi,

as a result of a discussion on the Cygwin list it came up that calling
mbrtowc on Cygwin is rather slow.  So I started to experiment with it
and it turned out that the major reason for the slowness is apparently
the fact that mbrtowc results in calling four functions:

mbrtowc -> _mbrtowc_r -> _mbtowc_r -> __mbtowc

Since the actual function behind the __mbtowc pointer is functionally
equivalent to what the application expects when calling mbrtowc, I
started to optimize away the intermediate functions.  The surprising
result (to me, at least) is that a loop like this

     const char in[] = "The quick brown fox jumps over the lazy dog";
     for (count = 0; count<  1000000; ++count)
       for (i = 0; i<  size; i += mbclen)
	if ((int)(mbclen = mbrtowc(&wc, in + i, size - i,&mbs))<= 0)
	  break;

is about 45% faster if the function called by the application call
__mbtowc directly rather than via the usual call path.  So I did the
same for wcrtomb with a similar result, 40-45% performance gain.

The below patch is the final result.  The basic idea is to call
__mbtowc and __wctomb instead of _mbtowc_r and _wctomb_r everywhere.
The simplest solution was IMHO to overload _mbtowc_r and _wctomb_r
with macros of the same name in local.h, so the change was basically
to add local.h to the affected source files.  If that's not wanted,
I can replace this with changing the sources to call the __mbtowc and
__wctomb functions directly, rather than via the macros.

Additionally, the functions mbrtowc and wcrtomb are now implemented
independently from _mbrtowc_r and _wcrtomb_r, unless
PREFER_SIZE_OVER_SPEED is defined.

Is that ok to check in?


I'm not in agreement with the last part of this patch which is to add a PREFER_SIZE_OVER_SPEED check for mbrtowc and wcrtomb. Where does one draw the line? Should the set of I/O functions that printf and scanf call be conglomerated into one massive function unless asked not to? It would save time.


Most platforms don't set the SIZE_OVER_SPEED unless they are ridiculously small, so you are adding additional bloat to the majority of platforms out there all to save one particular reentrant function call. There are many others out there as well. I would guess that this one call isn't giving you that much improvement on average compared to the optimization you made with the lower-level calls and also, Cygwin could use a macro trick itself to call all the _r function(s) directly and save time.

Now, that said, I totally agree with the heart of the change which is to replace the calls to _mbtowc_r and _wctomb_r as they are no longer performing any real logic.

I think it would be better to call the functions directly instead of hiding them behind the macros. The effort to do this is just a sed operation and the macro just gets in the way for debugging and understanding the code. It is not a temporary change or one that is governed by a switch which would more justify a macro solution.

-- Jeff J.


Thanks, Corinna


* libc/stdlib/local.h (_mbtowc_r): Define as call to __mbtowc. (_wctomb_r): Define as call to __wctomb. * libc/stdlib/btowc.c: Include local.h. * libc/stdlib/mblen.c: Ditto. * libc/stdlib/mblen_r.c: Ditto. * libc/stdlib/mbrtowc.c: Ditto. (mbrtowc): Implement independently from _mbrtowc_r, unless PREFER_SIZE_OVER_SPEED is defined. * libc/stdlib/mbstowcs_r.c: Include local.h. * libc/stdlib/mbtowc.c: Ditto. * libc/stdlib/mbtowc_r.c: Undefine _mbtowc_r. (__utf8_mbtowc): Drop unnecessary test for ch>= 0. * libc/stdlib/wcrtomb.c: Include local.h. (wcrtomb): Implement independently from _wcrtomb_r, unless PREFER_SIZE_OVER_SPEED is defined. * libc/stdlib/wcsnrtombs.c: Include local.h. (_wcsnrtombs_r): Call _wctomb_r rather than _wcrtomb_r. * libc/stdlib/wcstombs_r.c: Include local.h. * libc/stdlib/wctob.c: Ditto. * libc/stdlib/wctomb.c: Ditto. * libc/stdlib/wctomb_r.c: Undefine _wctomb_r. * libc/stdio/vfprintf.c: Include ../stdlib/local.h. * libc/stdio/vfscanf.c: Ditto.


Index: libc/stdlib/btowc.c =================================================================== RCS file: /cvs/src/src/newlib/libc/stdlib/btowc.c,v retrieving revision 1.2 diff -u -p -r1.2 btowc.c --- libc/stdlib/btowc.c 29 May 2007 21:26:59 -0000 1.2 +++ libc/stdlib/btowc.c 6 Nov 2009 19:50:17 -0000 @@ -3,6 +3,7 @@ #include<stdio.h> #include<reent.h> #include<string.h> +#include "local.h"

  wint_t
  btowc (int c)
Index: libc/stdlib/local.h
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/local.h,v
retrieving revision 1.8
diff -u -p -r1.8 local.h
--- libc/stdlib/local.h	25 Aug 2009 18:47:24 -0000	1.8
+++ libc/stdlib/local.h	6 Nov 2009 19:50:17 -0000
@@ -7,6 +7,12 @@ char *	_EXFUN(_gcvt,(struct _reent *, do

char *__locale_charset(_NOARGS);

+/* Overriding calls to _mbtowc_r and _wctomb_r with direct calls to __mbtowc
+   and __wctomb can speed up applications enormously.  The extra function call
+   for each invocation can take up a lot of time. */
+#define _mbtowc_r(p,w,s,n,ps) __mbtowc((p),(w),(s),(n),__locale_charset(),(ps))
+#define _wctomb_r(p,s,w,ps)   __wctomb((p),(s),(w),__locale_charset(),(ps))
+
  #ifndef __mbstate_t_defined
  #include<wchar.h>
  #endif
Index: libc/stdlib/mblen.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mblen.c,v
retrieving revision 1.5
diff -u -p -r1.5 mblen.c
--- libc/stdlib/mblen.c	23 Apr 2004 21:44:22 -0000	1.5
+++ libc/stdlib/mblen.c	6 Nov 2009 19:50:17 -0000
@@ -46,6 +46,7 @@ effects vary with the locale.
  #include<newlib.h>
  #include<stdlib.h>
  #include<wchar.h>
+#include "local.h"

  int
  _DEFUN (mblen, (s, n),
Index: libc/stdlib/mblen_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mblen_r.c,v
retrieving revision 1.4
diff -u -p -r1.4 mblen_r.c
--- libc/stdlib/mblen_r.c	23 Apr 2004 21:44:22 -0000	1.4
+++ libc/stdlib/mblen_r.c	6 Nov 2009 19:50:17 -0000
@@ -46,6 +46,7 @@ effects vary with the locale.
  #include<newlib.h>
  #include<stdlib.h>
  #include<wchar.h>
+#include "local.h"

  int
  _DEFUN (_mblen_r, (r, s, n, state),
Index: libc/stdlib/mbrtowc.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mbrtowc.c,v
retrieving revision 1.5
diff -u -p -r1.5 mbrtowc.c
--- libc/stdlib/mbrtowc.c	23 Apr 2004 21:44:22 -0000	1.5
+++ libc/stdlib/mbrtowc.c	6 Nov 2009 19:50:17 -0000
@@ -5,6 +5,7 @@
  #include<stdio.h>
  #include<errno.h>
  #include<string.h>
+#include "local.h"

  size_t
  _DEFUN (_mbrtowc_r, (ptr, pwc, s, n, ps),
@@ -47,6 +48,32 @@ _DEFUN (mbrtowc, (pwc, s, n, ps),
  	size_t n _AND
  	mbstate_t *ps)
  {
+#ifdef PREFER_SIZE_OVER_SPEED
    return _mbrtowc_r (_REENT, pwc, s, n, ps);
+#else
+  int retval = 0;
+
+#ifdef _MB_CAPABLE
+  if (ps == NULL)
+    {
+      _REENT_CHECK_MISC(_REENT);
+      ps =&(_REENT_MBRTOWC_STATE(_REENT));
+    }
+#endif
+
+  if (s == NULL)
+    retval = _mbtowc_r (_REENT, NULL, "", 1, ps);
+  else
+    retval = _mbtowc_r (_REENT, pwc, s, n, ps);
+
+  if (retval == -1)
+    {
+      ps->__count = 0;
+      _REENT->_errno = EILSEQ;
+      return (size_t)(-1);
+    }
+  else
+    return (size_t)retval;
+#endif
  }
  #endif /* !_REENT_ONLY */
Index: libc/stdlib/mbstowcs_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mbstowcs_r.c,v
retrieving revision 1.4
diff -u -p -r1.4 mbstowcs_r.c
--- libc/stdlib/mbstowcs_r.c	17 Mar 2009 12:16:28 -0000	1.4
+++ libc/stdlib/mbstowcs_r.c	6 Nov 2009 19:50:17 -0000
@@ -1,5 +1,6 @@
  #include<stdlib.h>
  #include<wchar.h>
+#include "local.h"

  size_t
  _DEFUN (_mbstowcs_r, (reent, pwcs, s, n, state),
Index: libc/stdlib/mbtowc.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc.c,v
retrieving revision 1.5
diff -u -p -r1.5 mbtowc.c
--- libc/stdlib/mbtowc.c	23 Apr 2004 21:44:22 -0000	1.5
+++ libc/stdlib/mbtowc.c	6 Nov 2009 19:50:17 -0000
@@ -54,6 +54,7 @@ effects vary with the locale.
  #include<newlib.h>
  #include<stdlib.h>
  #include<wchar.h>
+#include "local.h"

  int
  _DEFUN (mbtowc, (pwc, s, n),
Index: libc/stdlib/mbtowc_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/mbtowc_r.c,v
retrieving revision 1.17
diff -u -p -r1.17 mbtowc_r.c
--- libc/stdlib/mbtowc_r.c	3 Oct 2009 08:51:07 -0000	1.17
+++ libc/stdlib/mbtowc_r.c	6 Nov 2009 19:50:17 -0000
@@ -7,6 +7,9 @@
  #include<errno.h>
  #include "local.h"

+/* Disable macro definition from local.h. */
+#undef _mbtowc_r
+
  int (*__mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
  		 const char *, mbstate_t *)
  #ifdef __CYGWIN__
@@ -221,7 +224,7 @@ _DEFUN (__utf8_mbtowc, (r, pwc, s, n, ch
        return 0; /* s points to the null character */
      }

-  if (ch>= 0x0&&  ch<= 0x7f)
+  if (ch<= 0x7f)
      {
        /* single-byte sequence */
        state->__count = 0;
Index: libc/stdlib/wcrtomb.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wcrtomb.c,v
retrieving revision 1.4
diff -u -p -r1.4 wcrtomb.c
--- libc/stdlib/wcrtomb.c	23 Apr 2004 21:44:22 -0000	1.4
+++ libc/stdlib/wcrtomb.c	6 Nov 2009 19:50:17 -0000
@@ -4,6 +4,7 @@
  #include<stdlib.h>
  #include<stdio.h>
  #include<errno.h>
+#include "local.h"

  size_t
  _DEFUN (_wcrtomb_r, (ptr, s, wc, ps),
@@ -45,6 +46,33 @@ _DEFUN (wcrtomb, (s, wc, ps),
  	wchar_t wc _AND
  	mbstate_t *ps)
  {
+#ifdef PREFER_SIZE_OVER_SPEED
    return _wcrtomb_r (_REENT, s, wc, ps);
+#else
+  int retval = 0;
+  char buf[10];
+
+#ifdef _MB_CAPABLE
+  if (ps == NULL)
+    {
+      _REENT_CHECK_MISC(_REENT);
+      ps =&(_REENT_WCRTOMB_STATE(_REENT));
+    }
+#endif
+
+  if (s == NULL)
+    retval = _wctomb_r (_REENT, buf, L'\0', ps);
+  else
+    retval = _wctomb_r (_REENT, s, wc, ps);
+
+  if (retval == -1)
+    {
+      ps->__count = 0;
+      _REENT->_errno = EILSEQ;
+      return (size_t)(-1);
+    }
+  else
+    return (size_t)retval;
+#endif
  }
  #endif /* !_REENT_ONLY */
Index: libc/stdlib/wcsnrtombs.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wcsnrtombs.c,v
retrieving revision 1.1
diff -u -p -r1.1 wcsnrtombs.c
--- libc/stdlib/wcsnrtombs.c	19 Feb 2009 09:19:42 -0000	1.1
+++ libc/stdlib/wcsnrtombs.c	6 Nov 2009 19:50:17 -0000
@@ -99,6 +99,7 @@ PORTABILITY
  #include<stdlib.h>
  #include<stdio.h>
  #include<errno.h>
+#include "local.h"

  size_t
  _DEFUN (_wcsnrtombs_r, (r, dst, src, nwc, len, ps),
@@ -134,7 +135,7 @@ _DEFUN (_wcsnrtombs_r, (r, dst, src, nwc
      {
        int count = ps->__count;
        wint_t wch = ps->__value.__wch;
-      int bytes = _wcrtomb_r (r, buff, *pwcs, ps);
+      int bytes = _wctomb_r (r, buff, *pwcs, ps);
        if (bytes == -1)
  	{
  	  r->_errno = EILSEQ;
Index: libc/stdlib/wcstombs_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wcstombs_r.c,v
retrieving revision 1.3
diff -u -p -r1.3 wcstombs_r.c
--- libc/stdlib/wcstombs_r.c	23 Oct 2007 19:50:29 -0000	1.3
+++ libc/stdlib/wcstombs_r.c	6 Nov 2009 19:50:17 -0000
@@ -1,5 +1,6 @@
  #include<stdlib.h>
  #include<wchar.h>
+#include "local.h"

  size_t
  _DEFUN (_wcstombs_r, (reent, s, pwcs, n, state),
Index: libc/stdlib/wctob.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wctob.c,v
retrieving revision 1.3
diff -u -p -r1.3 wctob.c
--- libc/stdlib/wctob.c	29 May 2007 21:26:59 -0000	1.3
+++ libc/stdlib/wctob.c	6 Nov 2009 19:50:18 -0000
@@ -3,6 +3,7 @@
  #include<stdlib.h>
  #include<stdio.h>
  #include<string.h>
+#include "local.h"

  int
  wctob (wint_t c)
Index: libc/stdlib/wctomb.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb.c,v
retrieving revision 1.4
diff -u -p -r1.4 wctomb.c
--- libc/stdlib/wctomb.c	2 Mar 2009 23:30:59 -0000	1.4
+++ libc/stdlib/wctomb.c	6 Nov 2009 19:50:18 -0000
@@ -49,6 +49,7 @@ effects vary with the locale.
  #include<newlib.h>
  #include<stdlib.h>
  #include<errno.h>
+#include "local.h"

  int
  _DEFUN (wctomb, (s, wchar),
Index: libc/stdlib/wctomb_r.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdlib/wctomb_r.c,v
retrieving revision 1.16
diff -u -p -r1.16 wctomb_r.c
--- libc/stdlib/wctomb_r.c	3 Oct 2009 08:51:07 -0000	1.16
+++ libc/stdlib/wctomb_r.c	6 Nov 2009 19:50:18 -0000
@@ -6,6 +6,9 @@
  #include "mbctype.h"
  #include "local.h"

+/* Disable macro definition from local.h. */
+#undef _wctomb_r
+
  int (*__wctomb) (struct _reent *, char *, wchar_t, const char *charset,
  		 mbstate_t *)
  #ifdef __CYGWIN__
Index: libc/stdio/vfprintf.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/vfprintf.c,v
retrieving revision 1.75
diff -u -p -r1.75 vfprintf.c
--- libc/stdio/vfprintf.c	16 Jun 2009 17:44:20 -0000	1.75
+++ libc/stdio/vfprintf.c	6 Nov 2009 19:50:18 -0000
@@ -159,6 +159,7 @@ static char *rcsid = "$Id: vfprintf.c,v
  #include<sys/lock.h>
  #include<stdarg.h>
  #include "local.h"
+#include "../stdlib/local.h"
  #include "fvwrite.h"
  #include "vfieeefp.h"

Index: libc/stdio/vfscanf.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/stdio/vfscanf.c,v
retrieving revision 1.47
diff -u -p -r1.47 vfscanf.c
--- libc/stdio/vfscanf.c	11 Mar 2009 11:53:22 -0000	1.47
+++ libc/stdio/vfscanf.c	6 Nov 2009 19:50:18 -0000
@@ -122,6 +122,7 @@ Supporting OS subroutines required:
  #include<stdarg.h>
  #include<errno.h>
  #include "local.h"
+#include "../stdlib/local.h"

  #ifdef INTEGER_ONLY
  #define VFSCANF vfiscanf




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]