This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Do not break buffers in fvwrite for unbuffered files


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/31/2013 12:22 PM, Corinna Vinschen wrote:
> On Oct 31 10:53, Federico Terraneo wrote:
>> Sorry for the delay, I've been quite busy.
> 
> No worries.
> 
>> Anyway, here's the patch that makes both unbuffered and fully
>> buffered writes faster. It writes in multiples of fp->_bf._size,
>> and clamps at INT_MAX, as requested.
> 
> Thanks for working on this.  I have a few comments inline.
> 
>> Here's the changelog entry:
>> 
>> 2013-10-31  Terraneo Federico  <...>
>> 
>> * libc/stdio/fvwrite.c: Allow writing in multiples of 
>> fp->_bf._size for fully buffered and unbuffered files, to improve
>> write performance. [...] diff -ruN a/newlib/libc/stdio/fvwrite.c
>> b/newlib/libc/stdio/fvwrite.c --- a/newlib/libc/stdio/fvwrite.c
>> 2013-10-31 01:10:06.110679241 +0100 +++
>> b/newlib/libc/stdio/fvwrite.c	2013-10-31 09:25:04.543332198
>> +0100 @@ -21,6 +21,7 @@ #include <string.h> #include <stdlib.h> 
>> #include <errno.h> +#include <limits.h> #include "local.h" 
>> #include "fvwrite.h"
>> 
>> @@ -89,12 +90,16 @@ if (fp->_flags & __SNBF) { /* -       *
>> Unbuffered: write up to BUFSIZ bytes at a time. +       *
>> Unbuffered: split the buffer in the larger chunk that +       *
>> is both less than INT_MAX, as some legacy code may +       *
>> expect int instead of size_t, and is a multiple of +       *
>> fp->_bf._size. */ do { GETIOV (;); -	  w = fp->_write (ptr,
>> fp->_cookie, p, MIN (len, BUFSIZ)); +	  w = fp->_write (ptr,
>> fp->_cookie, p, +	    ((int)MIN(len, INT_MAX)) / fp->_bf._size *
>> fp->_bf._size);
> 
> What if len is < fp->_bf._size?  AFAICS, this would result in
> writing 0 bytes.

You're right. Thanks for noticing it.

One way to fix it is:

  if (len < fp->_bf._size)
    w = len;
  else
    w = ((int)MIN (len, INT_MAX)) / fp->_bf._size * fp->_bf._size;
  w = fp->_write (ptr, fp->_cookie, p, w);

but maybe we can just do a

  w = fp->_write (ptr, fp->_cookie, p, MIN (len, INT_MAX));

since with unbuffered files there's no real way to avoid writing in
sizes not multiple of fp->_bf._size anyway.

> 
>> if (w <= 0) goto err; p += w; @@ -177,30 +182,24 @@ fp->_p += w; 
>> w = len;		/* but pretend copied all */ } -	  else if (fp->_p >
>> fp->_bf._base && len > w) +	  else if (fp->_p > fp->_bf._base ||
>> len < fp->_bf._size) { -	      /* fill and flush */ +	      /*
>> pass through the buffer */ +	      w = MIN(len, w); COPY (w); -
>> /* fp->_w -= w; *//* unneeded */ +	      fp->_w -= w; fp->_p +=
>> w; -	      if (_fflush_r (ptr, fp)) +	      if(len > w)
>> if(_fflush_r (ptr, fp))
> 
> This looks weird.  Shouldn't that be
> 
> if (len > w && _fflush_r (ptr, fp))
> 
> or, for more clearness
> 
> if (fp->_w == 0 && _fflush_r (ptr, fp))
> 
> ?   Here's why:
> 
> The original test was performed before changing the buffer.  By
> testing len > w, it made sure to fill the buffer.  Now the test is
> performed after changing the buffer, so it can precisely test if
> the buffer is full.  In other words, check for fp->_w == 0.
> Testing len > w is kind of circuitous at this point, isn't it?

I did think about doing a fp->_w == 0, but I noticed that the previous
code when the buffer was empty a write of size fp->_bf._size would
bypass the buffer and write directly. On the other hand, if the buffer
is gradually filled, say the buffer is 1024 bytes and you do two
writes of 512, the buffer is not flushed on the second call, and the
flush is delayed till another write finds the buffer entirely full.

Since I didn't know if this behaviour was desired or not, I've
replicated it with the len > w check. However, for me either of the
two option is good, so let me know which one you prefer and I'll send
you another patch.

> 
>> goto err; } -	  else if (len >= (w = fp->_bf._size)) +	  else { 
>> /* write directly */ +	      w = (int)MIN(len, INT_MAX) /
>> fp->_bf._size * fp->_bf._size;
> 
> Again, what if len < fp->_bf._size?

This can't happen, since there's an

else if (fp->_p > fp->_bf._base || len < fp->_bf._size)

that catches this case and fills the file buffer instead.

> 
>> w = fp->_write (ptr, fp->_cookie, p, w); if (w <= 0) goto err; } 
>> -	  else -	    { -	      /* fill and done */ -	      w = len; -
>> COPY (w); -	      fp->_w -= w; -	      fp->_p += w; -	    } p +=
>> w; len -= w; }
> 
> Other than that, your formatting is missing a couple of spaces.
> Make that "if (...)" rather than "if(...)" or "MIN (...)" rather
> than "MIN(...)".

Will be fixed in the next patch.

> 
> 
> Thanks, Corinna
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSckUkAAoJECkLFtN5Xr9fjcsH/3XcGxUNVAVLnlRPfWz6JdQB
v5wRUHcMGreVsrVcRo1TtA0awslA5H5zBe8fqGKbe+pRzNwFjNVmb+rCy4nDMAdk
5DARDTPaspMF/sL0J+tMCbWqDQfEVD4tuRm0t/Zxll8Hylxt5J31kAtJennXEKas
kX9qc2yF6tdL/PJ6fqnBd3j7cQUJR1g6HNqhb90RPSsPQvihAiHHrj9tV+PAYILt
E+0cM2gE9K+l96t/c6FRADxpPZB98ZaVyvjJ7JcEX9bhKBheMf9tsrRu4l/AaazX
DaOvqiALi1dh2SDsA6RqJfoF7xvyxvviDHn5HfJ8SJdR8EPhGB3Aqp2sWRMkMHE=
=ym70
-----END PGP SIGNATURE-----


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]