This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Performance optimization in av::fixup - use buffered IO, not mapped file


On Dec 12 07:04, Eric Blake wrote:
> On 12/12/2012 06:22 AM, Corinna Vinschen wrote:
> > On Dec 12 06:11, Eric Blake wrote:
> >> Eww.  That would be a regression for coreutils, [...]
> > 
> > Really?  How so?
> 
> When using 'cp --sparse=always', coreutils relies on lseek() to create
> sparse files.  Removing this code from cygwin would mean that coreutils
> now has to be rewritten to explicitly ftruncate() instead of lseek() for
> creating sparse files.

On Cygwin only or on Linux as well?

> >> Why can't we instead use posix_fallocate() as a means of identifying a
> >> file that must not be sparse, and then just patch the compiler to use
> >> posix_fallocate() to never generate a sparse executable (but let all
> >> other sparse files continue to behave as normal)?
> > 
> > posix_fallocate is not allowed to generate sparse files, due to the
> > following restriction:
> > 
> >   "If posix_fallocate() returns successfully, subsequent writes to the
> >   specified file data shall not fail due to the lack of free space on
> >   the file system storage media."
> > 
> > See
> > http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_fallocate.html
> > 
> > Therefore only ftruncate and lseek potentially generate sparse files.
> > 
> > On second thought, I don't quite understand what you mean by "use
> > posix_fallocate() as a means of identifying a file that must not be
> > sparse".  Can you explain, please?
> 
> Since we know that an executable must NOT be sparse in order to make it
> more efficient with the Windows loader, then gcc should use
> posix_fallocate() to guarantee that the file is NOT sparse, even if it
> happens to issue a sequence of lseek() that would default to making it
> sparse without the fallocate.
> 
> In other words, I'm proposing that we delete nothing from cygwin1.dll,
> and instead fix the problem apps (gcc, emacs unexec) that actually
> create executables, so that the files they create are non-sparse because
> we have proven that they should not be sparse for performance reasons.
> Meanwhile, all non-executable files (such as virtual machine disk
> images, which are typically much bigger than executables, and where
> being sparse really does matter) do not have to jump through extra hoops
> of using ftruncate() when plain lseek() would do to keep them sparse.

Couldn't Devil's advocate also argue that coreutils are wrong?

> Oh, and while I'm thinking about it, it would be nice to copy Linux'
> fallocate(FALLOC_FL_PUNCH_HOLE) for punching holes into already-existing
> files, rather than only being able to create holes by sequentially
> building a file with each new hole possible only as the file size is
> extended.

Hmm, that might be possible by utilising the FSCTL_SET_SPARSE and
FSCTL_SET_ZERO_DATA DeviceIoControl codes.  However, we don't export
fallocate at all right now.  This is a clear case of PHC(*)


Corinna

(*) Patches happily considered.


-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]