This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Performance optimization in av::fixup - use buffered IO, not mapped file


On 12/13/12 1:09 AM, Corinna Vinschen wrote:
> On Dec 12 13:20, Daniel Colascione wrote:
>>(I also haven't checked whether anything in /bin actually ends up
>> sparse. It'd be interesting to see.)

I checked. Nothing in my /bin was sparse.

> cygexec is not such a good solution in the long run.  For 64 bit Cygwin
> we *have* to know if the process to execute is a 32 or 64 bit process
> so we can't quite avoid the hook_or_detect_cygwin test.

Today, I run with /bin mounted cygexec. Will I not be able to do
that once I have mixed 32- and 64-bit Cygwin binaries in PATH?

>> We could also work around the problem by making av::fixup use regular buffered
>> reads instead of mmap. Regular buffered reads don't seem to trigger the cache flush.
> 
> hook_or_detect_cygwin needs a second mapping if the file is big enough,

Right --- it walks various tables.

> so rewriting this code to use ReadFile's isn't that simple.  The
> question here is this.  Assuming we change MapViewOfFile to
> VirtualAlloc

Why VirtualAlloc? For smaller allocations, alloca should suffice,
and for pathologically large ones, we could just use the normal
process heap.

> /ReadFile, will that make things slower?

Not having benchmarked a ReadFile version of av::fixup, it's hard to
make a definitive statement. Still, the ReadFile version of my test
program had no effect on overall execution time. Using ReadFile
instead of a section view shouldn't affect the total number of IOs
either (assuming readahead is disabled by FILE_FLAG_RANDOM_ACCESS).
The performance penalty would be an extra memcpy, which, for the IO
sizes we're talking about, is noise.

This approach still makes me nervous. I'd be worried that 1) using
ReadFile wouldn't actually eliminate the redundant loads in some
cases, and 2) there might be other hidden performance problems with
sparse files. Examples of the latter problem might be cache flushes
caused by programs that mmap sparse data files and antivirus
programs that mmap all executables prior to execution in order to do
some kind of scanning.

> Of course we could use VA/RF only on sparse files since we know if an
> executable is sparse, thanks to the FILE_ATTRIBUTE_SPARSE_FILE attribute.

Good point.

>> Still, I think that creating sparse files only on ftruncate is the right thing
>> to do, at least for existing OS versions. Basically, almost nobody _uses_ sparse
>> files in Windows (except us), so the paths that deal with them are optimized for
>> simplicity and correctness, not performance.
> 
> It is kind of annoying that sparse files are something special in
> Windows, rather then the normal case.
> 
> After a night's sleep I tend to agree with you, though.  I found an old
> discussion in the cygwin archives from 2003 which also handled sparse
> file slowness.  It seems this Windows code hasn't changed a lot since
> then.

I saw that thread as well, but I didn't see any mention of sparse
file caching behavior. The thread didn't seem all that productive
(or pleasant).

> Cygwin is used a lot for development, and it is famed for its slowness.
> If we can notably speed up normal operation by disabling the automatic
> sparse handling in lseek/write, I think we should do it.  For those
> requiring sparseness on NTFS, we can add a "sparse" mount option and
> run the code in lseek/write only if that mount flag is set.  And
> for symmetry we should probbaly do the same in ftruncate.

What about using the automatic sparse handling in lseek/lwrite and
ftruncate only when the file being operated on is already sparse?
Once a file is marked sparse, we've already taken the performance
hit, so we might as well punch holes where appropriate.

As you mentioned above, we can figure that out very cheaply. Tools
that depend on being able to create sparse files (e.g. cp in some
modes) could be patched to turn on the sparse flag as needed without
having to rewrite all the primary lseek/ftruncate logic.

Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]