This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 1.3.10 memcmp() bug



>On Tuesday 23 April 2002 23:41, Sami Korhonen wrote:
> > On Tue, 23 Apr 2002, Tim Prince wrote:
> > > On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
> > > >  I wasnt sure wheter I should post about this on gcc bug report list 
>or
> > > > here. Anyways, it seems that using -O2 flag with gcc causes huge
> > > > slowdown in memcmp(). However i dont see performance drop under 
>linux,
> > > > so I suppose it is cygwin issue.
> > > >
> > > > $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> > > > Amount of memory to scan (mbytes)? 100
> > > > Memory block size (default 1024)? 1024
> > > > Allocating memory
> > > > Testing memory - read (1 byte at time)
> > > > Complete: 889.73MB/sec
> > > > Testing memory - read (4 bytes at time)
> > > > Complete: 3313.07MB/sec
> > > > Freeing memory
> > > >
> > > > $ gcc memtest.c -o memtest ; ./memtest.exe
> > > > Amount of memory to scan (mbytes)? 100
> > > > Memory block size (default 1024)? 1024
> > > > Allocating memory
> > > > Testing memory - read (1 byte at time)
> > > > Complete: 2517.94MB/sec
> > > > Testing memory - read (4 bytes at time)
> > > > Complete: 2933.50MB/sec
> > > > Freeing memory
> > > >
> > > >
> > > > '1 byte at time' is using memcmp() to compare two blocks.
> > >
> > > You leave so many relevant considerations unspecified, that anything I
> > > say must be a stab in the dark.  I assume you have a standard cygwin
> > > installation, where binutils is built to honor only 4-byte alignments,
> > > while recent linux configurations provide for 16-byte alignments.  The
> > > significance of that is different on various CPU families, with code
> > > alignment being quite important on certain CPU's, and data alignment 
>on
> > > others.  Do we assume that you are running on a 486, since you have 
>not
> > > told gcc otherwise?  You may have fallen accidentally into good 
>alignment
> > > in one case and bad in the other.  You might or might not be using
> > > similar versions of gcc in cygwin and linux.  If you would provide a 
>test
> > > case, and mention some hardware parameters, some of the mystery could 
>be
> > > eliminated; for example, we could find out whether memcmp() is code
> > > generated by gcc or from a library.  cygwin is not generally 
>considered
> > > an important target for performance optimization, as you can see from 
>the
> > > alignment considerations and the differences in the libraries.
> > > --
> > > Tim Prince
> >
> >  Sorry that I wasnt specific enough with my system configuration. I'm
> > running standard installation of cygwin on x86 (P4) and WinXP. Both
> > test were run under same setup, only difference was the use of -O2 flag. 
>I
> > find it odd, that performance differnece is that huge. Source is 
>available
> > at: http://kotisivu.raketti.net/darkone/memtest/memtest.c
>AFAICT there's no reason this should behave differently on linux or cygwin.
>You're comparing the speed of memcmp() against the speed of comparing ints 
>in
>a loop.  When you don't ask the compiler to in-line memcmp(), you get a
>library function which is written with enough smarts to compare 4 bytes at 
>a
>time.   Various versions of gcc are interpreting the instruction to use
>"optimized" in-line code as a rep cmpsb, which is slower than the newlib
>memcmp() function, even on my P-III.
>P4's, particularly early versions, are notorious for various performance
>glitches when using rep cmpsb on long strings.  gcc isn't smart enough to
>look at the lengths of your strings and second guess your instruction to do
>that, nor does it have a crystal ball to second guess your instruction to
>generate 486 code, even if you were running a version with P4 
>optimizations.
>In time critical applications, it can be quite important to learn the
>particular tricks of your compiler and when to choose a separately compiled
>string function, or when to ask for in-line, as well as to acquire a 
>library
>of such functions built for the processor of your choice.   On the P4, you
>would have available 64-bit integer comparisons if you chose to use them to
>speed this up.
>--


gcc 3.1+ are supposed to be 'more' intelligent about such things - althought 
they arent brilliant.

Regards,
Gareth

_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]