This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH RFC 2/2 V3] Improve 64bit memset for Corei7 with avx2 instruction


Andreas

memcpy_avx_unaligned get better improvement compared with memcpy_ssse3, thanks.


case	avx		ssse2_unaligned	ssse3		AVX vs SSE2	AVX vs SSSE3
200i	146833745		168384142	182668731		1.146767332	1.244051434
g23	1431207341		1557405243	1860107593		1.088175835	1.29967723
166i	350901531		379068674	452970332		1.08027079	1.290875907
cp-decl	370750774		395890196	452511422		1.067806796	1.220527248
c-type	763780824		810806468	932772220		1.061569553	1.221256401
expr2	986698539		1067232192	1287355386		1.081619309	1.304709934
expr	727016829		758953883	848020918		1.043928906	1.166439186
s04	1117900758		1185159528	1419512773		1.060165242	1.269802138
scilab	63309111		66893431	74106087		1.05661618	1.170543794
~


2013/7/29, Ling Ma <ling.ma.program@gmail.com>:
> Ok, we will send the compared result soon,
>
> Thanks
> Ling
>
> 2013/7/29, Andreas Jaeger <aj@suse.com>:
>> On 07/29/2013 11:49 AM, Ling Ma wrote:
>>> The Attachment includes how to setup cpu2006 gcc.403 to measure
>>> memset/memcpy respectively. the readme.txt specify the process.
>>> Any problem, please let me know.
>>
>> Ling,
>>
>> You're comparing against memcpy_sse2_unaligned but wouldn't the selector
>> use __memcpy_ssse3 on current Haswell cpus and thus you should compare
>> your new routine against that one?
>>
>> Andreas
>> --
>>  Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
>>   SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
>>    GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
>>     GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]