This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Question about madvise(DONTNEED) in glibc malloc


(4/14/13 10:42 AM), Siddhesh Poyarekar wrote:
> On 14 April 2013 22:07, KOSAKI Motohiro <kosaki.motohiro@gmail.com> wrote:
>> Hi all,
>>
>> Now, we linux MM folks discuss are discussing about new memory discarding feature.
>> (https://lkml.org/lkml/2013/3/12/105). The motivation is similar wtih MADV_FREE,
>> but more efficient. (http://lwn.net/Articles/230799)
>>
>> And I played ebizzy benchmark a bit because jemalloc claims jemalloc is faster than glibc
>> by using it (http://people.freebsd.org/~kris/scaling/ebizzy.html) and the patch auther
>> claimed vrange patch improves that. And I've found current glibc's MADV_DONTNEED usage
>> is crazy wrong.
>>
>> Please look at following result. MADV_DONTNEED makes 5 milion minor page fault and it
>> decrease transaction performance (record/s) from 73259 to 168333. My machine is typical
>> laptop. core i7 4cpu (8 threads) w/ 2G ram. When using larger machine, MADV_DONTNEED decrease
>> a performance more.
>>
>>
>> % perf stat ./ebizzy -S 3
>> 16833 records/s
>> real  3.00 s
>> user  6.83 s
>> sys  17.09 s
>>
>>  Performance counter stats for './ebizzy -S 3':
>>
>>       23914.067812 task-clock                #    7.941 CPUs utilized
>>              2,609 context-switches          #    0.109 K/sec
>>                137 CPU-migrations            #    0.006 K/sec
>>          4,803,074 page-faults               #    0.201 M/sec
>>
>> %   MALLOC_DISCARD_HEAP=0  perf stat ./ebizzy -S 3
>> 73259 records/s
>> real  3.00 s
>> user 23.84 s
>> sys   0.05 s
>>
>>
>>  Performance counter stats for './ebizzy -S 3':
>>
>>       23919.162533 task-clock                #    7.945 CPUs utilized
>>              2,533 context-switches          #    0.106 K/sec
>>                 77 CPU-migrations            #    0.003 K/sec
>>              4,256 page-faults               #    0.178 K/sec
> 
> This doesn't prove that glibc use of MADV_DONTNEED is wrong.  What
> this proves is that never giving memory back to the system results in
> crazy fast performance since we reduce syscall overhead.  It doesn't
> justify never returning memory back to the system though.

It does. You need to look at current heap_trim() or you don't understand
current DONTNEED design.

>  extra = (top_size - pad - MINSIZE - 1) & ~(pagesz - 1);
>  if(extra < (long)pagesz)
>    return 0;
>  /* Try to shrink. */
>  if(shrink_heap(heap, extra) != 0)
>    return 0;

heap_trim() only check extra size is larger than page size. 

And Quote form man madvise.

>       MADV_DONTNEED
>              Do  not  expect  access  in the near future.  (For the time
>              being, the application is finished with the given range, so
>              the  kernel can free resources associated with it.)  Subse-
>              quent accesses of pages in this  range  will  succeed,  but
>              will  result  either  in  re-loading of the memory contents
>              from the underlying mapped file (see mmap(2)) or zero-fill-
>              on-demand pages for mappings without an underlying file.

Current implemantation cleary does as document. It's not a bug and we don't plan to
change never.



> 
>> - MADV_DONTNEED assume discarded memory is 99.999% reused. but current glibc's assumption is
>>   clealy opposite. glibc assume it is very light weight when glibc prediction is not correct.
>>   I have no idea where this mismatch come from.
>>
>> - HPC folks want an allocator never return memory to OS. They are one of MALLOC_TRIM_THRESHOLD
>>   main user. however current MADV_DONTNEED usage don't have disabling knob. I couldn't found
>>   any reasonable reason.
> 
> Because it's yet another knob with a niche use case.  Have you tested
> with MALLOC_TRIM_THRESHOLD_ set to a ridiculously high value - 2GB ot
> 4GB? How does it compare?  It ought to give you performance similar to
> the new knob.


_int_free(mstate av, mchunkptr p, int have_lock)
{
(snip)

      if (av == &main_arena) {
#ifndef MORECORE_CANNOT_TRIM
	if ((unsigned long)(chunksize(av->top)) >=
	    (unsigned long)(mp_.trim_threshold))
	  systrim(mp_.top_pad, av);
#endif
      } else {
	/* Always try heap_trim(), even if the top chunk is not
	   large, because the corresponding heap might go away.  */
	heap_info *heap = heap_for_ptr(top(av));

	assert(heap->ar_ptr == av);
	heap_trim(heap, mp_.top_pad);
      }


MALLOC_TRIM_THRESHOLD_ is only for main thread.


>> In this week, we kernel MM folks plan to discuss on Linux MM summit (http://events.linuxfoundation.org/events/lsfmm-summit).
>> then, quick response is much appreciate even if not accuate.
> 
> Were any glibc developers invited? ;)

Sorry, MM Summit is invitation only and kernel developers discussion place.






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]