This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Question about madvise(DONTNEED) in glibc malloc


On Mon, Apr 15, 2013 at 07:27:02AM -0700, KOSAKI Motohiro wrote:
> (4/15/13 5:45 AM), Rich Felker wrote:
> > On Sun, Apr 14, 2013 at 10:01:52PM -0700, KOSAKI Motohiro wrote:
> >> (4/14/13 9:28 PM), Rich Felker wrote:
> >>> On Sun, Apr 14, 2013 at 09:37:16AM -0700, KOSAKI Motohiro wrote:
> >>>> Hi all,
> >>>>
> >>>> Now, we linux MM folks discuss are discussing about new memory
> >>>> discarding feature. (https://lkml.org/lkml/2013/3/12/105). The
> >>>> motivation is similar wtih MADV_FREE, but more efficient.
> >>>> (http://lwn.net/Articles/230799)
> >>>
> >>> If you're following this, do you know if vrange(VRANGE_NOVOLATILE) has
> >>> failure cases? 
> >>
> >> If any pages have already been discarded, vrange(VRANGE_NOVOLATILE) return 1.
> >> Subsequent memory access may cause minor page fault and makes page zero fill.
> > 
> > But vrange never fails as long as the mapping is valid?
> 
> Yes for current draft patch. but I don't expect final version also
> never fail. It has several unimplement and every memory management
> syscall can fail. madvise() can fail too.

It doesn't matter that madvise can fail because you don't have to call
madvise again to get back access to the range. If "freeing" the range
with madvise fails, that's just suboptimal behavior, not a nasty
failure case malloc has to handle. But if vrange fails to "get back"
the range, you're stuck with unusable memory, and that's a difficult
case for malloc to work with.

> But, vrange() never makes implicit uncommit charge even though any
> irregular is happen. Does this make sense? If you are interesting
> more corner case, you neeed to wait a while because current patch is
> rough design and just for discussion.

I think you're saying vrange never subtracts the volatile range from
the program's commit charge. This sounds right.

> >> Does you "commit" mean to talk about virtual address? If yes, vrange() never
> >> call neither mmap/munmap implicitly.
> >> If you are talk about physical pages, it may be discarded and disappeared. vrange()
> >> user must not pass undroppable data.
> > 
> > I mean in the sense of commit charge. I would assume vrange does not
> > adjust the commit charge, but this would be important to know.
> 
> Your assumption is correct.

Good.

> >>> If vrange can fail to "get back" the range,
> >>> this makes it a lot harder to use robustly.
> >>
> >> Any linux syscall _can_ return ENOMEM if system has really no memory.
> >> but we've never seen practically because kernel handle memory starvation enough
> >> clever.
> > 
> > No, there are plenty of syscalls that never fail. 
> 
> Ah, um getpid() and similar one never fail. yes. I only talked about memory
> management syscall.

OK, that's more reasonable. But still I think vrange should never fail
to restore a range to usable state.

> > And having this
> > property is important. For example, if calling vrange would have to
> > split up a range and thus add more ranges, the setting volatile mode
> > could reasonably fail, 
> > but setting non-volatile should just make the
> > entire contained range non-volatile so that no splitting is required
> > if it runs out of memory trying to do the split.
> 
> It depend. E.g. as you know, munmap() can increase number of maps
> and fail by /proc/sys/vm/max_map_count limitation. In general, every
> range framework makes similar implicit merge (and subsequent
> splitting). To avoid this, you need to gurantee neighber address
> region never be vrange()ed area.

Yes, I'm quite aware of that -- actually, that's why I suspect vrange
may have a similar issue. The difference is that vrange is just an
optimization. munmap can't "unmap a larger region" to avoid splitting
regions (which is an operation that requires kernel-space allocation),
but vrange could "un-volatile a larger region" to avoid splitting, as
a fallback when the kernel fails to increase the number of ranges.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]