This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/11261] New: malloc uses excessive memory for multi-threaded applications


malloc uses excessive memory for multi-threaded applications

The following program demonstrates malloc(3) using in excess of 600 megabytes 
of system memory while the program has never allocated more than 100 megabytes 
at any given time.  This results from the use of thread-specific "preferred 
arenas" for memory allocations.

The program first starts by contending a number of threads doing simple 
malloc/frees, with no net memory allocations.  This establishes preferred 
arenas for each thread as a result of USE_ARENAS and PER_THREADS.  Once 
preferred arenas are established, the program then has each thread, in turn, 
allocate 100 megabytes and then free all but 20 kilobytes, for a net memory 
allocation of 200 kilobytes.  The resulting malloc_stats() show 600 megabytes 
of allocated memory that cannot be returned to the system.

Over time, fragmentation of the heap can cause excessive paging when actual 
memory allocation never exceeded system capacity.  With the use of preferred 
arenas in this way, multi-threaded program memory usage is essentially 
unbounded (or bounded to the number of threads times the actual memory usage).

The program run and source code is below, as well as the glibc version from my 
RHEL5 system.  Thank you for your consideration.

[root@lab2-160 test_heap]# ./memx
creating 10 threads
allowing threads to contend to create preferred arenas
display preferred arenas
Arena 0:
system bytes     =     135168
in use bytes     =       2880
Arena 1:
system bytes     =     135168
in use bytes     =       2224
Arena 2:
system bytes     =     135168
in use bytes     =       2224
Arena 3:
system bytes     =     135168
in use bytes     =       2224
Arena 4:
system bytes     =     135168
in use bytes     =       2224
Arena 5:
system bytes     =     135168
in use bytes     =       2224
Total (incl. mmap):
system bytes     =     811008
in use bytes     =      14000
max mmap regions =          0
max mmap bytes   =          0
allowing threads to allocate 100MB each, sequentially in turn
thread 3 alloc 100MB
thread 3 free 100MB-20kB
thread 5 alloc 100MB
thread 5 free 100MB-20kB
thread 7 alloc 100MB
thread 7 free 100MB-20kB
thread 2 alloc 100MB
thread 2 free 100MB-20kB
thread 0 alloc 100MB
thread 0 free 100MB-20kB
thread 8 alloc 100MB
thread 8 free 100MB-20kB
thread 4 alloc 100MB
thread 4 free 100MB-20kB
thread 6 alloc 100MB
thread 6 free 100MB-20kB
thread 9 alloc 100MB
thread 9 free 100MB-20kB
thread 1 alloc 100MB
thread 1 free 100MB-20kB
Arena 0:
system bytes     =  100253696
in use bytes     =      40928
Arena 1:
system bytes     =  100184064
in use bytes     =      42352
Arena 2:
system bytes     =  100163584
in use bytes     =      22320
Arena 3:
system bytes     =  100163584
in use bytes     =      22320
Arena 4:
system bytes     =  100163584
in use bytes     =      22320
Arena 5:
system bytes     =  100204544
in use bytes     =      62384
Total (incl. mmap):
system bytes     =  601133056
in use bytes     =     212624
max mmap regions =          0
max mmap bytes   =          0
[root@lab2-160 test_heap]# rpm -q glibc
glibc-2.5-42.el5_4.2
glibc-2.5-42.el5_4.2
[root@lab2-160 test_heap]# 

====================================================================

[root@lab2-160 test_heap]# cat memx.c
// ****************************************************************************

#include <stdio.h>
#include <errno.h>
#include <assert.h>
#include <stdlib.h>
#include <pthread.h>
#include <inttypes.h>

#define NTHREADS  10
#define NALLOCS  10000
#define ALLOCSIZE  10000

static volatile int go;
static volatile int die;
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

static void *ps[NALLOCS];  // allocations that are freed in turn by each thread
static void *pps1[NTHREADS];  // straggling allocations to prevent arena free
static void *pps2[NTHREADS];  // straggling allocations to prevent arena free

void
my_sleep(
    int ms
    )
{
    int rv;
    struct timespec ts;
    struct timespec rem;

    ts.tv_sec  = ms / 1000;
    ts.tv_nsec = (ms % 1000) * 1000000;
    for (;;) {
        rv = nanosleep(&ts, &rem);
        if (! rv) {
            break;
        }
        assert(errno == EINTR);
        ts = rem;
    }
}

void *
my_thread(
    void *context
    )
{
    int i;
    int rv;
    void *p;

    // first we spin to get our own arena
    while (go == 0) {
        p = malloc(ALLOCSIZE);
        assert(p);
        if (rand()%20000 == 0) {
            my_sleep(10);
        }
        free(p);
    }

    // then we give main a chance to print stats
    while (go == 1) {
        my_sleep(100);
    }
    assert(go == 2);

    // then one thread at a time, do our big allocs
    rv = pthread_mutex_lock(&mutex);
    assert(! rv);
    printf("thread %d alloc 100MB\n", (int)(intptr_t)context);
    for (i = 0; i < NALLOCS; i++) {
        ps[i] = malloc(ALLOCSIZE);
        assert(ps[i]);
    }
    printf("thread %d free 100MB-20kB\n", (int)(intptr_t)context);
    // N.B. we leave two allocations straggling
    pps1[(int)(intptr_t)context] = ps[0];
    for (i = 1; i < NALLOCS-1; i++) {
        free(ps[i]);
    }
    pps2[(int)(intptr_t)context] = ps[i];
    rv = pthread_mutex_unlock(&mutex);
    assert(! rv);
}

int
main()
{
    int i;
    int rv;
    pthread_t thread;

    printf("creating %d threads\n", NTHREADS);
    for (i = 0; i < NTHREADS; i++) {
        rv = pthread_create(&thread, NULL, my_thread, (void *)(intptr_t)i);
        assert(! rv);
        rv = pthread_detach(thread);
        assert(! rv);
    }

    printf("allowing threads to contend to create preferred arenas\n");
    my_sleep(20000);

    printf("display preferred arenas\n");
    go = 1;
    my_sleep(1000);
    malloc_stats();

    printf("allowing threads to allocate 100MB each, sequentially in turn\n");
    go = 2;
    my_sleep(5000);
    malloc_stats();

    // free the stragglers
    for (i = 0; i < NTHREADS; i++) {
        free(pps1[i]);
        free(pps2[i]);
    }

    return 0;
}
[root@lab2-160 test_heap]#

-- 
           Summary: malloc uses excessive memory for multi-threaded
                    applications
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: rich at testardi dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=11261

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]