This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/14076] New: PTHREAD_MUTEX_LOCK() in multiple threads RETURNING EOWNERDEAD


http://sourceware.org/bugzilla/show_bug.cgi?id=14076

             Bug #: 14076
           Summary: PTHREAD_MUTEX_LOCK() in multiple threads RETURNING
                    EOWNERDEAD
           Product: glibc
           Version: 2.14
            Status: NEW
          Severity: critical
          Priority: P2
         Component: nptl
        AssignedTo: unassigned@sourceware.org
        ReportedBy: zhenzhong.duan@oracle.com
                CC: drepper.fsp@gmail.com
    Classification: Unclassified


Created attachment 6400
  --> http://sourceware.org/bugzilla/attachment.cgi?id=6400
code to test concurrency of thread calling pthread_mutex_lock

When a thread that hold mutex lock dead, multiple other threads that call
PTHREAD_MUTEX_LOCK() return EOWNERDEAD

From the POSIX docs

(http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_03)

The implementation shall behave as if at all times there is at most one owner
of any mutex.

A thread that becomes the owner of a mutex is said to have "acquired" the
mutex and the mutex is said to have become "locked''; when a thread gives up
ownership of a mutex it is said to have "released" the mutex and the mutex is
said to have become "unlocked".

Robust mutexes provide a means to enable the implementation to notify other
threads in the event of a process terminating while one of its threads holds
a mutex lock. The next thread that acquires the mutex is notified about the
termination by the return value [EOWNERDEAD] from the locking function. The
notified thread can then attempt to recover the state protected by the mutex,
and if successful mark the state protected by the mutex as consistent by a
call to pthread_mutex_consistent(). If the notified thread is unable to
recover the state, it can declare the state as not recoverable by a call to
pthread_mutex_unlock() without a prior call to pthread_mutex_consistent().

To me, this implies that the thread that receives the EOWNERDEAD status has
also âacquiredâ the mutex (i.e. has locked the mutex). Given this, only one
thread should be able to receive the EOWNERDEAD notification (otherwise,
multiple threads have âacquiredâ the mutex â which contradicts the POSIX
descriptions above).

Attached test code futexCase1_r1.cpp,
$ g++ -Wall -O3 -m32 -march=i686 futexCase1-r1.cpp -o futexCase1_r1 âlpthread
$ futexCase1_r1
27658: created mutex: 0xf7fdc000
......
27944: pthread_mutex_consistent_np failed: 0xf7fdc000 22 Invalid argument
28032: pthread_mutex_consistent_np failed: 0xf7fdc000 22 Invalid argument
28072: pthread_mutex_consistent_np failed: 0xf7fdc000 22 Invalid argument
27658: Done! lock concurrency: 0, max: 5 

Based on the man-pages, pthread_mutex_consistent_np() should only fail if the
mutex supplied is invalid (not initialized, etc.), or if the mutex is NOT in
an inconsistent state.  Given this, my speculation for the failure is that
multiple pthread_mutex_lock() calls are being allowed to simultaneously
return (incorrectly) with the EOWNERDEAD status, causing some of the
subsequent pthread_mutex_consistent_np() calls to fail because the mutex
state has already been made consistent.

Lastly, from looking at the resulting max-concurrency (5 in this case) we see
that the code protected by the mutex is NOT being single threaded by the
mutex as expected.

We originally reproduce this bug in 2.5-81.el5_8.2. I also tried with fedora16
newest, still reproduce.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]