This is the mail archive of the
libc-hacker@sourceware.cygnus.com
mailing list for the glibc project.
A deadlock in linuxthreads.
- To: Xavier.Leroy@inria.fr
- Subject: A deadlock in linuxthreads.
- From: hjl@lucon.org (H.J. Lu)
- Date: Fri, 18 Dec 1998 19:50:22 -0800 (PST)
- Cc: libc-hacker@cygnus.com (GNU C Library), drepper@cygnus.com (Ulrich Drepper)
Hi,
I think I finally find the deadlock bug triggered by ex6.c on a SMP
machine. The sequence is like this:
The manager is waiting in the loop for
a. dead children.
b. request from children.
Now
1. At some instance, a child exits.
2. The manager wakes up in the loop and finds a dead child. It calls
pthread_reap_children () which in turns call pthread_exited () which
calls __pthread_lock (). In __pthread_lock (), there is
if (oldstatus != 0) suspend(self);
This time "oldstatus" is 1 and suspend(self) is called. Now the manager
thinks it has nothing to do and suspends itself. At the same time,
another child sends a REQ_CREATE message to the manager and the calls
suspend(self). Now both the manager and the child called suspend(self).
We get a dead lock here. Remember you may not see it on a UP machine.
Here is a patch.
Thanks.
H.J.
------
Fri Dec 18 19:43:35 1998 H.J. Lu <hjl@gnu.org>
* spinlock.h (__pthread_lock_straight): New prototype.
* spinlock.c (__pthread_lock_straight): New.
(__pthread_lock): Use it.
* manager.c (pthread_exited): Call __pthread_lock_straight
instead of __pthread_lock.
Index: manager.c
===================================================================
RCS file: /home/work/cvs/gnu/glibc/linuxthreads/manager.c,v
retrieving revision 1.1.1.12
diff -u -p -r1.1.1.12 manager.c
--- manager.c 1998/10/31 16:47:03 1.1.1.12
+++ manager.c 1998/12/19 03:35:10
@@ -452,7 +452,7 @@ static void pthread_exited(pid_t pid)
th->p_nextlive->p_prevlive = th->p_prevlive;
th->p_prevlive->p_nextlive = th->p_nextlive;
/* Mark thread as exited, and if detached, free its resources */
- __pthread_lock(th->p_lock, NULL);
+ __pthread_lock_straight (th->p_lock, NULL, NULL);
th->p_exited = 1;
detached = th->p_detached;
__pthread_unlock(th->p_lock);
Index: spinlock.c
===================================================================
RCS file: /home/work/cvs/gnu/glibc/linuxthreads/spinlock.c,v
retrieving revision 1.1.1.8
diff -u -p -r1.1.1.8 spinlock.c
--- spinlock.c 1998/12/15 16:03:12 1.1.1.8
+++ spinlock.c 1998/12/19 03:39:13
@@ -36,8 +36,9 @@
This is safe because there are no concurrent __pthread_unlock
operations -- only the thread that locked the mutex can unlock it. */
-void internal_function __pthread_lock(struct _pthread_fastlock * lock,
- pthread_descr self)
+long internal_function __pthread_lock_straight
+ (struct _pthread_fastlock * lock, pthread_descr self,
+ pthread_descr * new_self)
{
long oldstatus, newstatus;
@@ -54,7 +55,18 @@ void internal_function __pthread_lock(st
THREAD_SETMEM(self, p_nextwaiting, (pthread_descr) oldstatus);
} while(! compare_and_swap(&lock->__status, oldstatus, newstatus,
&lock->__spinlock));
- if (oldstatus != 0) suspend(self);
+ if (new_self)
+ *new_self = self;
+ return oldstatus;
+}
+
+void internal_function __pthread_lock(struct _pthread_fastlock * lock,
+ pthread_descr self)
+{
+ pthread_descr new_self;
+
+ if (__pthread_lock_straight (lock, self, &new_self) != 0)
+ suspend(new_self);
}
void internal_function __pthread_unlock(struct _pthread_fastlock * lock)
Index: spinlock.h
===================================================================
RCS file: /home/work/cvs/gnu/glibc/linuxthreads/spinlock.h,v
retrieving revision 1.1.1.5
diff -u -p -r1.1.1.5 spinlock.h
--- spinlock.h 1998/10/31 16:47:04 1.1.1.5
+++ spinlock.h 1998/12/19 03:35:41
@@ -50,6 +50,9 @@ static inline int compare_and_swap(long
/* Internal locks */
+extern long internal_function __pthread_lock_straight
+ (struct _pthread_fastlock * lock, pthread_descr self,
+ pthread_descr * new_self);
extern void internal_function __pthread_lock(struct _pthread_fastlock * lock,
pthread_descr self);
extern void internal_function __pthread_unlock(struct _pthread_fastlock *lock);