This is the mail archive of the glibc-linux@ricardo.ecn.wfu.edu mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: pthreads problems


On Mon, 13 Dec 1999 kd@flaga.is wrote:

> Date: Mon, 13 Dec 1999 10:58:52 +0000
> From: kd@flaga.is
> Reply-To: glibc-linux@ricardo.ecn.wfu.edu
> To: Andreas Jaeger <aj@suse.de>
> Cc: glibc-linux@ricardo.ecn.wfu.edu
> Subject: Re: pthreads problems
> 
> 
> I have been tracing this bug into the libpthread. The pthread_create()
> blocks on the suspend(); call, namely on the function sigsuspend() (in
> linuxthreads/restart.h).
> 
> The thread is obviously put to sleep, but is never woken up. Who is
> responcible for waking up the thread?

The thread manager is responsible for waking the caller up. The pthread_create
caller passes a message to it through the manager pipe, then suspends.  The
thread manager processes the message, creates the new thread, stores the
information into the calling thread and then resumes it.

There are a number of reasons why this might deadlock; for example, the thread
manager may have crashed. There was a bug in 2.1.1 which would cause this to
happen if pthread_create() ran out of threads. The application would appear to
mysteriously deadlock, due to threads waiting for a reply from the defunct
manager.

Any bug in LinuxThreads (or your program!) which may lead the thread manager to
crash will result in this kind of deadlock.

> The signal that the thread is woken up on is number 32 i.e. SIGRTMIN.
> SIGRTMAX is 63.
> Shouldnt the signal that the thread is woken up with be SIGUSR1 (or so I
> thought from the documentation of linuxthreads)?

The LinuxThreads that is incorporated into glibc will use the real time
signals if they are available. You can look at the source code to understand
more closely how it works. (You don't have to download all of glibc, 
since LinuxThreads is in a separate tarball.)
 
> If I pass SIGRTMIN to the process manually (with kill -32 <processid of
> main thread>) the thread is woken up and it runs until it goes to sleep
> again.

It sounds very much like the manager went kaput. This doesn't necessarily point
to a bug in the threads library. It's possible for an application to corrupt
the threading meta-data. This is analogous to malloc() or free() crashing
because you have corrupted a heap, not because of bugs in the allocator.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]