This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug nptl/11291] New: potential deadlock in sem_*wait and sem_post for MIPS architectures
- From: "mischa dot jonker at viragelogic dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sources dot redhat dot com
- Date: 17 Feb 2010 15:06:52 -0000
- Subject: [Bug nptl/11291] New: potential deadlock in sem_*wait and sem_post for MIPS architectures
- Reply-to: sourceware-bugzilla at sourceware dot org
I've encountered the following issue while heavily using glibc's
sem_wait and sem_post simultaneously in a lot of threads: it seems
that the sem_post and sem_wait functions can enter an endless loop in
some corner cases. After diving into the code, it seems that (for the
case of sem_post, nptl/sysdeps/unix/sysv/linux/sem_post.c) the
following code is causing the problem:
int
__new_sem_post (sem_t *sem)
{
struct new_sem *isem = (struct new_sem *) sem;
__typeof (isem->value) cur;
do
{
cur = isem->value;
if (isem->value == SEM_VALUE_MAX)
{
__set_errno (EOVERFLOW);
return -1;
}
}
while (atomic_compare_and_exchange_bool_acq (&isem->value, cur + 1, cur));
The problem occurs because gcc is optimizing this piece of code, and
puts 'isem->value' in a register before the while loop actually
begins. The atomic_compare_and_exchange_bool_acq macro then compares
'cur' (from the register, which is never updated in the loop) with
'isem->value' (from memory, which could be updated by another thread).
When we have multiple threads running and using the same semaphore,
the isem->value might be updated by another thread after the register
is filled with the value from memory, and then we run into an endless
loop...
I was able to fix this (and more issues of the same kind in the other
semaphore functions) by adding a 'volatile' keyword:
int
__new_sem_post (sem_t *sem)
{
struct new_sem volatile *isem = (struct new_sem *) sem;
===8<===8<===8<=== Additional info
glibc 2.9, but still present in git
kernel Linux 2.6.28.10 Wed Feb 17 13:32:34 CET 2010 mips unknown
$ mipsel-linux-gcc -v
Using built-in specs.
Target: mipsel-linux
Configured with: --with-gnu-ld --enable-shared --enable-target-optspace --
enable-languages=c,c++,objc --enable-threads=posix --enable-multilib --enable-
c99 --enable-long-long --enable-symvers=gnu --enable-libstdcxx-pch --program-
prefix=mipsel-linux- --enable-libssp --disable-bootstrap --enable-libgomp --
disable-libmudflap --disable-libunwind-exceptions --enable-libssp --enable-
libgomp --disable-libmudflap --enable-__cxa_atexit
Thread model: posix
gcc version 4.2.4
$ mipsel-linux-ld -v
GNU ld (Linux/GNU Binutils) 2.18.50.0.7.20080502
===8<===8<===8<=== Patch that solved the problem for me:
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_post.c
b/nptl/sysdeps/unix/sysv/linux/sem_post.c
index 58b226f..0d4a6d6 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_post.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_post.c
@@ -29,7 +29,7 @@
int
__new_sem_post (sem_t *sem)
{
- struct new_sem *isem = (struct new_sem *) sem;
+ struct new_sem volatile *isem = (struct new_sem *) sem;
__typeof (isem->value) cur;
do
@@ -64,7 +64,7 @@ int
attribute_compat_text_section
__old_sem_post (sem_t *sem)
{
- int *futex = (int *) sem;
+ int volatile *futex = (int *) sem;
int nr = atomic_increment_val (futex);
/* We always have to assume it is a shared semaphore. */
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
b/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
index fdf0d74..a241ad6 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
@@ -34,7 +34,7 @@ extern void __sem_wait_cleanup (void *arg) attribute_hidden;
int
sem_timedwait (sem_t *sem, const struct timespec *abstime)
{
- struct new_sem *isem = (struct new_sem *) sem;
+ struct new_sem volatile *isem = (struct new_sem *) sem;
int err;
if (atomic_decrement_if_positive (&isem->value) > 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
b/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
index f500361..74e1170 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
@@ -30,7 +30,7 @@
int
__new_sem_trywait (sem_t *sem)
{
- int *futex = (int *) sem;
+ int volatile *futex = (int *) sem;
int val;
if (*futex > 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_wait.c
b/nptl/sysdeps/unix/sysv/linux/sem_wait.c
index 20e2b48..5601e1a 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_wait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_wait.c
@@ -32,7 +32,7 @@ void
attribute_hidden
__sem_wait_cleanup (void *arg)
{
- struct new_sem *isem = (struct new_sem *) arg;
+ struct new_sem volatile *isem = (struct new_sem *) arg;
atomic_decrement (&isem->nwaiters);
}
@@ -41,7 +41,7 @@ __sem_wait_cleanup (void *arg)
int
__new_sem_wait (sem_t *sem)
{
- struct new_sem *isem = (struct new_sem *) sem;
+ struct new_sem volatile *isem = (struct new_sem *) sem;
int err;
if (atomic_decrement_if_positive (&isem->value) > 0)
@@ -90,7 +90,7 @@ int
attribute_compat_text_section
__old_sem_wait (sem_t *sem)
{
- int *futex = (int *) sem;
+ int volatile *futex = (int *) sem;
int err;
do
--
Summary: potential deadlock in sem_*wait and sem_post for MIPS
architectures
Product: glibc
Version: 2.9
Status: NEW
Severity: normal
Priority: P2
Component: nptl
AssignedTo: drepper at redhat dot com
ReportedBy: mischa dot jonker at viragelogic dot com
CC: glibc-bugs at sources dot redhat dot com,mischa dot
jonker at viragelogic dot com
GCC build triplet: i386-linux-gnu
GCC host triplet: mipsel-linux-gnu
GCC target triplet: mipsel-linux-gnu
http://sourceware.org/bugzilla/show_bug.cgi?id=11291
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.