This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/2644] Race condition during unwind code after thread cancellation


http://sourceware.org/bugzilla/show_bug.cgi?id=2644

Abdullah Muzahid <prince.cse99 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |prince.cse99 at gmail dot
                   |                            |com
         Resolution|FIXED                       |

--- Comment #12 from Abdullah Muzahid <prince.cse99 at gmail dot com> 2011-07-22 22:52:50 UTC ---
Hi,
I am a phd student in University of Illinois in CS dept. Recently I have been
working on memory model related bugs in software. I was experimenting with this
bug. And I found out that the bug is not properly fixed. pthread_cancel_init()
uses libgcc_s_getcfa as a flag. To make it work, we need to use 2 barrier - one
before writing into libgcc_s_getcfa and one after reading it in line 40 (just
before returning). The fix puts the first barrier but not the second one. Now
consider the following scenario where Thread 1 in inside pthread_cancel_init
and is actually initializing the pointers. Thread 2 is in _Unwind_Resume, finds
libgcc_s_resume to be NULL and calls the init function.
     Thread 1                                 Thread 2
libgcc_s_resume = resume;      if(__builtin_expect(libgcc_s_getcfa != NULL,1)) 
...                            ...
atomic_write_barrier();
libgcc_s_getcfa = getcfa;      libgcc_s_resume(exc);

Now in Power-PC memory model, it is perfectly valid to execute read operations
to different addresses out of order as long as there is no barrier in between
them. Although thread 2 issues the instructions in order, it is possible that
the second read (i.e. reading of the pointer libgcc_s_resume) will execute
before the first read of libgcc_s_getcfa. This is shown here.

     Thread 1                             Thread 2
                                 libgcc_s_resume(exc);
libgcc_s_resume = resume;
...
atomic_write_barrier();
libgcc_s_getcfa = getcfa;            
                                 if(__builtin_expect(libgcc_s_getcfa !=
NULL,1))

As a result, although the condition of the if statement for thread 2 becomes 
true, it will end up using NULL value for libgcc_s_resume. This will crash the
program. So, you need to put a read_barrier after reading libgcc_s_getcfa in
the if statement (i.e at line 42 before returning from pthread_cancel_init).
This pattern is very similar to double checked locking (DCL) which also
requires 2 barrier to make it work. More on DCL can be found here
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
Thanks.
-Abdullah Muzahid
 PhD Student
 CS, UIUC

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]