This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Intermittent failures retrieving process exit codes


On Dec 21 01:30, Tom Honermann wrote:
> I spent most of the week debugging this issue.  This appears to be a
> defect in Windows.  I can reproduce the issue without Cygwin.  I
> can't rule out other third party kernel mode software possibly
> contributing to the issue.  A simple change to Cygwin works around
> the problem for me.
> 
> I don't know which Windows releases are affected by this.  I've only
> reproduced the problem (outside of Cygwin) with Wow64 processes
> running on 64-bit Windows 7.  I haven't yet tried elsewhere.
> 
> The problem appears to be a race condition involving concurrent
> calls to TerminateProcess() and ExitThread().  The example code
> below minimally mimics the threads created and exit process/thread
> calls that are performed when running Cygwin's false.exe.  The
> primary thread exits the process via TerminateProcess() ala
> pinfo::exit() in winsup/cygwin/pinfo.cc.  The secondary thread exits
> itself via ExitThread() ala Cygwin's signal processing thread
> function, wait_sig(), in winsup/cygwin/sigproc.cc.
> 
> When the race condition results in the undesirable outcome, the exit
> code for the process is set to the exit code for the secondary
> thread's call to ExitThread().  I can only speculate at this point,
> but my guess is that the TerminateProcess() code disassociates the
> calling thread from the process before other threads are stopped
> such that ExitThread(), concurrently running in another thread, may
> determine that the calling thread is the last thread of the process
> and overwrite the process exit code.
> 
> The issue also reproduces if ExitProcess() is called in place of
> TerminateProcess().  The test case below only uses
> TerminateProcess() because that is what Cygwin does.
> 
> Source code to reproduce the issue follows.  Again, Cygwin is not
> required to reproduce the problem.  For my own testing, I compiled
> the code using Microsoft's Visual Studio 2010 x86 compiler with the
> command 'cl /Fetest-exit-code.exe test-exit-code.cpp'
> 
> test-exit-code.cpp:

Wow.  Thanks for this testcase.  I tried to reproduce the issue and
I was not able to reprodsuce it on a single-CPU, single-core setup,
but I could reproduce it almost immediately on a dual-core system,
twice in a row in under 5 secs.

> The workaround I implemented within Cygwin was simple and sloppy.  I
> added a call to Sleep(1000) immediately before the call to
> ExitThread() in wait_sig() in winsup/cygwin/sigproc.cc.  Since this
> thread (probably) doesn't exit until the process is exiting anyway,
> the call to Sleep() does not adversely affect shutdown.  The thread
> just gets terminated while in the call to Sleep() instead of exiting
> before the process is terminated or getting terminated while still
> in the call to ExitThread().  A better solution might be to avoid
> the thread exiting at all (so long as it can't get terminated while
> holding critical resources), or to have the process exiting thread
> wait on it.  Neither of these is ideal.  Orderly shutdown of
> multi-threaded processes is really hard to do correctly on Windows.
> 
> Since the exit code for the signal processing thread is not used,
> having the wait_sig() thread (and any other threads that could
> potentially concurrently exit with another thread) exit with a
> special status value such as STATUS_THREAD_IS_TERMINATING
> (0xC000004BL) would enable diagnosis of this issue as any process
> exit code matching this would be a likely indicator that this issue
> was encountered.

Maybe the signal thread should really not exit by itself, but just
wait until the TerminateThread is called.  Chris?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]