This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Possible race in SYSV IPC (semaphores)
- From: "Lavrentiev, Anton (NIH/NLM/NCBI) [C]" <lavr at ncbi dot nlm dot nih dot gov>
- To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
- Date: Fri, 26 Oct 2012 14:36:42 -0400
- Subject: Possible race in SYSV IPC (semaphores)
Hi,
For now, I can only report the observed (mis) behavior of SYSV semop() call,
which (on the client side) gets manifested as the following:
transport_layer_pipes::connect: lost connection to cygserver, error = 2
(this code then does a by-hand adjustment with semctl(SETVAL)).
Note that there is a dedicated cygserver process running for my single-threaded
application.
Looking at the debugging output of cygserver, this is what I see in the log
(around the only time semctl() is logged there):
cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/transport_pipes.cc, line 132: Try to create named pipe: \\.\pipe\cygwin-13a7ed34cc1953a9-lpc
cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/transport_pipes.cc, line 132: Try to create named pipe: \\.\pipe\cygwin-13a7ed34cc1953a9-lpc
Note the double pipe creation call, and only a single "exit" log line such as:
cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/sem.cc, line 81: leaving (3416)
Cygserver does not stop (also, since SIGSYS is set to ignore in the program,
it also keeps running -- although, not always quite successfully once the semop()
failure occurred.)
The semaphore operations are very intensive; and involve arrays of 5 sems at
some times; also, there are quite large chunks of shmem updated every now
and then.
I studied the source of cygserver, and noticed that pipe_instance (transport_pipes.cc)
is not declared "volatile". This is strange because the compiler can rearrange lines
of code that include this variable, otherwise. And that seems rather critical.
Right now what I observe, is that SYSV IPC is unreliable, and I'm yet to figure
out why; the very same code (the locking logic) works on Linux/Solaris/Mac for
years and on thousands (yes, that many) of hosts. With CYGWIN the instability can
appear within a wide range of run time: from just a few minutes to some long hours,
rather randomly.
Any input can be greatly appreciated.
Anton Lavrentiev
Contractor NIH/NLM/NCBI
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple