This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Threaded socket hang in 1.3.20


On Tue, Feb 18, 2003 at 05:27:47PM -0500, Jason Tishler wrote:
> The attached C++ testcase demonstrates the problem.  In 1.3.20-1, the
> program hangs in the call to socket() in the second thread:
> 
>     Creating thread for fn1
>     fn1 begin
>     fn1: calling accept()...
>     Creating thread for fn2
>     fn2 begin
>     fn2: calling socket()...
> 
> I'm not sure why connect() fails, because a "telnet localhost 54321"
> works just fine.  I'm probably demonstrating my sockets ignorance.

I looked into this problem and it turns out to be a non-socket specific
problem but instead a deadlock problem in cygheap:

When accept is called, it creates a new file descriptor by calling

  cygheap_fdnew res_fd;

before calling winsock's accept().  This in turn creates an exclusive lock
in cygheap_fdnew():

  cygheap_fdnew (int seed_fd = -1, bool lockit = true)
    {
      if (lockit)
	SetResourceLock (LOCK_FD_LIST, WRITE_LOCK | READ_LOCK, "cygheap_fdnew");
      [...]

which is not unlocked as long as the function isn't left.

Since accept hangs until a connection is actually made (on blocking
sockets), the lock persists.  The next socket() call also creates a new
file descriptor the same way.  Since the above lock still applies, this
time the creation of the file descriptor hangs in the call to
SetResourceLock().

Looking through our sources, I found some places where cygheap_fdnew
could possible cause a hang or where the return value isn't tested or
where the lock is unnecessary long due to calling cygheap_fdnew too early.
I've cleaned that up a bit and commited the changes.

Now back to the test case.  With these changes the socket() call doesn't
hang but now connect() is in trouble.  It hangs for a while until it
returns with error 116, Connection timeout.

I must admit, that I didn't find the cause so far.  Help in debugging
this is appreciated.

Corinna


-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Developer                                mailto:cygwin at cygwin dot com
Red Hat, Inc.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]