This is the mail archive of the
cygwin-xfree@cygwin.com
mailing list for the Cygwin XFree86 project.
More Info on "Can't open display"
- To: cygwin-xfree at cygwin dot com
- Subject: More Info on "Can't open display"
- From: "Christopher Landrieu" <landrieu at hotmail dot com>
- Date: Mon, 18 Jun 2001 17:27:09
Okay... I finally found enough time this past weekend to persue this a
little further. I started looking through the code and found the handy
XTRANSDEBUG feature. More on that in a bit.
First, I appreciate everyone's recent comments and suggestions. I've gotten
a lot of ideas from the various discussions and thoughs that have been
flying through the list.
On the topic of UNIX domain sockets and Cygwin's implementation of them, I
don't believe that this is causing the Aventail troubles. The requirements
for the transport code to choose the domain socket communication method are
vary specific. I found the code in xc/lib/X11/ConnDis.c and in
xc/lib/xtrans/. It seems that domain sockets (AF_UNIX address familly) will
only be used in place of TCP sockets when the IP address in $DISPLAY matches
the machines local IP address. It can also be specified in $DISPLAY by
prepending "unix/".
Just the same, "tcp/" can be prepended to specify that a TCP socket
connection should be used (AF_INET). After finding this, I decided to play
around with it a bit to see if it changed anything. Unfortunately, the
problem remained.
Also, the domain sockets topic does not explain why local X clients would
fail to connect to remote displays.
Anyway, after fishing through xc/lib/xtrans a bit more, I found what I
needed to fully enable XTRANSDEBUGing (debugging messages in the transport
code of XFree86). So, I turned it up to five, did "make World", and
installed the DLLs. When I tried xterm, I got a flurry of useful messages
in my command window.
What I found in the debug messages is that it seems as if setting the socket
connection to be non-blocking is what is failing. Here is the debug from a
failed run (with Aventail running):
=== Begin debug listing ===
Xtrans debug:_X11TransOpenCOTSClient(tcp/127.0.0.1:0)
_X11TransOpen(1,tcp/127.0.0.1:0)
_X11TransParseAddress(tcp/127.0.0.1:0)
_X11TransSelectTransport(tcp)
_X11TransSocketOpenCOTSClient(tcp,127.0.0.1,0)
_X11TransSocketSelectFamily(tcp)
_X11TransSocketOpen(1,1)
_X11TransConnect(3,tcp/127.0.0.1:0)
_X11TransParseAddress(tcp/127.0.0.1:0)
_X11TransSocketINETConnect(3,127.0.0.1,0)
_X11TransSocketINETConnect: inet_addr(127.0.0.1) = 100007f
_X11TransSocketINETConnect: sockname.sin_port = 6000
_X11TransSocketINETGetAddr(14ad7108)
_X11TransSocketINETGetPeerAddr(14ad7108)
_X11TransGetPeerAddr(3)
_X11TransConvertAddress(2,16,14ad7198)
_X11TransSetOption(3,2,1)
_X11TransSocketWritev(3,26bf99c,1)
_X11TransSetOption(3,1,1)
_X11TransSocketDisconnect(14ad7108,3)
_X11TransClose(3)
_X11TransSocketINETClose(14ad7108,3)
_X11TransFreeConnInfo(14ad7108)
xterm Xt error: Can't open display: tcp/127.0.0.1:0.0
=== End debug listing ===
The _X11TransSetOption(3,1,1) message corresponds to the call in
_XSendClientPrefix() to _X11TransSetOption(dpy->trans_conn,
TRANS_NONBLOCKING, 1). This essentially does either an fcntl() or an
ioctl() on the socket to do poll the socket instead of blocking on recv()
calls (when there is no data available yet).
The next message after that is the client disconnecting and closing the
socket (so Aventail is not closing the socket). Now of course this does not
necessarily mean that it is that call that is failing. These messages only
display for calls to the functions in xc/lib/xtrans. There may be something
after this call which is failing and then initiating the disconnect and
close.
After seeing this, I decided to check out what happens when the connection
is successful (by putting xterm in as an exception in Aventail). I did this
and got the following:
=== Begin debug listing ===
_X11TransOpenCOTSClient(tcp/127.0.0.1:0)
_X11TransOpen(1,tcp/127.0.0.1:0)
_X11TransParseAddress(tcp/127.0.0.1:0)
_X11TransSelectTransport(tcp)
_X11TransSocketOpenCOTSClient(tcp,127.0.0.1,0)
_X11TransSocketSelectFamily(tcp)
_X11TransSocketOpen(1,1)
_X11TransConnect(3,tcp/127.0.0.1:0)
_X11TransParseAddress(tcp/127.0.0.1:0)
_X11TransSocketINETConnect(3,127.0.0.1,0)
_X11TransSocketINETConnect: inet_addr(127.0.0.1) = 100007f
_X11TransSocketINETConnect: sockname.sin_port = 6000
_X11TransSocketINETGetAddr(14ad7108)
_X11TransSocketINETGetPeerAddr(14ad7108)
_X11TransGetPeerAddr(3)
_X11TransConvertAddress(2,16,14ad7198)
_X11TransSetOption(3,2,1)
_X11TransSocketWritev(3,26bf99c,1)
_X11TransSetOption(3,1,1)
_X11TransSocketRead(3,26bfa54,8)
_X11TransSocketRead(3,26bfa54,8)
_X11TransSocketRead(3,14ad7840,256)
_X11TransSocketWrite(3,14ad3428,64)
_X11TransSocketRead(3,26bfa6c,32)
_X11TransSocketRead(3,26bfa6c,32)
_X11TransSocketRead(3,26bfa6c,32)
_X11TransSocketWrite(3,14ad3428,4)
...
=== End debug listing ===
The messages go on and on from there. As you can see, the next message
after the "_X11TransSetOption(3,1,1)" is a call to _X11TransSocketRead().
This read takes place about 15 lines down in XOpenDisplay() from where
_X11SendClientPrefix() returns. There isn't much there between the
_X11TransSetOption(3,1,1) and the call to _X11TransSocketRead(). The only
thing that looked suspicious to me is the LockDisplay() call. I traced
through it and couldn't find anything else that would cause it to abort due
to a network problem. Therefore, by process of elimination, I think it must
be the attempt to set TRANS_NONBLOCKING on the socket.
My next plan of attack is to write a small test program or two to test the
the use of that flag. Also, I want to trace through the Cygwin code for
fcntl() and/or ioctl(), whichever one is being used in this case, and find
out how Cygwin translates that to a WinSock call.
I am confident that this will lead us closer to either a bug report to give
to Aventail or a fix (or at least something to discuss) in Cygwin and
XFree86.
Chris
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com