This is the mail archive of the
cygwin-xfree@cygwin.com
mailing list for the Cygwin XFree86 project.
Re: XIO error... anyone seen it? [found and fixed]
- To: Alan Hourihane <alanh at fairlite dot demon dot co dot uk>
- Subject: Re: XIO error... anyone seen it? [found and fixed]
- From: Brian Genisio <briangenisio at yahoo dot com>
- Date: Thu, 8 Nov 2001 14:17:49 -0800 (PST)
- Cc: cygwin-xfree at cygwin dot com
Cyg/XF86 friends,
Ok... after plenty of debugging, and goofing with the problem, in order to
re-create it, I finally found what the problem is, AND the fix.
As you recall, I had 2 cygwin-X applications, and both were heavy graphic
hogs. One primarily uses XPutImage, and the other uses XpmCreate* functions.
Both were randomly dying with a fatal error of 105, which indicates a full
buffer. These calls to XPutImage were failing when they were putting HUGE
images to the X server. (1280x1024 5000+ colors at 32 bits/color)
Well, when I traced back the problem, I found that in the XpmCreate calls, it
was still dying in the XPutImage calls within the XpmCreate.... I found the
common culprit between the two applications.
Ok... time to debug the X11 library :)
XPutImage breaks the images up recursively, to handle the maximum request
length of the Display in PutSubImage (lib/X11/PutImage.c). It then sends the
data out through the Send*image->Data->_XSend (lib/X11/XlibInt.c).
Within the _XSend command, it does a bunch of buffer manipulation, and finally
calls _X11TransWritev to write it out to the socket. If this call fails, it
checks the macro ETEST(), which is really (errno == EAGAIN || errno ==
EWOULDBLOCK). If either of these are true, we want to try again, so _XSend
calls _XWaitForWritable.
_XWaitForWritable is a function that waits for the socket to be writeable, so
it can successfully call _X11TransWritev the next time through the while loop.
So, if I change the definition of ETEST() to (errno == EAGAIN || errno ==
EWOULDBLOCK || errno == ENOBUFS), my problem is fixed!!!
Why? Because there is no buffer space available on the socket, after 5000+
AllocColor requests. It needs to be cleaned out. _XWaitForWritable waits for
this to happen, and writing out happily works the next time around.
Should we add the folowing line to XlibInt.c? :
#ifdef __CYGWIN__
#define ETEST() (errno == EAGAIN || errno == EWOULDBLOCK || errno == ENOBUFS)
Or, is there a better solution? What do you (Cyg/XF86 developers) think?
Brian
--- Alan Hourihane <alanh@fairlite.demon.co.uk> wrote:
> On Mon, Nov 05, 2001 at 02:23:38PM -0800, Brian Genisio wrote:
> > Yes, I am running cygwin1.dll v 1.3.3.
> >
> > I have been trying to debug the problem, and of course, running in a
> debugger
> > makes it much harder to find where the problem is. I think the fact that
> the
> > app runs slower in a debugger makes this buffer overrun problem no longer
> > happen. (Sucks for finding the problem, eh?)
> >
> > In my application where I am having the most troubles, the
> > XpmCreatePixmapFromXpmImage function is where it crashes out. This
> function is
> > called many times in a row. I have tried a few things, with no improvement
> :
> >
> > 1. Call XFlush between calls
> > 2. Call XSync between calls
> > 3. Add a delay between calls.
> >
> > This tells me that the problem is probably being caused by the
> > XpmCreatePixmapFromXpmImage itself. Some Pixmaps being loaded are as large
> as
> > 2.6 MB (1280x960 in size) I hooked my app up to xmond, and found that the
> > following protocol calls were being made (for 59 XPM calls):
> >
> > 1 CreateWindow
> > 4 InternAtom
> > 1 GetProperty
> > 79 CreatePixmap
> > 80 CreateGC
> > 79 FreeGC
> > 154 PutImage
> > 23210 AllocColor
> > 2 QueryExtension
> >
> > So, we see that a LOT of colors are being allocated quickly, and a lot of
> > images are being put up to the X server in PutImage.
> >
> > Unfortunately, I am unable to catch it in the act (with the debugger or
> Xmon),
> > so I cant tell you exactly where the problem is :(
> >
> > Keep in mind, this is the Application crashing, not the X server.
> >
> Sure. I understand it's the application, and the error is coming back
> from the cygwin kernel. 1.3.4 is out, and maybe worth a try at least.
>
> You could set up some breakpoints for _XDefaultIOError (which is where
> the XIO: error comes from) and do a backtrace in gdb to find out exactly
> where things are coming from. This way the program should run at
> fullspeed.
>
> What's the platform and how much memory have you got ? what about the
> size of your paging file too ? what's the output of 'uname -a' ?
>
> Alan.
__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com