This is the mail archive of the cygwin-xfree@cygwin.com mailing list for the Cygwin XFree86 project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: XIO error... anyone seen it? [found and fixed]


Cyg/XF86 friends,

Ok... after plenty of debugging, and goofing with the problem, in order to
re-create it, I finally found what the problem is, AND the fix.

As you recall, I had 2 cygwin-X applications, and both were heavy  graphic
hogs.  One primarily uses XPutImage, and the other uses XpmCreate* functions. 
Both were randomly dying with a fatal error of 105, which indicates a full
buffer.  These calls to XPutImage were failing when they were putting HUGE
images to the X server.  (1280x1024 5000+ colors at 32 bits/color)

Well, when I traced back the problem, I found that in the XpmCreate calls, it
was still dying in the XPutImage calls within the XpmCreate.... I found the
common culprit between the two applications.  

Ok... time to debug the X11 library :)

XPutImage breaks the images up recursively, to handle the maximum request
length of the Display in PutSubImage (lib/X11/PutImage.c).  It then sends the
data out through the Send*image->Data->_XSend (lib/X11/XlibInt.c).

Within the _XSend command, it does a bunch of buffer manipulation, and finally
calls _X11TransWritev to write it out to the socket.  If this call fails, it
checks the macro ETEST(), which is really (errno == EAGAIN || errno ==
EWOULDBLOCK).  If either of these are true, we want to try again, so _XSend
calls _XWaitForWritable.  

_XWaitForWritable is a function that waits for the socket to be writeable, so
it can successfully call _X11TransWritev the next time through the while loop. 
So, if I change the definition of ETEST() to (errno == EAGAIN || errno ==
EWOULDBLOCK || errno == ENOBUFS), my problem is fixed!!!

Why?  Because there is no buffer space available on the socket, after 5000+
AllocColor requests.  It needs to be cleaned out.  _XWaitForWritable waits for
this to happen, and writing out happily works the next time around.


Should we add the folowing line to XlibInt.c?  :
#ifdef __CYGWIN__
#define ETEST() (errno == EAGAIN || errno == EWOULDBLOCK || errno == ENOBUFS)

Or, is there a better solution? What do you (Cyg/XF86 developers) think?
Brian




--- Alan Hourihane <alanh@fairlite.demon.co.uk> wrote:
> On Mon, Nov 05, 2001 at 02:23:38PM -0800, Brian Genisio wrote:
> > Yes, I am running cygwin1.dll v 1.3.3.
> > 
> > I have been trying to debug the problem, and of course, running in a
> debugger
> > makes it much harder to find where the problem is.  I think the fact that
> the
> > app runs slower in a debugger makes this buffer overrun problem no longer
> > happen.  (Sucks for finding the problem, eh?)
> > 
> > In my application where I am having the most troubles, the
> > XpmCreatePixmapFromXpmImage function is where it crashes out.  This
> function is
> > called many times in a row.  I have tried a few things, with no improvement
> :
> > 
> > 1. Call XFlush between calls
> > 2. Call XSync between calls
> > 3. Add a delay between calls.
> > 
> > This tells me that the problem is probably being caused by the
> > XpmCreatePixmapFromXpmImage itself.  Some Pixmaps being loaded are as large
> as
> > 2.6 MB (1280x960 in size)  I hooked  my app up to xmond, and found that the
> > following protocol calls were being made (for 59 XPM calls):
> > 
> > 1     CreateWindow
> > 4     InternAtom
> > 1     GetProperty
> > 79    CreatePixmap
> > 80    CreateGC
> > 79    FreeGC
> > 154   PutImage
> > 23210 AllocColor
> > 2     QueryExtension
> > 
> > So, we see that a LOT of colors are being allocated quickly, and a lot of
> > images are being put up to the X server in PutImage.
> > 
> > Unfortunately, I am unable to catch it in the act (with the debugger or
> Xmon),
> > so I cant tell you exactly where the problem is :(
> > 
> > Keep in mind, this is the Application crashing, not the X server.
> > 
> Sure. I understand it's the application, and the error is coming back
> from the cygwin kernel. 1.3.4 is out, and maybe worth a try at least.
> 
> You could set up some breakpoints for _XDefaultIOError (which is where
> the XIO: error comes from) and do a backtrace in gdb to find out exactly
> where things are coming from. This way the program should run at
> fullspeed.
> 
> What's the platform and how much memory have you got ? what about the
> size of your paging file too ? what's the output of 'uname -a' ?
> 
> Alan.


__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]