This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Cygwin or openssh socket problem/bug?


Hello,

dont know if this is "openssh" package problem or cygwin internal
problem so i post my findings here...

I use ssh (agent) running in background to get my remote auth stuff
(cvs) done.

In my ".bashrc"

---- snip .bashrc ---

[ -z "$SSH_AUTH_SOCK" ] && eval `ssh-agent -s`
[ -z "$SSH_AGENT_PID" ] || ssh-add -l >/dev/null 2>&1 || ssh-add

And ".logout"

---- snip .lgout -----

kill $SSH_AGENT_PID

---- snip -----

When i update my cygwin installation using "setup.exe" i occasionally
get the ssh-agent hanging while eating 100% cpu.
This happens while the Cygwin Setup Post-Install Script runs.
If i renice the "ssh-agent" process, setup (gui) exits cleanly but the
process is eating cpu forever.

$ ssh -V
OpenSSH_4.2p1, OpenSSL 0.9.8a 11 Oct 2005

Using my favorite win32 user mode debugger, ollydbg:

---- snip -----

Threads
Ident      Entry      Data block   Last error
Status      Priority   User time     System time
0000056C   7C810856   7FFDD000     ERROR_SUCCESS (00000000)
Paused       32 + 0       0.0000 s      0.0000 s
00000C4C   00000000   7FFDF000     ERROR_IO_PENDING (000003E5)
Active       32 - 15     18.6250 s     29.1718 s
00000F9C   7C810856   7FFDE000     ERROR_SUCCESS (00000000)
Paused       32 + 0       0.0000 s      0.0000 s

---- snip -----

Thread 0xc4c is eating cpu (32-15 = reniced it to idle prio) forever.

I debugged thru disassembly and using cygwin dll symbols + ssh-agent
sources (dont have debug symbols)...

----------------snip ssh-agent.c -----------------------------

http://www.openbsd.org/cgi-bin/cvsweb/~checkout~/src/usr.bin/ssh/ssh-age
nt.c?rev=1.123&content-type=text/plain

skip:
	new_socket(AUTH_SOCKET, sock);
	if (ac > 0) {
		signal(SIGALRM, check_parent_exists);
		alarm(10);
	}
	idtab_init();
	if (!d_flag)
		signal(SIGINT, SIG_IGN);
	signal(SIGPIPE, SIG_IGN);
	signal(SIGHUP, cleanup_handler);
	signal(SIGTERM, cleanup_handler);
	nalloc = 0;

	while (1) {
		prepare_select(&readsetp, &writesetp, &max_fd, &nalloc);
		if (select(max_fd + 1, readsetp, writesetp, NULL, NULL)
< 0) {
			if (errno == EINTR)
				continue;
			fatal("select: %s", strerror(errno));
		}
		after_select(readsetp, writesetp);
	}
	/* NOTREACHED */
}


static void
after_select(fd_set *readset, fd_set *writeset)
{
	struct sockaddr_un sunaddr;
	socklen_t slen;
	char buf[1024];
	int len, sock;
	u_int i;
	uid_t euid;
	gid_t egid;

	for (i = 0; i < sockets_alloc; i++)
		switch (sockets[i].type) {
		case AUTH_UNUSED:
			break;
		case AUTH_SOCKET:
			if (FD_ISSET(sockets[i].fd, readset)) {
				slen = sizeof(sunaddr);
				sock = accept(sockets[i].fd,
				    (struct sockaddr *) &sunaddr,
&slen);
				if (sock < 0) {
					error("accept from AUTH_SOCKET:
%s",
					    strerror(errno));
					break;
				}

----------------snip ssh-agent.c -----------------------------

0022EE0C   00402F7C  ssh-agen.00402F7C
0022EE10   00000001  |nfds = 1
0022EE14   004754E0  |Readfds = 004754E0
0022EE18   004754F0  |Writefds = 004754F0
0022EE1C   00000000  |Exceptfds = NULL
0022EE20   00000000  \pTimeout = NULL
0022EE24   00350688
0022EE28   0022EF00
0022EE2C   00000764
0022EE30   7C81B808  RETURN to kernel32.7C81B808 from kernel32.7C80250B
0022EE34   0022D238
0022EE38   61133000  ASCII "Cygwin Setup Post-Install Script"

----------------snip -----------------------------

The problem is the following (forever) loop:

----

while (1) {
		prepare_select(&readsetp, &writesetp, &max_fd, &nalloc);
		if (select(max_fd + 1, readsetp, writesetp, NULL, NULL)
< 0) {
			if (errno == EINTR)
				continue;
			fatal("select: %s", strerror(errno));
		}
		after_select(readsetp, writesetp);
	}
----

"int cygwin_select(int, _types_fd_set*, _types_fd_set*, _types_fd_set*,
timeval*)" 

in ssh-agents's main is returning "1" (eax)

"after_select" is called which calls "cygwin1.accept()"

"accept" returns "-1" (eax) and lasterror/errno is 0x6C

errno = 0x6C -> #define	ESHUTDOWN	108	/* Cannot send after
transport endpoint shutdown */

-----

The main question is: can cygwin's "select" be successful and following
"accept" fail due to non-socket?

Is the problem openssh or cygwin related?
Any thoughts?

Regards,

Robert Michelsen
--


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]