This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Close fds before execute a cmd


Hi, all!

	When I test systemtap, I found that sometimes,
such bug occurred:
1)stap will not exit.(^C does not work)
2)staprun will be "Uninterruptible sleep" on syscall delete_module.
3)stapio will receive SIGCHLDs and exits successfully.(it's proper)
4)stap_XXXXXX.ko cannot be unloaded by command rmmod.
(and I found it is the bug#3718)

	There are two ways to reproduce this bug:
1)use option -c "/etc/init.d/nfs start" with any stap script.
2)use stapfunc system("/etc/init.d/nfs start") in script.
(This bug is easily reproduced by these ways on the platforms
that I have, fc6 on i386 and rhel5_ga on ia64.)

	I traced this bug and at the last I found the reason: stapio
did not close some file descriptors before using execl() to run a cmd.
The processes of this cmd may read or write on the fds inherited from
stapio. Especially the fd of control_channel, if one of processes of
this cmd reads and writes on this fd, it communicates with stap_XXXXXX.ko,
and lots of BUGs occurred. (Does stap_XXXXXX.ko support communicating
with two or more processes at the same time?)

This is a example, ill_req.c:
----------------------------------------------
#include <unistd.h>
#include <string.h>
#define ILL_REQ 1 //STP_EXIT
int main(int argc, char *argv[])
{
	int i;
	int req_type = ILL_REQ;
	char buf[1024];
	memcpy(buf, &req_type, 4);
	// guest control_channel and write
	for (i = 3; i < 1024; i++)
		write(i, buf, 4);
	return 0;
}
----------------------------------------------

$ stap -e 'probe begin{system("./ill_req")}'
	I has not terminated it and there is no exit() in script,
but it exit quickly!

This bug can be fixed by the following patch:

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>

--- src/runtime/staprun/mainloop.c	2007-09-15 01:11:12.000000000 +0900
+++ src.new/runtime/staprun/mainloop.c	2007-10-09 13:49:24.000000000 +0900
@@ -48,6 +48,30 @@
	sigaction(SIGQUIT, &a, NULL);
}

+/* close all fds except std fds */
+static int close_all(void)
+{
+	struct dirent *d;
+	DIR *fds_dir = opendir("/proc/self/fd");
+	if (fds_dir == NULL)
+		return -1;
+
+	while ((d = readdir(fds_dir)) != NULL) {
+		char *endptr;
+		long fd = strtol(d->d_name, &endptr, 10);
+		if (d->d_name == endptr || *endptr != '\0')
+			continue; /* skip, it's not an integer */
+		if (fd < 0 || (unsigned long)fd >= (unsigned long)INT_MAX)
+			continue; /* skip, it's not a nonnegative-int */
+		if (fd != STDIN_FILENO && fd != STDOUT_FILENO
+			&& fd != STDERR_FILENO && fd != dirfd(fds_dir))
+		{
+			close((int) fd);
+		}
+	}
+	closedir(fds_dir);
+	return 0;
+}

/* * start_cmd forks the command given on the command line
@@ -87,6 +111,9 @@


		/* commands we fork need to run at normal priority */
		setpriority (PRIO_PROCESS, 0, 0);
+
+		if (close_all())
+			_exit(1);
		
		/* wait here until signaled */
		sigwait(&usrset, &signum);
@@ -112,6 +139,8 @@
		_perr("fork");
	} else if (pid == 0) {
		setpriority (PRIO_PROCESS, 0, 0);
+		if (close_all())
+			_exit(1);
		if (execl("/bin/sh", "sh", "-c", cmd, NULL) < 0)
			perr("%s", cmd);
		_exit(1);




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]