This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] GDBserver crashes when killing a multi-thread process




On 10/07/14 16:16, Pedro Alves wrote:
+static void
+kill_wait_lwp (struct lwp_info *lwp)
+{
+  struct thread_info *thr = get_lwp_thread (lwp);
+  int pid = ptid_get_pid (ptid_of (thr));
+  int lwpid = ptid_get_lwp (ptid_of (thr));
+  int wstat;
+  int res;
+
+  if (debug_threads)
+    debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid);
+
+  do
+    {
+      linux_kill_one_lwp (lwp);
+
+      /* Make sure it died.  Notes:
+
+	 - The loop is most likely unnecessary.
+
+         - We don't use linux_wait_for_event as that could delete lwps
+           while we're iterating over them.  We're not interested in
+           any pending status at this point, only in making sure all
+           wait status on the kernel side are collected until the
+           process is reaped.
+
+	 - We don't use __WALL here as the __WALL emulation relies on
+	   SIGCHLD, and killing a stopped process doesn't generate
+	   one, nor an exit status.
+      */
+      res = my_waitpid (lwpid, &wstat, 0);
+      if (res == -1 && errno == ECHILD)
+	res = my_waitpid (lwpid, &wstat, __WCLONE);
+    } while (res > 0 && WIFSTOPPED (wstat));
+
+  gdb_assert (res > 0);
+}

Hi Pedro,
do you still remember why did you add this assert?  It wasn't
mentioned in the mail https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html

I am looking at a GDBserver internal error on x86_64 when I run
gdb.threads/thread-unwindonsignal.exp with GDBserver,

continue^M
Continuing.^M
warning: Remote failure reply: E.No unwaited-for children left.^M
PC register is not available^M
(gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit
Remote debugging from host 127.0.0.1^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
monitor exit^M
Killing process(es): 30694^M
(gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A problem internal to GDBserver has been detected.^M
kill_wait_lwp: Assertion `res > 0' failed.

After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html
GDBserver starts to swallows errors if the LWP is gone.  Then, when
GDBservers kills non-exist LWP, the assert will be triggered.

Why don't we implement kill_wait_lwp like its counterpart in GDB
linux-nat.c:kill_wait_callback? we can loop and assert like this
patch below, (note that this patch fixes the internal error, and
the FAIL is still there).

--
Yao (éå)

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 7bb9f7f..07d051a 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -1101,9 +1101,9 @@ kill_wait_lwp (struct lwp_info *lwp)
       res = my_waitpid (lwpid, &wstat, 0);
       if (res == -1 && errno == ECHILD)
        res = my_waitpid (lwpid, &wstat, __WCLONE);
-    } while (res > 0 && WIFSTOPPED (wstat));
+    } while (res == lwpid);

-  gdb_assert (res > 0);
+  gdb_assert (res == -1 && errno == ECHILD);
 }

 /* Callback for `find_inferior'.  Kills an lwp of a given process,


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]