This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/4][REPOST] Remote Linux ptrace exec events


Hi Luis,
Thanks for the review.

On 5/7/2014 1:01 PM, Luis Machado wrote:
Hi,

On 04/30/2014 04:18 PM, Don Breazeal wrote:
From: Don Breazeal <don_breazeal@mentor.com>

[Reposting to eliminate unexpected attachment type.]

This patch implements support for exec events in gdbserver on linux, in
multiprocess mode (target extended-remote).  Follow-exec-mode and rerun
behave as expected.  Catchpoints for exec are not yet implemented
since it
will be easier to implement catchpoints for fork, vfork, and exec all at
the same time.

TESTING
---------
The patch was tested on GNU/Linux x86_64 with --target_board set to
native-gdbserver and native-extended-gdbserver, as well as testing native
GDB.  The test results for native-gdbserver were unchanged.  Thirteen
tests
that used to fail for native-extended-gdbserver on Linux pass with this
patch, and the non-ldr-exc-*.exp tests all pass in non-stop mode and
extended-remote.  There are several failures in the new non-ldr-exc-*.exp
tests in non-stop mode with native GDB.

One caveat: when an exec is detected, gdbserver emits a couple of
warnings:
     gdbserver: unexpected r_debug version 0
     gdbserver: unexpected r_debug version 0
However, debugging of shared libraries that are loaded by the exec'd
program works just fine.  These messages may be caused by gdbserver
making
an attempt to initialize the solib hook before the r_debug structure has
been initialized.  I intend to follow up in a subsequent patch.

IMPLEMENTATION
----------------
Support for exec events in single-threaded programs was a fairly
straightforward replication of the implementation in native GDB:
    1) Enable exec events via ptrace options.
    2) Add support for handling the exec events to the
handle_extended_wait
       and linux_wait_for_event_filtered.  Detect the exec event, then
       find and save the pathname of the executable file being exec'd.
    3) Implement an additional "stop reason", "exec", in the RSP stop
reply
       packet "T".
Existing GDB code takes care of handling the exec event on the host side
without modification.

Support for exec events in multi-threaded programs required some
additional
work that required a couple of significant changes to existing code.
In a
nutshell, the changes are to:
    4) Use the PTRACE_EVENT_EXIT extended event to handle thread exit,
       while not exposing any change in exit handling to the user.  The
       rationale for this is discussed in the "patch 0" email of this
       series.
    5) Recognize when the exec'ing thread has vanished (become the thread
       group leader) in send_sigstop.  Native GDB does this differently.

Regarding items 4 & 5: Recall that when a non-leader thread exec's,
all the
other threads are terminated and the exec'ing thread changes its
thread id
to that of the old leader (the process id) as part of the exec.  There is
no event reported for the "exit" of the exec'ing thread; it appears to
have
vanished.  The original thread group leader can't be reaped until all the
other threads have been reaped, and some way of determining that it has
exited is required in order to update the lwp list (#4 above). Also, some
mechanism for deleting the lwp entry corresponding to the exec'ing thread
is needed (#5 above).

The rationale for #4 is that in my testing I ran into a race condition in
the mechanism that's intended to detect when a thread group leader has
exited, check_zombie_leaders.  The race occurred when there were only two
threads in the program.  In this case the leader thread passes through a
very brief zombie state before being replaced by the exec'ing thread as
the thread group leader.  This state transition is asynchronous, with no
dependency on anything gdbserver does.  Using PTRACE_EVENT_EXIT ensures
that the leader exit will be detected.  I can provide a more detailed
explanation or the race, but I didn't want to be too long-winded here.

Regarding item #5, determining that the exec'ing thread has
"vanished", In
native GDB this is done by calling waitpid(PID), and if it returns ECHILD
it means that the thread is gone.  We don't want to use waitpid(PID) in
gdbserver, based on the discussion in
https://www.sourceware.org/ml/gdb-patches/2014-02/msg00828.html. An
alternative is to send a signal to each thread and look for an ESRCH (No
such process) error.  In all-stop mode this can be done in the normal
course of events, since when gdbserver reports an exec event it stops all
the other threads with a SIGSTOP.  In non-stop mode, when an exec
event has
been detected, we can call stop_all_lwps/unstop_all_lwps to accomplish
the
same thing.

gdb/
2014-04-02  Don Breazeal  <donb@codesourcery.com>

    * common/linux-ptrace.c (linux_test_for_tracefork)
    [GDBSERVER]: Add exec tracing to ptrace options if OS supports
    it.
    * remote.c (remote_parse_stop_reply): Support new RSP stop
    reply reason 'exec'.

gdbserver/
2014-04-02  Don Breazeal  <donb@codesourcery.com>

    * gdb/gdbserver/linux-low.c (linux_child_pid_to_exec_file): New
    function.
    (handle_extended_wait): Add support for PTRACE_EVENT_EXEC.
    (check_zombie_leaders): Update comment.
    (linux_low_filter_event): Add support for PTRACE_EVENT_EXEC.
    (linux_wait_for_event_filtered): Update comment.
    (extended_event_reported): New function.
    (send_sigstop): Delete lwp on 'No such process' error and
    reset current_inferior.
    * gdb/gdbserver/linux-low.h (struct lwp_info): New member
    'waitstatus'.
    * gdb/gdbserver/remote-utils.c (prepare_resume_reply): Support
    new RSP stop reply reason 'exec'.

---
  gdb/common/linux-ptrace.c    |    7 +-
  gdb/gdbserver/linux-low.c    |  175
++++++++++++++++++++++++++++++++++++++----
  gdb/gdbserver/linux-low.h    |    5 +
  gdb/gdbserver/remote-utils.c |   28 +++++++-
  gdb/remote.c                 |   27 ++++++-
  5 files changed, 222 insertions(+), 20 deletions(-)

diff --git a/gdb/common/linux-ptrace.c b/gdb/common/linux-ptrace.c
index e3fc705..b137df9 100644
--- a/gdb/common/linux-ptrace.c
+++ b/gdb/common/linux-ptrace.c
@@ -491,8 +491,11 @@ linux_test_for_tracefork (int child_pid)
    if (ret == child_pid && WIFSTOPPED (status)
        && status >> 16 == PTRACE_EVENT_EXIT)
      {
-        /* PTRACE_O_TRACEEXIT is supported.  */
-        current_ptrace_options |= PTRACE_O_TRACEEXIT;
+      /* PTRACE_O_TRACEEXIT is supported.  We use exit events to
+     implement support for exec events.  Since fork events are
+     supported we know exec events are supported, so we enable
+     exec events here.  */
+      current_ptrace_options |= PTRACE_O_TRACEEXIT | PTRACE_O_TRACEEXEC;
      }
  #endif
  }
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 90e7b15..5f94490 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -281,6 +281,32 @@ static int linux_event_pipe[2] = { -1, -1 };
  static void send_sigstop (struct lwp_info *lwp);
  static void wait_for_sigstop (void);

+/* Accepts an integer PID.  Returns a string containing the
+   name of the executable file for the child process.
+   Space for the result is malloc'd, caller must free.  */
+
+static char *
+linux_child_pid_to_exec_file (int pid)
+{
+  char *name1, *name2;
+
+  name1 = xmalloc (PATH_MAX);
+  name2 = xmalloc (PATH_MAX);
+  memset (name2, 0, PATH_MAX);
+
+  sprintf (name1, "/proc/%d/exe", pid);
+  if (readlink (name1, name2, PATH_MAX) > 0)
+    {
+      free (name1);
+      return name2;
+    }
+  else
+    {
+      free (name2);
+      return name1;
+    }
+}
+

At one point in time this function existed in gdbserver/linux-low.c but
got deleted by the following commit:

commit be07f1a20c962deb662b93209b4ca91bc8e5cbd8
Author: Pedro Alves <palves@redhat.com>
Date:   Fri Jan 27 19:23:43 2012 +0000

     2012-01-27  Pedro Alves  <palves@redhat.com>

         * linux-low.c (linux_child_pid_to_exec_file): Delete.
         (elf_64_file_p): Make static.
         (linux_pid_exe_is_elf_64_file): New.
         * linux-low.h (linux_child_pid_to_exec_file, elf_64_file_p):
         Delete declarations.
         (linux_pid_exe_is_elf_64_file): Declare.
         * linux-x86-low.c (x86_arch_setup): Use
         linux_pid_exe_is_elf_64_file.

It still exists in linux-nat.c though. I remember i tried to unify those
and move the resulting function to common/, but there may be differences
preventing such move.

If it can be done, it would be best.

I am working on doing this. It seems like it should be possible. One question: the version of the function in linux-nat.c takes a "struct target_ops *" argument, I assume because it is used in the target_ops function vector. Is it acceptable to include "target.h" in a common file? It will pick up different files for GDB and GDBserver. I'm not sure if that would violate some aspect of the design. common/btrace.c includes it, but it's the only common file that does.


  /* Return non-zero if HEADER is a 64-bit ELF file.  */

  static int
@@ -514,6 +540,19 @@ handle_extended_wait (struct lwp_info
*event_child, int *wstatp)
                (PTRACE_TYPE_ARG4) 0);
        return ret;
      }
+  else if (event == PTRACE_EVENT_EXEC)
+    {
+      if (debug_threads)
+    debug_printf ("LHEW: Got exec event from LWP %ld\n",
+              lwpid_of (event_thr));
+
+      event_child->waitstatus.kind = TARGET_WAITKIND_EXECD;
+      event_child->waitstatus.value.execd_pathname
+    = linux_child_pid_to_exec_file (lwpid_of (event_thr));
+
+      /* Report the event.  */
+      return 0;
+    }
    internal_error (__FILE__, __LINE__,
                _("unknown ptrace event %d"), event);
  }
@@ -1376,18 +1415,19 @@ check_zombie_leaders (void)
           program).  In the latter case, we can't waitpid the
           leader's exit status until all other threads are gone.

-         - There are 3 or more threads in the group, and a thread
+         - There are multiple threads in the group, and a thread
           other than the leader exec'd.  On an exec, the Linux
           kernel destroys all other threads (except the execing
           one) in the thread group, and resets the execing thread's
           tid to the tgid.  No exit notification is sent for the
           execing thread -- from the ptracer's perspective, it
           appears as though the execing thread just vanishes.
-         Until we reap all other threads except the leader and the
-         execing thread, the leader will be zombie, and the
-         execing thread will be in `D (disc sleep)'.  As soon as
-         all other threads are reaped, the execing thread changes
-         it's tid to the tgid, and the previous (zombie) leader
+         Until we reap all other threads (if any) except the
+         leader and the execing thread, the leader will be zombie,
+         and the execing thread will be in `D (disc sleep)'.  As
+         soon as all other threads are reaped, or have reported
+         PTRACE_EVENT_EXIT events, the execing thread changes its
+         tid to the tgid, and the previous (zombie) leader
           vanishes, giving place to the "new" leader.  We could try
           distinguishing the exit and exec cases, by waiting once
           more, and seeing if something comes out, but it doesn't
@@ -1395,7 +1435,11 @@ check_zombie_leaders (void)
           we'll re-add the new one once we see the exec event
           (which is just the same as what would happen if the
           previous leader did exit voluntarily before some other
-         thread execs).  */
+         thread execs).
+
+         Note that when PTRACE_EVENT_EXEC is supported, we use
+         that mechanism to detect thread exit, including the
+         exit of zombie leaders.  */

        if (debug_threads)
          fprintf (stderr,
@@ -1791,6 +1835,57 @@ linux_low_filter_event (ptid_t filter_ptid, int
lwpid, int *wstatp)

    child = find_lwp_pid (pid_to_ptid (lwpid));

+  /* Check for stop events reported by a process we didn't already
+     know about - anything not already in our LWP list.
+
+     If we're expecting to receive stopped processes after
+     fork, vfork, and clone events, then we'll just add the
+     new one to our list and go back to waiting for the event
+     to be reported - the stopped process might be returned
+     from waitpid before or after the event is.
+
+     But note the case of a non-leader thread exec'ing after the
+     leader having exited, and gone from our lists.  On an exec,
+     the Linux kernel destroys all other threads (except the execing
+     one) in the thread group, and resets the execing thread's tid
+     to the tgid.  No exit notification is sent for the execing
+     thread -- from the ptracer's perspective, it appears as though
+     the execing thread just vanishes.  When they are available, we
+     use exit events (PTRACE_EVENT_EXIT) to detect thread exit
+     reliably.  As soon as all other threads (if any) are reaped or
+     have reported their PTRACE_EVENT_EXIT events, the execing
+     thread changes it's tid to the tgid, and the previous (zombie)
+     leader vanishes, giving place to the "new" leader.  The lwp
+     entry for the previous leader is deleted when we handle its
+     exit event, and we re-add the new one here.  */
+
+  if (WIFSTOPPED (wstat) && child == NULL
+      && (WSTOPSIG (wstat) == SIGTRAP && wstat >> 16 ==
PTRACE_EVENT_EXEC))
+    {
+      ptid_t child_ptid;
+

It would be nice to replace the shift operation with a predicate to test
for PTRACE_EVENT_EXEC, just like you did for PTRACE_EVENT_EXIT.

Will do. Actually, given your previous comment, I intend to move all of these predicates into common/linux-ptrace.[ch].


+      /* A multi-thread exec after we had seen the leader exiting.  */
+      if (debug_threads)
+    debug_printf ("LLW: Re-adding thread group leader LWP %d.\n",
+              lwpid);
+
+      child_ptid = ptid_build (lwpid, lwpid, 0);
+      child = add_lwp (child_ptid);
+      child->stopped = 1;
+      current_inferior = child->thread;
+
+      if (non_stop && stopping_threads == NOT_STOPPING_THREADS)
+    {
+      /* Make sure we delete the lwp entry for the exec'ing thread,
+         which will have vanished.  We do this by sending a signal
+         to all the other threads in the lwp list, deleting any
+         that are not found.  Note that in all-stop mode this will
+         happen before reporting the event.  */
+      stop_all_lwps (0, child);
+      unstop_all_lwps (0, child);
+    }
+    }
+
    /* If we didn't find a process, one of two things presumably
happened:
       - A process we started and then detached from has exited.
Ignore it.
       - A process we are controlling has forked and the new child's stop
@@ -2122,8 +2217,7 @@ linux_wait_for_event_filtered (ptid_t wait_ptid,
ptid_t filter_ptid,
       - When a non-leader thread execs, that thread just vanishes
         without reporting an exit (so we'd hang if we waited for it
         explicitly in that case).  The exec event is reported to
-       the TGID pid (although we don't currently enable exec
-       events).  */
+       the TGID pid.  */
        errno = 0;
        ret = my_waitpid (-1, wstatp, options | WNOHANG);

@@ -2520,6 +2614,21 @@ linux_stabilize_threads (void)
      }
  }

+/* Return non-zero if WAITSTATUS reflects an extended linux
+   event.  Otherwise, return 0.  Note that extended EXIT
+   events are fixed up and handled like normal events, so
+   they are not considered here.  */
+
+static int
+extended_event_reported (const struct target_waitstatus *waitstatus)
+{
+
+  if (waitstatus == NULL)
+    return 0;
+
+  return waitstatus->kind == TARGET_WAITKIND_EXECD;
+}
+

Someone mentioned putting ()'s around the condition for better
readability. Though not required, i also agree that it improves things.

Not a requirement, just a suggestion.

I'll do it.


  /* Wait for process, returns status.  */

  static ptid_t
@@ -2883,7 +2992,8 @@ retry:
                 && !bp_explains_trap && !trace_event)
             || (gdb_breakpoint_here (event_child->stop_pc)
                 && gdb_condition_true_at_breakpoint
(event_child->stop_pc)
-               && gdb_no_commands_at_breakpoint
(event_child->stop_pc)));
+               && gdb_no_commands_at_breakpoint (event_child->stop_pc))
+           || extended_event_reported (&event_child->waitstatus));

    run_breakpoint_commands (event_child->stop_pc);

@@ -2905,6 +3015,15 @@ retry:
                paddress (event_child->stop_pc),
                paddress (event_child->step_range_start),
                paddress (event_child->step_range_end));
+      if (debug_threads
+          && extended_event_reported (&event_child->waitstatus))
+        {
+          char *str = target_waitstatus_to_string (ourstatus);
+          debug_printf ("LWP %ld has forked, cloned, vforked or execd"
+                " with waitstatus %s\n",
+                lwpid_of (get_lwp_thread (event_child)), str);
+          xfree (str);
+        }
      }


Maybe we could provide more precise information about the event here
instead of something generic? It may help debugging in the future or if
we ever have a mix of these events happening very close to each other.

It turns out the string from target_waitstatus_to_string contains the precise information. I'll clean this up.


        /* We're not reporting this breakpoint to GDB, so apply the
@@ -3003,7 +3122,19 @@ retry:
      unstop_all_lwps (1, event_child);
      }

-  ourstatus->kind = TARGET_WAITKIND_STOPPED;
+  if (extended_event_reported (&event_child->waitstatus))
+    {
+      /* If the reported event is a fork, vfork or exec, let GDB
+     know.  */
+      ourstatus->kind = event_child->waitstatus.kind;
+      ourstatus->value = event_child->waitstatus.value;
+
+      /* Reset the event child's waitstatus since we handled it
+     already.  */
+      event_child->waitstatus.kind = TARGET_WAITKIND_IGNORE;
+    }
+  else
+    ourstatus->kind = TARGET_WAITKIND_STOPPED;


This chunk of code setting waitstatus.kind to TARGET_WAITKIND_IGNORE may
be a bit confusing/hackish, but we really don't want gdbserver's event
loop to handle the extended events as TARGET_WAITKIND_STOPPED.

gdbserver's event loop could probably use a cleanup . For the time
being, this looks good to me though.

    if (current_inferior->last_resume_kind == resume_stop
        && WSTOPSIG (w) == SIGSTOP)
@@ -3014,13 +3145,14 @@ retry:
        ourstatus->value.sig = GDB_SIGNAL_0;
      }
    else if (current_inferior->last_resume_kind == resume_stop
-       && WSTOPSIG (w) != SIGSTOP)
+       && WSTOPSIG (w) != SIGSTOP
+       && !extended_event_reported (ourstatus))
      {
        /* A thread that has been requested to stop by GDB with vCont;t,
       but, it stopped for other reasons.  */
        ourstatus->value.sig = gdb_signal_from_host (WSTOPSIG (w));
      }
-  else
+  else if (ourstatus->kind == TARGET_WAITKIND_STOPPED)
      {
        ourstatus->value.sig = gdb_signal_from_host (WSTOPSIG (w));
      }
@@ -3126,6 +3258,7 @@ static void
  send_sigstop (struct lwp_info *lwp)
  {
    int pid;
+  int ret;

    pid = lwpid_of (get_lwp_thread (lwp));

@@ -3143,7 +3276,21 @@ send_sigstop (struct lwp_info *lwp)
      debug_printf ("Sending sigstop to lwp %d\n", pid);

    lwp->stop_expected = 1;
-  kill_lwp (pid, SIGSTOP);
+  errno = 0;
+  ret = kill_lwp (pid, SIGSTOP);
+  if (ret == -1 && errno == ESRCH)
+    {
+      /* If the kill fails with "No such process", on GNU/Linux we know
+     that the LWP has vanished - it is not a zombie, it is gone.
+     This is due to a thread other than the thread group leader
+     calling exec.  See comments in linux_low_filter_event regarding
+     PTRACE_EVENT_EXEC.  */
+      delete_lwp (lwp);
+      set_desired_inferior (0);
+
+      if (debug_threads)
+    debug_printf ("send_sigstop: lwp %d has vanished\n", pid);
+    }
  }

  static int
diff --git a/gdb/gdbserver/linux-low.h b/gdb/gdbserver/linux-low.h
index 7459710..7759f01 100644
--- a/gdb/gdbserver/linux-low.h
+++ b/gdb/gdbserver/linux-low.h
@@ -265,6 +265,11 @@ struct lwp_info
    /* When stopped is set, the last wait status recorded for this
lwp.  */
    int last_status;

+  /* If WAITSTATUS->KIND != TARGET_WAITKIND_IGNORE, the waitstatus for
+     this LWP's last event.  This may correspond to LAST_STATUS above,
+     or to the current status during event processing.  */
+  struct target_waitstatus waitstatus;
+
    /* When stopped is set, this is where the lwp stopped, with
       decr_pc_after_break already accounted for.  */
    CORE_ADDR stop_pc;
diff --git a/gdb/gdbserver/remote-utils.c b/gdb/gdbserver/remote-utils.c
index 4fcafa0..9ce25dc 100644
--- a/gdb/gdbserver/remote-utils.c
+++ b/gdb/gdbserver/remote-utils.c
@@ -1111,14 +1111,40 @@ prepare_resume_reply (char *buf, ptid_t ptid,
    switch (status->kind)
      {
      case TARGET_WAITKIND_STOPPED:
+    case TARGET_WAITKIND_EXECD:
        {
      struct thread_info *saved_inferior;
      const char **regp;
      struct regcache *regcache;
+    enum gdb_signal signal;
+
+    if (status->kind == TARGET_WAITKIND_EXECD)
+      signal = GDB_SIGNAL_TRAP;
+    else
+      signal = status->value.sig;
+
+    sprintf (buf, "T%02x", signal);

-    sprintf (buf, "T%02x", status->value.sig);
      buf += strlen (buf);

+    if (status->kind == TARGET_WAITKIND_EXECD && multi_process)
+      {
+        const char *event = "exec";
+        char hexified_pathname[PATH_MAX];
+
+        sprintf (buf, "%s:", event);
+        buf += strlen (buf);
+
+        /* Encode pathname to hexified format.  */
+        bin2hex ((const gdb_byte *) status->value.execd_pathname,
+             hexified_pathname, strlen(status->value.execd_pathname));
+
+        sprintf (buf, "%s;", hexified_pathname);
+        xfree (status->value.execd_pathname);
+        status->value.execd_pathname = NULL;
+        buf += strlen (buf);
+      }
+
      saved_inferior = current_inferior;

      current_inferior = find_thread_ptid (ptid);
diff --git a/gdb/remote.c b/gdb/remote.c
index be8c423..f4412d8 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -5542,11 +5542,13 @@ remote_parse_stop_reply (char *buf, struct
stop_reply *event)
           pnum and set p1 to point to the character following it.
           Otherwise p1 points to p.  */

-      /* If this packet is an awatch packet, don't parse the 'a'
-         as a register number.  */
+      /* If this packet has a stop reason string that starts
+         with a character that could be a hex digit, don't parse
+         it as a register number.  */

        if (strncmp (p, "awatch", strlen("awatch")) != 0
-          && strncmp (p, "core", strlen ("core") != 0))
+          && strncmp (p, "core", strlen ("core") != 0)
+          && strncmp (p, "exec", strlen ("exec") != 0))
          {
            /* Read the ``P'' register number.  */
            pnum = strtol (p, &p_temp, 16);
@@ -5598,6 +5600,25 @@ Packet: '%s'\n"),
            p = unpack_varlen_hex (++p1, &c);
            event->core = c;
          }
+          else if (strncmp (p, "exec", p1 - p) == 0)
+        {
+          ULONGEST pid;
+          char pathname[PATH_MAX];
+
+          p = unpack_varlen_hex (++p1, &pid);
+
+          /* Save the pathname for event reporting and for
+             the next run command. */
+          hex2bin (p1, (gdb_byte *) pathname, (p - p1)/2);
+          /* Add the null terminator.  */
+          pathname[(p - p1)/2] = '\0';
+          /* This is freed during event handling.  */
+          event->ws.value.execd_pathname = xstrdup (pathname);
+          event->ws.kind = TARGET_WAITKIND_EXECD;
+          /* Save the pathname for the next run command.  */
+          xfree (remote_exec_file);
+          remote_exec_file = pathname;
+        }
            else
          {
            /* Silently skip unknown optional info.  */







Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]