This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH 1/2] gdbserver: catch fetch registers error


Hello Pedro,

Thanks for the review and the elaborate response.

> The case of trying to read registers from a thread that is
> running is always a client bug.  GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.

That's a general misunderstanding of the design on my part, then.  I typically
don't check if I can/should do something but instead just do it and handle errors
in case it failed.  Like reading registers.

Let me drop this patch and instead call validate_registers_access in btrace before
reading the PC.


> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.

It works fine and I didn't see a reason why we would not want to allow it.

Thanks,
Markus.

> -----Original Message-----
> From: Pedro Alves [mailto:palves@redhat.com]
> Sent: Thursday, January 19, 2017 2:22 PM
> To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb-
> patches@sourceware.org
> Cc: Daniel Jacobowitz <drow@false.org>
> Subject: Re: [PATCH 1/2] gdbserver: catch fetch registers error
> 
> On 12/06/2016 03:54 PM, Markus Metzger wrote:
> > When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer
> throws
> > an error that is caught in captured_main, where it causes a E01 error packet to
> > be sent and gdbserer to quit (if --once was specified) or the event loop to be
> > re-started (otherwise).
> >
> > We may get such ptrace errors when trying to fetch registers for an exited or
> > running thread.  There are checks in GDB that check those conditions and throw
> > meaningful error messages before we could run into the above ptrace error,
> > e.g. thread.c:validate_registers_access.
> >
> > I ran into a new case and, rather than adding another call to
> > validate_registers_access in GDB, I propose to catch the error already when
> > handling the 'g' packet in gdbserver and reply with an error packet - assuming
> > that gdbserver's internal state is still intact.
> 
> So there are two changes here:
> 
> #1 - don't kill the debug session just because
>      of register reading failure.  I.e., handle it gracefully.
> 
> The case of trying to read registers from a thread that is
> running is always a client bug.  GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.
> I don't object to being defensive in gdbserver if that's
> not too invasive, but if we find such cases in GDB, we should
> fix them.
> 
> However, the case of trying to read from an exited thread can
> well happen, and there's nothing the client can do to prevent it.
> The only thing it can do it handle it gracefully.
> This is because threads can transition from stopped -> exited
> without a "running" state in between.  And this can happen if
> some _other_ thread of the inferior process is resumed and that
> other thread causes the whole process to exit (e.g., calls "exit()")
> just while gdb/gdbserver is accessing registers of the other
> supposedly-stopped thread.  It can also happen when some child
> thread other than the one we're trying to access registers
> from execs, and the thread the client is trying to access registers
> from is not the main thread (the exec makes the kernel kill all
> threads, including the ones that were stopped).
> 
> These sorts of corner cases are exercised by the
> process-dies-while-detaching.exp / process-dies-while-handling-bp.exp
> testcase and maybe others I don't recall their names immediately.
> 
> See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749
> 
> Note process-dies-while-handling-bp.exp is largely kfailed when
> running against remote, exactly because we don't handle this all
> to well currently, in both gdbserver, and in gdb common code.
> 
> #2 - sending a human readable error message back to gdb.
> 
> >
> > To not replace a meaningful error message with E01, I'm trying to generate a
> > useful error message when the error is detected and the exception is thrown.
> >
> > It would look like this ...
> >
> > gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44
> > cont&
> > Continuing.
> > (gdb) PASS: gdb.btrace/enable-running.exp: cont&
> > record btrace
> > warning: Remote failure reply: E.Selected thread is running.
> > warning: Remote failure reply: E.Selected thread is running.
> >
> > ... although in this particular case, I'm going to suppress the warning.
> 
> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.
> 
> >
> > To make this look a bit nicer, we could consider stripping the "E." or the
> > entire "Remote failure reply: E." when (re-)throwing the error inside GDB in
> > remote.c.
> 
> Agreed.
> 
> > @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs,
> >  		(PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4)
> 0);
> >        regaddr += sizeof (PTRACE_XFER_TYPE);
> >        if (errno != 0)
> > -	error ("reading register %d: %s", regno, strerror (errno));
> > +	{
> > +	  /* ESRCH could mean that the thread is not traced, exited, or is not
> > +	     stopped.  */
> > +	  if (errno == ESRCH)
> > +	    {
> > +	      struct lwp_info *lwp = get_thread_lwp (current_thread);
> > +
> > +	      if (!lwp_is_stopped (lwp))
> > +		error (_("Selected thread is running."));
> > +
> > +	      if (lwp_is_marked_dead (lwp))
> > +		error (_("Selected thread has terminated."));
> 
> The order of the checks seems backwards -- if the lwp is dead,
> lwp_is_stopped is meaningless?  But, if either of these conditions
> is true, and the errors are reached, then it looks like
> an internal gdbserver error to me.
> 
> lwp_is_stopped, only checks the lwp->stopped flag.
> If !lwp->stopped, then this looks like a case of "we should not
> have reached here in the first place".
> 
> Similarly for lwp_is_marked_dead -- if the lwp is known dead,
> then why did we try to ptrace it?  Something higher up in the
> call chain should have errored out earlier, IMO.
> 
> If we want gdbserver to be defensive against GDB mistakes,
> we can look at the request as coming from server.c, to handle a
> g/G packet.  So in that case it's gdbserver's knowledge of
> gdb's perspective of whether the thread is running that counts
> here.  I.e., a 'g' packet that tries to read from a thread
> that gdb resumed before should error out even if the
> thread happens to be momentarily paused due to some internal
> gdbserver event handling.
> 
> That suggests to me that for the "running" case,
> server.c's g/G packet handling should be erroring out before
> calling into the backend, if:
> 
>   if (thread->last_resume_kind != resume_stop
>       || thread->last_status.kind == TARGET_WAITKIND_IGNORE)
> 
> See the linux-low.c:lwp_resumed function, which we could
> move to common code.
> 
> Now, even with that in place, we'd still need to handle
> the unpredictable "stopped -> exited" transitions.  And those
> can only be handled by checking for failure after the fact...
> But with an upfront "!running" check, then we can assume that
> ESRCH means the thread is gone, and thus unconditionally
> throw the
>   error (_("Selected thread has terminated."));
> error.
> 
> Note that what gdbserver thinks of "selected" thread has
> no 1-1 relation with what the user thinks as selected thread,
> so that may be a bit confusing.  E.g., the user may well have thread 2,
> selected, while that error triggers for some other thread because
> gdb decided to temporarily switch to some other thread, internally,
> e.g., to iterate over threads and read their registers.
> 
> 
> > +	    }
> > +
> > +	  /* Report a generic error if we could not determine the exact
> > +	     reason.  */
> > +	  error (_("Could not read register %d: %s."), regno, strerror (errno));
> > +	}
> >      }
> >
> >    if (the_low_target.supply_ptrace_register)
> > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> > index ef8dd03..3064b4f 100644
> > --- a/gdb/gdbserver/server.c
> > +++ b/gdb/gdbserver/server.c
> > @@ -4132,8 +4132,22 @@ process_serial_event (void)
> >  	    write_enn (own_buf);
> >  	  else
> >  	    {
> > -	      regcache = get_thread_regcache (current_thread, 1);
> > -	      registers_to_string (regcache, own_buf);
> > +	      TRY
> > +		{
> > +		  regcache = get_thread_regcache (current_thread, 1);
> > +		  registers_to_string (regcache, own_buf);
> > +		}
> > +	      CATCH (exception, RETURN_MASK_ALL)
> 
> Why RETURN_MASK_ALL ?
> 
> > +		{
> > +		  const char *message;
> > +
> > +		  message = exception.message;
> > +		  if (message == NULL)
> > +		    message = _("Reading registers failed.");
> > +
> > +		  sprintf (own_buf, "E.%s", message);
> > +		}
> > +	      END_CATCH
> >  	    }
> >  	}
> >        break;
> >
> 
> Thanks,
> Pedro Alves

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]