This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
RE: [PATCH 1/2] gdbserver: catch fetch registers error
- From: "Metzger, Markus T" <markus dot t dot metzger at intel dot com>
- To: Pedro Alves <palves at redhat dot com>, "gdb-patches at sourceware dot org" <gdb-patches at sourceware dot org>
- Cc: Daniel Jacobowitz <drow at false dot org>
- Date: Thu, 19 Jan 2017 14:48:29 +0000
- Subject: RE: [PATCH 1/2] gdbserver: catch fetch registers error
- Authentication-results: sourceware.org; auth=none
- References: <1481039697-17596-1-git-send-email-markus.t.metzger@intel.com> <1481039697-17596-2-git-send-email-markus.t.metzger@intel.com> <07c4f2ec-25b7-7a42-bc57-a64045a65f9e@redhat.com>
Hello Pedro,
Thanks for the review and the elaborate response.
> The case of trying to read registers from a thread that is
> running is always a client bug. GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.
That's a general misunderstanding of the design on my part, then. I typically
don't check if I can/should do something but instead just do it and handle errors
in case it failed. Like reading registers.
Let me drop this patch and instead call validate_registers_access in btrace before
reading the PC.
> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.
It works fine and I didn't see a reason why we would not want to allow it.
Thanks,
Markus.
> -----Original Message-----
> From: Pedro Alves [mailto:palves@redhat.com]
> Sent: Thursday, January 19, 2017 2:22 PM
> To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb-
> patches@sourceware.org
> Cc: Daniel Jacobowitz <drow@false.org>
> Subject: Re: [PATCH 1/2] gdbserver: catch fetch registers error
>
> On 12/06/2016 03:54 PM, Markus Metzger wrote:
> > When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer
> throws
> > an error that is caught in captured_main, where it causes a E01 error packet to
> > be sent and gdbserer to quit (if --once was specified) or the event loop to be
> > re-started (otherwise).
> >
> > We may get such ptrace errors when trying to fetch registers for an exited or
> > running thread. There are checks in GDB that check those conditions and throw
> > meaningful error messages before we could run into the above ptrace error,
> > e.g. thread.c:validate_registers_access.
> >
> > I ran into a new case and, rather than adding another call to
> > validate_registers_access in GDB, I propose to catch the error already when
> > handling the 'g' packet in gdbserver and reply with an error packet - assuming
> > that gdbserver's internal state is still intact.
>
> So there are two changes here:
>
> #1 - don't kill the debug session just because
> of register reading failure. I.e., handle it gracefully.
>
> The case of trying to read registers from a thread that is
> running is always a client bug. GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.
> I don't object to being defensive in gdbserver if that's
> not too invasive, but if we find such cases in GDB, we should
> fix them.
>
> However, the case of trying to read from an exited thread can
> well happen, and there's nothing the client can do to prevent it.
> The only thing it can do it handle it gracefully.
> This is because threads can transition from stopped -> exited
> without a "running" state in between. And this can happen if
> some _other_ thread of the inferior process is resumed and that
> other thread causes the whole process to exit (e.g., calls "exit()")
> just while gdb/gdbserver is accessing registers of the other
> supposedly-stopped thread. It can also happen when some child
> thread other than the one we're trying to access registers
> from execs, and the thread the client is trying to access registers
> from is not the main thread (the exec makes the kernel kill all
> threads, including the ones that were stopped).
>
> These sorts of corner cases are exercised by the
> process-dies-while-detaching.exp / process-dies-while-handling-bp.exp
> testcase and maybe others I don't recall their names immediately.
>
> See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749
>
> Note process-dies-while-handling-bp.exp is largely kfailed when
> running against remote, exactly because we don't handle this all
> to well currently, in both gdbserver, and in gdb common code.
>
> #2 - sending a human readable error message back to gdb.
>
> >
> > To not replace a meaningful error message with E01, I'm trying to generate a
> > useful error message when the error is detected and the exception is thrown.
> >
> > It would look like this ...
> >
> > gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44
> > cont&
> > Continuing.
> > (gdb) PASS: gdb.btrace/enable-running.exp: cont&
> > record btrace
> > warning: Remote failure reply: E.Selected thread is running.
> > warning: Remote failure reply: E.Selected thread is running.
> >
> > ... although in this particular case, I'm going to suppress the warning.
>
> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.
>
> >
> > To make this look a bit nicer, we could consider stripping the "E." or the
> > entire "Remote failure reply: E." when (re-)throwing the error inside GDB in
> > remote.c.
>
> Agreed.
>
> > @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs,
> > (PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4)
> 0);
> > regaddr += sizeof (PTRACE_XFER_TYPE);
> > if (errno != 0)
> > - error ("reading register %d: %s", regno, strerror (errno));
> > + {
> > + /* ESRCH could mean that the thread is not traced, exited, or is not
> > + stopped. */
> > + if (errno == ESRCH)
> > + {
> > + struct lwp_info *lwp = get_thread_lwp (current_thread);
> > +
> > + if (!lwp_is_stopped (lwp))
> > + error (_("Selected thread is running."));
> > +
> > + if (lwp_is_marked_dead (lwp))
> > + error (_("Selected thread has terminated."));
>
> The order of the checks seems backwards -- if the lwp is dead,
> lwp_is_stopped is meaningless? But, if either of these conditions
> is true, and the errors are reached, then it looks like
> an internal gdbserver error to me.
>
> lwp_is_stopped, only checks the lwp->stopped flag.
> If !lwp->stopped, then this looks like a case of "we should not
> have reached here in the first place".
>
> Similarly for lwp_is_marked_dead -- if the lwp is known dead,
> then why did we try to ptrace it? Something higher up in the
> call chain should have errored out earlier, IMO.
>
> If we want gdbserver to be defensive against GDB mistakes,
> we can look at the request as coming from server.c, to handle a
> g/G packet. So in that case it's gdbserver's knowledge of
> gdb's perspective of whether the thread is running that counts
> here. I.e., a 'g' packet that tries to read from a thread
> that gdb resumed before should error out even if the
> thread happens to be momentarily paused due to some internal
> gdbserver event handling.
>
> That suggests to me that for the "running" case,
> server.c's g/G packet handling should be erroring out before
> calling into the backend, if:
>
> if (thread->last_resume_kind != resume_stop
> || thread->last_status.kind == TARGET_WAITKIND_IGNORE)
>
> See the linux-low.c:lwp_resumed function, which we could
> move to common code.
>
> Now, even with that in place, we'd still need to handle
> the unpredictable "stopped -> exited" transitions. And those
> can only be handled by checking for failure after the fact...
> But with an upfront "!running" check, then we can assume that
> ESRCH means the thread is gone, and thus unconditionally
> throw the
> error (_("Selected thread has terminated."));
> error.
>
> Note that what gdbserver thinks of "selected" thread has
> no 1-1 relation with what the user thinks as selected thread,
> so that may be a bit confusing. E.g., the user may well have thread 2,
> selected, while that error triggers for some other thread because
> gdb decided to temporarily switch to some other thread, internally,
> e.g., to iterate over threads and read their registers.
>
>
> > + }
> > +
> > + /* Report a generic error if we could not determine the exact
> > + reason. */
> > + error (_("Could not read register %d: %s."), regno, strerror (errno));
> > + }
> > }
> >
> > if (the_low_target.supply_ptrace_register)
> > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> > index ef8dd03..3064b4f 100644
> > --- a/gdb/gdbserver/server.c
> > +++ b/gdb/gdbserver/server.c
> > @@ -4132,8 +4132,22 @@ process_serial_event (void)
> > write_enn (own_buf);
> > else
> > {
> > - regcache = get_thread_regcache (current_thread, 1);
> > - registers_to_string (regcache, own_buf);
> > + TRY
> > + {
> > + regcache = get_thread_regcache (current_thread, 1);
> > + registers_to_string (regcache, own_buf);
> > + }
> > + CATCH (exception, RETURN_MASK_ALL)
>
> Why RETURN_MASK_ALL ?
>
> > + {
> > + const char *message;
> > +
> > + message = exception.message;
> > + if (message == NULL)
> > + message = _("Reading registers failed.");
> > +
> > + sprintf (own_buf, "E.%s", message);
> > + }
> > + END_CATCH
> > }
> > }
> > break;
> >
>
> Thanks,
> Pedro Alves
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928