This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix problem handling colon in linespec, PR breakpoints/18303


Keith Seitz writes:
 > Hi,
 >
 > Thank you for pointing this out and supplying a patch (and *tests*!).
 >
 > On 01/08/2016 10:44 AM, Don Breazeal wrote:
 >
> > During GDB's parsing of the linespec, when the filename was not found in
 > > the program's symbols, GDB tried to determine if the filename string
> > could be some other valid identifier. Eventually it would get to a point > > where it concluded that the Windows logical volume, or whatever was to the > > left of the colon, was a C++ namespace. When it found only one colon, it
 > > would assert.
 >
 > I have to admit, when I first read this, my reaction was that this is a
 > symbol lookup error. After investigating it a bit, I think it is. [More
 > rationale below.]
 >
> > GDB's linespec grammar allows for several different linespecs that contain > > ':'. The only one that has a number after the colon is filename:linenum.
 >
 > True, but for how long? I don't know. The parser actually does/could (or
 > for some brief time *did*) support offsets just about anywhere, e.g.,
 > foo.c:function:label:3. That support was removed and legacy (buggy)
 > behavior of ignoring the offset was maintained in the parser/sal
 > converter. There is no reason we couldn't support it, though.
 >
 > > There is another possible solution.  After no symbols are found for the
 > > file and we determine that it is a filename:linenum style linespec, we
 > > could just consume the filename token and parse the rest of the
 > > linespec.  That works as well, but there is no advantage to it.
 >
 > And, of course, I came up with an entirely different solution. :-)
 >
 > Essentially what is happening is that the linespec parser is calling
 > lookup_symbol on the user input (e.g., "spaces: and :colons.cc"). That
 > is causing several problems, all which assume the input is well-formed.
 > As you've discovered, that is a pretty poor assumption. Malformed (user)
 > input should not cause an assertion failure IMO.
 >
 > > I created two new tests for this.  One is just gdb.linespec/ls-errs.exp
 > > copied and converted to use C++ instead of C, and to add the Windows
 > > logical drive case.  The other is an MI test to provide a regression
 > > test for the specific case reported in PR 18303.
 >
 > EXCELLENT! Thank you so much for doing this. These tests were
 > outrageously useful while attempting to understand the problem.
 >
 > With that said, I offer *for discussion* this alternative solution,
 > which removes the assumption that input to lookup_symbol is/must be
 > well-formed.
 >
 > [There is one additional related/necessary fixlet snuck in here... The
 > C++ name parser always assumed that ':' was followed by another ':'. Bad
> assumption. So it would return an incorrect result in the malformed case.]
 >
 > WDYT?
 >
 > Keith
 >
 > [apologies on mailer wrapping]
 >
 > Author: Keith Seitz <keiths@redhat.com>
 > Date:   Fri Jan 8 14:00:22 2016 -0800
 >
 > Tolerate malformed input for lookup_symbol-called functions.
 >
 > lookup_symbol is often called with user input. Consequently, any
 > function called from lookup_symbol{,_in_language} should attempt to deal
 > with malformed input gracefully. After all, malformed user input is
 > not a programming/API error.
 >
 > This patch does not attempt to find/correct all instances of this.
 > It only fixes locations in the code that trigger test suite failures.
 >
 >     gdb/ChangeLog
 >
 >     	* cp-namespace.c (cp_lookup_bare_symbol): Do not assert on
 >     	user input.
 >     	(cp_search_static_and_baseclasses): Return null_block_symbol for
 >     	malformed input.
 >     	Remove assertions.
 >     	* cp-support.c (cp_find_first_component_aux): Do not return
 >     	a prefix length for ':' unless the next character is also ':'.
 >
 > diff --git a/gdb/cp-namespace.c b/gdb/cp-namespace.c
 > index 72002d6..aa225fe 100644
 > --- a/gdb/cp-namespace.c
 > +++ b/gdb/cp-namespace.c
 > @@ -166,12 +166,6 @@ cp_lookup_bare_symbol (const struct language_defn
 > *langdef,
 >  {
 >    struct block_symbol sym;
 >
 > -  /* Note: We can't do a simple assert for ':' not being in NAME because
> - ':' may be in the args of a template spec. This isn't intended to be
 > -     a complete test, just cheap and documentary.  */
 > -  if (strchr (name, '<') == NULL && strchr (name, '(') == NULL)
 > -    gdb_assert (strchr (name, ':') == NULL);
 > -

Heya.

The assert is intended to catch (some) violations of this
(from the function comment):

   NAME is guaranteed to not have any scope (no "::") in its name, though
   if for example NAME is a template spec then "::" may appear in the
   argument list.

This is an invariant on what name can (and cannot) be.
IOW, it is wrong to call this function with name == "foo::bar",
handling that is the caller's responsibility.
Thus I think having an assert here is ok and good
(as long as it tests for the correct thing!).

Whether it is ok to call this with name == "c:mumble" is the issue.
[or even "c:::mumble" or whatever else a user could type that
this function's caller isn't expected to handle]
On that I'm kinda ambivalent, but I like having the assert
watch for the stated invariant.

Thoughts?

 >    sym = lookup_symbol_in_static_block (name, block, domain);
 >    if (sym.symbol != NULL)
 >      return sym;
 > @@ -246,10 +240,9 @@ cp_search_static_and_baseclasses (const char *name,
 >    struct block_symbol klass_sym;
 >    struct type *klass_type;
 >
 > -  /* The test here uses <= instead of < because Fortran also uses this,
> - and the module.exp testcase will pass "modmany::" for NAME here. */
 > -  gdb_assert (prefix_len + 2 <= strlen (name));
 > -  gdb_assert (name[prefix_len + 1] == ':');
 > +  /* Check for malformed input.  */
 > +  if (prefix_len + 2 > strlen (name) || name[prefix_len + 1] != ':')
 > +    return null_block_symbol;
 >
 >    /* Find the name of the class and the name of the method, variable,
 > etc.  */
 >
 > diff --git a/gdb/cp-support.c b/gdb/cp-support.c
 > index df127c4..a71c6ad 100644
 > --- a/gdb/cp-support.c
 > +++ b/gdb/cp-support.c
 > @@ -1037,8 +1037,13 @@ cp_find_first_component_aux (const char *name,
 > int permissive)
 >  	      return strlen (name);
 >  	    }
 >  	case '\0':
 > -	case ':':
 >  	  return index;
 > +	case ':':
 > +	  /* ':' marks a component iff the next character is also a ':'.
 > +	     Otherwise it is probably malformed input.  */
 > +	  if (name[index + 1] == ':')
 > +	    return index;
 > +	  break;

What if name[index+2] is also ':'? :-)

[btw, I think "a::::b" is illegal (note four colons),
but I don't know that for sure]

 >  	case 'o':
 >  	  /* Operator names can screw up the recursion.  */
 >  	  if (operator_possible
 >

--
/dje


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]