This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

How I learned to stop worrying and love decode_line_1


I've been spending the week getting to know decode_line_1.  One of the
things that I learned was that decode_line_1 thinks that it's its job
to take apart C++ expressions like A::B::C::x, where A and B are
namespaces, C is a class or a namespace, and x is a variable or a
member or something.

This cleared up some issues for me: e.g. why you don't have to put
single quotes around expressions involving static class members but do
have to put single quotes around expressions involving namespaces.
But, frankly, I was surprised that the function claimed to handle
namespaces at all (since, basically, it doesn't currently); moreover,
there's this lovely comment:

  Some versions of the HP ANSI C++ compiler (as also possibly
  other compilers) generate class/function/member names with
  embedded double-colons if they are inside namespaces. To
  handle this, we loop a few times, considering larger and
  larger prefixes of the string as though they were single
  symbols.  So, if the initially supplied string is
  A::B::C::D::foo, we have to look up "A", then "A::B",
  then "A::B::C", then "A::B::C::D", and finally
  "A::B::C::D::foo" as single, monolithic symbols, because
  A, B, C or D may be namespaces.

I'm not sure I understand the context here: did/does HP's compiler
generate names for symbols in namespace that, after demangling, looked
fundamentally different from, say, the way GCC's do?  The above sounds
the same as the way GCC's demangled symbol names work; but it hints at
possible compilers which might somehow generate a symbol 'A' if there
is a namespace 'A'.  Are there any such compilers?  I sure can't
imagine where GDB would currently do anything useful with that
information.

So any insight about the context of that comment would be greatly
appreciated.

I'm certainly glad I discovered this bit of decode_line_1, at any
rate: it suggests that some code that I was planning to add to
lookup_symbol should really go in decode_line_1.  (The only thing
worse than having one gross hack of a parser is having two gross hacks
of a parser that are both trying to do the same thing.)

For what it's worth, I've got decode_line_1 tamed a bit.  The version
of it that I'll check into my branch later today is only 82 lines
long.  (I'll include it after my signature; of course, it calls lots
of other functions that contain various bits of the current version of
the function.) It turns out that, underlying all that mess, there's
actually a function with perfectly reasonable control flow.  In
particular, whoever put in those goto's originally should be given a
bit of a talking-to: there's absolutely no reason whatsoever for the
function to do any jumps like that.  Mind you, my version of the code
still contains all of the myriad special cases in the current version,
but at least it's now pretty obvious when the function might return,
whether all the different uses of the variables 'p' and 'copy' are
really pointing to the same thing or are only there because somebody
decided to reuse variable names, and so forth.

David Carlton
carlton@math.stanford.edu

struct symtabs_and_lines
decode_line_1 (char **argptr, int funfirstline, struct symtab *default_symtab,
	       int default_line, char ***canonical)
{
  /* This is NULL if there are no parens in argptr, or a pointer to
     the closing parenthesis if there are parens.  */
  char *paren_pointer;
  /* If a file name is specified, this is its symtab.  */
  struct symtab *file_symtab = NULL;
  int is_quoted;
  char *saved_arg = *argptr;

  /* Defaults have defaults.  */

  dl1_initialize_defaults (&default_symtab, &default_line);
  
  /* See if arg is *PC */

  if (**argptr == '*')
    return dl1_indirect (argptr);

  /* Set various flags.
   * 'paren_pointer' is important for overload checking, where
   * we allow things like: 
   *     (gdb) break c::f(int)
   */

  dl1_set_flags (*argptr, &is_quoted, &paren_pointer);

  /* Check to see if it's a multipart linespec (with colons or periods).  */
  {
    char *p;
    int is_quote_enclosed;
    
    /* Locate the first half of the linespec, ending in a colon, period,
       or whitespace.  (More or less.)  */

    p = dl1_locate_first_half (argptr, &is_quote_enclosed);

    /* Does it look like there actually were two parts?  */

    if ((p[0] == ':' || p[0] == '.') && paren_pointer == NULL)
      {
	if (is_quoted)
	  *argptr = *argptr + 1;
      
	/* Is it a C++ or Java compound data structure?  */
      
	if (p[0] == '.' || p[1] == ':')
	  return dl1_compound (argptr, funfirstline, canonical,
			       saved_arg, p);

	/* No, the first part is a filename; set file_symtab
	   accordingly.  Also, move argptr past the filename.  */
      
	file_symtab = dl1_handle_filename (argptr, p, is_quote_enclosed);
      }
  }

  /* Check whether arg is all digits (and sign) */

  if (dl1_is_all_digits (*argptr))
    return dl1_all_digits (argptr, default_symtab, default_line,
			   canonical, file_symtab);

  /* Arg token is not digits => try it as a variable name
     Find the next token (everything up to end or next whitespace).  */

  /* If it starts with $: may be a legitimate variable or routine name
     (e.g. HP-UX millicode routines such as $$dyncall), or it may
     be history value, or it may be a convenience variable */
  
  if (**argptr == '$')
    return dl1_dollar (argptr, funfirstline, default_symtab, canonical,
		       file_symtab);
  
  /* Look up that token as a variable.
     If file specified, use that file's per-file block to start with.  */

  return dl1_variable (argptr, funfirstline, canonical, is_quoted,
		       paren_pointer, file_symtab);
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]