This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][gdb/python] Add interface to access minimal_symbols


On 2018-10-04 05:11 PM, Tom de Vries wrote:
> Hi,
> 
> [ Submitted earlier here (
> https://sourceware.org/ml/gdb-patches/2016-02/msg00124.html ). ]
> 
> This patch adds a new gdb.MinSymbol object to export the minimal_symbol
> interface.
> 
> Build and reg-tested on x86_64-linux.
> 
> OK for trunk?
> 
> Thanks,
> - Tom

Hi Tom,

I think I have read (probably from Phil or Tom Tromey) that the intention was to
expose minsyms and full symbols using the same Symbol class, to avoid exposing
the fact that GDB represents symbols in different ways internally.  I don't know
if there was some concrete plans for that or if it was just at the idea stage.

Otherwise, I'd be fine with exposing gdb.MinSyms and documenting that they map to
the binary file format symbols (ELF, PE, mach-O, etc) while gdb.Symbols map to
debug info symbols.  I think it's a reality that will not change any time soon,
and it could be useful for users of the Python API to make the distinction
between the two.

Maybe I'm missing something that has already been discussed that makes exposing
minsyms a bad idea, in that case Phil and Tom are probably going to be able to
shed som light on that.  In the mean time here are a bunch of random
comments/suggestions.

* A general GDB-specific code style comment, every time you compare a pointer
  to know if it's NULL or not, it should be explicitly "ptr == NULL" or "ptr != NULL".
* Please make sure that all functions are documented (with an introductory comment).

> 
> [gdb/python] Add interface to access minimal_symbols
> 
> 2018-09-27  Jeff Mahoney  <jeffm@suse.com>
> 	    Tom de Vries  <tdevries@suse.de>
> 
> 	* Makefile.in (SUBDIR_PYTHON_SRCS): Add python/py-minsymbol.c.
> 	* python/py-minsymbol.c: New file.
> 	* python/py-objfile.c (objfpy_object_to_objfile): New function.
> 	* python/python-internal.h (minsym_object_type)
> 	(gdbpy_lookup_minimal_symbol, objfpy_object_to_objfile):
> 	(gdbpy_initialize_minsymbols): Declare.
> 	* python/python.c (do_start_initialization): Call
> 	gdbpy_initialize_minsymbols.
> 	(python_GdbMethods): Add lookup_minimal_symbol entry.
> 
> 	* python.texi (@node Python API): Add "Minimal Symbols In Python" menu
> 	entry.
> 	(@node Minimal Symbols In Python): New node.
> 
> 	* gdb.python/py-minsymbol.c: New test.
> 	* gdb.python/py-minsymbol.exp: New file.
> 
> ---
>  gdb/Makefile.in                           |   1 +
>  gdb/doc/python.texi                       | 140 +++++++++
>  gdb/python/py-minsymbol.c                 | 492 ++++++++++++++++++++++++++++++
>  gdb/python/py-objfile.c                   |   9 +
>  gdb/python/python-internal.h              |   7 +
>  gdb/python/python.c                       |   5 +
>  gdb/testsuite/gdb.python/py-minsymbol.c   |  38 +++
>  gdb/testsuite/gdb.python/py-minsymbol.exp |  67 ++++
>  8 files changed, 759 insertions(+)
> 
> diff --git a/gdb/Makefile.in b/gdb/Makefile.in
> index 8d780ac758..489dab5ca1 100644
> --- a/gdb/Makefile.in
> +++ b/gdb/Makefile.in
> @@ -377,6 +377,7 @@ SUBDIR_PYTHON_SRCS = \
>  	python/py-instruction.c \
>  	python/py-lazy-string.c \
>  	python/py-linetable.c \
> +	python/py-minsymbol.c \
>  	python/py-newobjfileevent.c \
>  	python/py-objfile.c \
>  	python/py-param.c \
> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
> index 1035be33f0..d91f6e35c5 100644
> --- a/gdb/doc/python.texi
> +++ b/gdb/doc/python.texi
> @@ -158,6 +158,7 @@ optional arguments while skipping others.  Example:
>  * Frames In Python::            Accessing inferior stack frames from Python.
>  * Blocks In Python::            Accessing blocks from Python.
>  * Symbols In Python::           Python representation of symbols.
> +* Minimal Symbols In Python::   Python representation of minimal symbols.
>  * Symbol Tables In Python::     Python representation of symbol tables.
>  * Line Tables In Python::       Python representation of line tables.
>  * Breakpoints In Python::       Manipulating breakpoints using Python.
> @@ -4878,6 +4879,145 @@ The value does not actually exist in the program.
>  The value's address is a computed location.
>  @end vtable
>  
> +@node Minimal Symbols In Python
> +@subsubsection Python representation of Minimal Symbols.
> +
> +@cindex minsymbols in python
> +@tindex gdb.MinSymbol
> +
> +@value{GDBN} represents every variable, function and type as an
> +entry in a symbol table.  @xref{Symbols, ,Examining the Symbol Table}.
> +Typical symbols like functions, variables, etc are represented by
> +gdb.Symbol objects in Python.  Some symbols are defined with less
> +information associated with them, like linker script variables
> +or assembly labels.  Python represents these minimal symbols in @value{GDBN}
> +with the @code{gdb.MinSymbol} object.

Here for example, I would make it clear that a MinSym == a symbol provided by the
binary file format, and would cite ELF, PE and mach-O as examples.  I think that
would make it more obvious to the reader.

> +
> +The following minimal symbol-related functions are available in the @code{gdb}
> +module:
> +
> +@findex gdb.lookup_minimal_symbol
> +@defun gdb.lookup_minimal_symbol (name @r{[}, sfile@r{]}, objfile@r{[})

The square bracket look wrong.  Since sfile and objfile are keyword arguments,
I guess you want them to look like this?

    gdb.lookup_minimal_symbol (name [, sfile][, objfile])

?

Not in this patch, but I think we should document the default value for keyword
arguments, like the official Python doc does, which would mean something like:

    gdb.lookup_minimal_symbol (name, sfile=None, objfile=None)


> +This function searches for a minimal symbol by name.
> +The search scope can be restricted by the sfile and objfile arguments.
> +
> +@var{name} is the name of the minimal symbol.  It must be a string.
> +The optional @var{sfile} argument restricts the search to the source file
> +in which the minimal symbol was defined.
> +The @var{sfile} argument must be a string.  The optional @var{objfile}
> +restricts the search to the objfile that contains the minimal symbol.
> +The @var{objfile} argument must be a @code{gdb.Objfile} object.

Can you try to split these in some kind of bullet, or at least one parameter
per paragraph?  I think we have a tendency to document parameters in a big
paragraph, which does not make it easy to look up an individual parameter.

I think a format like this makes it really easy to read:

https://nodejs.org/dist/latest-v10.x/docs/api/buffer.html#buffer_class_method_buffer_from_arraybuffer_byteoffset_length

Also, would it be possible to precise how the search is made, especially
when multiple symbols match a search?  If I have a C++ program with this:

void allo(int x)
{
}

void allo(float f)
{
}

then doing gdb.lookup_minimal_symbol('allo') will only return one of them.

It would also be good to mention how the sfile argument is used, does it have
to be an exact match, just the base name, any substring?

> +The result is a @code{gdb.MinSymbol} object or @code{None} if the symbol
> +is not found.
> +@end defun
> +
> +A @code{gdb.MinSymbol} object has the following attributes:
> +
> +@defvar MinSymbol.name
> +The name of the symbol as a string.  This attribute is not writable.
> +@end defvar
> +
> +@defvar MinSymbol.linkage_name
> +The name of the symbol, as used by the linker (i.e., may be mangled).
> +This attribute is not writable.
> +@end defvar
> +
> +@defvar MinSymbol.print_name
> +The name of the symbol in a form suitable for output.  This is either
> +@code{name} or @code{linkage_name}, depending on whether the user
> +asked @value{GDBN} to display demangled or mangled names.
> +@end defvar
> +
> +@defvar MinSymbol.filename
> +The file name of the source file where the minimal symbol is defined.  This
> +value may represent filenames used internally by the compiler rather
> +than a viewable/editable source file.
> +@end defvar
> +
> +@defvar MinSymbol.section
> +The name of the binary section containing this minimal symbol.
> +@end defvar
> +
> +@defvar MinSymbol.is_code
> +@code{True} if the minimal symbol is a function or a method.
> +@end defvar
> +
> +@defvar MinSymbol.is_data
> +@code{True} if the symbol is a variable or other data.
> +@end defvar
> +
> +A @code{gdb.MinSymbol} object has the following methods:
> +
> +@defun MinSymbol.is_valid ()
> +Returns @code{True} if the @code{gdb.MinSymbol} object is valid,
> +@code{False} if not.  A @code{gdb.MinSymbol} object can become invalid if
> +the symbol it refers to does not exist in @value{GDBN} any longer.
> +All other @code{gdb.MinSymbol} methods will throw an exception if it is
> +invalid at the time the method is called.
> +@end defun
> +
> +@defun MinSymbol.value ()
> +Compute the value of the minimal symbol, as a @code{gdb.Value}.  The value
> +returned represents the address of the minimal symbol.  Since minimal symbols
> +represent objects without rich type information, the @code{gdb.Type}
> +associated with the @code{gdb.Value} objects will be limited to whether
> +the minimal symbol describes executable code or data.
> +@end defun
> +
> +The available types for @code{gdb.MinSymbol} are represented
> +as constants in the @code{gdb} module.  They are distinctly separate from the
> +types represented by the @code{gdb.Type} object.

I think the "type" method, that returns one of these, is not documented.

> +int
> +gdbpy_initialize_minsymbols (void)
> +{
> +  if (PyType_Ready (&minsym_object_type) < 0)
> +    return -1;
> +
> +  msympy_objfile_data_key
> +    = register_objfile_data_with_cleanup (NULL, del_objfile_msymbols);
> +
> +  if (PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_UNKNOWN",
> +			       mst_unknown) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_TEXT",
> +				  mst_text) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_TEXT_GNU_IFUNC",
> +				  mst_text_gnu_ifunc) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_SLOT_GOT_PLT",
> +				  mst_slot_got_plt) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_DATA",
> +				  mst_data) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_BSS", mst_bss) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_ABS", mst_abs) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_SOLIB_TRAMPOLINE",
> +				  mst_solib_trampoline) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_TEXT",
> +				  mst_file_text) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_DATA",
> +				  mst_file_data) < 0
> +      || PyModule_AddIntConstant (gdb_module, "MINSYMBOL_TYPE_FILE_BSS",
> +				  mst_file_bss) < 0)

I was a bit worried about exposing the internal enum values, but it seems like
we do this in other places (like gdb.COMMAND_* constants).  If scripts use the
labels, they don't really care what the numerical value is.

> +    return -1;
> +
> +  return gdb_pymodule_addobject (gdb_module, "MinSymbol",
> +				 (PyObject *) &minsym_object_type);
> +}
> +
> +
> +
> +static gdb_PyGetSetDef minsym_object_getset[] = {
> +  { "name", msympy_get_name, NULL,
> +    "Name of the minimal symbol, as it appears in the source code.", NULL },
> +  { "linkage_name", msympy_get_linkage_name, NULL,
> +    "Name of the minimal symbol, as used by the linker (i.e., may be mangled).",
> +    NULL },
> +  { "filename", msympy_get_file_name, NULL,
> +    "Name of source file that contains this minimal symbol. Only applies for"
> +    " mst_file_*.",
> +    NULL },
> +  { "print_name", msympy_get_print_name, NULL,
> +    "Name of the minimal symbol in a form suitable for output.\n\
> +This is either name or linkage_name, depending on whether the user asked GDB\n\
> +to display demangled or mangled names.", NULL },
> +  { "section", msympy_get_section, NULL,
> +    "Section that contains this minimal symbol, if any", NULL, },

I would suggest naming this "section_name".  It would leave the "section" property
available so that ff we ever have a Python type to represent binary file sections,
we can use it to return that.

> +  { "type", msympy_get_type, NULL,
> +    "Type that this minimal symbol represents." },
> +  { NULL }  /* Sentinel */
> +};
> +
> +static PyMethodDef minsym_object_methods[] = {
> +  { "is_valid", msympy_is_valid, METH_NOARGS,
> +    "is_valid () -> Boolean.\n\
> +Return true if this minimal symbol is valid, false if not." },
> +  { "is_code", msympy_is_code, METH_NOARGS,
> +    "is_code () -> Boolean.\n\
> +Return true if this minimal symbol represents code." },
> +  { "is_data", msympy_is_data, METH_NOARGS,
> +    "is_data () -> Boolean.\n\
> +Return true if this minimal symbol represents data." },
> +  { "value", msympy_value, METH_VARARGS,
> +    "value ([frame]) -> gdb.Value\n\
> +Return the value of the minimal symbol." },
> +  {NULL}  /* Sentinel */
> +};
> +
> +PyTypeObject minsym_object_type = {
> +  PyVarObject_HEAD_INIT (NULL, 0)
> +  "gdb.MinSymbol",		  /*tp_name*/
> +  sizeof (minsym_object),	  /*tp_basicsize*/
> +  0,				  /*tp_itemsize*/
> +  msympy_dealloc,		  /*tp_dealloc*/
> +  0,				  /*tp_print*/
> +  0,				  /*tp_getattr*/
> +  0,				  /*tp_setattr*/
> +  0,				  /*tp_compare*/
> +  0,				  /*tp_repr*/
> +  0,				  /*tp_as_number*/
> +  0,				  /*tp_as_sequence*/
> +  0,				  /*tp_as_mapping*/
> +  0,				  /*tp_hash */
> +  0,				  /*tp_call*/
> +  msympy_str,			  /*tp_str*/

I would suggest implementing repr instead, as we have done for
gdb.Objfile and others.  Also, I would suggest using the output
style

  <gdb.MinSymbol name=%s>

to be somewhat consistent.  I think it helps when developing to
have the type of the object printed.  Plus, we make sure people
don't rely on the output of repr/str to format the output the
way they want :).

Simon


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]