This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH v1 02/13] script language API for GDB: scripting.[ch]
- From: Doug Evans <xdje42 at gmail dot com>
- To: Phil Muldoon <pmuldoon at redhat dot com>
- Cc: gdb-patches at sourceware dot org
- Date: Fri, 06 Dec 2013 09:16:59 -0800
- Subject: Re: [PATCH v1 02/13] script language API for GDB: scripting.[ch]
- Authentication-results: sourceware.org; auth=none
- References: <52a16629 dot 23ee440a dot 55aa dot 31c4 at mx dot google dot com> <52A1C281 dot 7040905 at redhat dot com>
Phil Muldoon <pmuldoon@redhat.com> writes:
> On 06/12/13 05:52, xdje42@gmail.com wrote:
>
>> +struct script_language_defn
>> +{
>> + /* Enum of the scripting language. */
>> + enum script_language language;
>> +
>> + /* The name of the scripting language, lowercase. E.g., python. */
>> + const char *name;
>> +
>> + /* The capitalized name of the scripting language. E.g., Python. */
>> + const char *capitalized_name;
>
> Why do we need a capitalized name and a lowercase name? I am sure
> there is a purpose in mind, but I could not see in the patch the
> reason for both.
For gdb's own scripting language it's "GDB".
For python it's "Python".
>> + /* The file suffix for this scripting language. E.g., ".py". */
>> + const char *suffix;
>
> I am undecided if this should be an array. I have seen .python
> (though .py is far more common). What do you think?
I think we can stick with just one suffix for now,
no need to add the feature at this point.
>> +/* The interface for making calls from GDB to an external scripting
>> + language. This is for non-script-loading related functionality,
>> + like pretty-printing, etc. The reason these are separated out as
>> + GDB's own scripting language makes use of script_language_script_opts,
>> + but it makes no use of these. There is no (current) intention to split
>> + script_language_ops up any further. */
>
>> + /* Called before printing a type. */
>> + void (*start_type_printers) (const struct script_language_defn *,
>> + struct script_type_printers *);
>> +
>> + /* Try to pretty-print TYPE. If successful the pretty-printed type is
>> + returned. Otherwise NULL is returned.
>> + This function has a bit of a funny name, since it actually applies
>> + recognizers, but this seemed clearer given the start_type_printers
>> + and free_type_printers functions. */
>> + char *(*apply_type_printers) (const struct script_language_defn *,
>> + const struct script_type_printers *,
>> + struct type *);
>> +
>> + /* Called after a type has been printed to give the type pretty-printer
>> + mechanism an opportunity to clean up. */
>> + void (*free_type_printers) (const struct script_language_defn *,
>> + struct script_type_printers *);
>> +
>> + /* Try to pretty-print a value of type TYPE located at VALADDR
>> + + EMBEDDED_OFFSET, which came from the inferior at address ADDRESS
>> + + EMBEDDED_OFFSET, onto stdio stream STREAM according to OPTIONS.
>> + VAL is the whole object that came from ADDRESS. VALADDR must point to
>> + the head of VAL's contents buffer.
>> + Returns non-zero if the value was successfully pretty-printed. */
>> + int (*apply_val_pretty_printer)
>> + (const struct script_language_defn *,
>> + struct type *type, const gdb_byte *valaddr,
>> + int embedded_offset, CORE_ADDR address,
>> + struct ui_file *stream, int recurse,
>> + const struct value *val, const struct value_print_options *options,
>> + const struct language_defn *language);
>> +
>> + /* GDB access to the "frame filter" feature.
>> + FRAME is the source frame to start frame-filter invocation. FLAGS is an
>> + integer holding the flags for printing. The following elements of
>> + the FRAME_FILTER_FLAGS enum denotes the make-up of FLAGS:
>> + PRINT_LEVEL is a flag indicating whether to print the frame's
>> + relative level in the output. PRINT_FRAME_INFO is a flag that
>> + indicates whether this function should print the frame
>> + information, PRINT_ARGS is a flag that indicates whether to print
>> + frame arguments, and PRINT_LOCALS, likewise, with frame local
>> + variables. ARGS_TYPE is an enumerator describing the argument
>> + format, OUT is the output stream to print. FRAME_LOW is the
>> + beginning of the slice of frames to print, and FRAME_HIGH is the
>> + upper limit of the frames to count. Returns SCR_BT_ERROR on error,
>> + or SCR_BT_COMPLETED on success. */
>> + enum script_bt_status (*apply_frame_filter)
>> + (const struct script_language_defn *,
>> + struct frame_info *frame, int flags, enum script_frame_args args_type,
>> + struct ui_out *out, int frame_low, int frame_high);
>> +
>> + /* Update values held by the scripting language when OBJFILE is discarded.
>> + New global types must be created for every such value, which must then be
>> + updated to use the new types.
>> + The function typically just iterates over all appropriate values and
>> + calls preserve_one_value for each one.
>> + COPIED_TYPES is used to prevent cycles / duplicates and is passed to
>> + preserve_one_value. */
>> + void (*preserve_values) (const struct script_language_defn *,
>> + struct objfile *objfile, htab_t copied_types);
>> +
>> + /* Return non-zero if there is a stop condition for the breakpoint. */
>> + int (*breakpoint_has_cond) (const struct script_language_defn *,
>> + struct breakpoint *);
>
> These features seem arbitrarily divided? Why no extended prompt
> support here for example?
My intention is that they are not arbitrarily divided.
Any time gdb needs to call into a scripting language, there's an ops method.
Another way of looking at it is: the only things exported from the
scripting language is scriping-priv.h, and ultimately just
script_language_defn.
[Plus the one "is python present" predicate for MI.
More on that in another email.]
Extended prompt support is missing only because of oversight.
Thanks for pointing it out!
> I would have thought that callback would
> have been something all of the languages would want access too. I
> guess I don't understand why some features are included here,
> formally, and some are ad-hoc hooks? I understand that things like
> convenience functions etc are not called via a central dispatch
> (like the above), but I am unsure of having half the features formally
> defined here, and the other half just left silently undeclared.
Lots of features exist without needing an "ops" method.
I was wondering about adding them, but there's no need so I kept it simple.
> It gives me pause on the design. If we are going to have a formalized
> interface to scripting in GDB it seems it would be best to either be
> ad-hoc, or completely formal. Right now it feels in-between?
I think it is quite formal.
All python is intended to export to gdb is struct script_language_defn.
[Plus the one function for MI ... one could certainly do that different:
script_lang_initialized_p (SCRIPT_LANG_PYTHON), which is better anyway.
Thanks!]
> Anyway, I think a lot more documentation is needed regarding the
> expectations of the API and the requirements to the scripting author.
> Are all of these functions mandatory and need to be fully implemented?
> Are NULLs allowed where the author has chosen not to implement that
> feature if they are not mandatory? What about the behavior of NOOP
> functions? (IE, the frame filter functions say for guile just returns
> NO_FILTERS. Will that deny the other scripting language the chance to
> run their frame filters?
Righto.
>> + /* Return non-zero if there is a stop condition for the breakpoint,
>> + and it indicates the program should stop. */
>> + enum scr_bp_stop (*breakpoint_cond_says_stop)
>> + (const struct script_language_defn *, struct breakpoint *);
>
> Why "scr" over script?
script "works for me".
>> +#include "defs.h"
>> +#include "auto-load.h"
>> +#include "breakpoint.h"
>> +#include "scripting.h"
>> +#include "scripting-priv.h"
>> +#include "cli/cli-script.h"
>> +#include "python/python.h"
>> +
>> +/* Iterate over all external scripting languages, regardless of whether the
>> + support has been compiled in or not. */
>> +#define ALL_EXT_SCRIPTING_LANGUAGES(i, slang) \
>> + for (/*int*/ i = 0, slang = external_scripting_languages[0]; \
>> + slang != NULL; \
>> + slang = external_scripting_languages[++i])
>> +
>> +/* Iterate over all external scripting languages that are supported. */
>> +#define ALL_ENABLED_EXT_SCRIPTING_LANGUAGES(i, slang) \
>> + for (/*int*/ i = 0, slang = external_scripting_languages[0]; \
>> + slang != NULL; \
>> + slang = external_scripting_languages[++i]) \
>> + if (slang->ops != NULL)
>
> Tiny nit, and it is a personal thing. If we are now asking for spaces
> between function names and comments, surely these same should apply
> for MACROS that are defined as, and called as, functions? Like I said,
> tiny style issue.
"works for me", I don't have a strong preference.
[Nit: We've always asked for spaces between function names and
comments. It's only now that we enforce that.]
>> +/* Table of all external (non-native) scripting languages. */
>> +
>> +static const struct script_language_defn * const
>> + external_scripting_languages[] =
>> +{
>> + &script_language_python,
>> + NULL
>> +};
>
> Small comment. If NULL is the sentinel value indicating the end of a
> sequence, I normally put a comment saying so. Otherwise it can get
> replaced accidentally with the next entry. Something like: "NULL /*
> Sentinel */" so it is absolutely clear.
Done.
But if we do this here I think it needs to formalized and done
everywhere from now on. I'm happy to add a section to the manual
if people want this.
>> +
>> +/* Accessors for "public" attributes of struct script_language.
>> +
>> + IWBN if we could use slang_foo here and elsewhere, but we can't for fear of
>> + confusing someone into thinking we might be referring to the Slang
>> + programming language. */
>> +
>> +/* Return the "name" field of SLANG. */
>
>> +
>> +const char *
>> +script_lang_name (const struct script_language_defn *slang)
>> +{
>> + return slang->name;
>> +}
>
> I'd go even further and remove all references to slang, and replace
> them with scripting_lang, including parameters and variables. It
> would be more consistent. That's just my choice though, it is really
> up to you.
It's a local variable name, but if it's still confusing I'm
happy to change it. OTOH, what I'm doing know *is* consistent.
In breakpoint.c it's b or bp or bl (for breakpoint_location).
In frame.c it's often just frame or fi instead of frame_info.
(it's also this_frame, next_frame, but my point is a shorter
variable name is consistent - I'm doing the same thing here).
I think it's totally ok to use something shorter for local variable
names as long as it's reasonable and consistent.
sl?
>> +
>> +/* Return the "capitalized_name" field of SLANG. */
>> +
>> +const char *
>> +script_lang_capitalized_name (const struct script_language_defn *slang)
>> +{
>> + return slang->capitalized_name;
>> +}
> I mentioned this earlier. Not sure why we need it. Perhaps instead
> of capitalized name, you really mean the proper noun of the scripting
> language? Still not sure why we need this and the lower-case version.
I want to have something simple to handle gdb vs GDB and python vs Python.
> If terms of upper-case and lower-case, the lower-case can just be
> extracted by conversion? This might be a comment issue, and
> these have real uses. Maybe "generic" and "proper" or
> something like this.
What's "generic" and "proper" might be in the eye of the reader.
It's not critical of course.
>> +/* Return the "suffix" field of SLANG. */
>
> This is going to get really confusing if we ever have slang scripting
> support ;)
It's just a local variable name.
sl?
>> +
>> +/* Return the script "sourcer" function for SLANG.
>> + This is the function that loads and processes a script. */
>
> The more I read the more I think which should be consistent with
> script_* over slang.
I like the consistency of "sl". How about that?
>> +/* Return the scripting language specified by CMD.
>> + Note the difference between this function and
>> + eval_script_from_control_command is that we loop on
>> + ALL_EXT_SCRIPTING_LANGUAGES whereas the latter loops on
>> + ALL_ENABLED_EXT_SCRIPTING_LANGUAGES. */
>> +
>> +static const struct script_language_defn *
>> +script_lang_from_control_command (struct command_line *cmd)
>> +{
>> + int i;
>> + const struct script_language_defn *slang;
>> +
>> + ALL_EXT_SCRIPTING_LANGUAGES (i, slang)
>> + {
>> + if (slang->cli_control_type == cmd->control_type)
>> + return slang;
>> + }
>
> It is hard to work this out as there is only one language supported,
> which is Python. But are you planning to add info scripting_languages
> and other assorted information commands?
I hadn't gotten that far yet, but sounds reasonable to me.
> It might even be that we need to add a priority to the language if
> they exist in mutual exclusion for shared features. (IE, if Python
> has a pretty printer for std::list, and so does guile, which gets to
> run?)
That's intended to be solved by the first-one-wins principle,
and keeping Python first in the list.
>> + struct script_type_printers *printers
>> + = XZALLOC (struct script_type_printers);
>> + int i;
>> + const struct script_language_defn *slang;
>> +
>> + ALL_ENABLED_EXT_SCRIPTING_LANGUAGES (i, slang)
>> + {
>> + if (slang->ops->start_type_printers != NULL)
>> + slang->ops->start_type_printers (slang, printers);
>> + }
>> +
>> + return printers;
>> +}
>
> I think this will run all type printers from all scripting languages.
> If Python has already pretty printed a type does it make sense to run
> it through other scripting languages (as an example)?
Hmmm, this is the "start" method that just initializes things
for the subsequent call to the apply_type_printers method. [Right?]
>> +/* Try to pretty-print a value of type TYPE located at VALADDR
>> + + EMBEDDED_OFFSET, which came from the inferior at address ADDRESS
>> + + EMBEDDED_OFFSET, onto stdio stream STREAM according to OPTIONS.
>> + VAL is the whole object that came from ADDRESS. VALADDR must point to
>> + the head of VAL's contents buffer.
>> + Returns non-zero if the value was successfully pretty-printed. */
>> +
>> +int
>> +apply_val_script_pretty_printer (struct type *type, const gdb_byte *valaddr,
>> + int embedded_offset, CORE_ADDR address,
>> + struct ui_file *stream, int recurse,
>> + const struct value *val,
>> + const struct value_print_options *options,
>> + const struct language_defn *language)
>> +{
>> + int i;
>> + const struct script_language_defn *slang;
>> +
>> + ALL_ENABLED_EXT_SCRIPTING_LANGUAGES (i, slang)
>> + {
>> + if (slang->ops->apply_val_pretty_printer (slang, type, valaddr,
>> + embedded_offset, address,
>> + stream, recurse, val,
>> + options, language))
>> + return 1;
>> + }
>> +
>> + return 0;
>> +}
>
> I think we definitely need a priority system for scripting languages.
> If the user cannot define priority, it would seem chaotic when
> multiple scripting languages are supported that have similar features
> supported. I think it would be ideal if the user could specify which
> scripting language has priority for printing a value where one or more
> printers/frame-filters exist over the other.
There's the basic priority system in place now.
One could give the user control on a per-command basis,
but I don't see a need for it yet.
> I thought at first that the existing priority system in place for
> frame-filters/type and value pretty printers would work. But it
> occurs to me it won't. Each scripting language I presume keeps their
> own list of printers. (I don't see guile, for example, adding to a
> python list). These lists are compartmentalized to each language. So
> the situation right now would be, if guile was listed above python,
> and guile had a pretty printer for foo, and so did Python, there is
> currently no way to use the Python pretty printer. And vice
> versa. And there is not way to get a grand list of pretty printers for
> all implemented scripting languages.
I wasn't intending to have guile ahead of python in the list,
but if it were, for the sake of discussion, one could just disable
the guile pretty-printer (or all of them).
Something like "disable guile pretty-printer .*" or some such.
It's easy enough to get a list of all pretty-printers.
We know how to print the list for each language and we know
all the languages.
> In some distributions pretty/type/frame-filters are distributed via a
> package, where the feature package installs the relevant bits. (IE in
> Fedora libstdc++.rpm installs the libstdc++ pretty printers). So in
> these cases, the user might be unaware, and have no control over what
> printers are installed.
yep.
>> +/* GDB access to the "frame filter" feature.
>> + FRAME is the source frame to start frame-filter invocation. FLAGS is an
>> + integer holding the flags for printing. The following elements of
>> + the FRAME_FILTER_FLAGS enum denotes the make-up of FLAGS:
>> + PRINT_LEVEL is a flag indicating whether to print the frame's
>> + relative level in the output. PRINT_FRAME_INFO is a flag that
>> + indicates whether this function should print the frame
>> + information, PRINT_ARGS is a flag that indicates whether to print
>> + frame arguments, and PRINT_LOCALS, likewise, with frame local
>> + variables. ARGS_TYPE is an enumerator describing the argument
>> + format, OUT is the output stream to print. FRAME_LOW is the
>> + beginning of the slice of frames to print, and FRAME_HIGH is the
>> + upper limit of the frames to count. Returns SCR_BT_ERROR on error,
>> + or SCR_BT_COMPLETED on success. */
>> +
>> +enum script_bt_status
>> +apply_script_frame_filter (struct frame_info *frame, int flags,
>> + enum script_frame_args args_type,
>> + struct ui_out *out,
>> + int frame_low, int frame_high)
>> +{
>> + int i;
>> + const struct script_language_defn *slang;
>> +
>> + ALL_ENABLED_EXT_SCRIPTING_LANGUAGES (i, slang)
>> + {
>> + enum script_bt_status status;
>> +
>> + if (slang->ops->apply_frame_filter == NULL)
>> + continue;
>
> I notice this can be NULL, but it seems type and value printers
> cannot? Why?
Just oversight.
>> + status = slang->ops->apply_frame_filter (slang, frame, flags,
>> + args_type, out,
>> + frame_low, frame_high);
>> + /* We use the filters from the first scripting language that has
>> + applicable filters. */
>> + if (status != SCR_BT_NO_FILTERS)
>> + return status;
>
> Not sure why you use SCR here instead of SCRIPT
Just going with the flow having converted it from PY_BT_NO_FILTERS.
SCRIPT_ is fine with me.
>> +enum script_language
>> + {
>> + SCRIPT_LANG_NONE,
>> + SCRIPT_LANG_GDB,
>> + SCRIPT_LANG_PYTHON
>> + };
>> +
>> +/* Script frame-filter status return values. */
>> +
>> +enum script_bt_status
>> + {
>> + /* Return when an error has occurred in processing frame filters,
>> + or when printing the stack. */
>> + SCR_BT_ERROR = -1,
>
> The enum name is script, but the constants are SCR_*
>
> But in the enum script_language above the constants are SCRIPT. I
> think we should finalize on one name throughout the code.
yeah.
>> +/* Flags to pass to apply_frame_filter. */
>> +
>> +enum frame_filter_flags
>> + {
>
> Why no SCRIPT tag on the name?
Move here from python/python.h as is.
I can certainly add a follow-on patch to add a prefix.
>> + /* Set this flag if frame level is to be printed. */
>> + PRINT_LEVEL = 1,
>> +
>> + /* Set this flag if frame information is to be printed. */
>> + PRINT_FRAME_INFO = 2,
>> +
>> + /* Set this flag if frame arguments are to be printed. */
>> + PRINT_ARGS = 4,
>> +
>> + /* Set this flag if frame locals are to be printed. */
>> + PRINT_LOCALS = 8,
>> + };
>> +
>> +/* A choice of the different frame argument printing strategies that
>> + can occur in different cases of frame filter instantiation. */
>> +
>> +enum script_frame_args
>> + {
>> + /* Print no values for arguments when invoked from the MI. */
>
>> +
>> +/* The possible results of script_language_ops.breakpoint_cond_says_stop. */
>> +
>> +enum scr_bp_stop
>> + {
>
> SCRIPT. And others.
Righto.
>> + /* No "stop" condition is set. */
>> + SCR_BP_STOP_UNSET,