This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCHv2] gdb: Rewrite argument handling for user-defined commands


* Tom Tromey <tom@tromey.com> [2018-09-07 14:36:04 -0600]:

> >>>>> "Andrew" == Andrew Burgess <andrew.burgess@embecosm.com> writes:
> 
> Andrew> This commit changes how quoting works so that the quotes are NOT now
> Andrew> included in the argument passed.  If the user wants to include quotes,
> Andrew> they would now need to use nested quotes, so "\"abc\"" will pass the
> Andrew> argument "abc".
> 
> Andrew> It is also possible to use single quotes, so '"abc"' will also pass
> Andrew> the argument "abc".
> 
> Andrew> As currently there's no documentation for how quoting works in
> Andrew> user-defined commands this commit adds documentation for the new
> Andrew> behaviour.
> 
> Andrew> The big risk with this commit is that this does change how arguments
> Andrew> are passed to user-defined commands, and this might causes issues for
> Andrew> existing users.
> 
> I think this change goes against the compatibility approach I discussed
> in that earlier thread -- it changes the syntax of a command in a way
> that is likely to be used in practice.
> 
> In my opinion, the documentation issue in cases like this is not a
> strong argument in favor of allowing a change.  That's because users
> will often resort to trial-and-error to get gdb to do what they want.
> So unless the documentation is very clear, in practice, and especially
> over long periods of time, behavior is locked to the implementation.
> 
> So, my own inclination is to say no to this patch, though I welcome &
> will listen to other responses.
> 
> I'd accept a patch adding an option to define,

OK, so we could have a switch to define:

   (gdb) define --arg-style=v2 my_command
           ...
         end

Or just a global flag:

   (gdb) set define-arg-style v1 | v2

I think either is fine, and neither is too hard to add on, so the next
hurdle here is:

>                                                 though as I mentioned
> earlier, in a case like that I think it's better to design something
> very good rather than to try to patch things piecemeal; the latter being
> how gdb ended up in this situation in the first place.

Here's my proposal then for how argument quoting might work in a V2
scheme, I'll use '>>' and '<<' to delimit examples where needed:

 1. Arguments are separated by whitespace,

 2. Everything within matched single or double quotes is a single
 argument.

 3. After a quoted region you still need whitespace to split the
 arguments, so >>'abc'def<< will be a single argument, >>abcdef<<.

 4. Within double quotes, a backslash can be used to escape a double
 quotes, so >>"\"this is one\""<< will give >>"this is one"<<.

 5. Within single quotes backslash does NOT escape a single quote, so
 a single quote always ends a quoted block, so >>'this is\'<< will
 pass the argument >>this is\<<.  I'm not 100% committed to this idea,
 and can make this like double quotes if that is preferred.

 6. Outside of single or double quotes, a backslash can be used to
 quote a single or double quotes, so >>ab\'cd<< is >>ab'cd<<.

 7. Other than the cases listed in 4, 5, and 6, a backslash is passed
 through, as is, so >>"\"multi \n line\""<< will pass >>"multi \n line"<<.

I'm happy to build or flesh out these rules more if people have
feedback.

The patch below is just a quick iteration to add a control flag, and
to better follow the rules I laid out above.  The patch would need
tests, and some help text filling in, but I expect I will get feedback
and need to adjust things some more yet, so this is still a work in
progress.

In the patch below I favoured having a global flag to control argument
parsing, over requiring the user to flag every use of 'define'.  Maybe
that's the wrong approach.... again, feedback welcome.

Thanks,
Andrew

---

gdb: Rewrite argument handling for user-defined commands

This commit rewrites argument passing for user-defined commands.  The
rewrite was inspired after this mailing list thread:

    https://sourceware.org/ml/gdb-patches/2018-08/msg00391.html

The summary is that it was felt that in order to pass arguments that
include whitespace, then single or double quotes should be used for
quoting the argument.

The problem is that currently, the quotes are included in the argument
that is passed into the user-defined command, so passing the argument
"1 + 1" will currently litterally pass "1 + 1" (including the quotes)
to GDB, which is no good if what you want to do is to pass an
expression.

This commit changes how quoting works so that the quotes are NOT now
included in the argument passed.  If the user wants to include quotes,
they would now need to use nested quotes, so "\"abc\"" will pass the
argument "abc".

It is also possible to use single quotes, so '"abc"' will also pass
the argument "abc".

As currently there's no documentation for how quoting works in
user-defined commands this commit adds documentation for the new
behaviour.

The big risk with this commit is that this does change how arguments
are passed to user-defined commands, and this might causes issues for
existing users.

gdb/ChangeLog:

	* cli/cli-script.c (user_defined_command_arg_scheme_v1): New
	static variable.
	(user_defined_command_arg_scheme_v2): Likewise.
	(user_defined_command_arg_scheme_names): Likewise.
	(user_defined_command_arg_scheme_string): Likewise.
	(show_user_defined_command_arg_scheme): New static function.
	(user_args::m_command_line): Remove.
	(user_args::m_args): Change type.
	(user_args::parse_command_line_v1): New method, contains the old
	contents of user_args::user_args.
	(user_args::parse_command_line_v2): New method, contains code to
	parse arguments using the new scheme.
	(user_args::user_args): Call one of the parser methods, the old
	content has moved into user_args::parse_command_line_v1.
	(user_args::insert_args): Update to take account of m_args type
	change.
	(_initialize_cli_script): Register new set/show variable.

gdb/doc/ChangeLog:

	* gdb.texinfo (Define): Additional documentation about argument
	syntax.

gdb/testsuite/ChangeLog:

	* gdb.base/commands.exp (user_defined_command_arg_quoting): New
	proc, which is added to the list of procs to call.
	* gdb.base/run.c (global_var): Defined global.
---
 gdb/ChangeLog                       |  20 ++++++
 gdb/cli/cli-script.c                | 136 ++++++++++++++++++++++++++++++++----
 gdb/doc/ChangeLog                   |   5 ++
 gdb/doc/gdb.texinfo                 |  66 +++++++++++++++--
 gdb/testsuite/ChangeLog             |   6 ++
 gdb/testsuite/gdb.base/commands.exp |  50 +++++++++++++
 gdb/testsuite/gdb.base/run.c        |   3 +
 7 files changed, 266 insertions(+), 20 deletions(-)

diff --git a/gdb/cli/cli-script.c b/gdb/cli/cli-script.c
index 8496fb85e6f..edfc6586b6d 100644
--- a/gdb/cli/cli-script.c
+++ b/gdb/cli/cli-script.c
@@ -58,6 +58,31 @@ static int command_nest_depth = 1;
 /* This is to prevent certain commands being printed twice.  */
 static int suppress_next_print_command_trace = 0;
 
+/* Constants used in argument parsing control switch.  */
+static const char user_defined_command_arg_scheme_v1[] = "v1";
+static const char user_defined_command_arg_scheme_v2[] = "v2";
+
+static const char *const user_defined_command_arg_scheme_names[] = {
+  user_defined_command_arg_scheme_v1,
+  user_defined_command_arg_scheme_v2,
+  NULL
+};
+
+/* The control variable for which argument parsing scheme user defined
+   commands use.  */
+static const char *user_defined_command_arg_scheme_string
+	= user_defined_command_arg_scheme_v1;
+
+static void
+show_user_defined_command_arg_scheme (struct ui_file *file, int from_tty,
+                                      struct cmd_list_element *c,
+                                      const char *value)
+{
+  fprintf_filtered (file,
+		    _("Arguments to user defined commands are parsed "
+                      "using scheme \"%s\".\n"), value);
+}
+
 /* Structure for arguments to user defined functions.  */
 
 class user_args
@@ -78,12 +103,13 @@ private:
   user_args (const user_args &) =delete;
   user_args &operator= (const user_args &) =delete;
 
-  /* It is necessary to store a copy of the command line to ensure
-     that the arguments are not overwritten before they are used.  */
-  std::string m_command_line;
+  /* We support two schemes for parsing a command line arguments into a
+     list of arguments.  */
+  void parse_command_line_v1 (const char *line);
+  void parse_command_line_v2 (const char *line);
 
-  /* The arguments.  Each element points inside M_COMMAND_LINE.  */
-  std::vector<gdb::string_view> m_args;
+  /* The arguments.  Parsed from the LINE passed into the constructor.  */
+  std::vector<std::string> m_args;
 };
 
 /* The stack of arguments passed to user defined functions.  We need a
@@ -744,17 +770,27 @@ if_command (const char *arg, int from_tty)
 
 user_args::user_args (const char *command_line)
 {
-  const char *p;
-
   if (command_line == NULL)
     return;
 
-  m_command_line = command_line;
-  p = m_command_line.c_str ();
+  if (user_defined_command_arg_scheme_string
+      == user_defined_command_arg_scheme_v1)
+    parse_command_line_v1 (command_line);
+  else
+    parse_command_line_v2 (command_line);
+}
+
+void
+user_args::parse_command_line_v1 (const char *command_line)
+{
+  gdb_assert (command_line != NULL);
+
+  const char *p = command_line;
 
   while (*p)
     {
-      const char *start_arg;
+      std::string arg;
+
       int squote = 0;
       int dquote = 0;
       int bsquote = 0;
@@ -763,9 +799,6 @@ user_args::user_args (const char *command_line)
       while (*p == ' ' || *p == '\t')
 	p++;
 
-      /* P now points to an argument.  */
-      start_arg = p;
-
       /* Get to the end of this argument.  */
       while (*p)
 	{
@@ -794,11 +827,75 @@ user_args::user_args (const char *command_line)
 		  else if (*p == '"')
 		    dquote = 1;
 		}
+	      arg += *p;
 	      p++;
 	    }
 	}
 
-      m_args.emplace_back (start_arg, p - start_arg);
+      m_args.emplace_back (arg);
+    }
+}
+
+/* The newer v2 scheme for parsing arguments from a command line.  This
+   scheme strips single and double quotes from arguments, thus allowing
+   quoting of expressions.  Within double quotes you can use backslash to
+   escape nested double quotes.  */
+
+void
+user_args::parse_command_line_v2 (const char *command_line)
+{
+  gdb_assert (command_line != NULL);
+
+  for (const char *p = command_line; *p != '\0'; )
+    {
+      std::string arg;
+      bool squote = false;
+      bool dquote = false;
+
+      /* Strip whitespace.  */
+      while (isspace (*p))
+	p++;
+
+      /* Get to the end of this argument.  */
+      while (*p)
+	{
+          /* If we find whitespace and we're not inside a single or double
+             quote then we have found the end of this argument.  */
+          if (isspace (*p) && !(squote || dquote))
+            break;
+          else
+            {
+              /* If we're inside a single quote and we find another single
+                 quote then this is the end of the argument.  */
+              if (*p == '\'' && !dquote)
+                {
+                  ++p;
+                  squote = !squote;
+                  continue;
+                }
+
+              /* If we're inside a double quote and we find another double
+                 quote then this is the end of the argument.  */
+              if (*p == '"' && !squote)
+                {
+                  ++p;
+                  dquote = !dquote;
+                  continue;
+                }
+
+              if (*p == '\\' && !squote)
+                {
+                  if (*(p + 1) == '\''
+                      || *(p + 1) == '"')
+                    ++p;
+                }
+            }
+
+          arg += *p;
+          ++p;
+	}
+
+      m_args.emplace_back (arg);
     }
 }
 
@@ -863,7 +960,7 @@ user_args::insert_args (const char *line) const
 	    error (_("Missing argument %ld in user function."), i);
 	  else
 	    {
-	      new_line.append (m_args[i].data (), m_args[i].length ());
+	      new_line.append (m_args[i]);
 	      line = tmp;
 	    }
 	}
@@ -1618,4 +1715,13 @@ The conditional expression must follow the word `if' and must in turn be\n\
 followed by a new line.  The nested commands must be entered one per line,\n\
 and should be terminated by the word 'else' or `end'.  If an else clause\n\
 is used, the same rules apply to its nested commands as to the first ones."));
+
+  add_setshow_enum_cmd ("user-defined-command-argument-scheme", class_support,
+			user_defined_command_arg_scheme_names,
+                        &user_defined_command_arg_scheme_string, _("\
+Set the argument parsing scheme used by user-defined commands."), _("\
+Show the argument parsing scheme used by user-defined commands."), _("\
+Write something here..."),
+			NULL, show_user_defined_command_arg_scheme,
+			&setlist, &showlist);
 }
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index f2d1155b4db..b159df3b217 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -25245,14 +25245,70 @@
 
 @noindent
 This defines the command @code{adder}, which prints the sum of
-its three arguments.  Note the arguments are text substitutions, so they may
-reference variables, use complex expressions, or even perform inferior
-functions calls.
+its three arguments.
+
+The arguments to user-defined commands are text substitutions, so they
+may reference variables, use complex expressions, or even perform
+inferior functions calls.  Each argument is separated with whitespace,
+so in the previous example three arguments were passed.  The following
+example also passes three arguments, though the arguments are more
+complex:
+
+@smallexample
+adder 10+1 10+2 10+3
+@end smallexample
+
+@noindent
+However, if whitespace were added around the @code{+} characters, then
+9 arguments would be passed, @code{adder} only uses the first 3 of
+these arguments, and the others would be silently ignored, for example:
+
+@smallexample
+adder 10 + 1 10 + 2 10 + 3
+@end smallexample
+
+Causes @value{GDBN} to try and evaluate the following, which is likely
+invalid:
+
+@smallexample
+print 10 + + + 1
+@end smallexample
+
+@noindent
+Arguments can be quoted with double (@code{"}) or single (@code{'})
+quotes.  These quotes are not passed through as part of the argument,
+so the complex arguments from the previous example can be written as:
+
+@smallexample
+adder '10 + 1' '10 + 2' '10 + 3'
+@end smallexample
+
+@noindent
+As the quotes are not passed through, then the previous example causes
+@value{GDBN} to evaluate:
+
+@smallexample
+print 10 + 1 + 10 + 2 + 10 + 3
+@end smallexample
+
+@noindent
+Outside of quotes, a backslash can be used to pass a quote as part of
+an argument, for example, @code{\'} will pass a single quote as an
+argument, and @code{\"} passes a double quote as an argument.
+
+Within double quotes a backslash can be used to pass a literal double
+quote, so @code{"\"abc\""} will pass the argument @code{"abc"}.
+
+Within single quotes a backslash does not escape a single quote, the
+next single quote always ends a single quoted argument, and
+backslashes within a single quoted argument are passed straight
+through, so @code{'abc\'} will pass the argument @code{abc\}.
 
 @cindex argument count in user-defined commands
 @cindex how many arguments (user-defined commands)
-In addition, @code{$argc} may be used to find out how many arguments have
-been passed.
+@noindent
+Within a user-defined command @code{$argc} may be used to find out how
+many arguments have been passed.
 
 @smallexample
 define adder
diff --git a/gdb/testsuite/gdb.base/commands.exp b/gdb/testsuite/gdb.base/commands.exp
index 52a22bb5ddc..91023dae7cc 100644
--- a/gdb/testsuite/gdb.base/commands.exp
+++ b/gdb/testsuite/gdb.base/commands.exp
@@ -1167,6 +1167,55 @@ proc_with_prefix backslash_in_multi_line_command_test {} {
     gdb_test "print 1" "" "run command"
 }
 
+proc_with_prefix user_defined_command_arg_quoting {} {
+
+    gdb_test_no_output "set user-defined-command-argument-scheme v2"
+
+    gdb_test_multiple "define show_args" "define show_args" {
+	-re "End with"  {
+	    pass "define show_args"
+	}
+    }
+
+    # This test should alternate between 0xdeadbeef and 0xfeedface two times.
+    gdb_test \
+	[multi_line_input \
+	     {printf "nargs=%d:", $argc} \
+	     {set $i = 0} \
+	     {while $i < $argc} \
+	     {printf " "} \
+	     {eval "echo '$arg%d'", $i} \
+	     {set $i = $i + 1} \
+	     {end} \
+	     {printf "\n"} \
+	     {end}] \
+	"" \
+	"enter commands"
+
+    gdb_test "show_args 1 2 3" \
+	"nargs=3: '1' '2' '3'"
+
+    gdb_test "show_args 1 '1 + 1' '1 + (1 + 1)'" \
+	"nargs=3: '1' '1 \\+ 1' '1 \\+ \\(1 \\+ 1\\)'"
+
+    gdb_test "show_args '{unsigned long long} &global_var'" \
+	"nargs=1: '{unsigned long long} &global_var'"
+
+    gdb_test "show_args '*((unsigned long long *) &global_var)'" \
+	"nargs=1: '\\*\\(\\(unsigned long long \\*\\) &global_var\\)'"
+
+    gdb_test "show_args '\"' \"'\" \"''\"" \
+	"nargs=3: '\"' ''' ''''"
+
+    gdb_test "show_args \\n a\\'b" \
+	"nargs=2: '\r\n' 'a'b'"
+
+    gdb_test "show_args \"This\\nIs\\nA\\nMulti Line\\nMessage" \
+	"nargs=1: 'This\r\nIs\r\nA\r\nMulti Line\r\nMessage'"
+
+    gdb_test_no_output "set user-defined-command-argument-scheme v1"
+}
+
 gdbvar_simple_if_test
 gdbvar_simple_while_test
 gdbvar_complex_if_while_test
@@ -1182,6 +1231,7 @@ user_defined_command_case_sensitivity
 user_defined_command_args_eval
 user_defined_command_args_stack_test
 user_defined_command_manyargs_test
+user_defined_command_arg_quoting
 watchpoint_command_test
 test_command_prompt_position
 deprecated_command_test
diff --git a/gdb/testsuite/gdb.base/run.c b/gdb/testsuite/gdb.base/run.c
index 614b018260d..d89bad78bb4 100644
--- a/gdb/testsuite/gdb.base/run.c
+++ b/gdb/testsuite/gdb.base/run.c
@@ -8,6 +8,9 @@
 
 #include "../lib/unbuffer_output.c"
 
+/* Used by commands.exp test script.  */
+volatile unsigned long long global_var = 34;
+
 int factorial (int);
 
 int
-- 
2.14.4


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]