This is the mail archive of the gdb@sourceware.cygnus.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Proposal: convert function definitions to prototyped form


As many of you know, I'm in the midst of purging the use of ``PARAMS''
in prototyped function declarations from the gdb sources.  After this
activity is concluded, I'd like to begin converting function
definitions whose parameters are declared using the traditional C
style to the ISO C prototyped form.  I.e, I'd like to convert
functions of the form

    int
    foo (a, b, c)
         int a;
         char *b;
         int *c;
    {
      ...
    }

to

    int foo (int a, char *b, int *c)
    {
      ...
    }

To this end, I've been working on a script to automate most of this
conversion.  I've appended a work-in-progress version to the end of
this message and will discuss it in some detail further below.

But first...  it's been pointed out to me that the ``protoize''
program is supposed to do the same thing.  I gave ``protoize'' a whirl
and noticed the following problems:

    1) A substantial amount of initial setup is involved.  Only in
       the simplest cases could you just feed a list of files to
       work on to protoize and expect it to work okay.  In particular,
       for gdb, I found it necessary to do a configure *in the source
       directory* so that xm.h, tm.h, and nm.h are set up.  I also
       found it necessary to build in bfd so that certain header
       files are created.

       The reason all of this is necessary is because protoize invokes
       gcc to learn about the program.  As a consequence, it is also
       important to know the set of -I switches that one normally feeds
       to gcc to do a build.

    2) ``protoize'' fails to convert functions disabled by ifdefs for
       the given platform.  OTOH, on some other platform(s), these
       functions might not be disabled and would be converted.  E.g,
       in breakpoint.c, on my GNU/Linux/x86 machine, protoize failed
       to convert create_longjmp_breakpoint() which is protected by an
       "#ifdef GET_LONGJMP_TARGET".

    3) ``protoize'' screws up on PTR types.  E.g, in breakpoint.c, it
       converted

            static int
            cover_target_enable_exception_callback (arg)
                 PTR arg;

       to

            static int
            cover_target_enable_exception_callback (__builtin_va_list arg)

    4) ``protoize'' does not reformat long argument lists.  The lists
       end up entirely contained on one line.

For more information on protoize, see the "Running protoize" page:

    http://gcc.gnu.org/onlinedocs/gcc_2.html#SEC48

and the "Caveats of using protoize" page:

    http://gcc.gnu.org/onlinedocs/gcc_7.html#SEC135

Two of my goals in creating the scripts for the PARAMS purging
activities was that the scripts should be 1) easy to invoke and 2)
require no hand editing of the output when done.  I.e, you shouldn't
have to edit any of the files that these scripts touch in order to fix
errors so that you can build again.  OTOH, the script may leave
certain portions of the file alone that could possibly have be
converted had the script been a little smarter.  The things that the
script fails to convert *will* have to be fixed later on (in order to
complete the cleanup activity), either by another script, or by hand. 
For the PARAMS purging activity, I have spent a fair amount of time
examining the diffs to make sure that this is indeed the case.  (And
I intend to do the same for the activity in this proposal.)

The reason that it is so important to avoid any hand edits is that we
want people who have local repositories or checked out source trees to
be able to run these conversion scripts against them so that merges
will be less painful.  (Even easy.)

With that said, keeping in mind the problems I noted above, I conclude
that ``protoize'' is not the appropriate tool for us to use to convert
function definitions in the gdb sources to their prototyped form.

Now, I'll tell you about the fix-decls script (see below).

The fix-decls script is run via "fix-decls <dirname>" where <dirname>
is the name of the directory to (recursively) fix.  It only touches
the .c files, with the exception of gnu-regex.c and the testsuite
files.  After it converts a traditional C style declaration to a
prototyped one, it invokes GNU indent (on the function declaration
only) to make sure that our indentation standards are still adhered
to.  There is no gratuitous reformatting of any other code.

Here are a list of things that it won't handle (at the moment,
anyway):

    - declarations with comments in them
    - function pointers
    - array declarations
    - malformed declarations (e.g, look at m88k_skip_prologue in m88k-tdep.c)

Functions with traditional C declarations with any of the above in
them will be left alone.  (And need to be converted by hand at some
point.)  But this isn't so bad.  When I ran the script over the gdb
sources, it converted 5,204 function definitions.  After throwing
out gnu-regex.c (which needs to track glibc), all of the stuff in
the testsuite, and everything in tui (which has ifdefs around each
and every function declaration), there are only 143 cases left to
examine by hand.  I've sampled these and many of them are unhandled
due to having comments following a parameter's type declaration.
If I could figure out what to do about comments, we'd have even
fewer cases to fix up by hand.

Here are some of the more interesting cases that it does handle:

Missing int in declaration (i960-tdep.c):

 CORE_ADDR
-frame_args_address (fi, must_be_correct)
-     struct frame_info *fi;
+frame_args_address (struct frame_info *fi, int must_be_correct)
 {

Comma separated parameter lists (remote-mm.c):

 static void
-init_target_mm (tstart, tend, dstart, dend, entry, ms_size, rs_size, arg_start)
-     ADDR32 tstart, tend, dstart, dend, entry;
-     INT32 ms_size, rs_size;
-     ADDR32 arg_start;
+init_target_mm (ADDR32 tstart, ADDR32 tend, ADDR32 dstart, ADDR32 dend,
+		ADDR32 entry, INT32 ms_size, INT32 rs_size, ADDR32 arg_start)
 {

Comma separated list with differing number of stars on the parameter
names (sparc-tdep.c):

 static branch_type
-isbranch (instruction, addr, target)
-     long instruction;
-     CORE_ADDR addr, *target;
+isbranch (long instruction, CORE_ADDR addr, CORE_ADDR * target)
 {

And another one (command.c):

 struct cmd_list_element *
-lookup_cmd_1 (text, clist, result_list, ignore_help_classes)
-     char **text;
-     struct cmd_list_element *clist, **result_list;
-     int ignore_help_classes;
+lookup_cmd_1 (char **text, struct cmd_list_element *clist,
+	      struct cmd_list_element **result_list, int ignore_help_classes)
 {

Out of order comma separated parameter list - note positions of ``from''
and ``to'' (gdbserver/remote-utils.c):

 void
-decode_M_packet (from, mem_addr_ptr, len_ptr, to)
-     char *from, *to;
-     CORE_ADDR *mem_addr_ptr;
-     unsigned int *len_ptr;
+decode_M_packet (char *from, CORE_ADDR * mem_addr_ptr, unsigned int *len_ptr,
+		 char *to)
 {

Finally, I should note that I've done a test build on GNU/Linux/x86
and had no build problems, nor did I see any regressions when I ran
the testsuite.

The fix-decls script is below.  I'd be interested in finding out if
anyone else has a favorite script for doing this sort of thing.  Other
comments welcome too.

--- fix-decls ---
#!/usr/bin/perl -w

use File::Find;
use FileHandle;
use IPC::Open3;
use English;

my ($root) = @ARGV;

if (!defined($root)) {
    die "Usage: $0 root\n";
}

@ARGV = ();

find(
    sub { 
	if ($_ eq 'testsuite') {
	    $File::Find::prune = 1;
	} elsif (-f && -T && /\.c$/ && $_ ne "gnu-regex.c") {
	    push @ARGV, $File::Find::name;
	}
    },
    $root
);

#$INPLACE_EDIT = '.bak';
$INPLACE_EDIT = '';
undef $/;			# slurp entire files

while (<>) {
    s/
	^			# line start
	( 
	  \w+			# function name
	)
	(
	  \s*			# spaces
	  \(			# left paren
	  \s*			# spaces
	)
	(
	  (?:\w+\s*,\s*)*	# 1 thru N-1 parameter names
	  \w+			# last parameter name
	)
	(
	  \s*			# spaces
	  \)			# right paren
	)
	(
	  (?:
	    [^;{}()]+;		# trad C parameter decl
	  )*
	)
	(
	  \s*			# spaces
	)
	(?=^{)			# lookahead to make sure we see a
				# right curly brace at the beginning
				# of the line
    /
	fix_decl($1, $2, $3, $4, $5, $6);
    /smgex;

    s/
	^			# line start
	( 
	  \w+			# function name
	)
	\s*			# spaces
	\(			# left paren
	\s*			# spaces
	\)			# right paren
	(?=\s*^{)		# lookahead to make sure we see a
				# right curly brace at the beginning
				# of the line
     /$1 (void)/smgx;

    print;
}

sub fix_decl {
    my ($funcname, $lparen, $params, $rparen, $decls, $spaces) = @_;
    my %h = ();
    my ($param, $decl);

    # Define $bailstr to be the original function declaration.  We
    # return it when we see something which doesn't make sense.
    my $bailstr = $funcname . $lparen . $params . $rparen . $decls . $spaces;

    if ($funcname =~ /^(do|while|if)$/) {
	# 'if', 'do', and 'while' are not function names
	# find_overload_match() in valops.c contains an if statement
	# which is confused as a function if we don't have this test.
	return $bailstr;
    }

    foreach $param (split /\s*,\s*/, $params) {
	if (defined $h{$param} || $param eq 'void') {
	    # Bail; either param has already been encountered or
	    # it's void in which case the decl in question is already
	    # ISO C.
	    return $bailstr;
	}
	$h{$param} = "int $param";	# Default
    }

    $decls =~ s/\s*;\Z//;		# remove final semicolon
    			
    foreach $decl (split /\s*;\s*/, $decls) {
	my ($type, $dparams) = 
	    $decl =~ /^				# beginning of string
	              (.*?)			# type
	              (				# dparams...
		        (?:
			  \**			#  stars
			  \s*			#  spaces
			  \w+			#  identifier
			  \s*			#  spaces
			  ,			#  comma
			  \s*			#  spaces
			)*			# any number of the above
			\**			#  stars
			\s*			#  spaces
			\w+			#  identifier
		      )
		      $				# end of string
		     /sx;
	return $bailstr			if !defined $type || !defined $dparams;
	$type =~ s/\A\s+//;		# nuke leading spaces
	$type =~ s/\s+\Z//;		# nuke trailing spaces
	return $bailstr			if $type eq '';
					# Bail if no type
	foreach $param (split /\s*,\s*/, $dparams) {

	    my ($stars, $stripped_param) = 
		$param =~ /(\**)\s*(\w+)/;

	    if (!defined($stripped_param) || !defined $h{$stripped_param}) {
		# Either we couldn't find the parameter or else
		# the parameter wasn't found in the parameter list
		return $bailstr;
	    }
	    $h{$stripped_param} = "$type $param";
	}
    }

    my $newparams = join(', ', map { $h{$_} } split(/\s*,\s*/, $params));

    my $newdecl = reindent("$funcname ($newparams)\n{\n}\n");
    $newdecl =~ s/{\n}//;
    return $newdecl;
}


sub reindent {
    my ($decl, $line_length) = @_;
    $line_length = 80		unless defined $line_length;
    my ($rfh, $wfh, $efh) = (FileHandle->new, FileHandle->new,
					      FileHandle->new);
    my $pid = open3($wfh, $rfh, $efh, "indent -l$line_length");
    $rfh->input_record_separator(undef);
    $efh->input_record_separator(undef);
    $wfh->print($decl);
    $wfh->close();
    my $replacement = <$rfh>;
    $rfh->close();
    my $errstr = <$efh>;
    $efh->close();
    waitpid $pid, 0;
    $replacement =~ s#\n$##;
    if ($errstr ne "") {
	print STDERR "Check $ARGV...\n$errstr\nInput:$decl\nOutput:$replacement\n\n"
    }
    $replacement;
}
--- end fix-decls ---

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]