This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Extension for addr2line to unwind inline scopes


Sometimes application programmers like to include the ability to show
a call stack traceback to the user when they detect an internal fault.
One way to do this is to walk back up the call stack, calling the
external program addr2line with the PC for each frame.  The man page
for addr2line specifically mentions using addr2line this way:

  In the second, addr2line reads hexadecimal addresses from standard
  input, and prints the file name and line number for each address on
  standard output.  In this mode, addr2line may be used in a pipe to
  convert dynamically chosen addresses.

Problem with this though is that execution of inlined functions is
ignored since they share their caller's stack frame.  Consider the
example program:

  inline void bar (void)
  {
    printf ("Called bar");
  }
  
  inline void foo (void)
  {
    bar ();
    printf ("Called foo");
  }
  
  main ()
  {
    foo ();
  }

Which when compiled with a gcc built from the current development
sources produces the following (partial) line table information:

  Special opcode 233: advance Address by 16 to 0x804839c and Line by 4 to 13
  Special opcode 89: advance Address by 6 to 0x80483a2 and Line by 0 to 13
  Advance Line by -10 to 3
  Special opcode 89: advance Address by 6 to 0x80483a8 and Line by 0 to 3
  Special opcode 151: advance Address by 10 to 0x80483b2 and Line by 6 to 9
  Special opcode 221: advance Address by 15 to 0x80483c1 and Line by 6 to 15

I've annotated a disassembly of main() with a line number prefix:

  13  0804839c <main>:
  13   804839c:       55                      push   %ebp
  13   804839d:       89 e5                   mov    %esp,%ebp
  13   804839f:       83 ec 08                sub    $0x8,%esp
  13   80483a2:       83 e4 f0                and    $0xfffffff0,%esp
  13   80483a5:       83 ec 1c                sub    $0x1c,%esp
   3   80483a8:       68 a4 84 04 08          push   $0x80484a4
   3   80483ad:       e8 fe fe ff ff          call   80482b0 <printf@plt>
   9   80483b2:       c7 04 24 af 84 04 08    movl   $0x80484af,(%esp)
   9   80483b9:       e8 f2 fe ff ff          call   80482b0 <printf@plt>
   9   80483be:       83 c4 10                add    $0x10,%esp
  15   80483c1:       c9                      leave  
  15   80483c2:       c3                      ret    
  15   80483c3:       90                      nop    
    
Current released versions of addr2line with find misleading info for
address 80483a8:

  $ /usr/bin/addr2line -e x1 -f 80483a8
  main
  /build/sourceware/binutils/i686-pc-linux-gnu/binutils/x1.c:3

A recent fix to addr2line improves that by making the function name
and line consistent:

  $ ./addr2line -e x1 -f 80483a8
  bar
  /build/sourceware/binutils/i686-pc-linux-gnu/binutils/x1.c:3

However if this was used as part of a frame traceback as described
above, the user would never see that bar was called by foo, which was
called by main.

As an experiment, I've modified bfd_find_nearest_line() so that when
it parses the DWARF info for an inlined function, it remembers a PC
from the caller of that inlined function, and makes it available via a
separate BFD call.  This allows the caller of bfd_find_nearest_line()
to loop until it's "unwound" all the inlined functions.  I also
modified addr2line to do this.  With these modifications in place, the
above example produces:

  $ ./addr2line -e x1 -f 80483a8
  bar
  /build/sourceware/binutils/i686-pc-linux-gnu/binutils/x1.c:3
  foo
  /build/sourceware/binutils/i686-pc-linux-gnu/binutils/x1.c:9
  main

Note I'm not proposing that any of these changes be adopted "as is",
just presenting them as a possible starting point for discussion.  To
adopt these patches at least the following issues would need to be
resolved:

(1) The _bfd_dwarf2_find_saved_nearest_scope() entry point to fetch
the most recent saved PC would need to have the BFD infrastructure
added to make it a generic BFD call.

(2) The PC that get's saved isn't necessarily the most recent
instruction executed in the caller of the inlined function.  It is
merely one which when fed back into bfd_find_nearest_line() will get
us to the caller of the inlined function.

(3) It would probably be best to add a new option to addr2line that
tells it to produce multiple outputs for a single address that is
found to be inside an inlined function, otherwise existing users of
addr2line might break.

(4) The DWARF info produced by current versions of gcc does not make
use of the DW_AT_ranges attribute to indentify discontiguous address
ranges of the inlined subroutine and it's caller.  Instead it uses
just DW_AT_low_pc and DW_AT_high_pc, which can produce overlapping
ranges.  I've somewhat hacked around this problem in the
find_nearest_line code.

(5) The DWARF parser in BFD's dwarf2.c is pretty much goal oriented
towards supporting just bfd_find_nearest_line().  It might be best to
reorganize this code to make it easier to add other BFD functions to
extract info from the DWARF bits.  On the other hand, perhaps BFD
really isn't the place for this code anyway.

(6) If we have to add another bfd entry point anyway, perhaps a better
idea is a variant of bfd_find_nearest_line() that returns a PC from
the next higher inlining scope directly.

I've attached the patches I'm currently experimenting with.

-Fred

Index: bfd/dwarf2.c
===================================================================
RCS file: /cvs/src/src/bfd/dwarf2.c,v
retrieving revision 1.59
diff -c -p -r1.59 dwarf2.c
*** bfd/dwarf2.c	5 Jan 2005 10:37:05 -0000	1.59
--- bfd/dwarf2.c	17 Jan 2005 15:28:28 -0000
*************** struct dwarf2_debug
*** 116,121 ****
--- 116,126 ----
  
    /* Length of the loaded .debug_str section.  */
    unsigned long dwarf_str_size;
+ 
+   /* If most recent call to bfd_find_nearest_line was given an address
+      in an inlined function, remember the closest address in the
+      inliner function that is not in the inlined function. */
+   bfd_vma inliner_pc;
  };
  
  struct arange
*************** struct line_info_table
*** 715,720 ****
--- 720,727 ----
  struct funcinfo
  {
    struct funcinfo *prev_func;
+   int tag;
+   int nesting_level;
    char* name;
    bfd_vma low;
    bfd_vma high;
*************** lookup_address_in_line_info_table (struc
*** 1297,1303 ****
  
  /* Function table functions.  */
  
! /* If ADDR is within TABLE, set FUNCTIONNAME_PTR, and return TRUE.  */
  
  static bfd_boolean
  lookup_address_in_function_table (struct funcinfo *table,
--- 1304,1313 ----
  
  /* Function table functions.  */
  
! /* If ADDR is within TABLE, set FUNCTIONNAME_PTR, and return TRUE.
!    Note that we need to find the function that has the smallest
!    range that contains ADDR, to handle inlined functions without
!    depending upon them being ordered in TABLE by increasing range. */
  
  static bfd_boolean
  lookup_address_in_function_table (struct funcinfo *table,
*************** lookup_address_in_function_table (struct
*** 1306,1311 ****
--- 1316,1322 ----
  				  const char **functionname_ptr)
  {
    struct funcinfo* each_func;
+   struct funcinfo* best_fit = NULL;
  
    for (each_func = table;
         each_func;
*************** lookup_address_in_function_table (struct
*** 1313,1325 ****
      {
        if (addr >= each_func->low && addr < each_func->high)
  	{
! 	  *functionname_ptr = each_func->name;
! 	  *function_ptr = each_func;
! 	  return TRUE;
  	}
      }
  
!   return FALSE;
  }
  
  static char *
--- 1324,1345 ----
      {
        if (addr >= each_func->low && addr < each_func->high)
  	{
! 	  if (!best_fit ||
! 	      ((each_func->high - each_func->low) < (best_fit->high - best_fit->low)))
! 	    best_fit = each_func;
  	}
      }
  
!   if (best_fit)
!     {
!       *functionname_ptr = best_fit->name;
!       *function_ptr = best_fit;
!       return TRUE;
!     }
!   else
!     {
!       return FALSE;
!     }
  }
  
  static char *
*************** scan_unit_for_functions (struct comp_uni
*** 1401,1406 ****
--- 1421,1428 ----
  	{
  	  bfd_size_type amt = sizeof (struct funcinfo);
  	  func = bfd_zalloc (abfd, amt);
+ 	  func->tag = abbrev->tag;
+ 	  func->nesting_level = nesting_level;
  	  func->prev_func = unit->function_table;
  	  unit->function_table = func;
  	}
*************** comp_unit_contains_address (struct comp_
*** 1631,1636 ****
--- 1653,1696 ----
    return FALSE;
  }
  
+ /* Given inlined function FUNC, find the "closest" address in the
+    inliner function that is not also in the inlined function.  There
+    are several assumptions at work here, including the assumption that
+    we can follow the function chain from this point to move up the
+    scope hierarchy.  This works because of the order of the entries in
+    dwarf and the way the function table is built.  Another assumption
+    is that all the code of a inlined function is contiguous within the
+    boundaries of the low and high PC's.  Another assumption is that
+    the inlined function range of PC's may be a contiguous subset of
+    the inliner function range of PC's, including having either the
+    same low value or the same high value (but of course not both at
+    the same time). */
+ 
+ static bfd_vma
+ lookup_inliner_pc_in_function_table (struct funcinfo *func)
+ {
+   struct funcinfo* each_func;
+ 
+   for (each_func = func;
+        each_func;
+        each_func = each_func->prev_func)
+     {
+       if (each_func->nesting_level == func->nesting_level - 1)
+ 	{
+ 	  /* Note: May want to reverse order of these two tests
+ 	     depending upon whether it is more useful to think of the
+ 	     inlined function as called by the inliner code before or
+ 	     after the inlined function. */
+ 	  if (each_func->low < func->low)
+ 	    return (func->low - 1);
+ 	  if (each_func->high > func->high)
+ 	    return (func->high);
+ 	  return (0);
+ 	}
+     }
+   return (0);
+ }
+ 
  /* If UNIT contains ADDR, set the output parameters to the values for
     the line containing ADDR.  The output parameters, FILENAME_PTR,
     FUNCTIONNAME_PTR, and LINENUMBER_PTR, are pointers to the objects
*************** comp_unit_find_nearest_line (struct comp
*** 1681,1686 ****
--- 1741,1748 ----
    function = NULL;
    func_p = lookup_address_in_function_table (unit->function_table, addr,
  					     &function, functionname_ptr);
+   if (func_p && (function->tag == DW_TAG_inlined_subroutine))
+     stash->inliner_pc = lookup_inliner_pc_in_function_table (function);
    line_p = lookup_address_in_line_info_table (unit->line_table, addr,
  					      function, filename_ptr,
  					      linenumber_ptr);
*************** _bfd_dwarf2_find_nearest_line (bfd *abfd
*** 1838,1843 ****
--- 1900,1907 ----
    if (! stash->info_ptr)
      return FALSE;
  
+   stash->inliner_pc = 0;
+ 
    /* Check the previously read comp. units first.  */
    for (each = stash->all_comp_units; each; each = each->next_unit)
      if (comp_unit_contains_address (each, addr))
*************** _bfd_dwarf2_find_nearest_line (bfd *abfd
*** 1928,1930 ****
--- 1992,2005 ----
  
    return FALSE;
  }
+ 
+ bfd_vma
+ _bfd_dwarf2_find_saved_nearest_scope (bfd *abfd)
+ {
+   struct dwarf2_debug *stash;
+ 
+   stash = elf_tdata(abfd)->dwarf2_find_line_info;
+   if (stash)
+     return (stash->inliner_pc);
+   return (NULL);
+ }
Index: binutils/addr2line.c
===================================================================
RCS file: /cvs/src/src/binutils/addr2line.c,v
retrieving revision 1.21
diff -c -p -r1.21 addr2line.c
*** binutils/addr2line.c	15 Jun 2004 01:19:13 -0000	1.21
--- binutils/addr2line.c	17 Jan 2005 15:28:29 -0000
*************** find_address_in_section (bfd *abfd, asec
*** 145,176 ****
  				 &filename, &functionname, &line);
  }
  
- /* Read hexadecimal addresses from stdin, translate into
-    file_name:line_number and optionally function name.  */
- 
  static void
! translate_addresses (bfd *abfd)
  {
!   int read_stdin = (naddr == 0);
! 
!   for (;;)
      {
-       if (read_stdin)
- 	{
- 	  char addr_hex[100];
- 
- 	  if (fgets (addr_hex, sizeof addr_hex, stdin) == NULL)
- 	    break;
- 	  pc = bfd_scan_vma (addr_hex, NULL, 16);
- 	}
-       else
- 	{
- 	  if (naddr <= 0)
- 	    break;
- 	  --naddr;
- 	  pc = bfd_scan_vma (*addr++, NULL, 16);
- 	}
- 
        found = FALSE;
        bfd_map_over_sections (abfd, find_address_in_section, NULL);
  
--- 145,155 ----
  				 &filename, &functionname, &line);
  }
  
  static void
! do_user_pc (bfd *abfd)
  {
!   for (; pc ;)
      {
        found = FALSE;
        bfd_map_over_sections (abfd, find_address_in_section, NULL);
  
*************** translate_addresses (bfd *abfd)
*** 215,224 ****
  	}
  
        /* fflush() is essential for using this command as a server
!          child process that reads addresses from a pipe and responds
!          with line number information, processing one address at a
!          time.  */
        fflush (stdout);
      }
  }
  
--- 194,239 ----
  	}
  
        /* fflush() is essential for using this command as a server
! 	 child process that reads addresses from a pipe and responds
! 	 with line number information, processing one address at a
! 	 time.  */
        fflush (stdout);
+ 
+       /* If this was an inlined function, rerun with the inliner PC */
+       if (found)
+ 	{
+ 	  extern bfd_vma _bfd_dwarf2_find_saved_nearest_scope (bfd *abfd);
+ 	  pc = _bfd_dwarf2_find_saved_nearest_scope (abfd);
+ 	}
+     }
+ }
+ 
+ /* Read hexadecimal addresses from stdin, translate into
+    file_name:line_number and optionally function name.  */
+ 
+ static void
+ translate_addresses (bfd *abfd)
+ {
+   int read_stdin = (naddr == 0);
+ 
+   for (;;)
+     {
+       if (read_stdin)
+ 	{
+ 	  char addr_hex[100];
+ 
+ 	  if (fgets (addr_hex, sizeof addr_hex, stdin) == NULL)
+ 	    break;
+ 	  pc = bfd_scan_vma (addr_hex, NULL, 16);
+ 	}
+       else
+ 	{
+ 	  if (naddr <= 0)
+ 	    break;
+ 	  --naddr;
+ 	  pc = bfd_scan_vma (*addr++, NULL, 16);
+ 	}
+       do_user_pc (abfd);
      }
  }
  

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]