This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/9] Handle TLS variable lookups when using separate debug object files.


On 2/5/19 12:06 PM, Simon Marchi wrote:
> On 2019-02-04 15:02, John Baldwin wrote:
>> On 2/2/19 7:52 AM, Simon Marchi wrote:
>>> On 2019-01-22 13:42, John Baldwin wrote:
>>>> The object files stored in the shared object list are the original
>>>> object files, not the separate debug object files.  However, when
>>>> resolving a TLS variable, the variable is resolved against a separate
>>>> debug object file if it exists.  In this case,
>>>> svr4_fetch_objfile_link_map was failing to find the link map entry
>>>> since the debug object file is not in its internal list, only the
>>>> original object file.
>>>
>>> Does this solve an existing issue, or an issue that would come up with
>>> the following patches?
>>
>> I only noticed while working on these patches, but I believe it is a
>> generic issue.  I tried to reproduce on a Linux box by compiling a 
>> small
>> library with separate debug symbols and a program that linked against 
>> it
>> and running it under gdb, but TLS variables didn't work for me even 
>> without
>> separate debug symbols in my testing. :(
>>
>> $ cat foo.c
>> #include "foo.h"
>>
>> static __thread int foo_id;
>>
>> void
>> set_foo_id(int id)
>> {
>>   foo_id = id;
>> }
>>
>> int
>> get_foo_id(void)
>> {
>>   return foo_id;
>> }
>> $ cat foo.h
>> void set_foo_id(int id);
>> int get_foo_id(void);
>> $ cat main.c
>> #include <stdio.h>
>>
>> #include "foo.h"
>>
>> int
>> main(void)
>> {
>>
>>   set_foo_id(47);
>>   printf("foo id = %d\n", get_foo_id());
>>   return (0);
>> }
>> $ cc -g -fPIC -shared foo.c -o libfoo.so.full
>> $ objcopy --only-keep-debug libfoo.so.full libfoo.so.debug
>> $ objcopy --strip-debug --add-gnu-debuglink=libfoo.so.debug
>> libfoo.so.full libfoo.so
>> $ cc -g main.c -o foo -lfoo -L.
>> $ gdb foo
>> GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
>> ...
>> (gdb) start
>> Temporary breakpoint 1 at 0x7ae: file main.c, line 9.
>> Starting program: /home/john/tls_lib/foo
>>
>> Temporary breakpoint 1, main () at main.c:9
>> 9         set_foo_id(47);
>> (gdb) p foo_id
>> Cannot find thread-local storage for process 3970, shared library 
>> libfoo.so:
>> Cannot find thread-local variables on this target
>>
>> Then tried it without separate debug file, but that didn't work either:
>>
>> $ cc -g -fPIC -shared foo.c -o libfoo.so
>> $ gdb foo
>> GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
>> ...
>> (gdb) start
>> Temporary breakpoint 1 at 0x7ae: file main.c, line 9.
>> Starting program: /home/john/tls_lib/foo
>>
>> Temporary breakpoint 1, main () at main.c:9
>> 9         set_foo_id(47);
>> (gdb) p foo_id
>> Cannot find thread-local storage for process 3982, shared library 
>> libfoo.so:
>> Cannot find thread-local variables on this target
>> (gdb) n
>> 10        printf("foo id = %d\n", get_foo_id());
>> (gdb)
>> foo id = 47
>> 11        return (0);
>> (gdb) p foo_id
>> Cannot find thread-local storage for process 3982, shared library 
>> libfoo.so:
>> Cannot find thread-local variables on this target
>>
>> I would have expected the second case to work and the first to fail and 
>> for
>> the patch to fix the first case.
> 
> I did a similar test and hit the same message.  I figured it's because 
> you need to build with -pthread.  With -pthread, GDB manages to print 
> the value in both cases (with and without separate debug info).  That's 
> why I ended up asking you this question, I thought that maybe I just 
> hadn't reproduced the issue correctly.
> 
> I have attached my test case (so we don't have to rewrite the same test 
> over and over), which you can easily build with and without debug info 
> (see the Makefile).  Does it trigger the problem on FreeBSD?  Does it 
> work for you in both cases on Linux, like it does for me?
> 
> Now that I take a look, there might be the gdb.threads/tls-shared.exp 
> test that does the exact same thing.  But there doesn't seem to be a 
> board file to test with separate debug files out of the box...

So, it works fine for me without this patch on FreeBSD.  However, I dug
around a bit more as I had only noticed this before when trying to debug
a core dump for riscv (as FreeBSD/riscv64 core dumps have the tp register
in them).  That is when I get a separate objfile passed to
svr4_fetch_objfile_link_map:

Breakpoint 3, svr4_fetch_objfile_link_map (objfile=0x8038a1600) at ../../gdb/solib-svr4.c:1541
1541      struct svr4_info *info = get_svr4_info ();
(top-gdb) p *objfile
$3 = {next = 0x8038a1480, 
  original_name = 0x8038df010 "/usr/lib/debug//ufs/riscv64/rootfs/lib/libc.so.7.debug", addr_low = 0x0, flags = {
...
(top-gdb) where
#0  svr4_fetch_objfile_link_map (objfile=0x8038a1600)
    at ../../gdb/solib-svr4.c:1541
#1  0x00000000015085e7 in gdbarch_fetch_tls_load_module_address (gdbarch=0x803800010, objfile=0x8038a1600) at ../../gdb/gdbarch.c:3019
#2  0x00000000019ba415 in target_translate_tls_address (objfile=0x8038a1600, 
    offset=0x1800) at ../../gdb/target.c:705
#3  0x00000000016ea2a1 in find_minsym_type_and_address (msymbol=0x803cfcb18, 
    objfile=0x8038a1600, address_p=0x7fffffffd2e8) at ../../gdb/parse.c:500
#4  0x00000000014afeb7 in evaluate_var_msym_value (noside=EVAL_NORMAL, 
    objfile=0x8038a1600, msymbol=0x803cfcb18) at ../../gdb/eval.c:743
#5  0x00000000014b0290 in evaluate_subexp_standard (expect_type=0x0, 
    exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL)
    at ../../gdb/eval.c:1326
#6  0x00000000012a8562 in evaluate_subexp_c (expect_type=0x0, exp=0x8034f0430, 
    pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/c-lang.c:704
#7  0x00000000014ae17b in evaluate_subexp (expect_type=0x0, exp=0x8034f0430, 
    pos=0x7fffffffde34, noside=EVAL_NORMAL) at ../../gdb/eval.c:80
#8  0x00000000014ae48b in evaluate_expression (exp=0x8034f0430)
    at ../../gdb/eval.c:141
#9  0x000000000173184f in print_command_1 (
    exp=0x7fffffffe101 "__je_tsd_initialized", voidprint=1)
    at ../../gdb/printcmd.c:1187
#10 0x000000000172e5ff in print_command (
    exp=0x7fffffffe101 "__je_tsd_initialized", from_tty=1)
    at ../../gdb/printcmd.c:1200
(top-gdb) frame 5
#5  0x00000000014b0290 in evaluate_subexp_standard (expect_type=0x0, 
    exp=0x8034f0430, pos=0x7fffffffde34, noside=EVAL_NORMAL)
    at ../../gdb/eval.c:1326
1326            value *val = evaluate_var_msym_value (noside,
(top-gdb) l
1321        case OP_VAR_MSYM_VALUE:
1322          {
1323            (*pos) += 3;
1324
1325            minimal_symbol *msymbol = exp->elts[pc + 2].msymbol;
1326            value *val = evaluate_var_msym_value (noside,
1327                                                  exp->elts[pc + 1].objfile,
1328                                                  msymbol);
1329
1330            type = value_type (val);

whereas on a live amd64 process this was called from a different place:

(top-gdb) where
#0  svr4_fetch_objfile_link_map (objfile=0x803879380)
    at ../../gdb/solib-svr4.c:1541
#1  0x00000000015085e7 in gdbarch_fetch_tls_load_module_address (gdbarch=0x8039b7010, objfile=0x803879380) at ../../gdb/gdbarch.c:3019
#2  0x00000000019ba3f5 in target_translate_tls_address (objfile=0x803879380, 
    offset=6168) at ../../gdb/target.c:705
#3  0x000000000141f361 in dwarf_evaluate_loc_desc::get_tls_address (
    this=0x7fffffffc578, offset=6168) at ../../gdb/dwarf2loc.c:609
#4  0x00000000014077dd in dwarf_expr_context::execute_stack_op (
    this=0x7fffffffc578, op_ptr=0x803dea6fb "\002S\177\001", 
    op_end=0x803dea6fb "\002S\177\001") at ../../gdb/dwarf2expr.c:1175
#5  0x00000000014056b1 in dwarf_expr_context::eval (this=0x7fffffffc578, 
    addr=0x803dea6f1 "\016\030\030", len=10) at ../../gdb/dwarf2expr.c:301
#6  0x000000000140e99d in dwarf2_evaluate_loc_desc_full (type=0x8040ad930, 
    frame=0x0, data=0x803dea6f1 "\016\030\030", size=10, per_cu=0x8039cb190, 
    subobj_type=0x8040ad930, subobj_byte_offset=0)
    at ../../gdb/dwarf2loc.c:2170
#7  0x000000000140e7d2 in dwarf2_evaluate_loc_desc (type=0x8040ad930, 
    frame=0x0, data=0x803dea6f1 "\016\030\030", size=10, per_cu=0x8039cb190)
    at ../../gdb/dwarf2loc.c:2352
#8  0x0000000001412794 in locexpr_read_variable (symbol=0x80416c930, frame=0x0)
    at ../../gdb/dwarf2loc.c:3503
#9  0x00000000014e3743 in default_read_var_value (var=0x80416c930, 
    var_block=0x8041724e0, frame=0x0) at ../../gdb/findvar.c:610
#10 0x00000000014e4881 in read_var_value (var=0x80416c930, 
    var_block=0x8041724e0, frame=0x0) at ../../gdb/findvar.c:815
#11 0x0000000001a99abe in value_of_variable (var=0x80416c930, b=0x8041724e0) at ../../gdb/valops.c:1292
#12 0x00000000014afd7b in evaluate_var_value (noside=EVAL_NORMAL, blk=0x8041724e0, var=0x80416c930) at ../../gdb/eval.c:721
#13 0x00000000014b0217 in evaluate_subexp_standard (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/eval.c:1312
#14 0x00000000012a8562 in evaluate_subexp_c (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/c-lang.c:704
#15 0x00000000014ae17b in evaluate_subexp (expect_type=0x0, exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL) at ../../gdb/eval.c:80
#16 0x00000000014ae48b in evaluate_expression (exp=0x803507830) at ../../gdb/eval.c:141
#17 0x000000000173184f in print_command_1 (exp=0x7fffffffd7d1 "__je_tsd_initialized", voidprint=1) at ../../gdb/printcmd.c:1187
#18 0x000000000172e5ff in print_command (exp=0x7fffffffd7d1 "__je_tsd_initialized", from_tty=1) at ../../gdb/printcmd.c:1200
(top-gdb) frame 13
#13 0x00000000014b0217 in evaluate_subexp_standard (expect_type=0x0, 
    exp=0x803507830, pos=0x7fffffffd504, noside=EVAL_NORMAL)
    at ../../gdb/eval.c:1312
1312                return evaluate_var_value (noside, exp->elts[pc + 1].block, var);
(top-gdb) l
1307            (*pos) += 3;
1308            symbol *var = exp->elts[pc + 2].symbol;
1309            if (TYPE_CODE (SYMBOL_TYPE (var)) == TYPE_CODE_ERROR)
1310              error_unknown_type (SYMBOL_PRINT_NAME (var));
1311            if (noside != EVAL_SKIP)
1312                return evaluate_var_value (noside, exp->elts[pc + 1].block, var);
1313            else
1314              {
1315                /* Return a dummy value of the correct type when skipping, so
1316                   that parent functions know what is to be skipped.  */

So it seems that the OP_VAR_VALUE path calls down into the dwarf bits that
get the "original" objfile to pass to target_translate_tls_address, whereas
the OP_VAR_MSYM_VALUE case ends up using a separate object file.  This might
in fact be due to bugs in the RISCV GCC backend as the TLS symbols for RISCV
don't seem quite right.  I have to cast TLS variables to their types for
example:

(gdb) p __je_tsd_initialized
'__je_tsd_initialized` has unknown type; cast it to its declared type
(gdb) p (bool)__je_tsd_initialized
$2 = 1 '\001'

Also, for me TLS variables in the main executable did not work for me on
RISCV, only TLS variables in shared libraries, unlike on other architectures
I tested where TLS variables in both the executable and shared libraries
worked.

-- 
John Baldwin

                                                                            


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]