This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC 2/3] Record function descriptor address instead of function address in value

From: Yao Qi <qiyaoltc at gmail dot com>
To: Ulrich Weigand <uweigand at de dot ibm dot com>
Cc: "gdb-patches at sourceware dot org" <gdb-patches at sourceware dot org>
Date: Fri, 28 Oct 2016 17:09:57 +0100
Subject: Re: [RFC 2/3] Record function descriptor address instead of function address in value
Authentication-results: sourceware.org; auth=none
References: <1476442387-17291-3-git-send-email-yao.qi@linaro.org> <20161017155133.A9B8711C257@oc8523832656.ibm.com>

Hi Ulrich,
I did some experiments these days, for all your suggestions, and
comments.  I share the results now...

On Mon, Oct 17, 2016 at 4:51 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Yao Qi wrote:
>
>> GCC vs GDB divergence on the meaning of "incr" brings the confusion.
>> We should reduce such divergence as much as we can.  However, this
>> divergence was added in https://sourceware.org/ml/gdb-patches/2001-11/msg00001.html
>> I agree with Jim, but I'd like use function descriptor address in value,
>> which makes the whole expression evaluation look more reasonable and
>> more close to compiler's behavior.
>
> I agree that this is a problem.  However, the reasons Jim mentions in
> that long comment are still valid: it is in general *not possible* to
> convert an arbitrary function address back to a function descriptor ...
>
>> In this patch, I add a new gdbarch method convert_from_func_addr, which
>> converts function address back to function descriptor address or
>> function pointer address.  It is the reversed operation of
>> convert_from_func_ptr_addr.
>
> ... and therefore this function cannot really be implemented on ppc64
> in a fully-general way.  In particular, when running stripped binaries
> or code that is otherwise without symbols (e.g. JITted code etc.), this
> conversion may not be possible.

I don't expect convert_from_func_addr working in general, due to the
reasons you mentioned.  convert_from_func_addr is only used when
symbol (from debug information) is available.  If we want to GDB
handle such complex expression evaluation correctly, debug
information is required.

>
> B.t.w. note that there is already a similar function that attempts this
> conversion (ppc-sysv-tdep.c:convert_code_addr_to_desc_addr), but it is
> currently only used during inferior function calls, so it is not so
> critical if it fails.  (With your change, that function may actually
> no longer be necessary at that place since values should already point
> to function descriptors ...)
>

Yes, convert_from_func_addr is similar to convert_code_addr_to_desc_addr.
I removed convert_code_addr_to_desc_addr, but it breaks ifunc
inferior call in my experiment, because ifunc resolver gets the function
address rather than function descriptor address, we still need
convert_code_addr_to_desc_addr to get function descriptor address.

> Note that this checks for presence of an msymbol at the code address,
> while your code checks for presence of a symbol.  Maybe the best way
> would be to check for either.
>

Now, convert_from_func_addr does whatever
convert_code_addr_to_desc_addr does.

>> We convert function address to function
>> descriptor address when,
>>
>>  - we create value for a function,
>>  - we generate ax code for a function,
>
> Since the conversion is problematic in general, I think the best way
> would be to keep descriptor addresses wherever possible, and only
> convert to code addresses at the last minute when necessary.
>

Yes, I agree.

> Your code already *mostly* does that, with the exception of symbol
> addresses.  I'm wondering whether we shouldn't push the change one
> step further back and already store descriptor addresses in
> SYMBOL_VALUE_ADDRESS.  Note that this is currently already handled
> somewhat inconsistenly on ppc64: msymbols already point to the
> descriptor, and so do symbols read from stabs debug info; only
> symbols read from DWARF debug info actually point to the code
> address.

I tried what you suggested, but failed.  SYMBOL_VALUE_ADDRESS
is used to get variable address, rather than function's address.  We
get function address from block (BLOCK_START), and get block
from symbol (SYMBOL_BLOCK_VALUE).  There is no good way to
store function descriptor address in symbol.

>
>> I don't change the meaning of return value of value_as_address, because
>> it is widely used, so value_as_address still return the function
>> address if type is TYPE_CODE_FUNC or TYPE_CODE_METHOD.
>
> I'm wondering whether this is a good idea; maybe it would be better to
> return descriptor addresses and update those callers that actually need
> code addresses.  This would keep the principle that we should keep
> descriptor addresses as long as possible.  Also, this might actually
> fix more bugs; if you look at e.g. this code in value_equal:
>
>   else if (code1 == TYPE_CODE_PTR && is_int2)
>     return value_as_address (arg1) == (CORE_ADDR) value_as_long (arg2);
>   else if (code2 == TYPE_CODE_PTR && is_int1)
>     return (CORE_ADDR) value_as_long (arg1) == value_as_address (arg2);
>
> this might actually cause invalid evaluation of expressions like
>
>   incr == 0x12345678
>
> (as compared to the native C evaluation).
>

How about doing this in a followup patch as an enhancement?  My
priority is to get this RFC acceptable, and make PPC64/ARM/MIPS
works well.  Propagate descriptor address and bug fixes can be done
later.

>
>> This patch brings several user visible changes, which look more
>> accurate, shown by this table below,
>>
>>  COMMAND           BEFORE                       AFTER
>>  p main            main function address        main function descriptor
>>                                                 address
>>  disass main       disassembly function main    not changed
>>  disass main+4,+4  disassembly 4 bytes from     disassembly 4 bytes from
>>                    function main address + 4    main's function descriptor + 4
>>  x/i main          show one instruction on      show one instruction on main's
>>                    function main address        function descriptor
>>
>> Although the latter looks inconvenient, that is consistent to the
>> meaning on C language level.  Due to these changes, test cases are
>> adjusted accordingly.
>
> Those are a bit annoying.  I think the underlying problem is that operations
> like "disass" or "x/i" really don't work on "values" in the original sense
> but rather on "PC addresses".  Maybe it would make sense to have those
> function enable a special "mode" in the expression evaluator that would
> change the conversion of functions to function pointers to use the code
> address instead of the descriptor address?
>

I think about the special "mode" in the expression evaluation, but I
suspect that there is a case in a single expression, function address
is expected in some symbol, and descriptor address is expected in
some other symbol, like,

int func (int i) { return i;}

(gdb) x/i func + func (1)

the "func" is expected to be function address and the second "func"
is expected to be function descriptor address.  I had a heuristics that
VALUE means function address if we do a pointer add with long type
(valarith.c:value_ptradd).  It works fine.

I've already managed to get all my patches work.  If you agree with
my thoughts above, I'll post my patches next week.

-- 
Yao (齐尧)

Follow-Ups:
- Re: [RFC 2/3] Record function descriptor address instead of function address in value
  - From: Ulrich Weigand

References:
- [RFC 2/3] Record function descriptor address instead of function address in value
  - From: Yao Qi
- Re: [RFC 2/3] Record function descriptor address instead of function address in value
  - From: Ulrich Weigand

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]