This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: Allow pie links to create PLT entries
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Sriraman Tallam <tmsriram at google dot com>
- Cc: Cary Coutant <ccoutant at google dot com>, binutils <binutils at sourceware dot org>, David Li <davidxl at google dot com>, Ian Lance Taylor <iant at google dot com>
- Date: Thu, 29 Jan 2015 15:13:17 -0800
- Subject: Re: Allow pie links to create PLT entries
- Authentication-results: sourceware.org; auth=none
- References: <CAAs8HmyEG-m74+vcKFzuFTzVB-1cQvp1K_k3Hji=9ZnFci7CtA at mail dot gmail dot com> <CAMe9rOoW6NDcAgTdY1rATCR+ncLd3RaoMyX=hqFU-A6hxBHAUQ at mail dot gmail dot com> <CAAs8HmyLBFgrj70-U8xBuDv00RbESBwznAs6+9Q_tm_1cRoUkA at mail dot gmail dot com> <CAMe9rOqEx8X2444FCZJDbQm=VKniUM0bRNaUuqknQyeOnVj7HA at mail dot gmail dot com> <CAAs8Hmxm4ya74vf6TpJOAYFO3Yn17bDj=wNN40Hr=nC9M7pPiA at mail dot gmail dot com>
On Thu, Jan 29, 2015 at 2:17 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Jan 29, 2015 at 12:17 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Jan 29, 2015 at 12:08 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Thu, Jan 29, 2015 at 11:48 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, Jan 29, 2015 at 11:00 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Hi,
>>>>>
>>>>> Here is a simple example that fails to link with -pie but which
>>>>> should work just fine without having to use -fPIE.
>>>>>
>>>>> foo.cc
>>>>> ======
>>>>> int extern_func();
>>>>> int main()
>>>>> {
>>>>> extern_func();
>>>>> return 0;
>>>>> }
>>>>>
>>>>> bar.cc
>>>>> =====
>>>>> int extern_func()
>>>>> {
>>>>> return 1;
>>>>> }
>>>>>
>>>>> $ g++ -fPIC -shared bar.cc -o libbar.so
>>>>> $ g++ foo.cc -lbar -pie
>>>>>
>>>>> ld: error: foo.o: requires dynamic R_X86_64_PC32 reloc against
>>>>> '_Z11extern_funcv' which may overflow at runtime; recompile with -fPIC
>>>>>
>>>>> It fails because the linker disallows creating a PLT for
>>>>> R_X86_64_PC32 reloc when it is perfectly fine to do so. Note that I
>>>>> could have recompiled foo.cc with -fPIE or -fPIC but I still think
>>>>> this can be allowed. With support for copy relocations in pie in gold
>>>>> and with this support, the cases where we would need to use -fPIE to
>>>>> get working pie links is smaller. This would help us link non-PIE
>>>>> objects into pie executables.
>>>>
>>>> You can't do it for x86 since EBX isn't setup for calling via PLT.
>>>> For x86-64, there should be little difference between PIE
>>>> and non-PIE code.
>>>
>>> True but that little difference is sometimes causing non-trivial
>>> performance penalties. With copyrelocations support for PIE added
>>> recently, one big difference causing non-trivial performance penalty
>>> went away. However, there are still differences in the way global
>>> arrays are accessed. For instance,
>>>
>>> uint32 a[] = {1, 2, 3, 4}
>>>
>>> a[i] can be accessed with one insn without -fPIE, whereas with -fPIE,
>>> we need two. One extra to get the 64-bit address of a.
>>>
>>> Without -fPIE:
>>>
>>> movslq 0x1655(%rip),%rax # 401b80 <i>
>>> mov 0x401b30(,%rax,4),%esi # a[i]
If you link it with -pie, you will have TEXTREL in executable.
Do you want relocations in text sections in PIE?
>>> With -fPIE:
>>>
>>> movslq 0x16c5(%rip),%rdx # <i>
>>> lea 0x166e(%rip),%rax # <&a>
>>> mov (%rax,%rdx,4),%esi # a[i]
>>>
>>> I wish we could use just one insn to do the last two in the -fPIE
>>> case, using PC-relative addressing like:
>>> mov 0x166e(%rip, %rdx, 4), %esi
>>
>> Can you improve GCC codegen for this?
>
> I didnt find an instruction similar to that which I could use. Is there one?
>
> I implemented an
>> optimization in ld to convert
>>
>> mov foo@GOTPCREL(%rip), %reg
>> to
>> lea foo(%rip), %reg
>>
>> for the locally defined symbol, foo. It improves PIE performance
>> by as much as 10%. You may want to implement it in gold. See
>> elf_x86_64_convert_mov_to_lea for details.
>
> Wow, this is cool! But, with copy relocations support for PIE, I think
> this should be fixed since the compiler can safely assume that the
> global is defined in the executable no matter what. Do you have an
> example where foo@GOTPCREL is still used for globals?
>
> foo.cc
> ---------
> extern int a;
> int main()
> {
> printf("%p", &a);
> }
>
> Before copyrelocations support for PIE check in GCC:
>
> foo.s
> ------
>
> ....
> movq a@GOTPCREL(%rip), %rax
> .....
>
> and after copyrelocs support:
>
> foo.s
> ------
>
> .......
> leaq a(%rip), %rsi
> ......
>
> Did I miss something?
>
>
If you don't have GOTPCREL relocations against locally
defined symbols, this optimization won't apply.
--
H.J.