This is the mail archive of the libc-ports@sources.redhat.com mailing list for the libc-ports project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] ARM: Add pointer guard support.


On 09/25/2013 12:23 PM, Will Newton wrote:
> On 25 September 2013 17:09, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 09/25/2013 05:06 AM, Will Newton wrote:
>>>
>>> Add support for pointer mangling in glibc internal structures in C
>>> and assembler code.
>>>
>>> Tested on armv7 with hard and soft thread pointers.
>>
>> Have you measured the performance versus using the existing
>> global variable?
> 
> No, but I'll put together a patch for that approach and see how it looks.
> 
>> TLS access on ARM is quite slow and it looks to me like it
>> may be faster to use the global variable. Keep in mind that
>> the pointer guard and stack guard do not vary by thread.
> 
> From a back of the envelope calculation the cost of accessing TLS is
> one cycle faster than accessing a global in best case (e.g.
> Cortex-A15), considerably slower in common case (50-60 cycles,
> Cortex-A9) and slower still in worst case (function call to libgcc and
> the kernel, ARMv4/ARMv5).
> 
> Pointer guard looks to be on slower code paths anyway as compared to
> stack guard, but you may be right that the global variable solution is
> the best way to go.

Thanks for exploring this solution.

>> 32-bit ARM is currently using a global variable e.g.
>> __pointer_chk_guard, all you need to do to make it work
>> is adjust the definitions of PTR_MANGLE and PTR_DEMANGLE
>> to reference the global symbol.
>>
>> This is the second proposal for ARM (first was [1] for
>> AArch64) to support storing the a guard in the TCB, but
>> nobody has responded yet to my question about performance.
> 
> AArch64 the equation is different - all AArch64 cores have a TLS
> register, and while it is not general purpose I suspect accessing it
> will be much faster than on the worst performing 32bit cores. I don't
> have any numbers though.

I don't disagree with you, but I'd like to see some due-diligence
in testing out the two alternatives and reporting back the performance
numbers. You need not implement both, just test two access methods
using a small test program and report the data.

Cheers,
Carlos.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]