This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New .nops directive, to aid Linux alternatives patching?


On Sat, Feb 10, 2018 at 9:22 AM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 10/02/18 15:44, H.J. Lu wrote:
>> On Fri, Feb 9, 2018 at 5:29 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 09/02/18 11:55, H.J. Lu wrote:
>>>> On Fri, Feb 9, 2018 at 3:35 AM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>>> On 09/02/18 02:22, H.J. Lu wrote:
>>>>>> On Thu, Feb 8, 2018 at 5:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>>> On Thu, Feb 8, 2018 at 4:45 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>>>>>> On 09/02/2018 00:24, H.J. Lu wrote:
>>>>>>>>> On Thu, Feb 8, 2018 at 3:47 PM, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>>>>>>>> On 08/02/2018 20:36, H.J. Lu wrote:
>>>>>>>>>>> On Thu, Feb 8, 2018 at 12:33 PM, Andrew Cooper
>>>>>>>>>>> <andrew.cooper3@citrix.com> wrote:
>>>>>>>>>>>> On 08/02/2018 20:28, H.J. Lu wrote:
>>>>>>>>>>>>> On Thu, Feb 8, 2018 at 12:27 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>>>>>>>>>> On Thu, Feb 8, 2018 at 12:18 PM, Andrew Cooper
>>>>>>>>>>>>>> <andrew.cooper3@citrix.com> wrote:
>>>>>>>>>>>>>>> On 08/02/2018 20:10, H.J. Lu wrote:
>>>>>>>>>>>>>>>> On Thu, Feb 8, 2018 at 11:26 AM, Andrew Cooper
>>>>>>>>>>>>>>>> <andrew.cooper3@citrix.com> wrote:
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I realise this is a little bit niche, but how feasible would it be to
>>>>>>>>>>>>>>>>> introduce a new .nops directive which takes a size parameter, and
>>>>>>>>>>>>>>>>> outputs long nops covering the number of specified bytes?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sounds to me you want a pseudo NOP instruction:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> pseudo-NOP N
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> which generates a long NOP with N byte.  Is that correct.  If yes,
>>>>>>>>>>>>>>>> what is the range of N?
>>>>>>>>>>>>>>> Currently 255 based on other implementation limits, and I expect that
>>>>>>>>>>>>>>> ought to be long enough for anyone.  There is one existing user for
>>>>>>>>>>>>>>> N=43, and I expect that to grow a bit.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The real answer properly depends at what point it is more efficient to
>>>>>>>>>>>>>>> jmp rather than wasting decode bandwidth decoding nops, and I don't know
>>>>>>>>>>>>>>> the answer, but expect that it isn't larger than 255.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How about
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> {nop} N
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If N is less than 15 bytes, it generates a long nop.   Otherwise, we use a jump
>>>>>>>>>>>>>> instruction over nops.  Does it work for you?
>>>>>>>>>>>>> N will be limited to 255.
>>>>>>>>>>>> Do you mean up to 255 bytes of adjacent long nops, or still a jump if
>>>>>>>>>>>> over 15 bytes?  For alternatives in the range of 15-30, a jmp is almost
>>>>>>>>>>>> certainly slower than executing through the nops.  The ORM isn't clear
>>>>>>>>>>>> where the split lies, and I expect it is very uarch specific.
>>>>>>>>>>> How about this
>>>>>>>>>>>
>>>>>>>>>>> {nop} N, L
>>>>>>>>>>> {nop} N
>>>>>>>>>>>
>>>>>>>>>>> N is < =255. If L is missing, L is 15.
>>>>>>>>>>>
>>>>>>>>>>> If N < L then
>>>>>>>>>>>   Long NOPs up to N bytes
>>>>>>>>>>> else
>>>>>>>>>>>   jmp + long nops up to N bytes.
>>>>>>>>>>> fi
>>>>>>>>>> I'm afraid that I don't think that will be very helpful in that form.
>>>>>>>>>> Are there technical reasons why you don't want to emit more than a
>>>>>>>>>> single 15byte long nop?
>>>>>>>>>>
>>>>>>>>> Doesn't
>>>>>>>>>
>>>>>>>>> {nop} 28, 40
>>>>>>>>>
>>>>>>>>> generate 2 x 14-byte nops?
>>>>>>>> By the above logic, yes.  I still don't see the value in the L
>>>>>>>> parameter, because I don't expect an average programmer to know how to
>>>>>>>> choose it sensibly.  Then again, a compiler generating code for a
>>>>>>>> specified uarch probably could have some idea of what value to feed in.
>>>>>>>>
>>>>>>>> If the semantics were a little more like:
>>>>>>>>
>>>>>>>> {nop} N => N bytes of nops with no jumps
>>>>>>>> {nop} N, L => as above
>>>>>>>>
>>>>>>>> Then this might be more useful.
>>>>>>>>
>>>>>>>> I expect N will typically be an expression rather than an absolute
>>>>>>>> number, because the usecase I've proposed is for filling in a specific,
>>>>>>>> calculated number of bytes.  (In particular, what commonly happens is
>>>>>>>> that memory references in alternatives are the thing which cause the
>>>>>>>> exact length to fluctuate.)  When there is a sensible uarch value for L,
>>>>>>>> that can be fed in, but shouldn't be mandatory.  In particular, if it
>>>>>>>> unknown, 15 is almost certainly the wrong default for it.
>>>>>>> So, you want
>>>>>>>
>>>>>>> .nop SIZE
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> .jump SIZE
>>>>>>>
>>>>>>> which are similar to '.skip SIZE , FILL'.  But they fill SIZE with nops or
>>>>>>> jmp + nops.
>>>>>>>
>>>>>> Or
>>>>>>
>>>>>> .nop SIZE, JUMP_SIZE
>>>>>>
>>>>>> If SIZE < JUMP_SIZE then
>>>>>>   SIZE of nops.
>>>>>> else
>>>>>>   SIZE of jmp + nops.
>>>>>> fi
>>>>> I'm still not sure why you want the jump functionality in the first
>>>>> place, but yes - this latest option would work.
>>>>>
>>>>> FWIW, jumping over code with alternatives is typically done like:
>>>>>
>>>>> ALTERNATIVE "jmp .L\@_skip", "", FEATURE_X
>>>>> ...
>>>>> .L\@_skip:
>>>>>
>>>>> At which point it is only the two or 5 byte jmp which is being
>>>>> dynamically modified.  The converse case is where we begin with 2 or 5
>>>>> bytes of nops, and dynamically insert the jmp.
>>>>>
>>>>> If we're in the line for other related feature requests, how about being
>>>>> able to optionally specify the maximum length of individual nops?  e.g.
>>>>>
>>>>> .nop SIZE [, MAX_NOP = 9 [, JUMP_SIZE = -1]]
>>>> OK, let go with
>>>>
>>>>  .nop SIZE [, MAX_NOP = 9]
>>>>
>>>> It is easier to implement with 2 arguments.   MAX_NOP must be a constant.
>>> Sounds good to me.
>> Please try users/hjl/nop branch:
>>
>> https://github.com/hjl-tools/binutils-gdb/tree/users/hjl/nop
>
> Oh - thankyou!  I was about to ask if there were any pointers to get
> started hacking on binutils.
>
> As for the functionality, there are unfortunately some issues.  Given
> this source:
>
>         .text
> single:
>         nop
>
> pseudo_1:
>         .nop 1
>
> pseudo_8:
>         .nop 8
>
> pseudo_8_4:
>         .nop 8, 4
>
> pseudo_20:
>         .nop 20
>
> I get the following disassembly:
>
> 0000000000000000 <single>:
>    0:    90                       nop
>
> 0000000000000001 <pseudo_1>:
>    1:    66 90                    xchg   %ax,%ax
>
> 0000000000000003 <pseudo_8>:
>    3:    66 0f 1f 84 00 00 00     nopw   0x0(%rax,%rax,1)
>    a:    00 00
>
> 000000000000000c <pseudo_8_4>:
>    c:    90                       nop
>    d:    0f 1f 40 00              nopl   0x0(%rax)
>   11:    0f 1f 40 00              nopl   0x0(%rax)
>
> 0000000000000015 <pseudo_20>:
>   15:    90                       nop
>   16:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
>   1d:    00 00 00
>   20:    66 2e 0f 1f 84 00 00     nopw   %cs:0x0(%rax,%rax,1)
>   27:    00 00 00
>
> The MAX_NOP part looks to be working as intended (including reducing
> below the default of 10), but there appears to be an off-by-one
> somewhere, as one too many nops are emitted in the block.
>
> Furthermore, attempting to use .nop 30 yields:
>
> /tmp/ccI2Eakp.s: Assembler messages:
> /tmp/ccI2Eakp.s: Fatal error: can't write 145268933551616 bytes to
> section .text of nops.o: 'Bad value'

Please try my branch again.  It should be fixed.


-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]