This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC] Clean up SSE variable shifts


(1) Instead of having the compiler generate a jump table,
use a computed branch inside inline assembly.

It's tempting to actually share code here, and generate
the table out-of-line with entries like

	psrldq $1, %xmm0
	ret

and use call *%1 in the inline assembly.  The use of

  register __m128i value __asm__("%xmm0");

could be used to restrict the compiler to the single
register supported by the out-of-line table.  It doesn't
look like this would unduly hamper the compiler in the
places it's used.

There are currently 5 copies of this jump table in libc.
We'd save 4*8*16 = 512 bytes of code space with the 
out-of-line version.

(2) The two instances of jump tables involving palignr
can me done just as easily by re-reading the data via
an unaligned load.  From a hot cache, that's surely
faster than anything else we could do here.

Thoughts?


r~

Attachment: z
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]