This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

x86 TCB_ALIGNMENT to 64 for cache-line alignment


[I sent this to libc-hacker first, but it didn't appear.
I think I may have fallen off that list.]

On Intel Atom chips, it's documented in the Intel optimization manual that
there is something like a nine-cycle penalty for every access using a
segment override (i.e. %fs:... and %gs:...) when the segment's base is not
aligned to the size of a cache line.  Our settings for TCB_ALIGNMENT are
less than the size of cache lines on current Atom chips, which is 64.

If nobody has a strenuous objection, I'll put in the following change (that
is, merge the roland/x86-tcb-align branch).  This is simple and costs only
a little per-thread memory bloat (48 bytes).  I'm told there is no reason
to suspect that cache lines will grow larger any year soon.  If we were
concerned either about future larger cache line sizes or about minimizing
the small memory bloat on older chips where cache lines are smaller, we
could pretty easily make this alignment decision be dynamic based on the
existing cpuid detection of cache line sizes.  But to me it doesn't seem
worth the complexity.

Ok?


Thanks,
Roland


nptl/
2011-08-14  Roland McGrath  <roland@hack.frob.com>

	* sysdeps/i386/pthreaddef.h (TCB_ALIGNMENT): Set to 64, optimal on Atom.
	* sysdeps/x86_64/pthreaddef.h (TCB_ALIGNMENT): Likewise.

diff --git a/nptl/sysdeps/i386/pthreaddef.h b/nptl/sysdeps/i386/pthreaddef.h
index 43b771c..1e84066 100644
--- a/nptl/sysdeps/i386/pthreaddef.h
+++ b/nptl/sysdeps/i386/pthreaddef.h
@@ -1,4 +1,4 @@
-/* Copyright (C) 2002, 2003 Free Software Foundation, Inc.
+/* Copyright (C) 2002,2003,2011 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
@@ -27,8 +27,14 @@
 /* Minimal stack size after allocating thread descriptor and guard size.  */
 #define MINIMAL_REST_STACK	2048
 
-/* Alignment requirement for TCB.  */
-#define TCB_ALIGNMENT		16
+/* Alignment requirement for TCB.
+
+   Some processors such as Intel Atom pay a big penalty on every
+   access using a segment override if that segment's base is not
+   aligned to the size of a cache line.  (See Intel 64 and IA-32
+   Architectures Optimization Reference Manual, section 13.3.3.3,
+   "Segment Base".)  On such machines, a cache line is 64 bytes.  */
+#define TCB_ALIGNMENT		64
 
 
 /* Location of current stack frame.  */
diff --git a/nptl/sysdeps/x86_64/pthreaddef.h b/nptl/sysdeps/x86_64/pthreaddef.h
index 8ec135c..9de4af2 100644
--- a/nptl/sysdeps/x86_64/pthreaddef.h
+++ b/nptl/sysdeps/x86_64/pthreaddef.h
@@ -1,4 +1,4 @@
-/* Copyright (C) 2002, 2003, 2007 Free Software Foundation, Inc.
+/* Copyright (C) 2002,2003,2007,2011 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
@@ -27,9 +27,17 @@
 /* Minimal stack size after allocating thread descriptor and guard size.  */
 #define MINIMAL_REST_STACK	2048
 
-/* Alignment requirement for TCB.  Need to store post-AVX vector registers
-   in the TCB and we want the storage to be aligned at 32-byte.  */
-#define TCB_ALIGNMENT		32
+/* Alignment requirement for TCB.
+
+   We need to store post-AVX vector registers in the TCB and we want the
+   storage to be aligned to at least 32 bytes.
+
+   Some processors such as Intel Atom pay a big penalty on every
+   access using a segment override if that segment's base is not
+   aligned to the size of a cache line.  (See Intel 64 and IA-32
+   Architectures Optimization Reference Manual, section 13.3.3.3,
+   "Segment Base".)  On such machines, a cache line is 64 bytes.  */
+#define TCB_ALIGNMENT		64
 
 
 /* Location of current stack frame.  The frame pointer is not usable.  */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]