This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PPC atomic.h add compare_exchange_val forms


Kevin B. Hendricks writes:

> Yes, I have always wondered the same thing.  In fact, if any writes 
> to the same cache line (but not the same exact reserved address) also 
> clear the reservation, then you could certainly slow things down by 
> accessing items in the structure that might fall in the same cache 
> line as the atomic_t (for example even immediately *before* it in the 
> case of locks)

Its a good thing if the thread reading/writing to the same cache line 
is the same thread that just got the lock (the lwarx sucked in the 
lock word and the following (hopefully) related data). 

Its a bad thing if the data following the lock is unrelated.

The good news is its fairly simple to try this (at least for nptl). 
For example in nptl/sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h 
I can add the __attribute__ (__aligned__) to each of the varius lock 
structures (pthread_mutex_t, pthread_cond_t, pthread_rwlock_t, 
pthread_barrier_t). 

For example:

typedef union
{
  struct
  {
    int __lock;
    unsigned int __count;
    struct pthread *__owner;
    int __kind;
  } __data;
  char __size[__SIZEOF_PTHREAD_MUTEX_T];
  long int __align  __attribute__((__aligned__(__CACHE_ALIGN_SIZE)));
} pthread_mutex_t;

For PPC64 (__CACHE_ALIGN_SIZE = 128) the result is both size and 
alignement 
of 128 bytes. However:

typedef union
{
  struct
  {
    int __lock;
    unsigned int __count;
    struct pthread *__owner;
    int __kind;
  } __data;
  char __size[__SIZEOF_PTHREAD_MUTEX_T];
  long int __align;
} pthread_mutex_t  __attribute__((__aligned__(__CACHE_ALIGN_SIZE)));

Results in size 40 and alignment of 128.

The first case is good for avoiding false sharing in static storage. 
While the second case is better for grouping locks and related data within
larger structs. Both cases force cache line alignment within arrays.

The only problem I found with this is nptl/pthread_mutex_init.c
includes the following assert:


  assert (sizeof (pthread_mutex_t) <= __SIZEOF_PTHREAD_MUTEX_T);

This fails the make check for the first case but works fine with the
second. But I am not sure what this assert is for (pthread_cond_init.c, 
pthread_rwlock_init.c, and pthread_barrier_init.c don't do this)?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]