This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v4] Make bindresvport() function to multithread-safe


On 10/01/2012 05:41 PM, Torvald Riegel wrote:
> Peng, I've seen that you've been using perf already to get the
> performance numbers.  What about comparing cache-misses too?  Perhaps
> that could explain why you're seeing improvements when you make things
> thread-safe by separating state for different threads.
> 

I add the cache-misses.

The single-thread test result is as follows:
Before the patch, execute the test program with 500 times:

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,132,667 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             6,954 cache-misses                                                  ( +-  6.73% )

       0.002997813 seconds time elapsed                                          ( +-  0.94% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,138,553 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             6,440 cache-misses                                                  ( +-  6.48% )

       0.003003190 seconds time elapsed                                          ( +-  0.96% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,135,453 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             5,914 cache-misses                                                  ( +-  6.88% )

       0.003010335 seconds time elapsed                                          ( +-  0.98% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,119,885 instructions              #    0.00  insns per cycle          ( +-  0.21% )
             6,367 cache-misses                                                  ( +-  6.40% )

       0.003011351 seconds time elapsed                                          ( +-  1.07% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,117,720 instructions              #    0.00  insns per cycle          ( +-  0.20% )
             5,827 cache-misses                                                  ( +-  7.00% )

       0.002979921 seconds time elapsed                                          ( +-  1.15% )


After the patch, execute the test program with 500 times:

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,177,151 instructions              #    0.00  insns per cycle          ( +-  0.19% )
             6,629 cache-misses                                                  ( +-  5.45% )

       0.002982136 seconds time elapsed                                          ( +-  1.04% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,201,042 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             5,871 cache-misses                                                  ( +-  7.03% )

       0.002994608 seconds time elapsed                                          ( +-  1.15% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,208,919 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             5,819 cache-misses                                                  ( +-  6.60% )

       0.003030730 seconds time elapsed                                          ( +-  0.98% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,190,925 instructions              #    0.00  insns per cycle          ( +-  0.20% )
             6,272 cache-misses                                                  ( +-  5.87% )

       0.003025345 seconds time elapsed                                          ( +-  1.04% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null

 Performance counter stats for './bindresvport_test' (100 runs):

         5,205,173 instructions              #    0.00  insns per cycle          ( +-  0.15% )
             5,515 cache-misses                                                  ( +-  7.45% )

       0.003008636 seconds time elapsed                                          ( +-  1.10% )


After test many times, the single-thread's performance goes almost same.
cache-misses can not explain what performance is modified.


The multi-threaded test result is as follows:
Before the patch, execute the test program with 500 times:

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         7,124,908 instructions              #    0.00  insns per cycle          ( +-  0.66% )
            20,311 cache-misses                                                  ( +-  7.19% )

       0.002589529 seconds time elapsed                                          ( +-  2.94% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
 Performance counter stats for './bindresvport_mul_test' (100 runs):

         7,117,596 instructions              #    0.00  insns per cycle          ( +-  0.72% )
            19,335 cache-misses                                                  ( +-  6.44% )

       0.002545734 seconds time elapsed                                          ( +-  3.43% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         7,099,283 instructions              #    0.00  insns per cycle          ( +-  0.63% )
            18,450 cache-misses                                                  ( +-  5.19% )

       0.002408456 seconds time elapsed                                          ( +-  2.70% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         7,082,442 instructions              #    0.00  insns per cycle          ( +-  0.62% )
            18,325 cache-misses                                                  ( +-  5.91% )

       0.002463885 seconds time elapsed                                          ( +-  2.20% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         7,092,449 instructions              #    0.00  insns per cycle          ( +-  0.71% )
            20,238 cache-misses                                                  ( +-  5.90% )

       0.002444265 seconds time elapsed                                          ( +-  2.27% )


After the patch, execute the test program with 500 times:

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/nullbindresvport: Address already in use
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         9,070,761 instructions              #    0.00  insns per cycle          ( +-  0.75% )
            19,888 cache-misses                                                  ( +-  5.56% )

       0.002591574 seconds time elapsed                                          ( +-  2.56% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         9,182,530 instructions              #    0.00  insns per cycle          ( +-  0.72% )
            20,737 cache-misses                                                  ( +-  6.41% )

       0.002552639 seconds time elapsed                                          ( +-  3.04% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         9,059,625 instructions              #    0.00  insns per cycle          ( +-  0.82% )
            20,460 cache-misses                                                  ( +-  5.80% )

       0.002610346 seconds time elapsed                                          ( +-  2.85% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         9,106,997 instructions              #    0.00  insns per cycle          ( +-  0.78% )
            18,848 cache-misses                                                  ( +-  3.09% )

       0.002621884 seconds time elapsed                                          ( +-  2.54% )

# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...

 Performance counter stats for './bindresvport_mul_test' (100 runs):

         8,984,585 instructions              #    0.00  insns per cycle          ( +-  0.80% )
            18,455 cache-misses                                                  ( +-  3.09% )

       0.002436007 seconds time elapsed                                          ( +-  2.88% )



After test many times, the multi-thread's performance goes a little down:(
I think the reason is as follows:
Before the patch, getpid() will call once in multi-threaded circumstance.
After the patch, gettid() will call as many as the numbers of thread 
in multi-threaded circumstance.

> Also, if possible, I think it's often better to make a hypothesis why
> performance would change this or that way due to a patch, and then try
> to validate this hypothesis with measurements.  If you just look at
> coarse numbers for a certain test case, it might be true that the patch
> makes the particular test case on your particular test machine faster,
> but then you don't know why; it could be the case that this is just an
> outlier, and the patch isn't improving performance in general.
> 

My hypothesis is the performance will stay almost same after the patch.

Thanks.

-- 
Best Regards,
Peng


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]