This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH v4] Make bindresvport() function to multithread-safe
- From: Peng Haitao <penght at cn dot fujitsu dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: "Carlos O'Donell" <carlos at systemhalted dot org>, Roland McGrath <roland at hack dot frob dot com>, Pedro Alves <palves at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, Andreas Jaeger <aj at suse dot com>
- Date: Tue, 16 Oct 2012 12:05:14 +0800
- Subject: Re: [PATCH v4] Make bindresvport() function to multithread-safe
- References: <1348823725-18793-1-git-send-email-penght@cn.fujitsu.com> <CAE2sS1hJLkePJXMw8wCXQ48einUnvjPSbjX11LMzCVvT3i3zZg@mail.gmail.com> <5065B74C.8090704@redhat.com> <5065B818.2020908@systemhalted.org> <5065C207.2020709@redhat.com> <CAE2sS1jK77AOr9dUP+1ri_aNxHQxajruxOTF3yydZ7hOJ2wW4A@mail.gmail.com> <5065C886.6020909@redhat.com> <CAE2sS1jL=3PgZJogO1rvL5LQBgt37RWEcYs9hQ-MRSak3JZN1w@mail.gmail.com> <20120928162506.8A2572C074@topped-with-meat.com> <CAE2sS1hkkUBxssFOGc0oghKLc+Syc6KG8-T9o8TOEwaL9dGjoA@mail.gmail.com> <20120928163408.874942C091@topped-with-meat.com> <CAE2sS1iwLkNYRVu8S+shHNdvaX-mND6zYQxtipkXnxTk4OK8bw@mail.gmail.com> <1349084461.3374.5880.camel@triegel.csb>
On 10/01/2012 05:41 PM, Torvald Riegel wrote:
> Peng, I've seen that you've been using perf already to get the
> performance numbers. What about comparing cache-misses too? Perhaps
> that could explain why you're seeing improvements when you make things
> thread-safe by separating state for different threads.
>
I add the cache-misses.
The single-thread test result is as follows:
Before the patch, execute the test program with 500 times:
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,132,667 instructions # 0.00 insns per cycle ( +- 0.15% )
6,954 cache-misses ( +- 6.73% )
0.002997813 seconds time elapsed ( +- 0.94% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,138,553 instructions # 0.00 insns per cycle ( +- 0.15% )
6,440 cache-misses ( +- 6.48% )
0.003003190 seconds time elapsed ( +- 0.96% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,135,453 instructions # 0.00 insns per cycle ( +- 0.15% )
5,914 cache-misses ( +- 6.88% )
0.003010335 seconds time elapsed ( +- 0.98% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,119,885 instructions # 0.00 insns per cycle ( +- 0.21% )
6,367 cache-misses ( +- 6.40% )
0.003011351 seconds time elapsed ( +- 1.07% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,117,720 instructions # 0.00 insns per cycle ( +- 0.20% )
5,827 cache-misses ( +- 7.00% )
0.002979921 seconds time elapsed ( +- 1.15% )
After the patch, execute the test program with 500 times:
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,177,151 instructions # 0.00 insns per cycle ( +- 0.19% )
6,629 cache-misses ( +- 5.45% )
0.002982136 seconds time elapsed ( +- 1.04% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,201,042 instructions # 0.00 insns per cycle ( +- 0.15% )
5,871 cache-misses ( +- 7.03% )
0.002994608 seconds time elapsed ( +- 1.15% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,208,919 instructions # 0.00 insns per cycle ( +- 0.15% )
5,819 cache-misses ( +- 6.60% )
0.003030730 seconds time elapsed ( +- 0.98% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,190,925 instructions # 0.00 insns per cycle ( +- 0.20% )
6,272 cache-misses ( +- 5.87% )
0.003025345 seconds time elapsed ( +- 1.04% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_test > /dev/null
Performance counter stats for './bindresvport_test' (100 runs):
5,205,173 instructions # 0.00 insns per cycle ( +- 0.15% )
5,515 cache-misses ( +- 7.45% )
0.003008636 seconds time elapsed ( +- 1.10% )
After test many times, the single-thread's performance goes almost same.
cache-misses can not explain what performance is modified.
The multi-threaded test result is as follows:
Before the patch, execute the test program with 500 times:
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
7,124,908 instructions # 0.00 insns per cycle ( +- 0.66% )
20,311 cache-misses ( +- 7.19% )
0.002589529 seconds time elapsed ( +- 2.94% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
7,117,596 instructions # 0.00 insns per cycle ( +- 0.72% )
19,335 cache-misses ( +- 6.44% )
0.002545734 seconds time elapsed ( +- 3.43% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
7,099,283 instructions # 0.00 insns per cycle ( +- 0.63% )
18,450 cache-misses ( +- 5.19% )
0.002408456 seconds time elapsed ( +- 2.70% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
7,082,442 instructions # 0.00 insns per cycle ( +- 0.62% )
18,325 cache-misses ( +- 5.91% )
0.002463885 seconds time elapsed ( +- 2.20% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
7,092,449 instructions # 0.00 insns per cycle ( +- 0.71% )
20,238 cache-misses ( +- 5.90% )
0.002444265 seconds time elapsed ( +- 2.27% )
After the patch, execute the test program with 500 times:
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/nullbindresvport: Address already in use
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
9,070,761 instructions # 0.00 insns per cycle ( +- 0.75% )
19,888 cache-misses ( +- 5.56% )
0.002591574 seconds time elapsed ( +- 2.56% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
9,182,530 instructions # 0.00 insns per cycle ( +- 0.72% )
20,737 cache-misses ( +- 6.41% )
0.002552639 seconds time elapsed ( +- 3.04% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
9,059,625 instructions # 0.00 insns per cycle ( +- 0.82% )
20,460 cache-misses ( +- 5.80% )
0.002610346 seconds time elapsed ( +- 2.85% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
9,106,997 instructions # 0.00 insns per cycle ( +- 0.78% )
18,848 cache-misses ( +- 3.09% )
0.002621884 seconds time elapsed ( +- 2.54% )
# perf stat -r 100 -e instructions,cache-misses -- ./bindresvport_mul_test > /dev/null
bindresvport: Address already in use
...
Performance counter stats for './bindresvport_mul_test' (100 runs):
8,984,585 instructions # 0.00 insns per cycle ( +- 0.80% )
18,455 cache-misses ( +- 3.09% )
0.002436007 seconds time elapsed ( +- 2.88% )
After test many times, the multi-thread's performance goes a little down:(
I think the reason is as follows:
Before the patch, getpid() will call once in multi-threaded circumstance.
After the patch, gettid() will call as many as the numbers of thread
in multi-threaded circumstance.
> Also, if possible, I think it's often better to make a hypothesis why
> performance would change this or that way due to a patch, and then try
> to validate this hypothesis with measurements. If you just look at
> coarse numbers for a certain test case, it might be true that the patch
> makes the particular test case on your particular test machine faster,
> but then you don't know why; it could be the case that this is just an
> outlier, and the patch isn't improving performance in general.
>
My hypothesis is the performance will stay almost same after the patch.
Thanks.
--
Best Regards,
Peng