This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: Cygwin multithreading performance
- From: Mark Geisert <mark at maxrnd dot com>
- To: cygwin-developers at cygwin dot com
- Date: Tue, 8 Dec 2015 21:31:20 -0800
- Subject: Re: Cygwin multithreading performance
- Authentication-results: sourceware.org; auth=none
- References: <5653B52B dot 5000804 at maxrnd dot com> <20151126093427 dot GJ2755 at calimero dot vinschen dot de> <5656DDEF dot 9070603 at maxrnd dot com> <5662C199 dot 7040906 at maxrnd dot com> <CABPLAST5EnifrAQ2xKZmohKhyxQHh=K3x3DeCL+BTdHN8nN98w at mail dot gmail dot com> <566367C8 dot 5020703 at maxrnd dot com> <CABPLASSY3WWpHAeh=5gqRKdg85M8Wzkrq9qMaDhzhk2zvxgcOw at mail dot gmail dot com> <5663EB9A dot 40002 at maxrnd dot com> <CABPLASQZrDMnN32GG3-qRsnHhjsoroaY7ti1wx5uASDqdU7M+g at mail dot gmail dot com> <5666B61F dot 9050209 at maxrnd dot com> <20151208153438 dot GL22073 at calimero dot vinschen dot de>
Corinna Vinschen wrote:
On Dec 8 02:51, Mark Geisert wrote:
(Maybe cygwin-developers is a better list for this? It's pretty obscure.)
Yes, cygwin-developers is fine since it's gory implementation details.
Here are some mutex lock stats I've been talking about providing. These are
from the OP's original testcase 'git repack -a -f' running over a clone of
the newlib-cygwin source tree. Run on a 2-core, 4-HT machine under Windows
7 x64. I'm running a slightly modified cygwin1.dll that has 3 one-line mods
to thread.cc.
Which I'd like to see a patch of, just to know what you mean.
I'm considering adding the tools that produced these displays to the
cygutils package. I'm unsure if the cygwin1.dll mods I've made locally
should be shipped generally; I don't know how much extra CPU they use, if
any.
Well, let's have a look. This is open source after all :)
Here's my patchlet against Cygwin 2.3.1-1...
/oss/src/winsup/cygwin diff -u thread.cc.safe thread.cc
--- thread.cc.safe 2015-11-14 03:41:15.000000000 -0800
+++ thread.cc 2015-12-04 03:49:03.463794000 -0800
@@ -1752,6 +1752,7 @@
int
pthread_mutex::lock ()
{
+ int tstamp = strace.microseconds ();
pthread_t self = ::pthread_self ();
int result = 0;
@@ -1772,8 +1773,8 @@
result = EDEADLK;
}
- pthread_printf ("mutex %p, self %p, owner %p, lock_counter %d, recursion_counter %u",
- this, self, owner, lock_counter, recursion_counter);
+ pthread_printf ("mutex %p, self %p, owner %p, lock_counter %d, recursion_counter %u, tstamp %d, caller %p",
+ this, self, owner, lock_counter, recursion_counter, tstamp, caller_return_address ());
return result;
}
@@ -1801,8 +1802,8 @@
res = 0;
}
- pthread_printf ("mutex %p, owner %p, self %p, lock_counter %d, recursion_counter %u, type %d, res %d",
- this, owner, self, lock_counter, recursion_counter, type, res);
+ pthread_printf ("mutex %p, owner %p, self %p, lock_counter %d, recursion_counter %u, type %d, res %d, caller %p",
+ this, owner, self, lock_counter, recursion_counter, type, res, caller_return_address ());
return res;
}
The pthread_printf() call modifications don't add any CPU unless there's a
relevant strace in progress, so that should be acceptable. The other mod at
the start of the function adding the strace.microseconds() call is what I was
a little concerned about. That call needn't be done unless an strace is in
progress but I do not know how expensive it is. It appears to be a
QueryPerformanceCounter() underneath. If that's cheap, the mod is OK as-is.
..mark