This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: MMU Off / Strict Alignment
- From: "Jonathan S. Shapiro" <shap at eros-os dot org>
- To: Richard Earnshaw <rearnsha at arm dot com>
- Cc: Christopher Covington <cov at codeaurora dot org>, "newlib at sourceware dot org" <newlib at sourceware dot org>, Marcus Shawcroft <marcus dot shawcroft at linaro dot org>, Matthew Gretton-Dann <matthew dot gretton-dann at linaro dot org>, "linaro-toolchain at lists dot linaro dot org" <linaro-toolchain at lists dot linaro dot org>
- Date: Tue, 17 Dec 2013 21:06:23 -0800
- Subject: Re: MMU Off / Strict Alignment
- Authentication-results: sourceware.org; auth=none
- References: <528CF7F1 dot 5050001 at codeaurora dot org> <CADSXKXqJgD3cq594+NeRk9=QHA1DKh3o7aPjsVYOx5OqT1Y6pw at mail dot gmail dot com> <52AF3E5A dot 4050507 at codeaurora dot org> <52B00D46 dot 6050302 at arm dot com>
At the risk of sticking my nose in, this isn't a startup code issue.
It's a contract issue.
First, I don't buy Richard's argument about memcpy() startup costs and
hard-to-predict branches. We do those tests on essentially every
*other* RISC platform without complaint, and it's very easy to order
those branches so that the currently efficient cases run well. Perhaps
more to the point, I haven't seen anybody put forward quantitative
data that using the MMU for unaligned references is any better than
executing those branches. Speaking as a recovering processor
architect, that assumption needs to be validated quantitatively. My
guess is that the branches are faster if properly arranged.
Second, this is a contract issue. If newlib intends to support
embedded platforms, then it needs to implement algorithms that are
functionally correct without relying on an MMU. By all means use
simpler or smarter algorithms when an MMU can be assumed to be
available in a given configuration, but provide an algorithm that is
functionally correct when no MMU is available. "Good overall
performance in memcpy" is a fine thing, but it is subject to the
requirement of meeting functional specifications. As Jochen Liedtke
famously put it (read this in a heavy German accent): "Fast, ya. But
correct? (shrug) Eh!"
So: we need a normative statement saying what the contract is. The
rest of the answer will fall out from that.
I do agree with Richard that startup code is special. I've built
deeply embedded runtimes of one form or another for 25 years now, and
I have yet to see a system where optimizing a simplistic byte-wise
memcpy during bootstrap would have made any difference in anything
overall. That said, if the specification of memcpy requires it to
handle incompatibly aligned pointers (and it does), and the contract
for newlib requires it to operate in MMU-less scenarios in a given
configuration (which, at least in some cases, it does), it's
completely legitimate to expect that bootstrap code can call memcpy()
and expect behavior that meets specifications.
So what's the contract?
Regards,
Jonathan S. Shapiro