This is the mail archive of the
newlib@sourceware.org
mailing list for the newlib project.
Re: Generic strlen
- From: Joel Sherrill <joel dot sherrill at oarcorp dot com>
- To: Eric Blake <eblake at redhat dot com>
- Cc: "David A. Ramos" <daramos at stanford dot edu>, "newlib at sourceware dot org" <newlib at sourceware dot org>
- Date: Fri, 29 Oct 2010 13:14:15 -0500
- Subject: Re: Generic strlen
- References: <9D4C8B44-FF83-43A1-A891-4AFB116679EA@stanford.edu> <4CCB0C4D.70105@redhat.com>
On 10/29/2010 01:02 PM, Eric Blake wrote:
On 10/29/2010 11:49 AM, David A. Ramos wrote:
Hi newlib maintainers,
Our checking tools (KLEE) keeps complaining about newlib's generic strlen version. It looks like it was patched back in May 2008 to include a speed hack that violates ISO C. It attempts to first word align the pointer, and then read a word at a time to check for a NULL:
libc/string/strlen.c:
73 /* If the string is word-aligned, we can check for the presence of
74 a null in each word-sized block. */
75 aligned_addr = (unsigned long *)str;
76 while (!DETECTNULL (*aligned_addr))
77 aligned_addr++;
Obviously, this can read out of bounds if the memory allocated to that string is less than a word in length. While on most architectures this wouldn't actually cause a segfault, I don't think that's a safe assumption for the generic version of a libc routine. The same patch included an i386 target containing the same algorithm, which may be perfectly acceptable.
Thoughts?
As long as reading beyond the end of a string does not fault, you can't
detect the violation of the standard, so the as-if rule applies. Prove
to me that there is an architecture that can fault on anything less than
a word boundary, and then we'll talk about changing the code. Until
then, this implementation may violate strict C89, but it is by all means
portable to all possible platforms that newlib will ever target. And
while you can indeed detect that you are reading beyond bounds by using
tools such as valgrind, I don't consider that as a valid argument to be
worth changing the code (you should instead be installing valgrind
exceptions to document the fact that we are knowingly reading beyond
bounds, but that the end result of that extraneous read does not affect
program execution because we are not basing any decisions off of that
extra data).
I don't like adding tool specific annotations to
code but I do like adding comments to clarify
code. Something based upon Eric's explanation
above would be a good addition to the method.
If the decision is made not to modify the code,
then these possibilities should be documented
along with the fact that they were recognized
and do not apply.
--
Joel Sherrill, Ph.D. Director of Research& Development
joel.sherrill@OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985