This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Fix strtod integer/buffer overflow (bug 14459)

From: "Joseph S. Myers" <joseph at codesourcery dot com>
To: <libc-alpha at sourceware dot org>
Date: Sun, 12 Aug 2012 18:28:11 +0000
Subject: Fix strtod integer/buffer overflow (bug 14459)
Bug 14459, which I found while working on the fix for bug 3479 (strtod
rounding), is integer overflows in strtod (and strtof, strtold,
strtod_l, etc.) on certain input, which can also result in buffer
overflows on the stack.

The essential problem is the use of int for various variables and
internal calculations that may exceed the range of int.  Various such
cases appear if the string passed is over 2GB in size, but the way
exponent limits are calculated mean that minimum input size to trigger
the problems is slightly under a fifth of a GB.

Three variables, dig_no, int_no and lead_zero, count the number of
various elements in the input in memory, and so size_t is the
appropriate type to use for them; this patch does so (with various
associated casts to avoid this introducing signed/unsigned mixed
operations; even if those are safe in a particular case, they are
often confusing, so it seems best to avoid them).  Exponents can be
larger (given an N-digit integer before the exponent, an exponent of
-4N can be appropriate for a hex float).  This patch uses intmax_t to
represent exponents, with plenty of assertions added (the code already
had quite a few) to verify that the values in question will not
overflow those variables or intermediate calculations.  It should not
be possible to trigger these assertions without input around 2^61
bytes or more in size, which I don't think is plausible on present-day
systems; if input of that size becomes relevant in future, on systems
where intmax_t is still 64-bit, it will be necessary to use a more
complicated approach to handle exponents (logically, they are
something like "signed size_t plus a few bits").

The testcase is the example I found that uses the smallest input;
there are plenty of other ways in which the integer overflows could
cause wrong results, if not buffer overflow, for larger input that
seems less reasonable to add to the testsuite.

I don't know how exploitable this buffer overflow is (given a program
passing untrusted input of (nearly) a fifth of a GB or more to
strtod-family functions or callers such as atof and scanf) - at least
the example testcase I add and give in the bug results (for me) in a
segfault from within str_to_mpn, and never gets as far as strtod_l
trying to return to its caller (which would be the point at which an
overwritten return address would be used).  But it seems plausible
that for some programs there would be some way to exploit some
overflow possibility fixed by this patch.

gnulib does not use glibc's strtod (it has a more minimal strtod
wrapper that fixes selected bugs from various systems' strtod,
including a limited subset of the various bugs in glibc that have been
fixed over the years) so there should be no issue of the buggy code
being embedded directly into various packages.

Tested x86 and x86_64.

2012-08-12  Joseph Myers  <joseph@codesourcery.com>

	[BZ #14459]
	* stdlib/strtod_l.c: Include <stdint.h>.
	(round_and_return): Change exponent parameter to type intmax_t.
	Rearrange calculations to avoid internal overflow possibilities.
	(str_to_mpn): Change exponent parameter to type intmax_t *.
	Rearrange calculations to avoid internal overflow possibilities.
	(____STRTOF_INTERNAL): Change exponent variable to type intmax_t.
	Change dig_no, int_no and lead_zero to type size_t.  Rearrange
	calculations and add assertions to avoid internal overflow
	possibilities.  Add casts to avoid signed/unsigned operations.
	* stdlib/tst-strtod-overflow.c: New file.
	* stdlib/Makefile (tests): Add tst-strtod-overflow.

diff --git a/stdlib/Makefile b/stdlib/Makefile
index 10674f2..f94266e 100644
--- a/stdlib/Makefile
+++ b/stdlib/Makefile
@@ -69,7 +69,7 @@ tests		:= tst-strtol tst-strtod testmb testrand testsort testdiv   \
 		   tst-makecontext tst-strtod4 tst-strtod5 tst-qsort2	    \
 		   tst-makecontext2 tst-strtod6 tst-unsetenv1		    \
 		   tst-makecontext3 bug-getcontext bug-fmtmsg1		    \
-		   tst-secure-getenv
+		   tst-secure-getenv tst-strtod-overflow
 tests-static	:= tst-secure-getenv
 
 include ../Makeconfig
diff --git a/stdlib/strtod_l.c b/stdlib/strtod_l.c
index 2166a08..bf0c781 100644
--- a/stdlib/strtod_l.c
+++ b/stdlib/strtod_l.c
@@ -60,6 +60,7 @@ extern unsigned long long int ____strtoull_l_internal (const char *, char **,
 #include <math.h>
 #include <stdlib.h>
 #include <string.h>
+#include <stdint.h>
 
 /* The gmp headers need some configuration frobs.  */
 #define HAVE_ALLOCA 1
@@ -174,19 +175,19 @@ extern const mp_limb_t _tens_in_limb[MAX_DIG_PER_LIMB + 1];
 /* Return a floating point number of the needed type according to the given
    multi-precision number after possible rounding.  */
 static FLOAT
-round_and_return (mp_limb_t *retval, int exponent, int negative,
+round_and_return (mp_limb_t *retval, intmax_t exponent, int negative,
 		  mp_limb_t round_limb, mp_size_t round_bit, int more_bits)
 {
   if (exponent < MIN_EXP - 1)
     {
-      mp_size_t shift = MIN_EXP - 1 - exponent;
-
-      if (shift > MANT_DIG)
+      if (exponent < MIN_EXP - 1 - MANT_DIG)
 	{
 	  __set_errno (ERANGE);
 	  return 0.0;
 	}
 
+      mp_size_t shift = MIN_EXP - 1 - exponent;
+
       more_bits |= (round_limb & ((((mp_limb_t) 1) << round_bit) - 1)) != 0;
       if (shift == MANT_DIG)
 	/* This is a special case to handle the very seldom case where
@@ -233,6 +234,9 @@ round_and_return (mp_limb_t *retval, int exponent, int negative,
       __set_errno (ERANGE);
     }
 
+  if (exponent > MAX_EXP)
+    goto overflow;
+
   if ((round_limb & (((mp_limb_t) 1) << round_bit)) != 0
       && (more_bits || (retval[0] & 1) != 0
 	  || (round_limb & ((((mp_limb_t) 1) << round_bit) - 1)) != 0))
@@ -258,6 +262,7 @@ round_and_return (mp_limb_t *retval, int exponent, int negative,
     }
 
   if (exponent > MAX_EXP)
+  overflow:
     return negative ? -FLOAT_HUGE_VAL : FLOAT_HUGE_VAL;
 
   return MPN2FLOAT (retval, exponent, negative);
@@ -271,7 +276,7 @@ round_and_return (mp_limb_t *retval, int exponent, int negative,
    factor for the resulting number (see code) multiply by it.  */
 static const STRING_TYPE *
 str_to_mpn (const STRING_TYPE *str, int digcnt, mp_limb_t *n, mp_size_t *nsize,
-	    int *exponent
+	    intmax_t *exponent
 #ifndef USE_WIDE_CHAR
 	    , const char *decimal, size_t decimal_len, const char *thousands
 #endif
@@ -335,7 +340,7 @@ str_to_mpn (const STRING_TYPE *str, int digcnt, mp_limb_t *n, mp_size_t *nsize,
     }
   while (--digcnt > 0);
 
-  if (*exponent > 0 && cnt + *exponent <= MAX_DIG_PER_LIMB)
+  if (*exponent > 0 && *exponent <= MAX_DIG_PER_LIMB - cnt)
     {
       low *= _tens_in_limb[*exponent];
       start = _tens_in_limb[cnt + *exponent];
@@ -413,7 +418,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 {
   int negative;			/* The sign of the number.  */
   MPN_VAR (num);		/* MP representation of the number.  */
-  int exponent;			/* Exponent of the number.  */
+  intmax_t exponent;		/* Exponent of the number.  */
 
   /* Numbers starting `0X' or `0x' have to be processed with base 16.  */
   int base = 10;
@@ -435,7 +440,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
   /* Points at the character following the integer and fractional digits.  */
   const STRING_TYPE *expp;
   /* Total number of digit and number of digits in integer part.  */
-  int dig_no, int_no, lead_zero;
+  size_t dig_no, int_no, lead_zero;
   /* Contains the last character read.  */
   CHAR_TYPE c;
 
@@ -767,7 +772,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
      are all or any is really a fractional digit will be decided
      later.  */
   int_no = dig_no;
-  lead_zero = int_no == 0 ? -1 : 0;
+  lead_zero = int_no == 0 ? (size_t) -1 : 0;
 
   /* Read the fractional digits.  A special case are the 'american
      style' numbers like `16.' i.e. with decimal point but without
@@ -789,12 +794,13 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 	     (base == 16 && ({ CHAR_TYPE lo = TOLOWER (c);
 			       lo >= L_('a') && lo <= L_('f'); })))
 	{
-	  if (c != L_('0') && lead_zero == -1)
+	  if (c != L_('0') && lead_zero == (size_t) -1)
 	    lead_zero = dig_no - int_no;
 	  ++dig_no;
 	  c = *++cp;
 	}
     }
+  assert (dig_no <= (uintmax_t) INTMAX_MAX);
 
   /* Remember start of exponent (if any).  */
   expp = cp;
@@ -817,24 +823,80 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 
       if (c >= L_('0') && c <= L_('9'))
 	{
-	  int exp_limit;
+	  intmax_t exp_limit;
 
 	  /* Get the exponent limit. */
 	  if (base == 16)
-	    exp_limit = (exp_negative ?
-			 -MIN_EXP + MANT_DIG + 4 * int_no :
-			 MAX_EXP - 4 * int_no + 4 * lead_zero + 3);
+	    {
+	      if (exp_negative)
+		{
+		  assert (int_no <= (uintmax_t) (INTMAX_MAX
+						 + MIN_EXP - MANT_DIG) / 4);
+		  exp_limit = -MIN_EXP + MANT_DIG + 4 * (intmax_t) int_no;
+		}
+	      else
+		{
+		  if (int_no)
+		    {
+		      assert (lead_zero == 0
+			      && int_no <= (uintmax_t) INTMAX_MAX / 4);
+		      exp_limit = MAX_EXP - 4 * (intmax_t) int_no + 3;
+		    }
+		  else if (lead_zero == (size_t) -1)
+		    {
+		      /* The number is zero and this limit is
+			 arbitrary.  */
+		      exp_limit = MAX_EXP + 3;
+		    }
+		  else
+		    {
+		      assert (lead_zero
+			      <= (uintmax_t) (INTMAX_MAX - MAX_EXP - 3) / 4);
+		      exp_limit = (MAX_EXP
+				   + 4 * (intmax_t) lead_zero
+				   + 3);
+		    }
+		}
+	    }
 	  else
-	    exp_limit = (exp_negative ?
-			 -MIN_10_EXP + MANT_DIG + int_no :
-			 MAX_10_EXP - int_no + lead_zero + 1);
+	    {
+	      if (exp_negative)
+		{
+		  assert (int_no
+			  <= (uintmax_t) (INTMAX_MAX + MIN_10_EXP - MANT_DIG));
+		  exp_limit = -MIN_10_EXP + MANT_DIG + (intmax_t) int_no;
+		}
+	      else
+		{
+		  if (int_no)
+		    {
+		      assert (lead_zero == 0
+			      && int_no <= (uintmax_t) INTMAX_MAX);
+		      exp_limit = MAX_10_EXP - (intmax_t) int_no + 1;
+		    }
+		  else if (lead_zero == (size_t) -1)
+		    {
+		      /* The number is zero and this limit is
+			 arbitrary.  */
+		      exp_limit = MAX_10_EXP + 1;
+		    }
+		  else
+		    {
+		      assert (lead_zero
+			      <= (uintmax_t) (INTMAX_MAX - MAX_10_EXP - 1));
+		      exp_limit = MAX_10_EXP + (intmax_t) lead_zero + 1;
+		    }
+		}
+	    }
+
+	  if (exp_limit < 0)
+	    exp_limit = 0;
 
 	  do
 	    {
-	      exponent *= 10;
-	      exponent += c - L_('0');
-
-	      if (__builtin_expect (exponent > exp_limit, 0))
+	      if (__builtin_expect ((exponent > exp_limit / 10
+				     || (exponent == exp_limit / 10
+					 && c - L_('0') > exp_limit % 10)), 0))
 		/* The exponent is too large/small to represent a valid
 		   number.  */
 		{
@@ -843,7 +905,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 		  /* We have to take care for special situation: a joker
 		     might have written "0.0e100000" which is in fact
 		     zero.  */
-		  if (lead_zero == -1)
+		  if (lead_zero == (size_t) -1)
 		    result = negative ? -0.0 : 0.0;
 		  else
 		    {
@@ -862,6 +924,9 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 		  /* NOTREACHED */
 		}
 
+	      exponent *= 10;
+	      exponent += c - L_('0');
+
 	      c = *++cp;
 	    }
 	  while (c >= L_('0') && c <= L_('9'));
@@ -930,7 +995,14 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 	}
 #endif
       startp += lead_zero + decimal_len;
-      exponent -= base == 16 ? 4 * lead_zero : lead_zero;
+      assert (lead_zero <= (base == 16
+			    ? (uintmax_t) INTMAX_MAX / 4
+			    : (uintmax_t) INTMAX_MAX));
+      assert (lead_zero <= (base == 16
+			    ? ((uintmax_t) exponent
+			       - (uintmax_t) INTMAX_MIN) / 4
+			    : ((uintmax_t) exponent - (uintmax_t) INTMAX_MIN)));
+      exponent -= base == 16 ? 4 * (intmax_t) lead_zero : (intmax_t) lead_zero;
       dig_no -= lead_zero;
     }
 
@@ -972,7 +1044,10 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
 	}
 
       /* Adjust the exponent for the bits we are shifting in.  */
-      exponent += bits - 1 + (int_no - 1) * 4;
+      assert (int_no <= (uintmax_t) (exponent < 0
+				     ? (INTMAX_MAX - bits + 1) / 4
+				     : (INTMAX_MAX - exponent - bits + 1) / 4));
+      exponent += bits - 1 + ((intmax_t) int_no - 1) * 4;
 
       while (--dig_no > 0 && idx >= 0)
 	{
@@ -1024,13 +1099,15 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
      really integer digits or belong to the fractional part; i.e. we normalize
      123e-2 to 1.23.  */
   {
-    register int incr = (exponent < 0 ? MAX (-int_no, exponent)
-			 : MIN (dig_no - int_no, exponent));
+    register intmax_t incr = (exponent < 0
+			      ? MAX (-(intmax_t) int_no, exponent)
+			      : MIN ((intmax_t) dig_no - (intmax_t) int_no,
+				     exponent));
     int_no += incr;
     exponent -= incr;
   }
 
-  if (__builtin_expect (int_no + exponent > MAX_10_EXP + 1, 0))
+  if (__builtin_expect (exponent > MAX_10_EXP + 1 - (intmax_t) int_no, 0))
     {
       __set_errno (ERANGE);
       return negative ? -FLOAT_HUGE_VAL : FLOAT_HUGE_VAL;
@@ -1215,7 +1292,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
        digits we should have enough bits for the result.  The remaining
        decimal digits give us the information that more bits are following.
        This can be used while rounding.  (Two added as a safety margin.)  */
-    if (dig_no - int_no > (MANT_DIG - bits + 2) / 3 + 2)
+    if ((intmax_t) dig_no > (intmax_t) int_no + (MANT_DIG - bits + 2) / 3 + 2)
       {
 	dig_no = int_no + (MANT_DIG - bits + 2) / 3 + 2;
 	more_bits = 1;
@@ -1223,7 +1300,7 @@ ____STRTOF_INTERNAL (nptr, endptr, group, loc)
     else
       more_bits = 0;
 
-    neg_exp = dig_no - int_no - exponent;
+    neg_exp = (intmax_t) dig_no - (intmax_t) int_no - exponent;
 
     /* Construct the denominator.  */
     densize = 0;
diff --git a/stdlib/tst-strtod-overflow.c b/stdlib/tst-strtod-overflow.c
new file mode 100644
index 0000000..668d55b
--- /dev/null
+++ b/stdlib/tst-strtod-overflow.c
@@ -0,0 +1,48 @@
+/* Test for integer/buffer overflow in strtod.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#define EXPONENT "e-2147483649"
+#define SIZE 214748364
+
+static int
+do_test (void)
+{
+  char *p = malloc (1 + SIZE + sizeof (EXPONENT));
+  if (p == NULL)
+    {
+      puts ("malloc failed, cannot test for overflow");
+      return 0;
+    }
+  p[0] = '1';
+  memset (p + 1, '0', SIZE);
+  memcpy (p + 1 + SIZE, EXPONENT, sizeof (EXPONENT));
+  double d = strtod (p, NULL);
+  if (d != 0)
+    {
+      printf ("strtod returned wrong value: %a\n", d);
+      return 1;
+    }
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"

-- 
Joseph S. Myers
joseph@codesourcery.com
Follow-Ups:
- Ping Re: Fix strtod integer/buffer overflow (bug 14459)
  - From: Joseph S. Myers
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]