This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug stdio/12701] scanf accepts non-matching input


http://sourceware.org/bugzilla/show_bug.cgi?id=12701

paxdiablo <allachan at au1 dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |allachan at au1 dot ibm.com

--- Comment #15 from paxdiablo <allachan at au1 dot ibm.com> 2012-11-26 08:26:04 UTC ---
I think this bug report is correct, at least in relation to the '%x/0xz'
sample.

There's a big difference between an input item, which *may* be an initial
subset of a properly scanned directive, and the *properly scanned directive*
itself.

Pushback controls how far you can back up the "input stream pointer" and is the
reason why scanf is usually not used by professionals, who prefer a
fgets/sscanf combo so they can bak up to the start of the line themselves.
However, the pushback is only relevant here in that context. The failure of
'0x' when scanning '%x' will not be able to push back all the way to the '0'
because of this limitation.

The function call sscanf ("a0xz", "%c%x%c") should return 1, not 3.

The controlling part of the standard is the bit dealing with the 'x' directive
itself:

=====
Matches an optionally signed hexadecimal integer, whose format is the same as
expected for the subject sequence of the strtoul function with the value 16 for
the base argument.
=====

The strtoul stuff states:

=====
If the value of base is zero, the expected form of the subject sequence is that
of an integer constant as described in 6.4.4.1, optionally preceded by a plus
or minus sign, but not including an integer suffix. If the value of base is
between 2 and 36 (inclusive), the expected form of the subject sequence is a
sequence of letters and digits representing an integer with the radix specified
by base, optionally preceded by a plus or minus sign, but not including an
integer suffix. The letters from a (or A) through z (or Z) are ascribed the
values 10 through 35; only letters and digits whose ascribed values are less
than that of base are permitted. If the value of base is 16, the characters 0x
or 0X may optionally precede the sequence of letters and digits, following the
sign if present.
=====

The controlling part there would be "a sequence of letters and digits
representing an integer" - you may argue that such a sequence may consist of
zero characters but I don't think anyone in their right mind would suggest that
definition represented an integer. In any case, the '0x' string fails on
strtoul:

    char *x;
    int rc = 42;
    rc = strtoul ("0x", &x, 16);
    printf ("%d [%s]/n", rc, x);
produces:

    0 [0x]
So even though rc is set to 0, the fact that the pointer points to the first
bad character means that the '0x' itself is not a valid hex number.

Putting in '0x5' as the string gives you:

    5 []
so that the first bad character is the end of the string (ie, there WERE no bad
characters).

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]