This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Signed/unsigned character arrays are not strings


On Sun, Feb 25, 2007 at 08:53:50PM +0100, Jan Kratochvil wrote:
> On Sun, 25 Feb 2007 08:59:41 +0100, mathieu lacage wrote:
> ...
> > I don't know how useful that is to you but a lot of people (the first
> > which comes to my mind is libxml2) decided to use "unsigned char *" to
> > identify utf-8 encoded strings in C.
> 
> Together with the attached RMS's response I became more inclined to revert this
> change and provide only "$xmm"-specific fix instead (probably for the GDB
> int8_t/uint8_t internal types).

There was a lot of discussion about how to treat signed char, unsigned
char, signed char *, et cetera.  There weren't a lot of conclusions,
but several people did not like the new behavior, and then discussion
trailed off.

I don't want to just revert the patch, because the problem that Jan
was fixing (unuseful display of $xmm registers) is really quite
annoying.  I see these options:

1.  Make vector types special.  Treat arrays of single byte integers
as characters, like before, unless they occur in a vector type.  This
is reasonable, but tricky to implement.

2.  Make two special single byte integer types, with a GDB internal
"not a char" flag set.  Use them for our builtin int8_t and uint8_t.
Use these to build types for vector registers.  Print all other single
byte types from user code as chars or strings.  This is similar to
#1, a little less helpful, but fairly easy.

3.  Treat "char" as a character, but "unsigned char" and "signed char"
as numbers (Jan's patch started down this road and Jim's went a bit
further).  Treat pointers/arrays of char as strings and
pointers/arrays of unsigned or signed char as numbers.  Add a "/s"
flag to the print command that treats single byte types as
characters or strings.

For example:
  char str[] = "hi";
  unsigned char version[] = "6.5";

(gdb) p version
$1 = { 54, 46, 53 }
(gdb) p/s version
$2 = "6.5"
(gdb) p str
$3 = "hi"

4. Like #3, except that instead of adding a /s modifier, add a "set"
knob.  Of course in this case we get to argue about the default value.


I think it's important that we resolve this open issue before we
release a new version of GDB, so please post which you prefer.  I like
#3 best, followed by #2; #4 is a good compromise but I worry that we
are proliferating knobs that no one ever changes.  I'm interested in
any other suggestions, though I think we've ruled out guessing based
on the type name.

-- 
Daniel Jacobowitz
CodeSourcery


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]