This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [RFC] Make string printing work on NetBSD (iconv issue)


> >>>>> "Paul" == Paul Koning <Paul_Koning@dell.com> writes:
> 
> Paul> The attached patch fixes this by having configure pick a
suitable
> Paul> codeset name to use.  "wchar_t" is used if available, otherwise
> ucs-2
> Paul> or ucs-4 with the appropriate byte order suffix is used instead.
> 
> This will yield incorrect results unless the chosen intermediate
> charset
> is actually the one used for wchar_t.

Tom, thanks for your feedback.

Yes, it clearly depends on picking the correct codeset.  If there were a
foolproof way to determine what that codeset is, that would be the best
answer.  I could not find one.  My reasoning is that UCS-n for n byte
wchar_t is a likely answer, so while it may be wrong for some platforms
(at least in theory) it will also be right for some, hopefully for most.
It clearly can't make matters worse, because any platform that doesn't
have the codeset name "wchar_t" currently doesn't work at all. 
 
> Note that if this is the case for UCS-4, then your platform headers
> ought to define __STDC_ISO_10646__.  So, you could test that in
> gdb_wchar.h rather than do any configury.

NetBSD clearly is using UCS-4 for wchar_t, but it does not define that
symbol.
 
> Alternatively, it is always safe to fall back to the code that uses
> narrow intermediate characters and host_charset for the intermediate
> encoding.

Yes, but doesn't that mean you end up not being able to accurately print
a wide string if one occurs in your program -- because it gets mapped to
the intermediate encoding first and with narrow chars for intermediate
coding you have a lossy translation?
 
> Perhaps this "wchar_t" thing is not the best way for us to go.  Maybe
> better would be to test __STDC_ISO_10646__ and fall back to narrow
> chars
> in all other cases. 

That sounds attractive.  But given that __STDC_ISO_10646__ isn't defined
in NetBSD even though it clearly supports wide chars and knows about
ucs-4, it doesn't seem to be workable.

> Other approaches are available too, but they are generally more work
> than simply using GNU libiconv.

Right, if you use libiconv then the issue goes away, and the patch I
wrote should handle that case cleanly.  I wanted to offer a solution to
people who don't want to install libiconv because they have a functional
iconv in libc, as is the case for NetBSD.

	paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]