This is the mail archive of the
gdb@sourceware.org
mailing list for the GDB project.
Re: printing wchar_t*
On Friday 14 April 2006 21:10, Eli Zaretskii wrote:
> > > If we want to support wchar_t arrays that store UTF-16, we will need
> > > to add a feature to GDB to convert UTF-16 to the full UCS-4
> > > codepoints, and output those.
> >
> > That's what I mentioned in a reply to Jim -- since the current string
> > printing code operated "one wchar_t at a time", it's not suitable for
> > outputing UTF-16 encoded wchar_t values to the user.
>
> I don't understand: if the wchar_t array holds a UTF-16 encoding, then
> when you receive the entire string, you have a UTF-16 encoding of what
> you want to display, and you yourself said that displaying a UTF-16
> encoded string is easy for you. So where is the problem? is that only
> that you cannot know the length of the UTF-16 encoded string? or is
> there something else missing?
For my frontend -- there's no problem, I can handle UTF-16 myself. However, if
gdb is to ever produce output in UTF-8, that should be readable by the
console, then it should handle surrogate pairs itself. Taking first and
second element of surrogate pair and converting both to UTF-8, individually,
won't work, for obvious reasons.
- Volodya