This is the mail archive of the
cygwin-developers
mailing list for the Cygwin project.
Re: Lone surrogates in UTF-8? (was: Re: Console codepage setting via chcp?)
On Sep 28 13:39, Andy Koppe wrote:
> 2009/9/28 Corinna Vinschen:
> >> Oh, and I thought of one more thing that won't roundtrip correctly
> >> from Unix to Windows and back: a high surrogate directly followed by a
> >> low surrogate, because they'll combine into a non-BMP codepoint
> >> represented by a 4-byte sequence. That's near-impossible to happen by
> >> chance though.
> >
> > There is no chance to do that right. ?But I'm willing to stick to
> > this trade-off since, as you wrote, it's near-impossible that somebody
> > created that filename by chance.
>
> Hmm. But what if Java or Oracle or some other CESU-8 degenerate did
> that on purpose?
>
> Just in case you're not yet completely sick of this, here's how I
> think it could be done:
Nooooo! I *am* completely sick of this. I'm willing to let this slip
until the first complaint about this very issue comes along.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat