This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Lone surrogates in UTF-8? (was: Re: Console codepage setting via chcp?)


On Sep 28 13:39, Andy Koppe wrote:
> 2009/9/28 Corinna Vinschen:
> >> Oh, and I thought of one more thing that won't roundtrip correctly
> >> from Unix to Windows and back: a high surrogate directly followed by a
> >> low surrogate, because they'll combine into a non-BMP codepoint
> >> represented by a 4-byte sequence. That's near-impossible to happen by
> >> chance though.
> >
> > There is no chance to do that right. ?But I'm willing to stick to
> > this trade-off since, as you wrote, it's near-impossible that somebody
> > created that filename by chance.
> 
> Hmm. But what if Java or Oracle or some other CESU-8 degenerate did
> that on purpose?
> 
> Just in case you're not yet completely sick of this, here's how I
> think it could be done:

Nooooo!  I *am* completely sick of this.  I'm willing to let this slip
until the first complaint about this very issue comes along.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]