This is the mail archive of the
mailing list for the Cygwin project.
Re: [ANNOUNCEMENT] Updated: dash-0.5.8-3
- From: Thomas Wolff <towo at towo dot net>
- To: cygwin at cygwin dot com
- Date: Mon, 13 Feb 2017 23:03:11 +0100
- Subject: Re: [ANNOUNCEMENT] Updated: dash-0.5.8-3
- Authentication-results: sourceware.org; auth=none
- References: <email@example.com> <firstname.lastname@example.org> <email@example.com> <20170131100402.GB29504@calimero.vinschen.de> <20170131131616.GC29504@calimero.vinschen.de> <firstname.lastname@example.org> <20170131153245.GA8905@calimero.vinschen.de>
Am 31.01.2017 um 16:32 schrieb Corinna Vinschen:
So the flag is always set initially? Also on Linux? Does it (on Linux)
also have an effect for non-UTF-8 multibyte encodings?
And cannot the Cygwin DLL set the flag to match the locale setting when
it was invoked?
I can (and will if appropriate) handle the flag in mintty as needed, but
what if someone calls LC_ALL=.other_encoding dash later within the
terminal session? I guess the more consistent solution would be to
handle this in the cygwin DLL.
On Jan 31 16:01, Houder wrote:
On Tue, 31 Jan 2017 14:16:16, Corinna Vinschen wrote:
I'm not quite sure yet but apparently the problem is in the handling of
VERASE in the termios implementation. In cooked mode it fills a char
buffer with what has been typed. The code doesn't know if the bytes in
the buffer are UTF-8 chars or just random bytes. So VERASE erases
exactly one byte, which means, in case of UTF-8 chars it only erases the
last byte of of a mulitbyte character.
Ok, here's what happens on Linux: The termios code support a flag
IUTF8. This flag determines if the termios code checks for UTF8
characters in the input when performing an ERASE. It checks if the
IUTF8 flag is set and if so, it checks in a loop if the just erased byte
is a UTF-8 continuation character. If so, it erases another byte.
Agreed. One byte or more, depending on the "character" ... (which is
not a problem in case of UTF-8 encoding -- continuation bit).
Of course, the terminal driver must receive the characters encoded in UTF-8.
... It's the termios implementation
inside Cygwin. I created a patch introducing the IUTF8 flag as on Linux
as well as a code snippet trying to remove entire utf-8 characters from
the input if the IUTF8 flag is set. And it's set now by default since
we default to UTF-8 anyway.
Thomas, you may want to check for the IUTF8 flag in upcoming mintty
versions and unset it if character set configured in the mintty options
dialog is != UTF-8.
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple