This is the mail archive of the cygwin mailing list for the Cygwin project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Dec 24 15:36, Xuefer wrote: > tested with > $ uname -a > CYGWIN_NT-6.1 mOo-PC 1.7.27(0.271/5/3) 2013-12-09 11:54 x86_64 Cygwin > > run the following code in .bat file, the file should be in GBK > encoding. as your system should be GBK encoding by default to parse > the batch file correctly > or copy paste the code to start->run > ==[ to get actual wrong output ] > c:\app\cygwin\bin\env LANG=zh_CN.UTF-8 PATH=/usr/bin bash -c "echo äæ; > echo äæ > a.txt; cat a.txt; xxd a.txt; echo please vim a.txt; sh" > =============== > > ==[ actual output ] > ä æ > ä æ > 0000000: 18e4 b8ad 18e6 9687 0a ......... > please vim a.txt > sh-4.1$ > =============== > now when you do "vim a.txt", you see > a.txt > ^Xä^Xæ I'm sorry, but I have a hard time testing this. I don't have a system, which allows to switch the console to codepage 936, which would be required to give this a try. Also, the a.bat.txt file you attached to your mail seems to be broken. The characters in the `echo' commands seem to consist of four 0x3f hex values, which is probably not what you wanted. This doesn't look like valid GBK encoding. I have a hunch what the problem might be, though. When you start the batch file, you don't have any POSIX environment variable set to tell Cygwin which codeset you're using. The first process started here is `env'. When you set LANG, it's env doing this, but it does so only *after* reading the command line. Env itself will use what is set in the environment prior to starting env. So when env evaluates the command line, it assumes that the Cygwin locale is supposed to be set to "C" or "POSIX", which is ASCII-only per POSIX. In that case, all non-ASCII chars in the input will be converted to replacement byte values, starting with ^X (== 0x18), followed by the UTF-8 value of the input character. That's what you see. If my hunch is more or less correct, a workaround would be to make sure the LANG or LC_CTYPE variable is set before calling the first Cygwin process. So, please change your bat file to something like this and try again: set LC_CTYPE=zh_CN.UTF-8 c:\app\cygwin\bin\env PATH=/usr/bin bash -c "echo äæ; echo äæ > a.txt; cat a.txt; xxd a.txt; echo please vim a.txt; sh" Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
Attachment:
pgpuL3lhTWdV5.pgp
Description: PGP signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |