This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: sed doesn't like LANG= anymore
- From: Andy Koppe <andy dot koppe at gmail dot com>
- To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
- Date: Thu, 20 May 2010 19:05:17 +0300
- Subject: Re: sed doesn't like LANG= anymore
- References: <20100520123926.GA1432@onderneming10.xs4all.nl>
On Thursday, May 20, 2010, Jurriaan wrote:
> A very long sed script that's been working for ages (back from the 1.5
> age) here has stopped working.
>
> It turned out sed doesn't like some strings anymore when environment
> variable LANG is empty. With LANG=ASCII, there are no problems.
>
> The actual text in the SED command is shown below as spaces, but it's a
> Swedish a with a small o on top of it, like this:
>
> sed -e"s/@a/ a/g;"
>
> where a is character 0xe5.
>
> Running with LANG=ASCII works, with LANG empty I get 'unterminated `s'
> command' from sed (which confused me for a while).
With empty LANG you're using the default UTF-8 encoding, where that
0xe5 byte constitutes an incomplete character. You need to either run
with a LANG setting that fits your script, e.g. C.ISO-8859-1, or
convert your script to UTF-8. I'm puzzled as to why LANG=ASCII would
have worked, since that's not a valid setting.
Andy
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple