This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

line endings, file path names (was: Updated: sed-4.1.5-2)


Corinna Vinschen wrote:

[JA wrote:]
Thank you very much for this fix. It will make life easier for all of
us who struggle with a mix of native and Cygwin tools. It is very much
appreciated that as far as line endings are concerned the attitude
taken by Cygwin developers is not "use POSIX line endings".

Sorry, but that's not why I did it. My personal opinion is still strongly on the "use POSIX line endings" side.

Too bad.


I made the fix only so that other mailing lists don't suffer

This is a strange reason for changing sed's functional behaviour, but since I like the outcome I won't complain. :-)

CRLF lineendings are in the top 10 of the worst ideas in the OS
business.

I agree 100%, and I also agree that DOS path names were a horrendous idea too, but neither of these questions are at issue here.


and I'm seriously contemplating (for years) to just remove textmode
from Cygwin.

This is where I disagree completely. From "CRLF was a bad idea" does not follow "hence we should not support it". This would just be sticking your head in the sand. Bad idea or not, you, or rather a text processing tool like sed, cannot avoid being faced by millions of documents that use CRLF and a few with Mac line endings too. The realization that it was a bad idea does not make these go away.


The only realistic approach here, and more so with line endings than with the path name issue, is that taken by XML (about which I usually have no good word to say):

2.11 End-of-Line Handling

 XML parsed entities are often stored in computer files which, for editing
 convenience, are organized into lines. These lines are typically separated
 by some combination of the characters carriage-return (#xD) and line-feed
 (#xA).

 To simplify the tasks of applications, the characters passed to an
 application by the XML processor must be as if the XML processor
 normalized all line breaks in external parsed entities (including the
 document entity) on input, before parsing, by translating both the
 two-character sequence #xD #xA and any #xD that is not followed by #xA
 to a single #xA character.

With respect to "text mode" don't forget that this is also part of the ISO standard for C and C++, although those standards don't go as far as XML does.

Another way to look at the issue: You can definitely always blame the whole mess on those who started the whole CRLF thing and I'm all on your side, but users of your tools will have to muddle through this mess one way or another. You can make it easier for your users by making the tools tolerate inputs that are affected by the mess that exists in real life, or you can make it difficult. If you take the latter route people will gravitate toward other tools in the long run. Cygwin has become as popular as it is because it helped get the job done, where the job is dealing with a mixed environment (POSIX-like behaviour in a non-POSIX environment).

Joachim

--
work:     joachima@netacquire.com   (http://www.netacquire.com)
private:  joachim@kraut.ca          (http://www.kraut.ca)

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]