This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Strange message from updatedb


On 2/27/07, Igor Peshansky <pechtcha@cs.nyu.edu> wrote:
On Tue, 27 Feb 2007, Phil Edwards wrote:
> Quotes and backslashes aren't going to solve the problem, I think.  I
> looked at updatedb (it's a shell script), and the --prunepaths
> argument is passed through a sed script which replaces spaces in order
> to turn it all into a regexp.  There's no way of telling sed to avoid
> some spaces and translate others.

That's not quite true.

I should have said "no way without modifying updatedb ourselves". Users stuck with the default updatedb can't set environment variables or options in a file somewhere to work around the hardcoded space character in the sed script. (Trying to change IFS before calling updatedb breaks too many things in the rest of updatedb.)

An option to updatedb to set the separator character, cf cut(1), would be nice.


> You used to be able to set the internal PRUNEREGEX variable directly,
[...]
So the behavior should be the same, unless the configure
options differed when the packages were built.  This is something best
answered by the findutils maintainer...

This file turns out to be purely packaging, used by distros to manage automatic runs of updatedb with cron. (There are actually some bugs reported because the conf file is used "only" by the packaging, and not by updatedb itself, contrary to some expectations.) So this was a red herring, my bad.


> Most lists of dirs are passed around with colon (or some such)
> separators to avoid just such problems with paths containing
> whitespace.  updatedb is still living in the 80's.

Well, it's a matter of convention.  Colons are legal in filenames on Unix,
as is pretty much any character except for NUL.  However, many tools treat
colons specially, so it's conventionally used as a separator.  If you have
to pick a character to use as a path separator, a space is as good as any.
You'd still need quoting or escape characters to represent the separator.

I'm aware of the restrictions, and I will bet long odds that spaces show up in filenames far more often than colons (or any other punctuation commonly used as a separator in pathname lists) are used. I've not changed my opinion that updatedb is behind the times and is needlessly complicating things for admins, but I also don't expect that opinion to change anybody else's mind -- this bug has been reported before, against other distros, and gets rejected because "you shouldn't use spaces in filenames under Linux," how quaint.

I'll probably hack up the local copy of updatedb so that it works for
me <bitter>on filenames containing the standard word separator for
English, oh noes!</bitter>, and the upstream non-cygwin maintainers
can continue to use the Linux version.  This is something I've been
considering for a while anyhow for a totally unrelated project, so
it's not troublesome, just disappointing.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]