This is the mail archive of the kawa@sources.redhat.com mailing list for the Kawa project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: volunteer project: srfi 14 (char-sets)

From: Per Bothner <per at bothner dot com>
To: Jim White <jim at pagesmiths dot com>
Cc: kawa at sources dot redhat dot com
Date: Thu, 24 Mar 2005 08:45:37 -0800
Subject: Re: volunteer project: srfi 14 (char-sets)
References: <424251C8.6060901@bothner.com> <4242B53B.5020305@pagesmiths.com>

Jim White wrote:

I intend to take a crack at this, but I don't want to use bitmaps.


Bitmaps are not appropriate - thwy'd need up to 2*20 bits for the
multiple Unicode planes.  That's why I sugested in my sketch
an implementation based on sorted ranges.  Logarithmic search;
linear update including set operations and copy.

Of course they're inefficient for an Ascii-only set that has many
short ranges - but how often do people need that?  If that's an
issue, we can have a third concrete implementation type that uses
bit vectors.  Or we could have a hybrid implemenmtation:  Use one
bit in the starting-index of a pair to indicate that the following
int is a 32-bit mask, instead of a range end.  Then we're only
taking at most twice as much space as bit vector implementation,
though we still have to do binary search to find the correct
(start, mask) pair.  And update operations become more complex.
--
	--Per Bothner
per@bothner.com   http://per.bothner.com/

References:
- volunteer project: srfi 14 (char-sets)
  - From: Per Bothner
- Re: volunteer project: srfi 14 (char-sets)
  - From: Jim White

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]