This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Scheme style auto-resizing hashtable (fwd)


On Mon, 2 Nov 1998, Ian Bicking wrote:

> Just some comments on naming:
> 
> ...
> > 	dictionary-lookup
> 
> Should this maybe be dictionary-ref -- to keep with the vector-ref
> notation?  Not that "ref" is too meaningful, but it is consistent.
> Well, maybe... lookup sounds better, but remembering which
> good-sounding accessor goes with which data type can be hard.

*sigh*, yes I see your point, there's a "list-ref" too.

but I'm still married to "dictionary-lookup".  "dictionary-ref" just
doesn't sound right to me.

> ...
> > 	dictionary-map
> > 	dictionary-foreach
> 
> dictionary-map-pair, dictionary-foreach-pair.  Or
> dictionary-map-value, dictionary-foreach-value.  I'd vote for mapping
> to values as being the default (i.e., dictionary-map maps the values).

possibly.  "dictionary-map-value" would of course be a simple application
of "dictionary-map":

(define dictionary-map-value
  (lambda (value-proc dictionary)
    (let ((pair-proc (lambda (pair) (value-proc (cdr pair)))))
      (dictionary-map pair-proc dictionary))))

I guess "dictionary-map-value" would be very useful to potential users. 
But at some point we need to decide where to draw the line between
often-used procedures that should be defined in the extension and
lesser-used procedures that users should write for themselves.  It's
probably best not to try to anticipate every possible procedure that a
user may ever want. 

> And, as I think about it, dictionary-map-pair is unlikely to be used,
> a rather silly function, I think.  dictionary-foreach-pair is all you
> probably need.

I like "dictionary-map-pair" because it yields, e.g. "dictionary->keys",
"dictionary->values", "dictionary->alist".

> ("pair" or "association"?  Pair is shorter, association is more
> accurate...)

mmm, I dunno.  I'm not married to either.  But I do prefer
"dictionary->alist" over "dictionary->pairs".

> A dictionary-rehash! function, perhaps?

excellent idea, useful for cases where the user knowingly alters the keys
of key-value pairs stored in the dictionary.  (perhaps they do a
"dictionary->alist", and then fool around with the pairs in the alist).

> If dictionary-update! does what I think, it should probably be called
> dictionary-set!

nope, sorry.  Nobody correctly guessed my intentional meaning.  Very bad
job of naming, sorry.

I'll give:

dictionary-insert-alist!
dictionary-insert-dictionary!

and their memory-conscious siblings:

dictionary-consume-alist!       -- delete alist while inserting pairs
dictionary-consume-dictionary!  -- delete buckets while inserting entries

> There should be a "dictionary?"

yes, *smack forehead*, I forgot.
this is okay, but not foolproof:

(define dictionary?
  (lambda (obj)
    (and (pair? obj)
         (let ((header (car obj)))
           (and (vector? header)
		(let ((dictionary-type (vector-ref header 0)))
		  (or (eq? dictionary-type 'dictionary)
		      (eq? dictionary-type 'dictionaryv)
		      (eq? dictionary-type 'dictionaryq))))))))

> It might be nice to have a form like:
> (dictionary '((key1 . value1) (key2 . value2) ...))
> 
> Or maybe:
> (dictionary '(key1 . value1) '(key2 . value2) ...)
> I don't know which is really better.

I would prefer to give the user a chance to set up their dictionary the
way they want it prior to "converting" an alist to a dictionary:

(define foo (make-dictionaryv))       ;; I'm going to use mostly integers
(dictionary-disable! foo 'store-hash) ;; and symbols as keys

(dictionary-insert-alist! foo '((color . "blue") (42 . "foo")))


> dictionary-stats and dictionary-more-stats don't make any sense to me.

umm, they're for people who want to check how well the hasher is handling
their keys...?  *blush*, alright, those procedures were solely for *my*
use.

> If there are a lot of stats, maybe a lot of functions are called for
> (one for each stat).  If collecting the stats is expensive, but you
> get them all at once, then... hmm... I dunno.
>

"stats" would give everything that can be deduced from the info in the
header 

#(dictionary-type number-entries auto-shrink-flag store-hash-flag)

and the length of the vector of buckets.  So, stats may do this:

guile> (dictionary-stats foo)
((dictionary-type . 'dictionaryv) (auto-shrink-enabled . #t)
 (store-hash-enabled . #f) (number-entries . 18) (number-buckets . 8)
 (mean-bucket-size . 2.25))

"more-stats" would actually look at the length of each bucket and then
spit out things like 'number-nonempty-buckets, 'mean-nonempty-bucket-size,
'largest-bucket-size, 'largest-bucket-index, 'bucket-size-quartiles,
'bucket-size-standard-deviation, .... etc.
 
> Why is there a "!" after dictionary-copy! ?  Doesn't it make a
> (non-destructive) copy of a dictionary?

oops, right.

> Maybe there should be a dictionary-append! (is this what you meant by
> dictionary-copy!?)  "append" does imply some sort of ordering, though.

dictionary-insert-alist and friends should provide the functionality you
have in mind here.

> What is dictionary-clear! supposed to do?
> 

basically delete all entries.  It would delete all the buckets while
traversing them, break all the old cons cells, and in general make life
easy for Guile's gc.

> There should be a dictionary-length.

yes, right.

> That's all I can think of at the moment.

thank you VERY much for your input.

	Jay
	jglascoe@jay.giss.nasa.gov