This is the mail archive of the guile@cygnus.com mailing list for the guile project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
On Thu, Jul 23, 1998 at 01:04:05PM -0400, Radey Shouman wrote: > Telford Tendys <telford@triangle.triode.net.au> writes: > > > Well I looked at trying to speed up the garbage collector but > > (to the extent of my understanding) there's nothing much to be improved -- > > [ ... ] > > Guile currently allocates local environments in cons cells, for > typical programs this represents most of the consing done. In the > later versions of SCM there is a small cache of cells used only for > environments; this cache is managed by a copying collector that moves > live cells to the usual Scheme heap. Since most environment cells > become garbage almost immediately this saves the time that would be > used in the sweep phase of a mark/sweep collector and also results in > better locality of reference. OK, see if I understand this right: * you can make the garbage collector faster by giving it less work to do. * a lot of the garbage that needs collecting is objects that are used once and then discarded -- since we know that such objects only have a single reference to them, once that single reference is gone it is possible for these objects to take a short-cut through collection. * If, by chance, such an object collects a second reference then treat it as normal with no short-cut. > Using this copying collector required changes to the evaluator. > However, there is not really that much code that deals directly with > local environments. Environment cacheing appealed to me because > it required only incremental changes to SCM; with some more work > one might also allocate most local bindings directly on a stack. Well I must say that I doubt that I could jump in and make the changes but what you say seems to make sense. Certainly, the binding mechanism is so central to scheme that any improvement would pay big dividends. > A substantial part of the cost of allocating floating point numbers > is in mallocating and freeing each double. This part of the cost > can be reduced greatly without changing much: allocate each double > or complex (just the double, not the cell header) in a dedicated heap > and collect by copying at mark time. I have this working for floats > in SCM, and am working on using it for bignums. Bignums are more > difficult because the arithmetic functions assume that pointers > to big digits remain valid across potential garbage collections. Can you have variable sized cells rather than a cons cell pointing to allocated space? This sort of rolls the two allocation steps into one. Would this give any improvement? I noticed that Bohem's garbage collector has caches for all small sized items and just rounds up the size to the nearest thing that it has cached. Is this usable for scheme? Also note that (on almost every system) malloc stores the size of the block one word back from the start pointer. This means that mallocing lots of little objects has a horrible memory overhead when compared to mallocing one big array of little objects. If we are going to start with caching, can this method be helpful? > > What does scm_cells_allocated do? Can it be removed from this > > macro? I know it seems insignificant in terms of overhead but > > this macro is called LOTS of times. > > It allows the repl to print out the allocation cost of each command, > which is useful if you are trying to minimize that cost. Hmmm, I suspected it was something like that, seems to me that knowing the allocation cost of a command is useful but it is essentially a debugging feature rather than a core language feature. It can be implemented by doing a complete garbage collect first, then doing a scan of all live data to learn its size, then repeating the above after the command is evaluated. The expense of the garbage collects is irrelevant when debugging and the overhead is gone when you are not debugging. - Tel