This is the mail archive of the guile@sources.redhat.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: OK, what about some resolution (Re: GUILE's GC - why we strugglingto solve already solved problems?)


Hello together!

When discussing about replacing guile's gc, we should first remember that the
very beginning of this discussion were the repeated complaints about guile
stealing the main function.  It seems to me that this is the main issue,
rather than the quality of guile's current gc.  This problem could be solved
if somebody was willing to extract the corresponding parts of the Boehm gc and
to place them into guile.

There are some counterarguments against going this way: First, it means some
work to extract that code into guile and also to periodically import patches
from Boehm's code into guile.  Second, it is claimed that the work spent with
guile's own gc is lost, since by switching to Boehm's collector we get a gc
with better algorithms than guile's, we would get improved performance and we
would not have to bother about gc optimization any more.

To the first point:
-------------------

It's true that extracting the relevant bits from the Boehm code and tracking
the patches means some work.  However, we must be aware that adapting the
Boehm collector to guile will also require a lot of work and changes to guile
if we want the collector to work reliably and efficiently.  I will give some
examples:

* a garbage collector as Boehm's requires that pointers always look like
  pointers, because only pointers that look like pointers to the collector
  are treated as references to memory regions that have to be kept.  In
  guile, currently pointers do not always look like pointers.  For
  example, in gloc cells there is an offset added, thus obfuscating the
  original pointer.  This may not be a problem, since the resulting value
  still looks like a pointer _inside_ of the memory region.  Thus, in this
  special case it can be assumed that we would be on the safe
  side.  However, a real problem are pointers on systems, where UNICOS is
  defined, as the following excerpt from gc.h shows:

  #ifdef _UNICOS
  #  define SCM2PTR(x) ((SCM_CELLPTR) (SCM_UNPACK (x) >> 3))
  #  define PTR2SCM(x) (SCM_PACK (((scm_bits_t) (x)) << 3))
  #else
  #  define SCM2PTR(x) ((SCM_CELLPTR) (SCM_UNPACK (x)))
  #  define PTR2SCM(x) (SCM_PACK ((scm_bits_t) (x)))
  #endif /* def _UNICOS */

  On these systems, Boehm wouldn't work.  I currently don't know of other
  places within guile where this kind of problem exists.

* guile requires that all of it's cells are 8 byte aligned, because 
  the lower 3 bits of the address are needed for type information.  When
  replacing SCM_NEWCELL with a call to a standard-style malloc, then we
  always have to over-allocate in order to be able to guarantee the 8 byte
  alignment for the actual cell data.  If there is no support for this
  kind of requirements, then with Boehm's collector we will always have an
  memory usage overhead of at least 50% (assuming that the result of
  malloc is 4 byte aligned we would have to allocate 12 instead of 8
  bytes to be able to provide the required alignment.)

* the fact that guile's cell heap and the systems's general malloc heap
  are separated gives us a variety of debugging possibilities.  For
  example, the kind of checks enabled by compiling guile and extension
  code with SCM_DEBUG_CELL_ACCESSES enabled _might_ become difficult to
  provide with a system where there is simply no information about _where_
  in memory cells may occur:  If cell memory is obtained via some malloc
  function that is also used to obtain memory for other purposes, there
  isn't any distinction any more.

* when cells are collected, we need to provide our own finalization code.  
  It may be possible that Boehm provides this feature.

We may be able to handle some of the issues above by providing specialized
type handlers etc. to Boehm's collector and to change for example guile's
type system etc.  But:  This means a lot of work.


To the second point:
--------------------

In general, it should be obvious that a system that can make use of internal
knowledge, will be always superior to a more general system where no such
knowledge is available, given that both systems use similar algorithms.  For
the case of guile's gc compared to the Boehm gc, we can only expect a
performance improvement by using Boehm's collector (if at all) as long as we
keep guile's current gc without further improvements.  As soon as guile's
collector is improved to use better algorithms, it will again have the
advantage of internal knowledge.

Thus, even if there were performance improvements with Boehm's collector
compared to the current guile, switching over to Boehm would in the long run
mean a performance loss compared to a potential improved collector that is
special for guile's way of using memory.  But, I doubt that performance is
_the_ issue with the current discussion, it just started to go that way
although it originated from the main() issue.  If performance was _the_
core problem, it would be more important to get Jost's environment patches
into guile (startup being four times as fast as with guile 1.4.1 !) or to
switch to a different evaluator.

And, there are some possible improvements to guile's gc which are 'just around
the corner', and I am not talking of Godot's brother (the generational gc):

* the separation of the gc bits from the type tags.  This step may bring
  performance improvements due to reduced page faults.  It may also bring
  performance improvements due to the fact that it reduces the pressure on
  guile's type bits, because we might be able to use more efficient type
  encodings in some places.

* explicit mark stack:  This will bring some performance improvement
  during the mark phase.  The change is not very complicated and requires
  only to modify scm_gc_mark.  Greg Harvey's code already contains a
  sample implementation.

* lazy sweeping:  This is suggested by Michael, and I don't have a very
  detailed idea about how this would work, but if I remember him right, it
  is also a not too complicated change.

Now, having said that I don't see the performance aspect in the current
discussion, I have to agree about the point with freeing guile from the need
to code it's own gc.  Or, in other words, it would be nice if there was a gc
that was optimized to the way guile uses memory, that was developed and used
separately from guile, that incorporated good gc algorithms and still would be
pluggable into guile with minimum effort.


Here's a suggestion about it:
-----------------------------

Separating the gc bits from the type tags will allow us to fully separate the
gc subsystem from the rest of guile and put it into a separate gc library that
then may be used also from code other than guile.  The nice thing is, that
this library can be made almost completely independent of the type system that
is used.  This allows everybody to experiment with gc, without the need to
go into guile's internals.  A first draft for such a library already
exists, because I once realised how easily this could be created from
guile's current gc.  Currently Michael is looking over it to give some
comments, as the first version does not cover anything about generational
gc.

Best regards
Dirk


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]