This is the mail archive of the guile@cygnus.com mailing list for the guile project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
The other explanations are perfectly good; but here's mine: Any garbage collector needs to know which objects a given object refers to. If O is an object, let REFS(O) be the set of objects it refers to. So, if O is a pair, then REFS(O) contains exactly (car O) and (cdr O). If O is a closure, then REFS(O) is the variables it has closed over. Clearly, in order to compute REFS(O), you generally need to know the type of O, and its layout. However, sometimes it's very difficult to know the layout of an object. For example, suppose O is a continuation or a thread; in Guile, these objects actually include C stacks, laid out by the C compiler. The C compiler doesn't even let you walk stack frames, let alone tell you which slots are references to Scheme objects. A conservative garbage collector is one which, for at least some datatypes, uses a conservative estimate of REFS. For example, Guile does this for continuations and threads. It treats the stack as a dumb array of bytes, and then scans it for any word which looks like a reference to a Scheme object. It may mistake integers or frame pointers or random binary data for a Scheme object, but it will certainly catch all the valid references. For correctness, this is all the GC needs. Guile is designed in certain ways to make this process work better. First, it only uses conservative scanning on objects where it can't do better. All ordinary Scheme objects are scanned precisely. Secondly, the representation of Scheme objects was chosen to make it efficient to recognize pointers to actual objects. The benefit is convenience; you can write ordinary C code which manipulates Scheme objects, and the collector will just find all your references, and keep the objects alive for you. There are GC'd systems which require you to explicitly record your heap references; this is, effectively, describing the layout of your stack at run time. I've worked extensively on one such system, Emacs. It was an utter pain to maintain; even Richard Stallman couldn't do it, and we had GC-related bugs in every release. Given the reality of human maintainers, conservative GC is more reliable than Emacs's scheme. There is a price, however. Because you're not *sure* whether this word you found is really a pointer to a Scheme object, you can't relocate that Scheme object, because you can't fix up that pointer. It might actually be part of a string, or an integer, or encrypted data. There are GC's out there that pin individual objects and relocate all the rest (e.g., Joel Bartlett's "Mostly Copying" collector). Guile's GC doesn't do that. There is also a risk. Your compiler would be perfectly within its rights to save all pointers on the stack exclusive-or'd with 0xdeadbeef, and only xor them into their natural state after bringing them into registers. More realistically, your compiler might store pointers offset by some constant; there are plausible cases where this can generate pointers outside the object entirely. These `pointer hiding' optimizations do not seem to be a serious problem in practice, although they do mean that conservative GC is `incorrect', in some sense.