This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: GC and threads (was Re: generational GC)



> (In case it's not obvious what I mean, here's what I have in mind for
> the case with 1 program thread + 1 GC thread.  Whenever the program
> creates a new cell, it marks it with the current "GC cycle counter" C.
> When (roughly) the program thread thinks that GC is desirable, it
> stores the cycle counter and the top and bottom of the thread stack in
> globals (Cstored = C etc.) that the GC thread can access, increments
> the cycle counter, and signals the GC thread.  The GC thread marks as
> usual, using the program thread's stored stack address range as well
> as its own stack, but ignoring cells with cycle counter greater than
> the counter stored in the program thread's global cycle counter
> variable.  It then sweeps, again ignoring (i.e. not collecting) cells
> with a recent cycle counter (C > Cstored).
> 
> The key point here is that it is impossible for a recent cell to
> reference a non-recent cell unless that non-recent cell was also
> accessible from the thread's stack and global starting points at the
> time that the cycle counter was incremented.

I think what you're describing here is equivalent to allocating new
cells with the mark bit set.  The GC won't sweep up such cells until
end of the next GC cycle.

Here's the way the literature generally discusses this kind of thing.

Most of the GC's we're considering work by starting from a set of root
pointers, and traversing the heap, marking every reachable object.
(Just so you don't assume this is universal, note that
reference-counting collectors don't work that way.)

Every traversal algorithm divides nodes into three classes:
1) Nodes we've visited and dealt with.  Call these "black" nodes.
2) Nodes we've not visited, or seen references to.  Call these "white" 
   nodes.  At the end of the traversal, any nodes that are still white
   are garbage.
3) Nodes we've found pointers to, but haven't yet visited.  Call these
   "grey" nodes.  For example, if you're using a mark stack, the nodes
   on the stack are grey.

At the beginning of the cycle, all nodes are white.  We start coloring
nodes grey as we find references to them.  Each time we visit a node,
we turn it from grey to black, and any white nodes it refers to become
grey.

If you watch the traversal in progress, you'll see a wavefront of grey
cells spreading out from the roots, leaving black cells behind it, and
with white cells ahead of it.  By the end of the cycle, any nodes the
grey wavefront never reached are still white, and will be freed.

Some GC's follow the "tricolor invariant", which states that no black
cell may ever refer directly to a white cell.  Black cells may point
only to grey cells, and other black cells.  Since the grey cells are
yet to be visited, if you preserve this invariant, you're guaranteed
to visit every reachable white cell.

In this terminology, your strategy would be described as "allocate
black."  That is, new nodes are assumed to be live, and are left alone
until the next GC cycle begins, at which point everything is colored
white.

The problem with allocating black is that the new cells can't be
collected until the end of the *next* GC cycle, even though they're
the most likely to die quickly.  So it's more efficient to use some
kind of heuristic to color them grey or white.

For example, suppose we mark the global roots first, and leave
scanning the threads' stacks at the very end.  When a thread allocates
a new cell, we can color the new cell white if the thread's stack
hasn't been scanned yet, or black if it has.  That way, if we can
reach a good portion of the heap through the global roots, we can do
most of our allocation white, and those cells can be reclaimed at the
end of this GC cycle.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]