This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: tools to make non-conservative GC feasible.


Chris.Bitmead@misys.com.au writes:

> >Not really. If you're interfacing with arbitrary c code, the c code is
> >already doing it's own 'memory management'. If you need to pass it
> >objects from scheme, then you have to convert it to something the
> >program recognizes and handle that, but the actual operation of the c
> >code is something you don't have to care about.
> 
> Well, I could use the same argument in  favour of precise gc. If the C
> code is doing its own memory management, who cares about the tiny bit
> of work needed in the scheme wrappers to allow precise gc.
> 
> >The question is, how often does this actually happen in a real
> >program? In order for a dead object to never be collected, there has
> >to be a place on the stack that never changes, or a value that is put
> >on the stack quite frequently, that corresponds to an object that
> >guile knows about. It can and does happen, but it's very unlikely that
> >it will happen with a significant number of objects.
> 
> I agree that the odds of it happening are extremely small. I can just
> picture this program running one day and it just happens on this
> particular day to be doing calculations that just happen to be right
> smack in the address space of the process. Agreed that I will probably
> never live to see this day, but it just bothers me. Would you trust
> your medical equipment to it? Would you trust your nuclear arsenal to it?

I wouldn't trust it in cases where someone could end up dead if
something goes wrong (and I don't trust nuclear arsenals to anything),
but I do trust it for guile, which isn't really targetted towards life
and death situations.
 
> >In the file descriptor case, unless I'm missing something, files are
> >expected to be explicitly closed by scheme, so you aren't really
> >wasting fds if it stays live, just the space held by the fd structure
> >(this is unfortunate, because it lessens the reliability of guardians,
> >although I really doubt you'd see a situation where you ran out of
> >fd's because they were all being held up by the stack... you'd need
> >some really bad luck).
> 
> Agreed, very very bad luck. But imagine if you allocated 100 fds for some
> daemon. Then for some reason it did some recursive function that calculated
> a sequence in that very range. I don't think I'll ever see the day,
> but there may be critical applications that you don't want to take the
> chance.

Also, the fd's aren't necessarily going to be in a contiguous space of
memory, which could also help.

> >Even the performance of some widely used algorithms (quicksort, for
> >one, which probably isn't the best example since there are a lot of
> >work arounds to make the bad case almost completely unlikely, but it's
> >the first thing that pops to mind) have such a bad worst case that
> >they could end up executing forever. Generally, it doesn't happen, so
> >we accept the unlikely possibility and move on. This is a case where I
> >think you have to consistantly see incorrect behavior in a real
> >program before you chuck a very useful feature that, by all accounts,
> >works quite well.
> 
> Well, at least quicksort will still work correctly, so I think that's
> really a bit different.

Well, it depends on how long you plan on living (a few billion years
should suffice ;).
 
> You are right of course. It just goes against my grain to ignore border
> line cases.
> 

Hey, it's not like I'm happy & content with the fact that it could
break (and I'm certainly not 'right', because this really deals with
what is considered an acceptable tradeoff; for me, it is; for you, you
have some misgivings). I'm not ignoring the possibility that things
could go wrong, but given both the chances of breakage, and the ease
that the conservative collector affords us, I think that the tradeoff
is worth it (of course, part of this is that the bits of cs that
interest me the most are things like ai, where things are often
hanging by a thread, and heuristic isn't a dirty word ;). In general,
very little memory gets held around longer than it should, and
provided that the value isn't constantly on the stack, it probably
won't hang around any longer than it would if it was in an old
generation of the gc.

Another reason I like the conservative collector over explicit marking
I mentioned in an email to Jim Blandy yesterday (this deals with his
suggestion that, instead of providing things like
scm_malloc_protected, etc... we just do the work, then call
scm_modified on the cell):

--begin snippage--
Another, probably nitpicky, reason I'm not fussy about that route is
that it adds something that doesn't really fit with what the
programmer expects when they're writing c. This is mostly my own
experience, but I've found that it's a lot easier to write code for a
system or library that, while it might use different names, is
providing exactly the same things I expect to see. Personally, I'd
rather do something like:

SCM_SET(SCM_CDR(x)->some_scheme_val, foo);

than

SCM_CDR(x)->some_scheme_val = foo;
scm_modified(x);

The first way, once you know you have to use SCM_SET, you use SCM_SET
and mumble about that dumbass that required you to use SCM_SET. The
second way, you forget scm_modified in a few places (mumbling about
the dumbass that required you to use scm_modified, of course... you've
gotta mumble about something ;), and end up spending a much more
annoying time trying to figure out how foo became garbage. For some
reason, it's things like this that seem to turn into the annoying
bugs; malloc/free is a prime example (though more involved... it would
be like adding uncertainty into where to call scm_modified), else we
wouldn't even be thinking about gc. 

--end snippage--

Explicitly notifying the gc of stack values falls into pretty much the
same case (although I'd like to think a bit further on Perry Metzger's
idea of modifying bsd's lint to help out... I'm not fussy about the
necessity of a lint; even less so if the modification work falls to
whoever's doing the gc >;). Even so, I don't think the possibility of
explicit marking should be completely ignored, but right now I'm more
interested in (and inclined to) getting a conservative gengc up and
running.

-- 
Greg