This is the mail archive of the guile@cygnus.com mailing list for the Guile project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
re: mod_guile design problems

To: <guile@cygnus.com>
Subject: re: mod_guile design problems
From: "Will Hartung" <vft750@home.com>
Date: Mon, 9 Aug 1999 07:55:44 -0700



I got this from:

http://sourceware.cygnus.com/ml/guile/1999-06/msg00293.html

regarding making a mod_guile for Apache.

> I just finished reading "Writing Apache Modules with Perl and C",
> and got quite some ideas for mod_guile, but i have some basic problems:

I haven't read this book.

> - I want to make smob's for the request_rec, server_rec and
>   conn_rec structures. Those are actually automagically destroyed
>   by apache when it's done with it. I have no idea how to handle
>   this - mod_guile modules can of course store the smob's
>   wherever they like, and access them even after the real ones
>   have vanished. sadly, apache does *not* provide a clean way to
>   register "cleanup"-functions for these structures.

There are actually two issues here regarding clean up. To be blunt, in the
current, most popular version of the Apache server, you don't have to worry
too much about clean up. Simply because when ALL of your requests are done,
the child exits. This is because Apache is a forked server model. The only
detail is the KEEP-ALIVE mode where your child keeps handling requests from
the same client. But, in general, if you want to be sloppy, you can pretty
much punt on some clean up issues and let the OS clean it up for you, as
only the root parent server is a long lived process.

Of course, you can throw all of this out for the NT port, as it's a threaded
server. In fact, it sounds to me that they're moving Apache to be a
completely threaded server someday.

In reality, I would imagine that the *_recs are allocated from memory pool
of the request, and therefore implicitly destroyed when the request is
completed.

> - Apache uses it's own type of FILE pointers, BUFF's.
>   There seems no way to wrap a BUFF into a guile port and lateron
>   retrieve the BUFF from the port.

You shouldn't have any problem wrapping a port around these BUFFs. The issue
here (to me) is that HTTP requires that the size of the content that is
being sent be known in advance, whether its being sent in a single pulse, or
in "chunked" mode. For files, this is easy, but for dynamic HTML, you'll
need to cache the entire thing anyway to know the size of the final result,
and then spit it out all at once. If it were me, I'd create an HTTP-PORT
like thing where I don't have to worry about such things, and I'd give it
the port options and properties that represent the various bits of an HTTP
header. And I'd make sure there was a way to create a sort of pass through
port so you don't have to cache such things as files (though those tend to
be sent as redirects to the server). That way, you just (display ...) to the
port, and all of the HTTP junk is done for you.

> - Memory pools. I haven't found any use for memory pools in Guile
>   since it's already Garbage Collected. I just need some logic to
>   find an appropriate pool for those functions that require one.
>   (e.g. ap_escape_shell_cmd - i'll copy the result so it can be
>   stored in guile independently of the pool)

Yeah, you can pretty much punt on the memory pools, they're just as a
benefit to the module developer. You don't NEED to use them. They're primary
use is for the C coders to use them rather than malloc(1) so that they can
worry less about memory leaks. Of course, the C++ guys are pretty much left
out in the cold, but that's pretty much a default state for them anyway.

> More or less unrelated to this, i'm also searching for the new
> Environment specification (i've lost my local copy). I wonder
> wether it'll be possible to mark an environment as "read-only",
> e.g. that a set! will not affect that environment or any of it'sparents.

The rest of the things you mention are pretty much trivial compared to the
environment issue. The previous bits are simply glue are the Apache API.

Here are some things that you may want to consider: (note, I made a related
posting about this in comp.lang.scheme, and someone suggested I come here).

First, as mentioned, Apache runs in two different modes. It's forked on UNIX
and threaded on NT. And it seems that for performance reasons, they want to
take Apache towards the threaded model.

The forked model helps keep Apache clean because it doesn't have to worry
about things like memory leaks and such.

In this specific case, you don't really have to worry about a "read only"
environement, because it essentially is a read only environment. None of
your changes will get back to the core scheme image in the parent process.
The flip side of this is that you cannot share any information in Apache
between children, nor can you (easily -- in a documented, "supported"
manner) make changes to the parent. You can step outside of Apache and use
shared memory to facilitate this, but that's left as an exercise to the
reader.

Another issue to remember about the forked model is that if your scheme
image loads up anything by default (say, a bunch of stuff from SLIB) when
the server starts, then you will not be able to change that core server
unless you restart the server from scratch, or you use some Apache external
technique to communicate to the parent Apache process. Personally, I'd like
to be able to change the server on the fly.

Of course, in the threaded model it is completely flipped around. Everything
is shared. So you need to be concerned about a "read only" environment, and
you need to be concerned with resource sharing (at least managing it). In
the threaded model it sounds kind of nasty because you'll need at some level
a mutex to protect changes to the environment (public and private). And that
sounds like it has a potential to be a wee bit slow to me. Pehaps you only
need the mutex in the "get memory" part of the scheme, and, of course, the
GC.

One possible way of achieving a private environment is to surround your
scheme script with (let ((global1 global1) ...) ...script...). Use the let
to shadow any globals, and then you only have to worry about routines that
affect the global environment as a side affect.

If you don't care about having different Apache children being able to
communicate with each other, or about being able to persist stuff from
request to request in memory rather than to disk, then adding guile to a
forked Apache server shouldn't be too big of a deal. Essentially, I would
come up with a scheme layer to the CGI, have the scheme request handle the
POST and GET stuff, creating an assoc list or whatever, then make the
HTTP-PORT the *current-output-port* and jump to your script from there. That
way you can write your scheme like a CGI program, and test your scheme like
a CGI program. Once you can duplicate that functionality, then I'd tackle
the embedded scheme in HTML project.

It's not that difficult. It's fairly easy to have Apache call your code, and
once it does, it's done and doesn't really care. You can do a lot with
minimal interface to Apache.

I'm interested in this, so send me e-mail, or bring it back to the list, or
drag the generic highlevel issues over to c.l.scheme.

Regards,

Will Hartung
(vft750@home.com)
mod_guile design problems.url
Follow-Ups:
- Re: mod_guile design problems
  - From: forcer <forcer@mindless.com>
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]