From: William A. Rowe Jr Date: Sat, 22 Sep 2001 15:45:22 +0000 (+0000) Subject: Move API.html over to manual/developer, begin some cleanup. X-Git-Tag: 2.0.26~203 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=f7c23a7e8b79a9e74f16d0b361babad9b0def53c;p=apache Move API.html over to manual/developer, begin some cleanup. Could a DoxyGen'er please update the guidlines in documenting.html? git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@91108 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/manual/developer/documenting b/docs/manual/developer/documenting deleted file mode 100644 index 0994b10d78..0000000000 --- a/docs/manual/developer/documenting +++ /dev/null @@ -1,43 +0,0 @@ -Apache 2.0 is using ScanDoc to document the API's and global variables in -the code. This will explain the basics of how to document using Scandoc. - -To start a scandoc block, use /** -To end a scandoc block, use */ - -In the middle of the block, there are multiple tags we can use: - - Description of this functions purpose - @param parameter_name description - @tip Any information the programmer should know - @deffunc function prototype. - -The deffunc is not always necessary. ScanDoc does not have a full parser in -it, so any prototype that use a macro in the return type declaration is too -complex for scandoc. Those functions require a deffunc. - -An example: - -/** - * return the final element of the pathname - * @param pathname The path to get the final element of - * @return the final element of the path - * @tip Examples: - *
- *                 "/foo/bar/gum"   -> "gum"
- *                 "/foo/bar/gum/"  -> ""
- *                 "gum"            -> "gum"
- *                 "wi\\n32\\stuff" -> "stuff"
- * 
- * @deffunc const char * ap_filename_of_pathname(const char *pathname) - */ - -At the top of the header file, we always include - -/** - * @package Name of library header - */ - -ScanDoc uses a new html file for each package. The html files are named: - -Name of library header.html, so try to be concise with your names - diff --git a/docs/manual/developer/documenting.html b/docs/manual/developer/documenting.html new file mode 100644 index 0000000000..8f8163f426 --- /dev/null +++ b/docs/manual/developer/documenting.html @@ -0,0 +1,64 @@ + + + +Documenting Apache 2.0 + + + + + + +

Documentating Apache 2.0

+ +

Apache 2.0 uses DoxyGen to document the API's and global variables in the + the code. This will explain the basics of how to document using DoxyGen. + +

To start a documentation block, use /**
+ To end a documentation block, use */

+ +

In the middle of the block, there are multiple tags we can use:

+
+    Description of this functions purpose
+    @param parameter_name description
+

+ +

The deffunc is not always necessary. DoxyGen does not have a full parser + in it, so any prototype that use a macro in the return type declaration + is too complex for scandoc. Those functions require a deffunc.

+ +

An example (using &> rather than >):

+
+/**
+ * return the final element of the pathname
+ * @param pathname The path to get the final element of
+ * @return the final element of the path
+ * @tip Examples:
+ * <pre>
+ *                 "/foo/bar/gum"   -&> "gum"
+ *                 "/foo/bar/gum/"  -&> ""
+ *                 "gum"            -&> "gum"
+ *                 "wi\\n32\\stuff" -&> "stuff"
+ * </pre>
+ * @deffunc const char * ap_filename_of_pathname(const char *pathname)
+ */
+
+ +

At the top of the header file, always include:

+
+/**
+ * @package Name of library header
+ */
+
+ +

ScanDoc uses a new html file for each package. The html files are named + {Name_of_library_header}.html, so try to be concise with your names.

+ + + + diff --git a/docs/manual/developer/index.html b/docs/manual/developer/index.html index 93eb0d9d2f..c20d989ccd 100644 --- a/docs/manual/developer/index.html +++ b/docs/manual/developer/index.html @@ -16,10 +16,22 @@

Developer Documentation for Apache-2.0

-

Apache Hook Functions

-

Converting Apache 1.3 Modules to Apache 2.0

-

Debugging Memory Allocation in APR

-

Apache 1.3 API Notes

+

Many of the documents on these Developer pages are lifted from Apache 1.3's + documentation. While they are all being updated to Apache 2.0, they are + in different stages of progress. Please be patient, and point out any + discrepancies or errors on the developer/ pages directly to the + dev@httpd.apache.org mailing list.

+ +

Topics

+
+
Apache 2.0 API Notes
+
Overview of Apache's Application Programming Interface.
+
Apache Hook Functions
+
Porting Apache 1.3 Modules
+
Debugging Memory Allocation
+
Documenting Apache 2.0
+
+ diff --git a/docs/manual/misc/API.html b/docs/manual/misc/API.html deleted file mode 100644 index 496be760c9..0000000000 --- a/docs/manual/misc/API.html +++ /dev/null @@ -1,1161 +0,0 @@ - - -Apache API notes - - - - - -
Warning: -This document has not been updated to take into account changes -made in the 2.0 version of the Apache HTTP Server. Some of the -information may still be relevant, but please use it -with care. -
- -

Apache API notes

- -These are some notes on the Apache API and the data structures you -have to deal with, etc. They are not yet nearly complete, but -hopefully, they will help you get your bearings. Keep in mind that -the API is still subject to change as we gain experience with it. -(See the TODO file for what might be coming). However, -it will be easy to adapt modules to any changes that are made. -(We have more modules to adapt than you do). -

- -A few notes on general pedagogical style here. In the interest of -conciseness, all structure declarations here are incomplete --- the -real ones have more slots that I'm not telling you about. For the -most part, these are reserved to one component of the server core or -another, and should be altered by modules with caution. However, in -some cases, they really are things I just haven't gotten around to -yet. Welcome to the bleeding edge.

- -Finally, here's an outline, to give you some bare idea of what's -coming up, and in what order: - -

- -

Basic concepts.

- -We begin with an overview of the basic concepts behind the -API, and how they are manifested in the code. - -

Handlers, Modules, and Requests

- -Apache breaks down request handling into a series of steps, more or -less the same way the Netscape server API does (although this API has -a few more stages than NetSite does, as hooks for stuff I thought -might be useful in the future). These are: - - - -These phases are handled by looking at each of a succession of -modules, looking to see if each of them has a handler for the -phase, and attempting invoking it if so. The handler can typically do -one of three things: - - - -Most phases are terminated by the first module that handles them; -however, for logging, `fixups', and non-access authentication -checking, all handlers always run (barring an error). Also, the -response phase is unique in that modules may declare multiple handlers -for it, via a dispatch table keyed on the MIME type of the requested -object. Modules may declare a response-phase handler which can handle -any request, by giving it the key */* (i.e., a -wildcard MIME type specification). However, wildcard handlers are -only invoked if the server has already tried and failed to find a more -specific response handler for the MIME type of the requested object -(either none existed, or they all declined).

- -The handlers themselves are functions of one argument (a -request_rec structure. vide infra), which returns an -integer, as above.

- -

A brief tour of a module

- -At this point, we need to explain the structure of a module. Our -candidate will be one of the messier ones, the CGI module --- this -handles both CGI scripts and the ScriptAlias config file -command. It's actually a great deal more complicated than most -modules, but if we're going to have only one example, it might as well -be the one with its fingers in every place.

- -Let's begin with handlers. In order to handle the CGI scripts, the -module declares a response handler for them. Because of -ScriptAlias, it also has handlers for the name -translation phase (to recognize ScriptAliased URIs), the -type-checking phase (any ScriptAliased request is typed -as a CGI script).

- -The module needs to maintain some per (virtual) -server information, namely, the ScriptAliases in effect; -the module structure therefore contains pointers to a functions which -builds these structures, and to another which combines two of them (in -case the main server and a virtual server both have -ScriptAliases declared).

- -Finally, this module contains code to handle the -ScriptAlias command itself. This particular module only -declares one command, but there could be more, so modules have -command tables which declare their commands, and describe -where they are permitted, and how they are to be invoked.

- -A final note on the declared types of the arguments of some of these -commands: a pool is a pointer to a resource pool -structure; these are used by the server to keep track of the memory -which has been allocated, files opened, etc., either to service a -particular request, or to handle the process of configuring itself. -That way, when the request is over (or, for the configuration pool, -when the server is restarting), the memory can be freed, and the files -closed, en masse, without anyone having to write explicit code to -track them all down and dispose of them. Also, a -cmd_parms structure contains various information about -the config file being read, and other status information, which is -sometimes of use to the function which processes a config-file command -(such as ScriptAlias). - -With no further ado, the module itself: - -

-/* Declarations of handlers. */
-
-int translate_scriptalias (request_rec *);
-int type_scriptalias (request_rec *);
-int cgi_handler (request_rec *);
-
-/* Subsidiary dispatch table for response-phase handlers, by MIME type */
-
-handler_rec cgi_handlers[] = {
-{ "application/x-httpd-cgi", cgi_handler },
-{ NULL }
-};
-
-/* Declarations of routines to manipulate the module's configuration
- * info.  Note that these are returned, and passed in, as void *'s;
- * the server core keeps track of them, but it doesn't, and can't,
- * know their internal structure.
- */
-
-void *make_cgi_server_config (pool *);
-void *merge_cgi_server_config (pool *, void *, void *);
-
-/* Declarations of routines to handle config-file commands */
-
-extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
-                          char *real);
-
-command_rec cgi_cmds[] = {
-{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
-    "a fakename and a realname"},
-{ NULL }
-};
-
-module cgi_module = {
-   STANDARD_MODULE_STUFF,
-   NULL,                     /* initializer */
-   NULL,                     /* dir config creator */
-   NULL,                     /* dir merger --- default is to override */
-   make_cgi_server_config,   /* server config */
-   merge_cgi_server_config,  /* merge server config */
-   cgi_cmds,                 /* command table */
-   cgi_handlers,             /* handlers */
-   translate_scriptalias,    /* filename translation */
-   NULL,                     /* check_user_id */
-   NULL,                     /* check auth */
-   NULL,                     /* check access */
-   type_scriptalias,         /* type_checker */
-   NULL,                     /* fixups */
-   NULL,                     /* logger */
-   NULL                      /* header parser */
-};
-
- -

How handlers work

- -The sole argument to handlers is a request_rec structure. -This structure describes a particular request which has been made to -the server, on behalf of a client. In most cases, each connection to -the client generates only one request_rec structure.

- -

A brief tour of the request_rec

- -The request_rec contains pointers to a resource pool -which will be cleared when the server is finished handling the -request; to structures containing per-server and per-connection -information, and most importantly, information on the request itself.

- -The most important such information is a small set of character -strings describing attributes of the object being requested, including -its URI, filename, content-type and content-encoding (these being filled -in by the translation and type-check handlers which handle the -request, respectively).

- -Other commonly used data items are tables giving the MIME headers on -the client's original request, MIME headers to be sent back with the -response (which modules can add to at will), and environment variables -for any subprocesses which are spawned off in the course of servicing -the request. These tables are manipulated using the -ap_table_get and ap_table_set routines.

-

- Note that the Content-type header value cannot be - set by module content-handlers using the ap_table_*() - routines. Rather, it is set by pointing the content_type - field in the request_rec structure to an appropriate - string. E.g., -
-  r->content_type = "text/html";
- 
-
-Finally, there are pointers to two data structures which, in turn, -point to per-module configuration structures. Specifically, these -hold pointers to the data structures which the module has built to -describe the way it has been configured to operate in a given -directory (via .htaccess files or -<Directory> sections), for private data it has -built in the course of servicing the request (so modules' handlers for -one phase can pass `notes' to their handlers for other phases). There -is another such configuration vector in the server_rec -data structure pointed to by the request_rec, which -contains per (virtual) server configuration data.

- -Here is an abridged declaration, giving the fields most commonly used:

- -

-struct request_rec {
-
-  pool *pool;
-  conn_rec *connection;
-  server_rec *server;
-
-  /* What object is being requested */
-
-  char *uri;
-  char *filename;
-  char *path_info;
-  char *args;           /* QUERY_ARGS, if any */
-  struct stat finfo;    /* Set by server core;
-                         * st_mode set to zero if no such file */
-
-  char *content_type;
-  char *content_encoding;
-
-  /* MIME header environments, in and out.  Also, an array containing
-   * environment variables to be passed to subprocesses, so people can
-   * write modules to add to that environment.
-   *
-   * The difference between headers_out and err_headers_out is that
-   * the latter are printed even on error, and persist across internal
-   * redirects (so the headers printed for ErrorDocument handlers will
-   * have them).
-   */
-
-  table *headers_in;
-  table *headers_out;
-  table *err_headers_out;
-  table *subprocess_env;
-
-  /* Info about the request itself... */
-
-  int header_only;     /* HEAD request, as opposed to GET */
-  char *protocol;      /* Protocol, as given to us, or HTTP/0.9 */
-  char *method;        /* GET, HEAD, POST, etc. */
-  int method_number;   /* M_GET, M_POST, etc. */
-
-  /* Info for logging */
-
-  char *the_request;
-  int bytes_sent;
-
-  /* A flag which modules can set, to indicate that the data being
-   * returned is volatile, and clients should be told not to cache it.
-   */
-
-  int no_cache;
-
-  /* Various other config info which may change with .htaccess files
-   * These are config vectors, with one void* pointer for each module
-   * (the thing pointed to being the module's business).
-   */
-
-  void *per_dir_config;   /* Options set in config files, etc. */
-  void *request_config;   /* Notes on *this* request */
-
-};
-
-
- -

Where request_rec structures come from

- -Most request_rec structures are built by reading an HTTP -request from a client, and filling in the fields. However, there are -a few exceptions: - - - -

Handling requests, declining, and returning error - codes

- -As discussed above, each handler, when invoked to handle a particular -request_rec, has to return an int to -indicate what happened. That can either be - - - -Note that if the error code returned is REDIRECT, then -the module should put a Location in the request's -headers_out, to indicate where the client should be -redirected to.

- -

Special considerations for response - handlers

- -Handlers for most phases do their work by simply setting a few fields -in the request_rec structure (or, in the case of access -checkers, simply by returning the correct error code). However, -response handlers have to actually send a request back to the client.

- -They should begin by sending an HTTP response header, using the -function ap_send_http_header. (You don't have to do -anything special to skip sending the header for HTTP/0.9 requests; the -function figures out on its own that it shouldn't do anything). If -the request is marked header_only, that's all they should -do; they should return after that, without attempting any further -output.

- -Otherwise, they should produce a request body which responds to the -client as appropriate. The primitives for this are ap_rputc -and ap_rprintf, for internally generated output, and -ap_send_fd, to copy the contents of some FILE * -straight to the client.

- -At this point, you should more or less understand the following piece -of code, which is the handler which handles GET requests -which have no more specific handler; it also shows how conditional -GETs can be handled, if it's desirable to do so in a -particular response handler --- ap_set_last_modified checks -against the If-modified-since value supplied by the -client, if any, and returns an appropriate code (which will, if -nonzero, be USE_LOCAL_COPY). No similar considerations apply for -ap_set_content_length, but it returns an error code for -symmetry.

- -

-int default_handler (request_rec *r)
-{
-    int errstatus;
-    FILE *f;
-
-    if (r->method_number != M_GET) return DECLINED;
-    if (r->finfo.st_mode == 0) return NOT_FOUND;
-
-    if ((errstatus = ap_set_content_length (r, r->finfo.st_size))
-	|| (errstatus = ap_set_last_modified (r, r->finfo.st_mtime)))
-        return errstatus;
-
-    f = fopen (r->filename, "r");
-
-    if (f == NULL) {
-        log_reason("file permissions deny server access",
-                   r->filename, r);
-        return FORBIDDEN;
-    }
-
-    register_timeout ("send", r);
-    ap_send_http_header (r);
-
-    if (!r->header_only) send_fd (f, r);
-    ap_pfclose (r->pool, f);
-    return OK;
-}
-
- -Finally, if all of this is too much of a challenge, there are a few -ways out of it. First off, as shown above, a response handler which -has not yet produced any output can simply return an error code, in -which case the server will automatically produce an error response. -Secondly, it can punt to some other handler by invoking -ap_internal_redirect, which is how the internal redirection -machinery discussed above is invoked. A response handler which has -internally redirected should always return OK.

- -(Invoking ap_internal_redirect from handlers which are -not response handlers will lead to serious confusion). - -

Special considerations for authentication - handlers

- -Stuff that should be discussed here in detail: - - - -

Special considerations for logging handlers

- -When a request has internally redirected, there is the question of -what to log. Apache handles this by bundling the entire chain of -redirects into a list of request_rec structures which are -threaded through the r->prev and r->next -pointers. The request_rec which is passed to the logging -handlers in such cases is the one which was originally built for the -initial request from the client; note that the bytes_sent field will -only be correct in the last request in the chain (the one for which a -response was actually sent). - -

Resource allocation and resource pools

-

-One of the problems of writing and designing a server-pool server is -that of preventing leakage, that is, allocating resources (memory, -open files, etc.), without subsequently releasing them. The resource -pool machinery is designed to make it easy to prevent this from -happening, by allowing resource to be allocated in such a way that -they are automatically released when the server is done with -them. -

-

-The way this works is as follows: the memory which is allocated, file -opened, etc., to deal with a particular request are tied to a -resource pool which is allocated for the request. The pool -is a data structure which itself tracks the resources in question. -

-

-When the request has been processed, the pool is cleared. At -that point, all the memory associated with it is released for reuse, -all files associated with it are closed, and any other clean-up -functions which are associated with the pool are run. When this is -over, we can be confident that all the resource tied to the pool have -been released, and that none of them have leaked. -

-

-Server restarts, and allocation of memory and resources for per-server -configuration, are handled in a similar way. There is a -configuration pool, which keeps track of resources which were -allocated while reading the server configuration files, and handling -the commands therein (for instance, the memory that was allocated for -per-server module configuration, log files and other files that were -opened, and so forth). When the server restarts, and has to reread -the configuration files, the configuration pool is cleared, and so the -memory and file descriptors which were taken up by reading them the -last time are made available for reuse. -

-

-It should be noted that use of the pool machinery isn't generally -obligatory, except for situations like logging handlers, where you -really need to register cleanups to make sure that the log file gets -closed when the server restarts (this is most easily done by using the -function ap_pfopen, which also -arranges for the underlying file descriptor to be closed before any -child processes, such as for CGI scripts, are execed), or -in case you are using the timeout machinery (which isn't yet even -documented here). However, there are two benefits to using it: -resources allocated to a pool never leak (even if you allocate a -scratch string, and just forget about it); also, for memory -allocation, ap_palloc is generally faster than -malloc. -

-

-We begin here by describing how memory is allocated to pools, and then -discuss how other resources are tracked by the resource pool -machinery. -

-

Allocation of memory in pools

-

-Memory is allocated to pools by calling the function -ap_palloc, which takes two arguments, one being a pointer to -a resource pool structure, and the other being the amount of memory to -allocate (in chars). Within handlers for handling -requests, the most common way of getting a resource pool structure is -by looking at the pool slot of the relevant -request_rec; hence the repeated appearance of the -following idiom in module code: -

-
-int my_handler(request_rec *r)
-{
-    struct my_structure *foo;
-    ...
-
-    foo = (foo *)ap_palloc (r->pool, sizeof(my_structure));
-}
-
-

-Note that there is no ap_pfree --- -ap_palloced memory is freed only when the associated -resource pool is cleared. This means that ap_palloc does not -have to do as much accounting as malloc(); all it does in -the typical case is to round up the size, bump a pointer, and do a -range check. -

-

-(It also raises the possibility that heavy use of ap_palloc -could cause a server process to grow excessively large. There are -two ways to deal with this, which are dealt with below; briefly, you -can use malloc, and try to be sure that all of the memory -gets explicitly freed, or you can allocate a sub-pool of -the main pool, allocate your memory in the sub-pool, and clear it out -periodically. The latter technique is discussed in the section on -sub-pools below, and is used in the directory-indexing code, in order -to avoid excessive storage allocation when listing directories with -thousands of files). -

-

Allocating initialized memory

-

-There are functions which allocate initialized memory, and are -frequently useful. The function ap_pcalloc has the same -interface as ap_palloc, but clears out the memory it -allocates before it returns it. The function ap_pstrdup -takes a resource pool and a char * as arguments, and -allocates memory for a copy of the string the pointer points to, -returning a pointer to the copy. Finally ap_pstrcat is a -varargs-style function, which takes a pointer to a resource pool, and -at least two char * arguments, the last of which must be -NULL. It allocates enough memory to fit copies of each -of the strings, as a unit; for instance: -

-
-     ap_pstrcat (r->pool, "foo", "/", "bar", NULL);
-
-

-returns a pointer to 8 bytes worth of memory, initialized to -"foo/bar". -

-

Commonly-used pools in the Apache Web server

-

-A pool is really defined by its lifetime more than anything else. There -are some static pools in http_main which are passed to various -non-http_main functions as arguments at opportune times. Here they are: -

-
-
permanent_pool -
-
-
    -
  • never passed to anything else, this is the ancestor of all pools -
  • -
-
-
pconf -
-
-
    -
  • subpool of permanent_pool -
  • -
  • created at the beginning of a config "cycle"; exists until the - server is terminated or restarts; passed to all config-time - routines, either via cmd->pool, or as the "pool *p" argument on - those which don't take pools -
  • -
  • passed to the module init() functions -
  • -
-
-
ptemp -
-
-
    -
  • sorry I lie, this pool isn't called this currently in 1.3, I - renamed it this in my pthreads development. I'm referring to - the use of ptrans in the parent... contrast this with the later - definition of ptrans in the child. -
  • -
  • subpool of permanent_pool -
  • -
  • created at the beginning of a config "cycle"; exists until the - end of config parsing; passed to config-time routines via - cmd->temp_pool. Somewhat of a "bastard child" because it isn't - available everywhere. Used for temporary scratch space which - may be needed by some config routines but which is deleted at - the end of config. -
  • -
-
-
pchild -
-
-
    -
  • subpool of permanent_pool -
  • -
  • created when a child is spawned (or a thread is created); lives - until that child (thread) is destroyed -
  • -
  • passed to the module child_init functions -
  • -
  • destruction happens right after the child_exit functions are - called... (which may explain why I think child_exit is redundant - and unneeded) -
  • -
-
-
ptrans -
-
-
    -
  • should be a subpool of pchild, but currently is a subpool of - permanent_pool, see above -
  • -
  • cleared by the child before going into the accept() loop to receive - a connection -
  • -
  • used as connection->pool -
  • -
-
-
r->pool -
-
-
    -
  • for the main request this is a subpool of connection->pool; for - subrequests it is a subpool of the parent request's pool. -
  • -
  • exists until the end of the request (i.e., - ap_destroy_sub_req, or - in child_main after process_request has finished) -
  • -
  • note that r itself is allocated from r->pool; i.e., - r->pool is - first created and then r is the first thing palloc()d from it -
  • -
-
-
-

-For almost everything folks do, r->pool is the pool to use. But you -can see how other lifetimes, such as pchild, are useful to some -modules... such as modules that need to open a database connection once -per child, and wish to clean it up when the child dies. -

-

-You can also see how some bugs have manifested themself, such as setting -connection->user to a value from r->pool -- in this case -connection exists -for the lifetime of ptrans, which is longer than r->pool (especially if -r->pool is a subrequest!). So the correct thing to do is to allocate -from connection->pool. -

-

-And there was another interesting bug in mod_include/mod_cgi. You'll see -in those that they do this test to decide if they should use r->pool -or r->main->pool. In this case the resource that they are registering -for cleanup is a child process. If it were registered in r->pool, -then the code would wait() for the child when the subrequest finishes. -With mod_include this could be any old #include, and the delay can be up -to 3 seconds... and happened quite frequently. Instead the subprocess -is registered in r->main->pool which causes it to be cleaned up when -the entire request is done -- i.e., after the output has been sent to -the client and logging has happened. -

-

Tracking open files, etc.

-

-As indicated above, resource pools are also used to track other sorts -of resources besides memory. The most common are open files. The -routine which is typically used for this is ap_pfopen, which -takes a resource pool and two strings as arguments; the strings are -the same as the typical arguments to fopen, e.g., -

-
-     ...
-     FILE *f = ap_pfopen (r->pool, r->filename, "r");
-
-     if (f == NULL) { ... } else { ... }
-
-

-There is also a ap_popenf routine, which parallels the -lower-level open system call. Both of these routines -arrange for the file to be closed when the resource pool in question -is cleared. -

-

-Unlike the case for memory, there are functions to close -files allocated with ap_pfopen, and ap_popenf, -namely ap_pfclose and ap_pclosef. (This is -because, on many systems, the number of files which a single process -can have open is quite limited). It is important to use these -functions to close files allocated with ap_pfopen and -ap_popenf, since to do otherwise could cause fatal errors on -systems such as Linux, which react badly if the same -FILE* is closed more than once. -

-

-(Using the close functions is not mandatory, since the -file will eventually be closed regardless, but you should consider it -in cases where your module is opening, or could open, a lot of files). -

-

Other sorts of resources --- cleanup functions

-
-More text goes here. Describe the the cleanup primitives in terms of -which the file stuff is implemented; also, spawn_process. -
-

-Pool cleanups live until clear_pool() is called: clear_pool(a) recursively -calls destroy_pool() on all subpools of a; then calls all the cleanups for a; -then releases all the memory for a. destroy_pool(a) calls clear_pool(a) -and then releases the pool structure itself. i.e., clear_pool(a) doesn't -delete a, it just frees up all the resources and you can start using it -again immediately. -

-

Fine control --- creating and dealing with sub-pools, with a note -on sub-requests

- -On rare occasions, too-free use of ap_palloc() and the -associated primitives may result in undesirably profligate resource -allocation. You can deal with such a case by creating a -sub-pool, allocating within the sub-pool rather than the main -pool, and clearing or destroying the sub-pool, which releases the -resources which were associated with it. (This really is a -rare situation; the only case in which it comes up in the standard -module set is in case of listing directories, and then only with -very large directories. Unnecessary use of the primitives -discussed here can hair up your code quite a bit, with very little -gain).

- -The primitive for creating a sub-pool is ap_make_sub_pool, -which takes another pool (the parent pool) as an argument. When the -main pool is cleared, the sub-pool will be destroyed. The sub-pool -may also be cleared or destroyed at any time, by calling the functions -ap_clear_pool and ap_destroy_pool, respectively. -(The difference is that ap_clear_pool frees resources -associated with the pool, while ap_destroy_pool also -deallocates the pool itself. In the former case, you can allocate new -resources within the pool, and clear it again, and so forth; in the -latter case, it is simply gone).

- -One final note --- sub-requests have their own resource pools, which -are sub-pools of the resource pool for the main request. The polite -way to reclaim the resources associated with a sub request which you -have allocated (using the ap_sub_req_... functions) -is ap_destroy_sub_req, which frees the resource pool. -Before calling this function, be sure to copy anything that you care -about which might be allocated in the sub-request's resource pool into -someplace a little less volatile (for instance, the filename in its -request_rec structure).

- -(Again, under most circumstances, you shouldn't feel obliged to call -this function; only 2K of memory or so are allocated for a typical sub -request, and it will be freed anyway when the main request pool is -cleared. It is only when you are allocating many, many sub-requests -for a single main request that you should seriously consider the -ap_destroy_... functions). - -

Configuration, commands and the like

- -One of the design goals for this server was to maintain external -compatibility with the NCSA 1.3 server --- that is, to read the same -configuration files, to process all the directives therein correctly, -and in general to be a drop-in replacement for NCSA. On the other -hand, another design goal was to move as much of the server's -functionality into modules which have as little as possible to do with -the monolithic server core. The only way to reconcile these goals is -to move the handling of most commands from the central server into the -modules.

- -However, just giving the modules command tables is not enough to -divorce them completely from the server core. The server has to -remember the commands in order to act on them later. That involves -maintaining data which is private to the modules, and which can be -either per-server, or per-directory. Most things are per-directory, -including in particular access control and authorization information, -but also information on how to determine file types from suffixes, -which can be modified by AddType and -DefaultType directives, and so forth. In general, the -governing philosophy is that anything which can be made -configurable by directory should be; per-server information is -generally used in the standard set of modules for information like -Aliases and Redirects which come into play -before the request is tied to a particular place in the underlying -file system.

- -Another requirement for emulating the NCSA server is being able to -handle the per-directory configuration files, generally called -.htaccess files, though even in the NCSA server they can -contain directives which have nothing at all to do with access -control. Accordingly, after URI -> filename translation, but before -performing any other phase, the server walks down the directory -hierarchy of the underlying filesystem, following the translated -pathname, to read any .htaccess files which might be -present. The information which is read in then has to be -merged with the applicable information from the server's own -config files (either from the <Directory> sections -in access.conf, or from defaults in -srm.conf, which actually behaves for most purposes almost -exactly like <Directory />).

- -Finally, after having served a request which involved reading -.htaccess files, we need to discard the storage allocated -for handling them. That is solved the same way it is solved wherever -else similar problems come up, by tying those structures to the -per-transaction resource pool.

- -

Per-directory configuration structures

- -Let's look out how all of this plays out in mod_mime.c, -which defines the file typing handler which emulates the NCSA server's -behavior of determining file types from suffixes. What we'll be -looking at, here, is the code which implements the -AddType and AddEncoding commands. These -commands can appear in .htaccess files, so they must be -handled in the module's private per-directory data, which in fact, -consists of two separate tables for MIME types and -encoding information, and is declared as follows: - -
-typedef struct {
-    table *forced_types;      /* Additional AddTyped stuff */
-    table *encoding_types;    /* Added with AddEncoding... */
-} mime_dir_config;
-
- -When the server is reading a configuration file, or -<Directory> section, which includes one of the MIME -module's commands, it needs to create a mime_dir_config -structure, so those commands have something to act on. It does this -by invoking the function it finds in the module's `create per-dir -config slot', with two arguments: the name of the directory to which -this configuration information applies (or NULL for -srm.conf), and a pointer to a resource pool in which the -allocation should happen.

- -(If we are reading a .htaccess file, that resource pool -is the per-request resource pool for the request; otherwise it is a -resource pool which is used for configuration data, and cleared on -restarts. Either way, it is important for the structure being created -to vanish when the pool is cleared, by registering a cleanup on the -pool if necessary).

- -For the MIME module, the per-dir config creation function just -ap_pallocs the structure above, and a creates a couple of -tables to fill it. That looks like this: - -

-void *create_mime_dir_config (pool *p, char *dummy)
-{
-    mime_dir_config *new =
-      (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config));
-
-    new->forced_types = ap_make_table (p, 4);
-    new->encoding_types = ap_make_table (p, 4);
-
-    return new;
-}
-
- -Now, suppose we've just read in a .htaccess file. We -already have the per-directory configuration structure for the next -directory up in the hierarchy. If the .htaccess file we -just read in didn't have any AddType or -AddEncoding commands, its per-directory config structure -for the MIME module is still valid, and we can just use it. -Otherwise, we need to merge the two structures somehow.

- -To do that, the server invokes the module's per-directory config merge -function, if one is present. That function takes three arguments: -the two structures being merged, and a resource pool in which to -allocate the result. For the MIME module, all that needs to be done -is overlay the tables from the new per-directory config structure with -those from the parent: - -

-void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
-{
-    mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
-    mime_dir_config *subdir = (mime_dir_config *)subdirv;
-    mime_dir_config *new =
-      (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config));
-
-    new->forced_types = ap_overlay_tables (p, subdir->forced_types,
-                                        parent_dir->forced_types);
-    new->encoding_types = ap_overlay_tables (p, subdir->encoding_types,
-                                          parent_dir->encoding_types);
-
-    return new;
-}
-
- -As a note --- if there is no per-directory merge function present, the -server will just use the subdirectory's configuration info, and ignore -the parent's. For some modules, that works just fine (e.g., for the -includes module, whose per-directory configuration information -consists solely of the state of the XBITHACK), and for -those modules, you can just not declare one, and leave the -corresponding structure slot in the module itself NULL.

- -

Command handling

- -Now that we have these structures, we need to be able to figure out -how to fill them. That involves processing the actual -AddType and AddEncoding commands. To find -commands, the server looks in the module's command table. -That table contains information on how many arguments the commands -take, and in what formats, where it is permitted, and so forth. That -information is sufficient to allow the server to invoke most -command-handling functions with pre-parsed arguments. Without further -ado, let's look at the AddType command handler, which -looks like this (the AddEncoding command looks basically -the same, and won't be shown here): - -
-char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
-{
-    if (*ext == '.') ++ext;
-    ap_table_set (m->forced_types, ext, ct);
-    return NULL;
-}
-
- -This command handler is unusually simple. As you can see, it takes -four arguments, two of which are pre-parsed arguments, the third being -the per-directory configuration structure for the module in question, -and the fourth being a pointer to a cmd_parms structure. -That structure contains a bunch of arguments which are frequently of -use to some, but not all, commands, including a resource pool (from -which memory can be allocated, and to which cleanups should be tied), -and the (virtual) server being configured, from which the module's -per-server configuration data can be obtained if required.

- -Another way in which this particular command handler is unusually -simple is that there are no error conditions which it can encounter. -If there were, it could return an error message instead of -NULL; this causes an error to be printed out on the -server's stderr, followed by a quick exit, if it is in -the main config files; for a .htaccess file, the syntax -error is logged in the server error log (along with an indication of -where it came from), and the request is bounced with a server error -response (HTTP error status, code 500).

- -The MIME module's command table has entries for these commands, which -look like this: - -

-command_rec mime_cmds[] = {
-{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
-    "a mime type followed by a file extension" },
-{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
-    "an encoding (e.g., gzip), followed by a file extension" },
-{ NULL }
-};
-
- -The entries in these tables are: - - - -Finally, having set this all up, we have to use it. This is -ultimately done in the module's handlers, specifically for its -file-typing handler, which looks more or less like this; note that the -per-directory configuration structure is extracted from the -request_rec's per-directory configuration vector by using -the ap_get_module_config function. - -
-int find_ct(request_rec *r)
-{
-    int i;
-    char *fn = ap_pstrdup (r->pool, r->filename);
-    mime_dir_config *conf = (mime_dir_config *)
-             ap_get_module_config(r->per_dir_config, &mime_module);
-    char *type;
-
-    if (S_ISDIR(r->finfo.st_mode)) {
-        r->content_type = DIR_MAGIC_TYPE;
-        return OK;
-    }
-
-    if((i=ap_rind(fn,'.')) < 0) return DECLINED;
-    ++i;
-
-    if ((type = ap_table_get (conf->encoding_types, &fn[i])))
-    {
-        r->content_encoding = type;
-
-        /* go back to previous extension to try to use it as a type */
-
-        fn[i-1] = '\0';
-        if((i=ap_rind(fn,'.')) < 0) return OK;
-        ++i;
-    }
-
-    if ((type = ap_table_get (conf->forced_types, &fn[i])))
-    {
-        r->content_type = type;
-    }
-
-    return OK;
-}
-
-
- -

Side notes --- per-server configuration, virtual - servers, etc.

- -The basic ideas behind per-server module configuration are basically -the same as those for per-directory configuration; there is a creation -function and a merge function, the latter being invoked where a -virtual server has partially overridden the base server configuration, -and a combined structure must be computed. (As with per-directory -configuration, the default if no merge function is specified, and a -module is configured in some virtual server, is that the base -configuration is simply ignored).

- -The only substantial difference is that when a command needs to -configure the per-server private module data, it needs to go to the -cmd_parms data to get at it. Here's an example, from the -alias module, which also indicates how a syntax error can be returned -(note that the per-directory configuration argument to the command -handler is declared as a dummy, since the module doesn't actually have -per-directory config data): - -

-char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
-{
-    server_rec *s = cmd->server;
-    alias_server_conf *conf = (alias_server_conf *)
-            ap_get_module_config(s->module_config,&alias_module);
-    alias_entry *new = ap_push_array (conf->redirects);
-
-    if (!ap_is_url (url)) return "Redirect to non-URL";
-
-    new->fake = f; new->real = url;
-    return NULL;
-}
-
- - diff --git a/docs/manual/misc/index.html b/docs/manual/misc/index.html index cb4ee732a6..2c31975345 100644 --- a/docs/manual/misc/index.html +++ b/docs/manual/misc/index.html @@ -20,12 +20,6 @@ Apache web server development project.

-
API -
-
Description of Apache's Application Programming Interface. -
How to use XSSI and Negotiation for custom ErrorDocuments