granicus.if.org Git - apache/blob - docs/manual/developer/API.xml

   1 <?xml version="1.0" encoding="UTF-8" ?>
   2 <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
   3 <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
   4 <!-- $LastChangedRevision$ -->
   5
   6 <!--
   7  Licensed to the Apache Software Foundation (ASF) under one or more
   8  contributor license agreements.  See the NOTICE file distributed with
   9  this work for additional information regarding copyright ownership.
  10  The ASF licenses this file to You under the Apache License, Version 2.0
  11  (the "License"); you may not use this file except in compliance with
  12  the License.  You may obtain a copy of the License at
  13
  14      http://www.apache.org/licenses/LICENSE-2.0
  15
  16  Unless required by applicable law or agreed to in writing, software
  17  distributed under the License is distributed on an "AS IS" BASIS,
  18  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  19  See the License for the specific language governing permissions and
  20  limitations under the License.
  21 -->
  22
  23 <manualpage metafile="API.xml.meta">
  24 <parentdocument href="./">Developer Documentation</parentdocument>
  25
  26 <title>Apache 1.3 API notes</title>
  27
  28 <summary>
  29     <note type="warning"><title>Warning</title>
  30       <p>This document has not been updated to take into account changes made
  31       in the 2.0 version of the Apache HTTP Server. Some of the information may
  32       still be relevant, but please use it with care.</p>
  33     </note>
  34
  35     <p>These are some notes on the Apache API and the data structures you have
  36     to deal with, <em>etc.</em> They are not yet nearly complete, but hopefully,
  37     they will help you get your bearings. Keep in mind that the API is still
  38     subject to change as we gain experience with it. (See the TODO file for
  39     what <em>might</em> be coming). However, it will be easy to adapt modules
  40     to any changes that are made. (We have more modules to adapt than you
  41     do).</p>
  42
  43     <p>A few notes on general pedagogical style here. In the interest of
  44     conciseness, all structure declarations here are incomplete -- the real
  45     ones have more slots that I'm not telling you about. For the most part,
  46     these are reserved to one component of the server core or another, and
  47     should be altered by modules with caution. However, in some cases, they
  48     really are things I just haven't gotten around to yet. Welcome to the
  49     bleeding edge.</p>
  50
  51     <p>Finally, here's an outline, to give you some bare idea of what's coming
  52     up, and in what order:</p>
  53
  54     <ul>
  55       <li>
  56         <a href="#basics">Basic concepts.</a>
  57
  58         <ul>
  59           <li><a href="#HMR">Handlers, Modules, and
  60           Requests</a></li>
  61
  62           <li><a href="#moduletour">A brief tour of a
  63           module</a></li>
  64         </ul>
  65       </li>
  66
  67       <li>
  68         <a href="#handlers">How handlers work</a>
  69
  70         <ul>
  71           <li><a href="#req_tour">A brief tour of the
  72           <code>request_rec</code></a></li>
  73
  74           <li><a href="#req_orig">Where request_rec structures come
  75           from</a></li>
  76
  77           <li><a href="#req_return">Handling requests, declining,
  78           and returning error codes</a></li>
  79
  80           <li><a href="#resp_handlers">Special considerations for
  81           response handlers</a></li>
  82
  83           <li><a href="#auth_handlers">Special considerations for
  84           authentication handlers</a></li>
  85
  86           <li><a href="#log_handlers">Special considerations for
  87           logging handlers</a></li>
  88         </ul>
  89       </li>
  90
  91       <li><a href="#pools">Resource allocation and resource
  92       pools</a></li>
  93
  94       <li>
  95         <a href="#config">Configuration, commands and the like</a>
  96
  97         <ul>
  98           <li><a href="#per-dir">Per-directory configuration
  99           structures</a></li>
 100
 101           <li><a href="#commands">Command handling</a></li>
 102
 103           <li><a href="#servconf">Side notes --- per-server
 104           configuration, virtual servers, <em>etc</em>.</a></li>
 105         </ul>
 106       </li>
 107     </ul>
 108 </summary>
 109
 110 <section id="basics"><title>Basic concepts</title>
 111     <p>We begin with an overview of the basic concepts behind the API, and how
 112     they are manifested in the code.</p>
 113
 114     <section id="HMR"><title>Handlers, Modules, and Requests</title>
 115       <p>Apache breaks down request handling into a series of steps, more or
 116       less the same way the Netscape server API does (although this API has a
 117       few more stages than NetSite does, as hooks for stuff I thought might be
 118       useful in the future). These are:</p>
 119
 120       <ul>
 121       <li>URI -&gt; Filename translation</li>
 122       <li>Auth ID checking [is the user who they say they are?]</li>
 123       <li>Auth access checking [is the user authorized <em>here</em>?]</li>
 124       <li>Access checking other than auth</li>
 125       <li>Determining MIME type of the object requested</li>
 126       <li>`Fixups' -- there aren't any of these yet, but the phase is intended
 127       as a hook for possible extensions like <directive module="mod_env"
 128       >SetEnv</directive>, which don't really fit well elsewhere.</li>
 129       <li>Actually sending a response back to the client.</li>
 130       <li>Logging the request</li>
 131       </ul>
 132
 133       <p>These phases are handled by looking at each of a succession of
 134       <em>modules</em>, looking to see if each of them has a handler for the
 135       phase, and attempting invoking it if so. The handler can typically do one
 136       of three things:</p>
 137
 138       <ul>
 139       <li><em>Handle</em> the request, and indicate that it has done so by
 140       returning the magic constant <code>OK</code>.</li>
 141
 142       <li><em>Decline</em> to handle the request, by returning the magic integer
 143       constant <code>DECLINED</code>. In this case, the server behaves in all
 144       respects as if the handler simply hadn't been there.</li>
 145
 146       <li>Signal an error, by returning one of the HTTP error codes. This
 147       terminates normal handling of the request, although an ErrorDocument may
 148       be invoked to try to mop up, and it will be logged in any case.</li>
 149       </ul>
 150
 151       <p>Most phases are terminated by the first module that handles them;
 152       however, for logging, `fixups', and non-access authentication checking,
 153       all handlers always run (barring an error). Also, the response phase is
 154       unique in that modules may declare multiple handlers for it, via a
 155       dispatch table keyed on the MIME type of the requested object. Modules may
 156       declare a response-phase handler which can handle <em>any</em> request,
 157       by giving it the key <code>*/*</code> (<em>i.e.</em>, a wildcard MIME type
 158       specification). However, wildcard handlers are only invoked if the server
 159       has already tried and failed to find a more specific response handler for
 160       the MIME type of the requested object (either none existed, or they all
 161       declined).</p>
 162
 163       <p>The handlers themselves are functions of one argument (a
 164       <code>request_rec</code> structure. vide infra), which returns an integer,
 165       as above.</p>
 166     </section>
 167
 168     <section id="moduletour"><title>A brief tour of a module</title>
 169       <p>At this point, we need to explain the structure of a module. Our
 170       candidate will be one of the messier ones, the CGI module -- this handles
 171       both CGI scripts and the <directive module="mod_alias"
 172       >ScriptAlias</directive> config file command. It's actually a great deal
 173       more complicated than most modules, but if we're going to have only one
 174       example, it might as well be the one with its fingers in every place.</p>
 175
 176       <p>Let's begin with handlers. In order to handle the CGI scripts, the
 177       module declares a response handler for them. Because of <directive
 178       module="mod_alias">ScriptAlias</directive>, it also has handlers for the
 179       name translation phase (to recognize <directive module="mod_alias"
 180       >ScriptAlias</directive>ed URIs), the type-checking phase (any
 181       <directive module="mod_alias">ScriptAlias</directive>ed request is typed
 182       as a CGI script).</p>
 183
 184       <p>The module needs to maintain some per (virtual) server information,
 185       namely, the <directive module="mod_alias">ScriptAlias</directive>es in
 186       effect; the module structure therefore contains pointers to a functions
 187       which builds these structures, and to another which combines two of them
 188       (in case the main server and a virtual server both have <directive
 189       module="mod_alias">ScriptAlias</directive>es declared).</p>
 190
 191       <p>Finally, this module contains code to handle the <directive
 192       module="mod_alias">ScriptAlias</directive> command itself. This particular
 193       module only declares one command, but there could be more, so modules have
 194       <em>command tables</em> which declare their commands, and describe where
 195       they are permitted, and how they are to be invoked.</p>
 196
 197       <p>A final note on the declared types of the arguments of some of these
 198       commands: a <code>pool</code> is a pointer to a <em>resource pool</em>
 199       structure; these are used by the server to keep track of the memory which
 200       has been allocated, files opened, <em>etc.</em>, either to service a
 201       particular request, or to handle the process of configuring itself. That
 202       way, when the request is over (or, for the configuration pool, when the
 203       server is restarting), the memory can be freed, and the files closed,
 204       <em>en masse</em>, without anyone having to write explicit code to track
 205       them all down and dispose of them. Also, a <code>cmd_parms</code>
 206       structure contains various information about the config file being read,
 207       and other status information, which is sometimes of use to the function
 208       which processes a config-file command (such as <directive
 209       module="mod_alias">ScriptAlias</directive>). With no further ado, the
 210       module itself:</p>
 211
 212       <example>
 213         /* Declarations of handlers. */<br />
 214         <br />
 215         int translate_scriptalias (request_rec *);<br />
 216         int type_scriptalias (request_rec *);<br />
 217         int cgi_handler (request_rec *);<br />
 218         <br />
 219         /* Subsidiary dispatch table for response-phase <br />
 220         &nbsp;* handlers, by MIME type */<br />
 221         <br />
 222         handler_rec cgi_handlers[] = {<br />
 223         <indent>
 224           { "application/x-httpd-cgi", cgi_handler },<br />
 225           { NULL }<br />
 226         </indent>
 227         };<br />
 228         <br />
 229         /* Declarations of routines to manipulate the <br />
 230         &nbsp;* module's configuration info.  Note that these are<br />
 231         &nbsp;* returned, and passed in, as void *'s; the server<br />
 232         &nbsp;* core keeps track of them, but it doesn't, and can't,<br />
 233         &nbsp;* know their internal structure.<br />
 234         &nbsp;*/<br />
 235         <br />
 236         void *make_cgi_server_config (pool *);<br />
 237         void *merge_cgi_server_config (pool *, void *, void *);<br />
 238         <br />
 239         /* Declarations of routines to handle config-file commands */<br />
 240         <br />
 241         extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
 242                                   char *real);<br />
 243         <br />
 244         command_rec cgi_cmds[] = {<br />
 245         <indent>
 246           { "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,<br />
 247           <indent>"a fakename and a realname"},<br /></indent>
 248           { NULL }<br />
 249         </indent>
 250         };<br />
 251         <br />
 252         module cgi_module = {
 253 <pre>  STANDARD_MODULE_STUFF,
 254   NULL,                     /* initializer */
 255   NULL,                     /* dir config creator */
 256   NULL,                     /* dir merger */
 257   make_cgi_server_config,   /* server config */
 258   merge_cgi_server_config,  /* merge server config */
 259   cgi_cmds,                 /* command table */
 260   cgi_handlers,             /* handlers */
 261   translate_scriptalias,    /* filename translation */
 262   NULL,                     /* check_user_id */
 263   NULL,                     /* check auth */
 264   NULL,                     /* check access */
 265   type_scriptalias,         /* type_checker */
 266   NULL,                     /* fixups */
 267   NULL,                     /* logger */
 268   NULL                      /* header parser */
 269 };</pre>
 270       </example>
 271     </section>
 272 </section>
 273
 274 <section id="handlers"><title>How handlers work</title>
 275     <p>The sole argument to handlers is a <code>request_rec</code> structure.
 276     This structure describes a particular request which has been made to the
 277     server, on behalf of a client. In most cases, each connection to the
 278     client generates only one <code>request_rec</code> structure.</p>
 279
 280     <section id="req_tour"><title>A brief tour of the request_rec</title>
 281       <p>The <code>request_rec</code> contains pointers to a resource pool
 282       which will be cleared when the server is finished handling the request;
 283       to structures containing per-server and per-connection information, and
 284       most importantly, information on the request itself.</p>
 285
 286       <p>The most important such information is a small set of character strings
 287       describing attributes of the object being requested, including its URI,
 288       filename, content-type and content-encoding (these being filled in by the
 289       translation and type-check handlers which handle the request,
 290       respectively).</p>
 291
 292       <p>Other commonly used data items are tables giving the MIME headers on
 293       the client's original request, MIME headers to be sent back with the
 294       response (which modules can add to at will), and environment variables for
 295       any subprocesses which are spawned off in the course of servicing the
 296       request. These tables are manipulated using the <code>ap_table_get</code>
 297       and <code>ap_table_set</code> routines.</p>
 298
 299       <note>
 300         <p>Note that the <code>Content-type</code> header value <em>cannot</em>
 301         be set by module content-handlers using the <code>ap_table_*()</code>
 302         routines. Rather, it is set by pointing the <code>content_type</code>
 303         field in the <code>request_rec</code> structure to an appropriate
 304         string. <em>e.g.</em>,</p>
 305         <example>
 306           r-&gt;content_type = "text/html";
 307         </example>
 308       </note>
 309
 310       <p>Finally, there are pointers to two data structures which, in turn,
 311       point to per-module configuration structures. Specifically, these hold
 312       pointers to the data structures which the module has built to describe
 313       the way it has been configured to operate in a given directory (via
 314       <code>.htaccess</code> files or <directive type="section" module="core"
 315       >Directory</directive> sections), for private data it has built in the
 316       course of servicing the request (so modules' handlers for one phase can
 317       pass `notes' to their handlers for other phases). There is another such
 318       configuration vector in the <code>server_rec</code> data structure pointed
 319       to by the <code>request_rec</code>, which contains per (virtual) server
 320       configuration data.</p>
 321
 322       <p>Here is an abridged declaration, giving the fields most commonly
 323       used:</p>
 324
 325       <example>
 326         struct request_rec {<br />
 327         <br />
 328         pool *pool;<br />
 329         conn_rec *connection;<br />
 330         server_rec *server;<br />
 331         <br />
 332         /* What object is being requested */<br />
 333         <br />
 334         char *uri;<br />
 335         char *filename;<br />
 336         char *path_info;
 337 <pre>char *args;           /* QUERY_ARGS, if any */
 338 struct stat finfo;    /* Set by server core;
 339                        * st_mode set to zero if no such file */</pre>
 340         char *content_type;<br />
 341         char *content_encoding;<br />
 342         <br />
 343         /* MIME header environments, in and out. Also, <br />
 344         &nbsp;* an array containing environment variables to<br />
 345         &nbsp;* be passed to subprocesses, so people can write<br />
 346         &nbsp;* modules to add to that environment.<br />
 347         &nbsp;*<br />
 348         &nbsp;* The difference between headers_out and <br />
 349         &nbsp;* err_headers_out is that the latter are printed <br />
 350         &nbsp;* even on error, and persist across internal<br />
 351         &nbsp;* redirects (so the headers printed for <br />
 352         &nbsp;* <directive module="core">ErrorDocument</directive> handlers will have
 353          them).<br />
 354         &nbsp;*/<br />
 355          <br />
 356         table *headers_in;<br />
 357         table *headers_out;<br />
 358         table *err_headers_out;<br />
 359         table *subprocess_env;<br />
 360         <br />
 361         /* Info about the request itself... */<br />
 362         <br />
 363 <pre>int header_only;     /* HEAD request, as opposed to GET */
 364 char *protocol;      /* Protocol, as given to us, or HTTP/0.9 */
 365 char *method;        /* GET, HEAD, POST, <em>etc.</em> */
 366 int method_number;   /* M_GET, M_POST, <em>etc.</em> */
 367
 368 </pre>
 369         /* Info for logging */<br />
 370         <br />
 371         char *the_request;<br />
 372         int bytes_sent;<br />
 373         <br />
 374         /* A flag which modules can set, to indicate that<br />
 375         &nbsp;* the data being returned is volatile, and clients<br />
 376         &nbsp;* should be told not to cache it.<br />
 377         &nbsp;*/<br />
 378         <br />
 379         int no_cache;<br />
 380         <br />
 381         /* Various other config info which may change<br />
 382         &nbsp;* with .htaccess files<br />
 383         &nbsp;* These are config vectors, with one void*<br />
 384         &nbsp;* pointer for each module (the thing pointed<br />
 385         &nbsp;* to being the module's business).<br />
 386         &nbsp;*/<br />
 387         <br />
 388 <pre>void *per_dir_config;   /* Options set in config files, <em>etc.</em> */
 389 void *request_config;   /* Notes on *this* request */</pre>
 390         <br />
 391         };
 392       </example>
 393     </section>
 394
 395     <section id="req_orig"><title>Where request_rec structures come from</title>
 396       <p>Most <code>request_rec</code> structures are built by reading an HTTP
 397       request from a client, and filling in the fields. However, there are a
 398       few exceptions:</p>
 399
 400       <ul>
 401       <li>If the request is to an imagemap, a type map (<em>i.e.</em>, a
 402       <code>*.var</code> file), or a CGI script which returned a local
 403       `Location:', then the resource which the user requested is going to be
 404       ultimately located by some URI other than what the client originally
 405       supplied. In this case, the server does an <em>internal redirect</em>,
 406       constructing a new <code>request_rec</code> for the new URI, and
 407       processing it almost exactly as if the client had requested the new URI
 408       directly.</li>
 409
 410       <li>If some handler signaled an error, and an <code>ErrorDocument</code>
 411       is in scope, the same internal redirect machinery comes into play.</li>
 412
 413       <li><p>Finally, a handler occasionally needs to investigate `what would
 414       happen if' some other request were run. For instance, the directory
 415       indexing module needs to know what MIME type would be assigned to a
 416       request for each directory entry, in order to figure out what icon to
 417       use.</p>
 418
 419       <p>Such handlers can construct a <em>sub-request</em>, using the
 420       functions <code>ap_sub_req_lookup_file</code>,
 421       <code>ap_sub_req_lookup_uri</code>, and <code>ap_sub_req_method_uri</code>;
 422       these construct a new <code>request_rec</code> structure and processes it
 423       as you would expect, up to but not including the point of actually sending
 424       a response. (These functions skip over the access checks if the
 425       sub-request is for a file in the same directory as the original
 426       request).</p>
 427
 428       <p>(Server-side includes work by building sub-requests and then actually
 429       invoking the response handler for them, via the function
 430       <code>ap_run_sub_req</code>).</p>
 431       </li>
 432       </ul>
 433     </section>
 434
 435     <section id="req_return"><title>Handling requests, declining, and returning
 436     error codes</title>
 437       <p>As discussed above, each handler, when invoked to handle a particular
 438       <code>request_rec</code>, has to return an <code>int</code> to indicate
 439       what happened. That can either be</p>
 440
 441       <ul>
 442       <li><code>OK</code> -- the request was handled successfully. This may or
 443       may not terminate the phase.</li>
 444
 445       <li><code>DECLINED</code> -- no erroneous condition exists, but the module
 446       declines to handle the phase; the server tries to find another.</li>
 447
 448       <li>an HTTP error code, which aborts handling of the request.</li>
 449       </ul>
 450
 451       <p>Note that if the error code returned is <code>REDIRECT</code>, then
 452       the module should put a <code>Location</code> in the request's
 453       <code>headers_out</code>, to indicate where the client should be
 454       redirected <em>to</em>.</p>
 455     </section>
 456
 457     <section id="resp_handlers"><title>Special considerations for response
 458     handlers</title>
 459       <p>Handlers for most phases do their work by simply setting a few fields
 460       in the <code>request_rec</code> structure (or, in the case of access
 461       checkers, simply by returning the correct error code). However, response
 462       handlers have to actually send a request back to the client.</p>
 463
 464       <p>They should begin by sending an HTTP response header, using the
 465       function <code>ap_send_http_header</code>. (You don't have to do anything
 466       special to skip sending the header for HTTP/0.9 requests; the function
 467       figures out on its own that it shouldn't do anything). If the request is
 468       marked <code>header_only</code>, that's all they should do; they should
 469       return after that, without attempting any further output.</p>
 470
 471       <p>Otherwise, they should produce a request body which responds to the
 472       client as appropriate. The primitives for this are <code>ap_rputc</code>
 473       and <code>ap_rprintf</code>, for internally generated output, and
 474       <code>ap_send_fd</code>, to copy the contents of some <code>FILE *</code>
 475       straight to the client.</p>
 476
 477       <p>At this point, you should more or less understand the following piece
 478       of code, which is the handler which handles <code>GET</code> requests
 479       which have no more specific handler; it also shows how conditional
 480       <code>GET</code>s can be handled, if it's desirable to do so in a
 481       particular response handler -- <code>ap_set_last_modified</code> checks
 482       against the <code>If-modified-since</code> value supplied by the client,
 483       if any, and returns an appropriate code (which will, if nonzero, be
 484       USE_LOCAL_COPY). No similar considerations apply for
 485       <code>ap_set_content_length</code>, but it returns an error code for
 486       symmetry.</p>
 487
 488       <example>
 489         int default_handler (request_rec *r)<br />
 490         {<br />
 491         <indent>
 492           int errstatus;<br />
 493           FILE *f;<br />
 494           <br />
 495           if (r-&gt;method_number != M_GET) return DECLINED;<br />
 496           if (r-&gt;finfo.st_mode == 0) return NOT_FOUND;<br />
 497           <br />
 498           if ((errstatus = ap_set_content_length (r, r-&gt;finfo.st_size))<br />
 499           &nbsp;&nbsp;&nbsp;&nbsp;||
 500              (errstatus = ap_set_last_modified (r, r-&gt;finfo.st_mtime)))<br />
 501           return errstatus;<br />
 502           <br />
 503           f = fopen (r-&gt;filename, "r");<br />
 504           <br />
 505           if (f == NULL) {<br />
 506           <indent>
 507             log_reason("file permissions deny server access", r-&gt;filename, r);<br />
 508             return FORBIDDEN;<br />
 509           </indent>
 510           }<br />
 511           <br />
 512           register_timeout ("send", r);<br />
 513           ap_send_http_header (r);<br />
 514           <br />
 515           if (!r-&gt;header_only) send_fd (f, r);<br />
 516           ap_pfclose (r-&gt;pool, f);<br />
 517           return OK;<br />
 518         </indent>
 519         }
 520       </example>
 521
 522       <p>Finally, if all of this is too much of a challenge, there are a few
 523       ways out of it. First off, as shown above, a response handler which has
 524       not yet produced any output can simply return an error code, in which
 525       case the server will automatically produce an error response. Secondly,
 526       it can punt to some other handler by invoking
 527       <code>ap_internal_redirect</code>, which is how the internal redirection
 528       machinery discussed above is invoked. A response handler which has
 529       internally redirected should always return <code>OK</code>.</p>
 530
 531       <p>(Invoking <code>ap_internal_redirect</code> from handlers which are
 532       <em>not</em> response handlers will lead to serious confusion).</p>
 533     </section>
 534
 535     <section id="auth_handlers"><title>Special considerations for authentication
 536     handlers</title>
 537       <p>Stuff that should be discussed here in detail:</p>
 538
 539       <ul>
 540       <li>Authentication-phase handlers not invoked unless auth is
 541       configured for the directory.</li>
 542
 543       <li>Common auth configuration stored in the core per-dir
 544       configuration; it has accessors <code>ap_auth_type</code>,
 545       <code>ap_auth_name</code>, and <code>ap_requires</code>.</li>
 546
 547       <li>Common routines, to handle the protocol end of things, at
 548       least for HTTP basic authentication
 549       (<code>ap_get_basic_auth_pw</code>, which sets the
 550       <code>connection-&gt;user</code> structure field
 551       automatically, and <code>ap_note_basic_auth_failure</code>,
 552       which arranges for the proper <code>WWW-Authenticate:</code>
 553       header to be sent back).</li>
 554       </ul>
 555     </section>
 556
 557     <section id="log_handlers"><title>Special considerations for logging
 558     handlers</title>
 559       <p>When a request has internally redirected, there is the question of
 560       what to log. Apache handles this by bundling the entire chain of redirects
 561       into a list of <code>request_rec</code> structures which are threaded
 562       through the <code>r-&gt;prev</code> and <code>r-&gt;next</code> pointers.
 563       The <code>request_rec</code> which is passed to the logging handlers in
 564       such cases is the one which was originally built for the initial request
 565       from the client; note that the <code>bytes_sent</code> field will only be
 566       correct in the last request in the chain (the one for which a response was
 567       actually sent).</p>
 568     </section>
 569 </section>
 570
 571 <section id="pools"><title>Resource allocation and resource pools</title>
 572     <p>One of the problems of writing and designing a server-pool server is
 573     that of preventing leakage, that is, allocating resources (memory, open
 574     files, <em>etc.</em>), without subsequently releasing them. The resource
 575     pool machinery is designed to make it easy to prevent this from happening,
 576     by allowing resource to be allocated in such a way that they are
 577     <em>automatically</em> released when the server is done with them.</p>
 578
 579     <p>The way this works is as follows: the memory which is allocated, file
 580     opened, <em>etc.</em>, to deal with a particular request are tied to a
 581     <em>resource pool</em> which is allocated for the request. The pool is a
 582     data structure which itself tracks the resources in question.</p>
 583
 584     <p>When the request has been processed, the pool is <em>cleared</em>. At
 585     that point, all the memory associated with it is released for reuse, all
 586     files associated with it are closed, and any other clean-up functions which
 587     are associated with the pool are run. When this is over, we can be confident
 588     that all the resource tied to the pool have been released, and that none of
 589     them have leaked.</p>
 590
 591     <p>Server restarts, and allocation of memory and resources for per-server
 592     configuration, are handled in a similar way. There is a <em>configuration
 593     pool</em>, which keeps track of resources which were allocated while reading
 594     the server configuration files, and handling the commands therein (for
 595     instance, the memory that was allocated for per-server module configuration,
 596     log files and other files that were opened, and so forth). When the server
 597     restarts, and has to reread the configuration files, the configuration pool
 598     is cleared, and so the memory and file descriptors which were taken up by
 599     reading them the last time are made available for reuse.</p>
 600
 601     <p>It should be noted that use of the pool machinery isn't generally
 602     obligatory, except for situations like logging handlers, where you really
 603     need to register cleanups to make sure that the log file gets closed when
 604     the server restarts (this is most easily done by using the function <code><a
 605     href="#pool-files">ap_pfopen</a></code>, which also arranges for the
 606     underlying file descriptor to be closed before any child processes, such as
 607     for CGI scripts, are <code>exec</code>ed), or in case you are using the
 608     timeout machinery (which isn't yet even documented here). However, there are
 609     two benefits to using it: resources allocated to a pool never leak (even if
 610     you allocate a scratch string, and just forget about it); also, for memory
 611     allocation, <code>ap_palloc</code> is generally faster than
 612     <code>malloc</code>.</p>
 613
 614     <p>We begin here by describing how memory is allocated to pools, and then
 615     discuss how other resources are tracked by the resource pool machinery.</p>
 616
 617     <section><title>Allocation of memory in pools</title>
 618       <p>Memory is allocated to pools by calling the function
 619       <code>ap_palloc</code>, which takes two arguments, one being a pointer to
 620       a resource pool structure, and the other being the amount of memory to
 621       allocate (in <code>char</code>s). Within handlers for handling requests,
 622       the most common way of getting a resource pool structure is by looking at
 623       the <code>pool</code> slot of the relevant <code>request_rec</code>; hence
 624       the repeated appearance of the following idiom in module code:</p>
 625
 626       <example>
 627         int my_handler(request_rec *r)<br />
 628         {<br />
 629         <indent>
 630           struct my_structure *foo;<br />
 631           ...<br />
 632           <br />
 633           foo = (foo *)ap_palloc (r-&gt;pool, sizeof(my_structure));<br />
 634         </indent>
 635         }
 636       </example>
 637
 638       <p>Note that <em>there is no <code>ap_pfree</code></em> --
 639       <code>ap_palloc</code>ed memory is freed only when the associated resource
 640       pool is cleared. This means that <code>ap_palloc</code> does not have to
 641       do as much accounting as <code>malloc()</code>; all it does in the typical
 642       case is to round up the size, bump a pointer, and do a range check.</p>
 643
 644       <p>(It also raises the possibility that heavy use of
 645       <code>ap_palloc</code> could cause a server process to grow excessively
 646       large. There are two ways to deal with this, which are dealt with below;
 647       briefly, you can use <code>malloc</code>, and try to be sure that all of
 648       the memory gets explicitly <code>free</code>d, or you can allocate a
 649       sub-pool of the main pool, allocate your memory in the sub-pool, and clear
 650       it out periodically. The latter technique is discussed in the section
 651       on sub-pools below, and is used in the directory-indexing code, in order
 652       to avoid excessive storage allocation when listing directories with
 653       thousands of files).</p>
 654     </section>
 655
 656     <section><title>Allocating initialized memory</title>
 657       <p>There are functions which allocate initialized memory, and are
 658       frequently useful. The function <code>ap_pcalloc</code> has the same
 659       interface as <code>ap_palloc</code>, but clears out the memory it
 660       allocates before it returns it. The function <code>ap_pstrdup</code>
 661       takes a resource pool and a <code>char *</code> as arguments, and
 662       allocates memory for a copy of the string the pointer points to, returning
 663       a pointer to the copy. Finally <code>ap_pstrcat</code> is a varargs-style
 664       function, which takes a pointer to a resource pool, and at least two
 665       <code>char *</code> arguments, the last of which must be
 666       <code>NULL</code>. It allocates enough memory to fit copies of each of
 667       the strings, as a unit; for instance:</p>
 668
 669       <example>
 670         ap_pstrcat (r-&gt;pool, "foo", "/", "bar", NULL);
 671       </example>
 672
 673       <p>returns a pointer to 8 bytes worth of memory, initialized to
 674       <code>"foo/bar"</code>.</p>
 675     </section>
 676
 677     <section id="pools-used"><title>Commonly-used pools in the Apache Web
 678     server</title>
 679       <p>A pool is really defined by its lifetime more than anything else.
 680       There are some static pools in http_main which are passed to various
 681       non-http_main functions as arguments at opportune times. Here they
 682       are:</p>
 683
 684       <dl>
 685       <dt><code>permanent_pool</code></dt>
 686       <dd>never passed to anything else, this is the ancestor of all pools</dd>
 687
 688       <dt><code>pconf</code></dt>
 689       <dd>
 690         <ul>
 691           <li>subpool of permanent_pool</li>
 692
 693           <li>created at the beginning of a config "cycle"; exists
 694           until the server is terminated or restarts; passed to all
 695           config-time routines, either via cmd-&gt;pool, or as the
 696           "pool *p" argument on those which don't take pools</li>
 697
 698           <li>passed to the module init() functions</li>
 699         </ul>
 700       </dd>
 701
 702       <dt><code>ptemp</code></dt>
 703       <dd>
 704         <ul>
 705           <li>sorry I lie, this pool isn't called this currently in
 706           1.3, I renamed it this in my pthreads development. I'm
 707           referring to the use of ptrans in the parent... contrast
 708           this with the later definition of ptrans in the
 709           child.</li>
 710
 711           <li>subpool of permanent_pool</li>
 712
 713           <li>created at the beginning of a config "cycle"; exists
 714           until the end of config parsing; passed to config-time
 715           routines <em>via</em> cmd-&gt;temp_pool. Somewhat of a
 716           "bastard child" because it isn't available everywhere.
 717           Used for temporary scratch space which may be needed by
 718           some config routines but which is deleted at the end of
 719           config.</li>
 720         </ul>
 721       </dd>
 722
 723       <dt><code>pchild</code></dt>
 724       <dd>
 725         <ul>
 726           <li>subpool of permanent_pool</li>
 727
 728           <li>created when a child is spawned (or a thread is
 729           created); lives until that child (thread) is
 730           destroyed</li>
 731
 732           <li>passed to the module child_init functions</li>
 733
 734           <li>destruction happens right after the child_exit
 735           functions are called... (which may explain why I think
 736           child_exit is redundant and unneeded)</li>
 737         </ul>
 738       </dd>
 739
 740       <dt><code>ptrans</code></dt>
 741       <dd>
 742         <ul>
 743           <li>should be a subpool of pchild, but currently is a
 744           subpool of permanent_pool, see above</li>
 745
 746           <li>cleared by the child before going into the accept()
 747           loop to receive a connection</li>
 748
 749           <li>used as connection-&gt;pool</li>
 750         </ul>
 751       </dd>
 752
 753       <dt><code>r-&gt;pool</code></dt>
 754       <dd>
 755         <ul>
 756           <li>for the main request this is a subpool of
 757           connection-&gt;pool; for subrequests it is a subpool of
 758           the parent request's pool.</li>
 759
 760           <li>exists until the end of the request (<em>i.e.</em>,
 761           ap_destroy_sub_req, or in child_main after
 762           process_request has finished)</li>
 763
 764           <li>note that r itself is allocated from r-&gt;pool;
 765           <em>i.e.</em>, r-&gt;pool is first created and then r is
 766           the first thing palloc()d from it</li>
 767         </ul>
 768       </dd>
 769       </dl>
 770
 771       <p>For almost everything folks do, <code>r-&gt;pool</code> is the pool to
 772       use. But you can see how other lifetimes, such as pchild, are useful to
 773       some modules... such as modules that need to open a database connection
 774       once per child, and wish to clean it up when the child dies.</p>
 775
 776       <p>You can also see how some bugs have manifested themself, such as
 777       setting <code>connection-&gt;user</code> to a value from
 778       <code>r-&gt;pool</code> -- in this case connection exists for the
 779       lifetime of <code>ptrans</code>, which is longer than
 780       <code>r-&gt;pool</code> (especially if <code>r-&gt;pool</code> is a
 781       subrequest!). So the correct thing to do is to allocate from
 782       <code>connection-&gt;pool</code>.</p>
 783
 784       <p>And there was another interesting bug in <module>mod_include</module>
 785       / <module>mod_cgi</module>. You'll see in those that they do this test
 786       to decide if they should use <code>r-&gt;pool</code> or
 787       <code>r-&gt;main-&gt;pool</code>. In this case the resource that they are
 788       registering for cleanup is a child process. If it were registered in
 789       <code>r-&gt;pool</code>, then the code would <code>wait()</code> for the
 790       child when the subrequest finishes. With <module>mod_include</module> this
 791       could be any old <code>#include</code>, and the delay can be up to 3
 792       seconds... and happened quite frequently. Instead the subprocess is
 793       registered in <code>r-&gt;main-&gt;pool</code> which causes it to be
 794       cleaned up when the entire request is done -- <em>i.e.</em>, after the
 795       output has been sent to the client and logging has happened.</p>
 796     </section>
 797
 798     <section id="pool-files"><title>Tracking open files, etc.</title>
 799       <p>As indicated above, resource pools are also used to track other sorts
 800       of resources besides memory. The most common are open files. The routine
 801       which is typically used for this is <code>ap_pfopen</code>, which takes a
 802       resource pool and two strings as arguments; the strings are the same as
 803       the typical arguments to <code>fopen</code>, <em>e.g.</em>,</p>
 804
 805       <example>
 806         ...<br />
 807         FILE *f = ap_pfopen (r-&gt;pool, r-&gt;filename, "r");<br />
 808         <br />
 809         if (f == NULL) { ... } else { ... }<br />
 810       </example>
 811
 812       <p>There is also a <code>ap_popenf</code> routine, which parallels the
 813       lower-level <code>open</code> system call. Both of these routines arrange
 814       for the file to be closed when the resource pool in question is
 815       cleared.</p>
 816
 817       <p>Unlike the case for memory, there <em>are</em> functions to close files
 818       allocated with <code>ap_pfopen</code>, and <code>ap_popenf</code>, namely
 819       <code>ap_pfclose</code> and <code>ap_pclosef</code>. (This is because, on
 820       many systems, the number of files which a single process can have open is
 821       quite limited). It is important to use these functions to close files
 822       allocated with <code>ap_pfopen</code> and <code>ap_popenf</code>, since to
 823       do otherwise could cause fatal errors on systems such as Linux, which
 824       react badly if the same <code>FILE*</code> is closed more than once.</p>
 825
 826       <p>(Using the <code>close</code> functions is not mandatory, since the
 827       file will eventually be closed regardless, but you should consider it in
 828       cases where your module is opening, or could open, a lot of files).</p>
 829     </section>
 830
 831     <section><title>Other sorts of resources -- cleanup functions</title>
 832       <p>More text goes here. Describe the cleanup primitives in terms of
 833       which the file stuff is implemented; also, <code>spawn_process</code>.</p>
 834
 835       <p>Pool cleanups live until <code>clear_pool()</code> is called:
 836       <code>clear_pool(a)</code> recursively calls <code>destroy_pool()</code>
 837       on all subpools of <code>a</code>; then calls all the cleanups for
 838       <code>a</code>; then releases all the memory for <code>a</code>.
 839       <code>destroy_pool(a)</code> calls <code>clear_pool(a)</code> and then
 840       releases the pool structure itself. <em>i.e.</em>,
 841       <code>clear_pool(a)</code> doesn't delete <code>a</code>, it just frees
 842       up all the resources and you can start using it again immediately.</p>
 843     </section>
 844
 845     <section><title>Fine control -- creating and dealing with sub-pools, with
 846     a note on sub-requests</title>
 847       <p>On rare occasions, too-free use of <code>ap_palloc()</code> and the
 848       associated primitives may result in undesirably profligate resource
 849       allocation. You can deal with such a case by creating a <em>sub-pool</em>,
 850       allocating within the sub-pool rather than the main pool, and clearing or
 851       destroying the sub-pool, which releases the resources which were
 852       associated with it. (This really <em>is</em> a rare situation; the only
 853       case in which it comes up in the standard module set is in case of listing
 854       directories, and then only with <em>very</em> large directories.
 855       Unnecessary use of the primitives discussed here can hair up your code
 856       quite a bit, with very little gain).</p>
 857
 858       <p>The primitive for creating a sub-pool is <code>ap_make_sub_pool</code>,
 859       which takes another pool (the parent pool) as an argument. When the main
 860       pool is cleared, the sub-pool will be destroyed. The sub-pool may also be
 861       cleared or destroyed at any time, by calling the functions
 862       <code>ap_clear_pool</code> and <code>ap_destroy_pool</code>, respectively.
 863       (The difference is that <code>ap_clear_pool</code> frees resources
 864       associated with the pool, while <code>ap_destroy_pool</code> also
 865       deallocates the pool itself. In the former case, you can allocate new
 866       resources within the pool, and clear it again, and so forth; in the
 867       latter case, it is simply gone).</p>
 868
 869       <p>One final note -- sub-requests have their own resource pools, which are
 870       sub-pools of the resource pool for the main request. The polite way to
 871       reclaim the resources associated with a sub request which you have
 872       allocated (using the <code>ap_sub_req_...</code> functions) is
 873       <code>ap_destroy_sub_req</code>, which frees the resource pool. Before
 874       calling this function, be sure to copy anything that you care about which
 875       might be allocated in the sub-request's resource pool into someplace a
 876       little less volatile (for instance, the filename in its
 877       <code>request_rec</code> structure).</p>
 878
 879       <p>(Again, under most circumstances, you shouldn't feel obliged to call
 880       this function; only 2K of memory or so are allocated for a typical sub
 881       request, and it will be freed anyway when the main request pool is
 882       cleared. It is only when you are allocating many, many sub-requests for a
 883       single main request that you should seriously consider the
 884       <code>ap_destroy_...</code> functions).</p>
 885     </section>
 886 </section>
 887
 888 <section id="config"><title>Configuration, commands and the like</title>
 889     <p>One of the design goals for this server was to maintain external
 890     compatibility with the NCSA 1.3 server --- that is, to read the same
 891     configuration files, to process all the directives therein correctly, and
 892     in general to be a drop-in replacement for NCSA. On the other hand, another
 893     design goal was to move as much of the server's functionality into modules
 894     which have as little as possible to do with the monolithic server core. The
 895     only way to reconcile these goals is to move the handling of most commands
 896     from the central server into the modules.</p>
 897
 898     <p>However, just giving the modules command tables is not enough to divorce
 899     them completely from the server core. The server has to remember the
 900     commands in order to act on them later. That involves maintaining data which
 901     is private to the modules, and which can be either per-server, or
 902     per-directory. Most things are per-directory, including in particular access
 903     control and authorization information, but also information on how to
 904     determine file types from suffixes, which can be modified by
 905     <directive module="mod_mime">AddType</directive> and <directive
 906     module="core">ForceType</directive> directives, and so forth. In general,
 907     the governing philosophy is that anything which <em>can</em> be made
 908     configurable by directory should be; per-server information is generally
 909     used in the standard set of modules for information like
 910     <directive module="mod_alias">Alias</directive>es and <directive
 911     module="mod_alias">Redirect</directive>s which come into play before the
 912     request is tied to a particular place in the underlying file system.</p>
 913
 914     <p>Another requirement for emulating the NCSA server is being able to handle
 915     the per-directory configuration files, generally called
 916     <code>.htaccess</code> files, though even in the NCSA server they can
 917     contain directives which have nothing at all to do with access control.
 918     Accordingly, after URI -&gt; filename translation, but before performing any
 919     other phase, the server walks down the directory hierarchy of the underlying
 920     filesystem, following the translated pathname, to read any
 921     <code>.htaccess</code> files which might be present. The information which
 922     is read in then has to be <em>merged</em> with the applicable information
 923     from the server's own config files (either from the <directive
 924     type="section" module="core">Directory</directive> sections in
 925     <code>access.conf</code>, or from defaults in <code>srm.conf</code>, which
 926     actually behaves for most purposes almost exactly like <code>&lt;Directory
 927     /&gt;</code>).</p>
 928
 929     <p>Finally, after having served a request which involved reading
 930     <code>.htaccess</code> files, we need to discard the storage allocated for
 931     handling them. That is solved the same way it is solved wherever else
 932     similar problems come up, by tying those structures to the per-transaction
 933     resource pool.</p>
 934
 935     <section id="per-dir"><title>Per-directory configuration structures</title>
 936       <p>Let's look out how all of this plays out in <code>mod_mime.c</code>,
 937       which defines the file typing handler which emulates the NCSA server's
 938       behavior of determining file types from suffixes. What we'll be looking
 939       at, here, is the code which implements the <directive module="mod_mime"
 940       >AddType</directive> and <directive module="mod_mime"
 941       >AddEncoding</directive> commands. These commands can appear in
 942       <code>.htaccess</code> files, so they must be handled in the module's
 943       private per-directory data, which in fact, consists of two separate
 944       tables for MIME types and encoding information, and is declared as
 945       follows:</p>
 946
 947       <example>
 948 <pre>typedef struct {
 949     table *forced_types;      /* Additional AddTyped stuff */
 950     table *encoding_types;    /* Added with AddEncoding... */
 951 } mime_dir_config;</pre>
 952       </example>
 953
 954       <p>When the server is reading a configuration file, or <directive
 955       type="section" module="core">Directory</directive> section, which includes
 956       one of the MIME module's commands, it needs to create a
 957       <code>mime_dir_config</code> structure, so those commands have something
 958       to act on. It does this by invoking the function it finds in the module's
 959       `create per-dir config slot', with two arguments: the name of the
 960       directory to which this configuration information applies (or
 961       <code>NULL</code> for <code>srm.conf</code>), and a pointer to a
 962       resource pool in which the allocation should happen.</p>
 963
 964       <p>(If we are reading a <code>.htaccess</code> file, that resource pool
 965       is the per-request resource pool for the request; otherwise it is a
 966       resource pool which is used for configuration data, and cleared on
 967       restarts. Either way, it is important for the structure being created to
 968       vanish when the pool is cleared, by registering a cleanup on the pool if
 969       necessary).</p>
 970
 971       <p>For the MIME module, the per-dir config creation function just
 972       <code>ap_palloc</code>s the structure above, and a creates a couple of
 973       tables to fill it. That looks like this:</p>
 974
 975       <example>
 976         void *create_mime_dir_config (pool *p, char *dummy)<br />
 977         {<br />
 978         <indent>
 979           mime_dir_config *new =<br />
 980           <indent>
 981            (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config));<br />
 982           </indent>
 983           <br />
 984           new-&gt;forced_types = ap_make_table (p, 4);<br />
 985           new-&gt;encoding_types = ap_make_table (p, 4);<br />
 986           <br />
 987           return new;<br />
 988         </indent>
 989         }
 990       </example>
 991
 992       <p>Now, suppose we've just read in a <code>.htaccess</code> file. We
 993       already have the per-directory configuration structure for the next
 994       directory up in the hierarchy. If the <code>.htaccess</code> file we just
 995       read in didn't have any <directive module="mod_mime">AddType</directive>
 996       or <directive module="mod_mime">AddEncoding</directive> commands, its
 997       per-directory config structure for the MIME module is still valid, and we
 998       can just use it. Otherwise, we need to merge the two structures
 999       somehow.</p>
1000
1001       <p>To do that, the server invokes the module's per-directory config merge
1002       function, if one is present. That function takes three arguments: the two
1003       structures being merged, and a resource pool in which to allocate the
1004       result. For the MIME module, all that needs to be done is overlay the
1005       tables from the new per-directory config structure with those from the
1006       parent:</p>
1007
1008       <example>
1009         void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)<br />
1010         {<br />
1011         <indent>
1012           mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;<br />
1013           mime_dir_config *subdir = (mime_dir_config *)subdirv;<br />
1014           mime_dir_config *new =<br />
1015           <indent>
1016             (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config));<br />
1017           </indent>
1018           <br />
1019           new-&gt;forced_types = ap_overlay_tables (p, subdir-&gt;forced_types,<br />
1020           <indent>
1021             parent_dir-&gt;forced_types);<br />
1022           </indent>
1023           new-&gt;encoding_types = ap_overlay_tables (p, subdir-&gt;encoding_types,<br />
1024           <indent>
1025             parent_dir-&gt;encoding_types);<br />
1026           </indent>
1027           <br />
1028           return new;<br />
1029         </indent>
1030         }
1031       </example>
1032
1033       <p>As a note -- if there is no per-directory merge function present, the
1034       server will just use the subdirectory's configuration info, and ignore
1035       the parent's. For some modules, that works just fine (<em>e.g.</em>, for
1036       the includes module, whose per-directory configuration information
1037       consists solely of the state of the <code>XBITHACK</code>), and for those
1038       modules, you can just not declare one, and leave the corresponding
1039       structure slot in the module itself <code>NULL</code>.</p>
1040     </section>
1041
1042     <section id="commands"><title>Command handling</title>
1043       <p>Now that we have these structures, we need to be able to figure out how
1044       to fill them. That involves processing the actual <directive
1045       module="mod_mime">AddType</directive> and <directive module="mod_mime"
1046       >AddEncoding</directive> commands. To find commands, the server looks in
1047       the module's command table. That table contains information on how many
1048       arguments the commands take, and in what formats, where it is permitted,
1049       and so forth. That information is sufficient to allow the server to invoke
1050       most command-handling functions with pre-parsed arguments. Without further
1051       ado, let's look at the <directive module="mod_mime">AddType</directive>
1052       command handler, which looks like this (the <directive module="mod_mime"
1053       >AddEncoding</directive> command looks basically the same, and won't be
1054       shown here):</p>
1055
1056       <example>
1057         char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)<br />
1058         {<br />
1059         <indent>
1060           if (*ext == '.') ++ext;<br />
1061           ap_table_set (m-&gt;forced_types, ext, ct);<br />
1062           return NULL;<br />
1063         </indent>
1064         }
1065       </example>
1066
1067       <p>This command handler is unusually simple. As you can see, it takes
1068       four arguments, two of which are pre-parsed arguments, the third being the
1069       per-directory configuration structure for the module in question, and the
1070       fourth being a pointer to a <code>cmd_parms</code> structure. That
1071       structure contains a bunch of arguments which are frequently of use to
1072       some, but not all, commands, including a resource pool (from which memory
1073       can be allocated, and to which cleanups should be tied), and the (virtual)
1074       server being configured, from which the module's per-server configuration
1075       data can be obtained if required.</p>
1076
1077       <p>Another way in which this particular command handler is unusually
1078       simple is that there are no error conditions which it can encounter. If
1079       there were, it could return an error message instead of <code>NULL</code>;
1080       this causes an error to be printed out on the server's
1081       <code>stderr</code>, followed by a quick exit, if it is in the main config
1082       files; for a <code>.htaccess</code> file, the syntax error is logged in
1083       the server error log (along with an indication of where it came from), and
1084       the request is bounced with a server error response (HTTP error status,
1085       code 500).</p>
1086
1087       <p>The MIME module's command table has entries for these commands, which
1088       look like this:</p>
1089
1090       <example>
1091         command_rec mime_cmds[] = {<br />
1092         <indent>
1093           { "AddType", add_type, NULL, OR_FILEINFO, TAKE2,<br />
1094           <indent>"a mime type followed by a file extension" },<br /></indent>
1095           { "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,<br />
1096           <indent>
1097           "an encoding (<em>e.g.</em>, gzip), followed by a file extension" },<br />
1098           </indent>
1099           { NULL }<br />
1100         </indent>
1101         };
1102       </example>
1103
1104       <p>The entries in these tables are:</p>
1105       <ul>
1106       <li>The name of the command</li>
1107       <li>The function which handles it</li>
1108       <li>a <code>(void *)</code> pointer, which is passed in the
1109       <code>cmd_parms</code> structure to the command handler ---
1110       this is useful in case many similar commands are handled by
1111       the same function.</li>
1112
1113       <li>A bit mask indicating where the command may appear. There
1114       are mask bits corresponding to each
1115       <code>AllowOverride</code> option, and an additional mask
1116       bit, <code>RSRC_CONF</code>, indicating that the command may
1117       appear in the server's own config files, but <em>not</em> in
1118       any <code>.htaccess</code> file.</li>
1119
1120       <li>A flag indicating how many arguments the command handler
1121       wants pre-parsed, and how they should be passed in.
1122       <code>TAKE2</code> indicates two pre-parsed arguments. Other
1123       options are <code>TAKE1</code>, which indicates one
1124       pre-parsed argument, <code>FLAG</code>, which indicates that
1125       the argument should be <code>On</code> or <code>Off</code>,
1126       and is passed in as a boolean flag, <code>RAW_ARGS</code>,
1127       which causes the server to give the command the raw, unparsed
1128       arguments (everything but the command name itself). There is
1129       also <code>ITERATE</code>, which means that the handler looks
1130       the same as <code>TAKE1</code>, but that if multiple
1131       arguments are present, it should be called multiple times,
1132       and finally <code>ITERATE2</code>, which indicates that the
1133       command handler looks like a <code>TAKE2</code>, but if more
1134       arguments are present, then it should be called multiple
1135       times, holding the first argument constant.</li>
1136
1137       <li>Finally, we have a string which describes the arguments
1138       that should be present. If the arguments in the actual config
1139       file are not as required, this string will be used to help
1140       give a more specific error message. (You can safely leave
1141       this <code>NULL</code>).</li>
1142       </ul>
1143
1144       <p>Finally, having set this all up, we have to use it. This is ultimately
1145       done in the module's handlers, specifically for its file-typing handler,
1146       which looks more or less like this; note that the per-directory
1147       configuration structure is extracted from the <code>request_rec</code>'s
1148       per-directory configuration vector by using the
1149       <code>ap_get_module_config</code> function.</p>
1150
1151       <example>
1152         int find_ct(request_rec *r)<br />
1153         {<br />
1154         <indent>
1155           int i;<br />
1156           char *fn = ap_pstrdup (r-&gt;pool, r-&gt;filename);<br />
1157           mime_dir_config *conf = (mime_dir_config *)<br />
1158           <indent>
1159             ap_get_module_config(r-&gt;per_dir_config, &amp;mime_module);<br />
1160           </indent>
1161           char *type;<br />
1162           <br />
1163           if (S_ISDIR(r-&gt;finfo.st_mode)) {<br />
1164           <indent>
1165             r-&gt;content_type = DIR_MAGIC_TYPE;<br />
1166             return OK;<br />
1167           </indent>
1168           }<br />
1169           <br />
1170           if((i=ap_rind(fn,'.')) &lt; 0) return DECLINED;<br />
1171           ++i;<br />
1172           <br />
1173           if ((type = ap_table_get (conf-&gt;encoding_types, &amp;fn[i])))<br />
1174           {<br />
1175           <indent>
1176             r-&gt;content_encoding = type;<br />
1177             <br />
1178             /* go back to previous extension to try to use it as a type */<br />
1179             fn[i-1] = '\0';<br />
1180             if((i=ap_rind(fn,'.')) &lt; 0) return OK;<br />
1181             ++i;<br />
1182           </indent>
1183           }<br />
1184           <br />
1185           if ((type = ap_table_get (conf-&gt;forced_types, &amp;fn[i])))<br />
1186           {<br />
1187           <indent>
1188             r-&gt;content_type = type;<br />
1189           </indent>
1190           }<br />
1191           <br />
1192           return OK;
1193         </indent>
1194         }
1195       </example>
1196     </section>
1197
1198     <section id="servconf"><title>Side notes -- per-server configuration,
1199     virtual servers, <em>etc</em>.</title>
1200       <p>The basic ideas behind per-server module configuration are basically
1201       the same as those for per-directory configuration; there is a creation
1202       function and a merge function, the latter being invoked where a virtual
1203       server has partially overridden the base server configuration, and a
1204       combined structure must be computed. (As with per-directory configuration,
1205       the default if no merge function is specified, and a module is configured
1206       in some virtual server, is that the base configuration is simply
1207       ignored).</p>
1208
1209       <p>The only substantial difference is that when a command needs to
1210       configure the per-server private module data, it needs to go to the
1211       <code>cmd_parms</code> data to get at it. Here's an example, from the
1212       alias module, which also indicates how a syntax error can be returned
1213       (note that the per-directory configuration argument to the command
1214       handler is declared as a dummy, since the module doesn't actually have
1215       per-directory config data):</p>
1216
1217       <example>
1218         char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)<br />
1219         {<br />
1220         <indent>
1221           server_rec *s = cmd-&gt;server;<br />
1222           alias_server_conf *conf = (alias_server_conf *)<br />
1223           <indent>
1224             ap_get_module_config(s-&gt;module_config,&amp;alias_module);<br />
1225           </indent>
1226           alias_entry *new = ap_push_array (conf-&gt;redirects);<br />
1227           <br />
1228           if (!ap_is_url (url)) return "Redirect to non-URL";<br />
1229           <br />
1230           new-&gt;fake = f; new-&gt;real = url;<br />
1231           return NULL;<br />
1232         </indent>
1233         }
1234       </example>
1235     </section>
1236 </section>
1237
1238 </manualpage>