X-Git-Url: https://granicus.if.org/sourcecode?a=blobdiff_plain;f=docs%2Fmanual%2Fcaching.xml;h=47a6e11493040bf5d5d8f40773b77a5290bc6a6e;hb=402ea113bbd93eef00e66ba0caaef75df15cd0e8;hp=5c1ac0428d90e1e66835424c7a2f854f581a0185;hpb=c2dad2462cfe84eda8cf6906d76634d5490928f0;p=apache

diff --git a/docs/manual/caching.xml b/docs/manual/caching.xml
index 5c1ac0428d..47a6e11493 100644
--- a/docs/manual/caching.xml
+++ b/docs/manual/caching.xml
@@ -26,152 +26,246 @@
 
   <summary>
     <p>This document supplements the <module>mod_cache</module>,
-    <module>mod_disk_cache</module>, <module>mod_file_cache</module> and <a 
+    <module>mod_cache_disk</module>, <module>mod_file_cache</module> and <a
     href="programs/htcacheclean.html">htcacheclean</a> reference documentation.
-    It describes how to use Apache's caching features to accelerate web and 
+    It describes how to use the Apache HTTP Server's caching features to accelerate web and
     proxy serving, while avoiding common problems and misconfigurations.</p>
   </summary>
 
   <section id="introduction">
     <title>Introduction</title>
 
-    <p>As of Apache HTTP server version 2.2 <module>mod_cache</module>
-    and <module>mod_file_cache</module> are no longer marked
-    experimental and are considered suitable for production use. These
-    caching architectures provide a powerful means to accelerate HTTP
-    handling, both as an origin webserver and as a proxy.</p>
-
-    <p><module>mod_cache</module> and its provider modules 
-    <module>mod_disk_cache</module> 
-    provide intelligent, HTTP-aware caching. The content itself is stored
-    in the cache, and mod_cache aims to honour all of the various HTTP
-    headers and options that control the cachability of content. It can
-    handle both local and proxied content. <module>mod_cache</module>
-    is aimed at both simple and complex caching configurations, where
-    you are dealing with proxied content, dynamic local content or 
-    have a need to speed up access to local files which change with 
-    time.</p>
-
-    <p><module>mod_file_cache</module> on the other hand presents a more
-    basic, but sometimes useful, form of caching. Rather than maintain
-    the complexity of actively ensuring the cachability of URLs,
-    <module>mod_file_cache</module> offers file-handle and memory-mapping 
-    tricks to keep a cache of files as they were when Apache was last 
-    started. As such, <module>mod_file_cache</module> is aimed at improving 
-    the access time to local static files which do not change very
-    often.</p>
-
-    <p>As <module>mod_file_cache</module> presents a relatively simple
-    caching implementation, apart from the specific sections on <directive 
-    module="mod_file_cache">CacheFile</directive> and <directive 
-    module="mod_file_cache">MMapFile</directive>, the explanations
-    in this guide cover the <module>mod_cache</module> caching 
-    architecture.</p>
-
-    <p>To get the most from this document, you should be familiar with 
-    the basics of HTTP, and have read the Users' Guides to 
-    <a href="urlmapping.html">Mapping URLs to the Filesystem</a> and 
+    <p>The Apache HTTP server offers a range of caching features that
+    are designed to improve the performance of the server in various
+    ways.</p>
+
+    <dl>
+        <dt>Three-state RFC2616 HTTP caching</dt>
+        <dd>
+            <module>mod_cache</module>
+            and its provider modules
+            <module>mod_cache_disk</module>
+            provide intelligent, HTTP-aware caching. The content itself is stored
+            in the cache, and mod_cache aims to honor all of the various HTTP
+            headers and options that control the cacheability of content
+            as described in
+            <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">Section
+            13 of RFC2616</a>.
+            <module>mod_cache</module>
+            is aimed at both simple and complex caching configurations, where
+            you are dealing with proxied content, dynamic local content or
+            have a need to speed up access to local files on a potentially
+            slow disk.
+        </dd>
+
+        <dt>Two-state key/value shared object caching</dt>
+        <dd>
+            The <a href="socache.html">shared object cache API</a> (socache)
+            and its provider modules provide a
+            server wide key/value based shared object cache. These modules
+            are designed to cache low level data such as SSL sessions and
+            authentication credentials. Backends allow the data to be stored
+            server wide in shared memory, or datacenter wide in a cache such
+            as memcache or distcache.
+        </dd>
+
+        <dt>Specialized file caching</dt>
+        <dd>
+            <module>mod_file_cache</module>
+            offers the ability to pre-load
+            files into memory on server startup, and can improve access
+            times and save file handles on files that are accessed often,
+            as there is no need to go to disk on each request.
+        </dd>
+    </dl>
+
+    <p>To get the most from this document, you should be familiar with
+    the basics of HTTP, and have read the Users' Guides to
+    <a href="urlmapping.html">Mapping URLs to the Filesystem</a> and
     <a href="content-negotiation.html">Content negotiation</a>.</p>
 
   </section>
-   
-  <section id="overview">
 
-    <title>Caching Overview</title>
-    
+  <section id="http-caching">
+
+    <title>Three-state RFC2616 HTTP caching</title>
+
     <related>
       <modulelist>
         <module>mod_cache</module>
-        <module>mod_disk_cache</module>
-        <module>mod_file_cache</module>
+        <module>mod_cache_disk</module>
       </modulelist>
       <directivelist>
         <directive module="mod_cache">CacheEnable</directive>
         <directive module="mod_cache">CacheDisable</directive>
-        <directive module="mod_file_cache">CacheFile</directive>
-        <directive module="mod_file_cache">MMapFile</directive>
         <directive module="core">UseCanonicalName</directive>
         <directive module="mod_negotiation">CacheNegotiatedDocs</directive>
       </directivelist>
     </related>
 
-    <p>There are two main stages in <module>mod_cache</module> that can
-    occur in the lifetime of a request. First, <module>mod_cache</module>
-    is a URL mapping module, which means that if a URL has been cached,
-    and the cached version of that URL has not expired, the request will 
-    be served directly by <module>mod_cache</module>.</p>
-
-    <p>This means that any other stages that might ordinarily happen
-    in the process of serving a request -- for example being handled
-    by <module>mod_proxy</module>, or <module>mod_rewrite</module> --
-    won't happen.  But then this is the point of caching content in
-    the first place.</p>
-
-    <p>If the URL is not found within the cache, <module>mod_cache</module>
-    will add a <a href="filter.html">filter</a> to the request handling. After
-    Apache has located the content by the usual means, the filter will be run
-    as the content is served. If the content is determined to be cacheable, 
-    the content will be saved to the cache for future serving.</p>
-
-    <p>If the URL is found within the cache, but also found to have expired,
-    the filter is added anyway, but <module>mod_cache</module> will create
-    a conditional request to the backend, to determine if the cached version
-    is still current. If the cached version is still current, its
-    meta-information will be updated and the request will be served from the
-    cache. If the cached version is no longer current, the cached version
-    will be deleted and the filter will save the updated content to the cache
-    as it is served.</p>
+    <p>The HTTP protocol contains built in support for an in-line caching
+    mechanism
+    <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">
+    described by section 13 of RFC2616</a>, and the
+    <module>mod_cache</module> module can be used to take advantage of
+    this.</p>
+
+    <p>Unlike a simple two state key/value cache where the content
+    disappears completely when no longer fresh, an HTTP cache includes
+    a mechanism to retain stale content, and to ask the origin server
+    whether this stale content has changed and if not, make it fresh
+    again.</p>
+
+    <p>An entry in an HTTP cache exists in one of three states:</p>
+
+    <dl>
+    <dt>Fresh</dt>
+    <dd>
+        If the content is new enough (younger than its <strong>freshness
+        lifetime</strong>), it is considered <strong>fresh</strong>. An
+        HTTP cache is free to serve fresh content without making any
+        calls to the origin server at all.
+    </dd>
+    <dt>Stale</dt>
+    <dd>
+        <p>If the content is too old (older than its <strong>freshness
+        lifetime</strong>), it is considered <strong>stale</strong>. An
+        HTTP cache should contact the origin server and check whether
+        the content is still fresh before serving stale content to a
+        client. The origin server will either respond with replacement
+        content if not still valid, or ideally, the origin server will
+        respond with a code to tell the cache the content is still
+        fresh, without the need to generate or send the content again.
+        The content becomes fresh again and the cycle continues.</p>
+
+        <p>The HTTP protocol does allow the cache to serve stale data
+        under certain circumstances, such as when an attempt to freshen
+        the data with an origin server has failed with a 5xx error, or
+        when another request is already in the process of freshening
+        the given entry. In these cases a <code>Warning</code> header
+        is added to the response.</p>
+    </dd>
+    <dt>Non Existent</dt>
+    <dd>
+        If the cache gets full, it reserves the option to delete content
+        from the cache to make space. Content can be deleted at any time,
+        and can be stale or fresh. The <a
+        href="programs/htcacheclean.html">htcacheclean</a> tool can be
+        run on a once off basis, or deployed as a daemon to keep the size
+        of the cache within the given size, or the given number of inodes.
+        The tool attempts to delete stale content before attempting to
+        delete fresh content.
+    </dd>
+    </dl>
+
+    <p>Full details of how HTTP caching works can be found in
+    <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html">
+    Section 13 of RFC2616</a>.</p>
+
+    <section>
+      <title>Interaction with the Server</title>
+
+      <p>The <module>mod_cache</module> module hooks into the server in two
+      possible places depending on the value of the
+      <directive module="mod_cache">CacheQuickHandler</directive> directive:
+      </p>
+
+      <dl>
+        <dt>Quick handler phase</dt>
+        <dd>
+          <p>This phase happens very early on during the request processing,
+              just after the request has been parsed. If the content is
+              found within the cache, it is served immediately and almost
+              all request processing is bypassed.</p>
+
+              <p>In this scenario, the cache behaves as if it has been "bolted
+              on" to the front of the server.</p>
+
+              <p>This mode offers the best performance, as the majority of
+              server processing is bypassed. This mode however also bypasses the
+              authentication and authorization phases of server processing, so
+              this mode should be chosen with care when this is important.</p>
+
+              <p> Requests with an "Authorization" header (for example, HTTP Basic
+              Authentication) are neither cacheable nor served from the cache
+              when <module>mod_cache</module> is running in this phase.</p>
+          </dd>
+          <dt>Normal handler phase</dt>
+          <dd>
+              <p>This phase happens late in the request processing, after all
+              the request phases have completed.</p>
+
+              <p>In this scenario, the cache behaves as if it has been "bolted
+              on" to the back of the server.</p>
+
+              <p>This mode offers the most flexibility, as the potential exists
+              for caching to occur at a precisely controlled point in the filter
+              chain, and cached content can be filtered or personalized before
+              being sent to the client.</p>
+          </dd>
+        </dl>
+
+        <p>If the URL is not found within the cache, <module>mod_cache</module>
+        will add a <a href="filter.html">filter</a> to the filter stack in order
+        to record the response to the cache, and then stand down, allowing normal
+        request processing to continue. If the content is determined to be
+        cacheable, the content will be saved to the cache for future serving,
+        otherwise the content will be ignored.</p>
+
+        <p>If the content found within the cache is stale, the
+        <module>mod_cache</module> module converts the request into a
+        <strong>conditional request</strong>. If the origin server responds with
+        a normal response, the normal response is cached, replacing the content
+        already cached. If the origin server responds with a 304 Not Modified
+        response, the content is marked as fresh again, and the cached content
+        is served by the filter instead of saving it.</p>
+    </section>
 
     <section>
       <title>Improving Cache Hits</title>
 
-      <p>When caching locally generated content, ensuring that  
-      <directive module="core">UseCanonicalName</directive> is set to 
-      <code>On</code> can dramatically improve the ratio of cache hits. This
-      is because the hostname of the virtual-host serving the content forms
-      a part of the cache key. With the setting set to <code>On</code>
+      <p>When a virtual host is known by one of many different server aliases,
+      ensuring that <directive module="core">UseCanonicalName</directive> is
+      set to <code>On</code> can dramatically improve the ratio of cache hits.
+      This is because the hostname of the virtual-host serving the content is
+      used within the cache key. With the setting set to <code>On</code>
       virtual-hosts with multiple server names or aliases will not produce
       differently cached entities, and instead content will be cached as
       per the canonical hostname.</p>
 
-      <p>Because caching is performed within the URL to filename translation 
-      phase, cached documents will only be served in response to URL requests.
-      Ordinarily this is of little consequence, but there is one circumstance
-      in which it matters: If you are using <a href="howto/ssi.html">Server 
-      Side Includes</a>;</p>
+    </section>
 
-      <example>
-      <pre>
-&lt;!-- The following include can be cached --&gt;
-&lt;!--#include virtual="/footer.html" --&gt; 
+    <section>
+      <title>Freshness Lifetime</title>
 
-&lt;!-- The following include can not be cached --&gt;
-&lt;!--#include file="/path/to/footer.html" --&gt;</pre>
-      </example>
+      <p>Well formed content that is intended to be cached should declare an
+      explicit freshness lifetime with the <code>Cache-Control</code>
+      header's <code>max-age</code> or <code>s-maxage</code> fields, or
+      by including an <code>Expires</code> header.</p>
 
-      <p>If you are using Server Side Includes, and want the benefit of speedy
-      serves from the cache, you should use <code>virtual</code> include
-      types.</p>
-    </section>
-    
-    <section>
-      <title>Expiry Periods</title>
-    
-      <p>The default expiry period for cached entities is one hour, however 
-      this can be easily over-ridden by using the <directive 
-      module="mod_cache">CacheDefaultExpire</directive> directive. This
-      default is only used when the original source of the content does not
-      specify an expire time or time of last modification.</p>
+      <p>At the same time, the origin server defined freshness lifetime can
+      be overridden by a client when the client presents their own
+      <code>Cache-Control</code> header within the request. In this case,
+      the lowest freshness lifetime between request and response wins.</p>
+
+      <p>When this freshness lifetime is missing from the request or the
+      response, a default freshness lifetime is applied. The default
+      freshness lifetime for cached entities is one hour, however
+      this can be easily over-ridden by using the <directive
+      module="mod_cache">CacheDefaultExpire</directive> directive.</p>
 
       <p>If a response does not include an <code>Expires</code> header but does
       include a <code>Last-Modified</code> header, <module>mod_cache</module>
-      can infer an expiry period based on the use of the <directive 
+      can infer a freshness lifetime based on a heuristic, which can be
+      controlled through the use of the <directive
       module="mod_cache">CacheLastModifiedFactor</directive> directive.</p>
 
-      <p>For local content, <module>mod_expires</module> may be used to
-      fine-tune the expiry period.</p>
+      <p>For local content, or for remote content that does not define its own
+      <code>Expires</code> header, <module>mod_expires</module> may be used to
+      fine-tune the freshness lifetime by adding <code>max-age</code> and
+      <code>Expires</code>.</p>
 
-      <p>The maximum expiry period may also be controlled by using the
+      <p>The maximum freshness lifetime may also be controlled by using the
       <directive module="mod_cache">CacheMaxExpire</directive>.</p>
 
     </section>
@@ -179,75 +273,75 @@
     <section>
       <title>A Brief Guide to Conditional Requests</title>
 
-      <p>When content expires from the cache and is re-requested from the 
-      backend or content provider, rather than pass on the original request,
-      Apache will use a conditional request instead.</p>
-
-      <p>HTTP offers a number of headers which allow a client, or cache
-      to discern between different versions of the same content. For
-      example if a resource was served with an "Etag:" header, it is
-      possible to make a conditional request with an "If-None-Match:" 
-      header. If a resource was served with a "Last-Modified:" header
-      it is possible to make a conditional request with an 
-      "If-Modified-Since:" header, and so on.</p>
-
-      <p>When such a conditional request is made, the response differs
-      depending on whether the content matches the conditions. If a request is 
-      made with an "If-Modified-Since:" header, and the content has not been 
-      modified since the time indicated in the request then a terse "304 Not 
-      Modified" response is issued.</p>
-
-      <p>If the content has changed, then it is served as if the request were
-      not conditional to begin with.</p>
-
-      <p>The benefits of conditional requests in relation to caching are 
-      twofold. Firstly, when making such a request to the backend, if the 
-      content from the backend matches the content in the store, this can be
-      determined easily and without the overhead of transferring the entire
-      resource.</p>
-
-      <p>Secondly, conditional requests are usually less strenuous on the
-      backend. For static files, typically all that is involved is a call
-      to <code>stat()</code> or similar system call, to see if the file has
-      changed in size or modification time. As such, even if Apache is
-      caching local content, even expired content may still be served faster
-      from the cache if it has not changed. As long as reading from the cache
-      store is faster than reading from the backend (e.g. <module
-      >mod_disk_cache</module> with memory disk
-      compared to reading from disk).</p> 
+      <p>When content expires from the cache and becomes stale, rather than
+      pass on the original request, httpd will modify the request to make
+      it conditional instead.</p>
+
+      <p>When an <code>ETag</code> header exists in the original cached
+      response, <module>mod_cache</module> will add an
+      <code>If-None-Match</code> header to the request to the origin server.
+      When a <code>Last-Modified</code> header exists in the original
+      cached response, <module>mod_cache</module> will add an
+      <code>If-Modified-Since</code> header to the request to the origin
+      server. Performing either of these actions makes the request
+      <strong>conditional</strong>.</p>
+
+      <p>When a conditional request is received by an origin server, the
+      origin server should check whether the ETag or the Last-Modified
+      parameter has changed, as appropriate for the request. If not, the
+      origin should respond with a terse "304 Not Modified" response. This
+      signals to the cache that the stale content is still fresh should be
+      used for subsequent requests until the content's new freshness lifetime
+      is reached again.</p>
+
+      <p>If the content has changed, then the content is served as if the
+      request were not conditional to begin with.</p>
+
+      <p>Conditional requests offer two benefits. Firstly, when making such
+      a request to the origin server, if the content from the origin
+      matches the content in the cache, this can be determined easily and
+      without the overhead of transferring the entire resource.</p>
+
+      <p>Secondly, a well designed origin server will be designed in such
+      a way that conditional requests will be significantly cheaper to
+      produce than a full response. For static files, typically all that is
+      involved is a call to <code>stat()</code> or similar system call, to
+      see if the file has changed in size or modification time. As such, even
+      local content may still be served faster from the cache if it has not
+      changed.</p>
+
+      <p>Origin servers should make every effort to support conditional
+      requests as is practical, however if conditional requests are not
+      supported, the origin will respond as if the request was not
+      conditional, and the cache will respond as if the content had changed
+      and save the new content to the cache. In this case, the cache will
+      behave like a simple two state cache, where content is effectively
+      either fresh or deleted.</p>
     </section>
 
     <section>
       <title>What Can be Cached?</title>
 
-      <p>As mentioned already, the two styles of caching in Apache work 
-      differently, <module>mod_file_cache</module> caching maintains file 
-      contents as they were when Apache was started. When a request is 
-      made for a file that is cached by this module, it is intercepted 
-      and the cached file is served.</p>
-
-      <p><module>mod_cache</module> caching on the other hand is more
-      complex. When serving a request, if it has not been cached
-      previously, the caching module will determine if the content
-      is cacheable. The conditions for determining cachability of 
-      a response are;</p>
+      <p>The full definition of which responses can be cached by an HTTP
+      cache is defined in
+      <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4">
+      RFC2616 Section 13.4 Response Cacheability</a>, and can be summed up as
+      follows:</p>
 
       <ol>
-        <li>Caching must be enabled for this URL. See the <directive 
+        <li>Caching must be enabled for this URL. See the <directive
         module="mod_cache">CacheEnable</directive> and <directive
         module="mod_cache">CacheDisable</directive> directives.</li>
 
-        <li>The response must have a HTTP status code of 200, 203, 300, 301 or 
-        410.</li>
+        <li>If the response has an HTTP status code other than 200, 203, 300, 
+        301 or 410 it must also specify an "Expires" or "Cache-Control" header.
+        </li>
 
         <li>The request must be a HTTP GET request.</li>
 
-        <li>If the request contains an "Authorization:" header, the response
-        will not be cached.</li>
-
         <li>If the response contains an "Authorization:" header, it must
         also contain an "s-maxage", "must-revalidate" or "public" option
-        in the "Cache-Control:" header.</li>
+        in the "Cache-Control:" header, or it won't be cached.</li>
 
         <li>If the URL included a query string (e.g. from a HTML form GET
         method) it will not be cached unless the response specifies an
@@ -258,17 +352,17 @@
         <li>If the response has a status of 200 (OK), the response must
         also include at least one of the "Etag", "Last-Modified" or
         the "Expires" headers, or the max-age or s-maxage directive of
-        the "Cache-Control:" header, unless the 
-        <directive module="mod_cache">CacheIgnoreNoLastMod</directive> 
+        the "Cache-Control:" header, unless the
+        <directive module="mod_cache">CacheIgnoreNoLastMod</directive>
         directive has been used to require otherwise.</li>
 
         <li>If the response includes the "private" option in a "Cache-Control:"
-        header, it will not be stored unless the 
+        header, it will not be stored unless the
         <directive module="mod_cache">CacheStorePrivate</directive> has been
         used to require otherwise.</li>
 
-        <li>Likewise, if the response includes the "no-store" option in a 
-        "Cache-Control:" header, it will not be stored unless the 
+        <li>Likewise, if the response includes the "no-store" option in a
+        "Cache-Control:" header, it will not be stored unless the
         <directive module="mod_cache">CacheStoreNoStore</directive> has been
         used.</li>
 
@@ -279,29 +373,42 @@
 
     <section>
       <title>What Should Not be Cached?</title>
-    
-      <p>In short, any content which is highly time-sensitive, or which varies
-      depending on the particulars of the request that are not covered by
-      HTTP negotiation, should not be cached.</p>
-
-      <p>If you have dynamic content which changes depending on the IP address
-      of the requester, or changes every 5 minutes, it should almost certainly
-      not be cached.</p>
-
-      <p>If on the other hand, the content served differs depending on the
-      values of various HTTP headers, it might be possible
-      to cache it intelligently through the use of a "Vary" header.</p>
+
+      <p>It should be up to the client creating the request, or the origin
+      server constructing the response to decide whether or not the content
+      should be cacheable or not by correctly setting the
+      <code>Cache-Control</code> header, and <module>mod_cache</module> should
+      be left alone to honor the wishes of the client or server as appropriate.
+      </p>
+
+      <p>Content that is time sensitive, or which varies depending on the
+      particulars of the request that are not covered by HTTP negotiation,
+      should not be cached. This content should declare itself uncacheable
+      using the <code>Cache-Control</code> header.</p>
+
+      <p>If content changes often, expressed by a freshness lifetime of minutes
+      or seconds, the content can still be cached, however it is highly
+      desirable that the origin server supports
+      <strong>conditional requests</strong> correctly to ensure that
+      full responses do not have to be generated on a regular basis.</p>
+
+      <p>Content that varies based on client provided request headers can be
+      cached through intelligent use of the <code>Vary</code> response
+      header.</p>
+
     </section>
 
     <section>
       <title>Variable/Negotiated Content</title>
 
-      <p>If a response with a "Vary" header is received by 
-      <module>mod_cache</module> when requesting content by the backend it
-      will attempt to handle it intelligently. If possible, 
-      <module>mod_cache</module> will detect the headers attributed in the
-      "Vary" response in future requests and serve the correct cached 
-      response.</p>
+      <p>When the origin server is designed to respond with different content
+      based on the value of headers in the request, for example to serve
+      multiple languages at the same URL, HTTP's caching mechanism makes it
+      possible to cache multiple variants of the same page at the same URL.</p>
+
+      <p>This is done by the origin server adding a <code>Vary</code> header
+      to indicate which headers must be taken into account by a cache when
+      determining whether two variants are different from one another.</p>
 
       <p>If for example, a response is received with a vary header such as;</p>
 
@@ -312,181 +419,371 @@ Vary: negotiate,accept-language,accept-charset
       <p><module>mod_cache</module> will only serve the cached content to
       requesters with accept-language and accept-charset headers
       matching those of the original request.</p>
+
+      <p>Multiple variants of the content can be cached side by side,
+      <module>mod_cache</module> uses the <code>Vary</code> header and the
+      corresponding values of the request headers listed by <code>Vary</code>
+      to decide on which of many variants to return to the client.</p>
     </section>
+
   </section>
 
-  <section id="security">
-    <title>Security Considerations</title>
+  <section id="examples">
 
-    <section>
-      <title>Authorization and Access Control</title>
+    <title>Cache Setup Examples</title>
 
-      <p>Using <module>mod_cache</module> is very much like having a built
-      in reverse-proxy. Requests will be served by the caching module unless
-      it determines that the backend should be queried. When caching local
-      resources, this drastically changes the security model of Apache.</p>
+    <related>
+      <modulelist>
+        <module>mod_cache</module>
+        <module>mod_cache_disk</module>
+        <module>mod_cache_socache</module>
+        <module>mod_socache_memcache</module>
+      </modulelist>
+      <directivelist>
+        <directive module="mod_cache">CacheEnable</directive>
+        <directive module="mod_cache_disk">CacheRoot</directive>
+        <directive module="mod_cache_disk">CacheDirLevels</directive>
+        <directive module="mod_cache_disk">CacheDirLength</directive>
+        <directive module="mod_cache_socache">CacheSocache</directive>
+      </directivelist>
+    </related>
 
-      <p>As traversing a filesystem hierarchy to examine potential
-      <code>.htaccess</code> files would be a very expensive operation,
-      partially defeating the point of caching (to speed up requests),
-      <module>mod_cache</module> makes no decision about whether a cached
-      entity is authorised for serving. In other words; if
-      <module>mod_cache</module> has cached some content, it will be served
-      from the cache as long as that content has not expired.</p>
+    <section id="disk">
+      <title>Caching to Disk</title>
+
+      <p>The <module>mod_cache</module> module relies on specific backend store
+      implementations in order to manage the cache, and for caching to disk
+      <module>mod_cache_disk</module> is provided to support this.</p>
+
+      <p>Typically the module will be configured as so;</p>
+
+      <highlight language="config">
+CacheRoot   "/var/cache/apache/"
+CacheEnable disk /
+CacheDirLevels 2
+CacheDirLength 1
+      </highlight>
+
+      <p>Importantly, as the cached files are locally stored, operating system
+      in-memory caching will typically be applied to their access also. So
+      although the files are stored on disk, if they are frequently accessed
+      it is likely the operating system will ensure that they are actually
+      served from memory.</p>
 
-      <p>If, for example, your configuration permits access to a resource by IP
-      address you should ensure that this content is not cached. You can do this
-      by using the <directive module="mod_cache">CacheDisable</directive>
-      directive, or <module>mod_expires</module>. Left unchecked,
-      <module>mod_cache</module> - very much like a reverse proxy - would cache
-      the content when served and then serve it to any client, on any IP
-      address.</p>
     </section>
 
     <section>
-      <title>Local exploits</title>
+      <title>Understanding the Cache-Store</title>
 
-      <p>As requests to end-users can be served from the cache, the cache
-      itself can become a target for those wishing to deface or interfere with
-      content. It is important to bear in mind that the cache must at all
-      times be writable by the user which Apache is running as. This is in 
-      stark contrast to the usually recommended situation of maintaining
-      all content unwritable by the Apache user.</p>
+      <p>To store items in the cache, <module>mod_cache_disk</module> creates
+      a 22 character hash of the URL being requested. This hash incorporates
+      the hostname, protocol, port, path and any CGI arguments to the URL,
+      as well as elements defined by the Vary header to ensure that multiple
+      URLs do not collide with one another.</p>
 
-      <p>If the Apache user is compromised, for example through a flaw in
-      a CGI process, it is possible that the cache may be targeted. When
-      using <module>mod_disk_cache</module>, it is relatively easy to 
-      insert or modify a cached entity.</p>
+      <p>Each character may be any one of 64-different characters, which mean
+      that overall there are 64^22 possible hashes. For example, a URL might
+      be hashed to <code>xyTGxSMO2b68mBCykqkp1w</code>. This hash is used
+      as a prefix for the naming of the files specific to that URL within
+      the cache, however first it is split up into directories as per
+      the <directive module="mod_cache_disk">CacheDirLevels</directive> and
+      <directive module="mod_cache_disk">CacheDirLength</directive>
+      directives.</p>
 
-      <p>This presents a somewhat elevated risk in comparison to the other 
-      types of attack it is possible to make as the Apache user. If you are 
-      using <module>mod_disk_cache</module> you should bear this in mind - 
-      ensure you upgrade Apache when security upgrades are announced and 
-      run CGI processes as a non-Apache user using <a 
-      href="suexec.html">suEXEC</a> if possible.</p>
+      <p><directive module="mod_cache_disk">CacheDirLevels</directive>
+      specifies how many levels of subdirectory there should be, and
+      <directive module="mod_cache_disk">CacheDirLength</directive>
+      specifies how many characters should be in each directory. With
+      the example settings given above, the hash would be turned into
+      a filename prefix as
+      <code>/var/cache/apache/x/y/TGxSMO2b68mBCykqkp1w</code>.</p>
+
+      <p>The overall aim of this technique is to reduce the number of
+      subdirectories or files that may be in a particular directory,
+      as most file-systems slow down as this number increases. With
+      setting of "1" for
+      <directive module="mod_cache_disk">CacheDirLength</directive>
+      there can at most be 64 subdirectories at any particular level.
+      With a setting of 2 there can be 64 * 64 subdirectories, and so on.
+      Unless you have a good reason not to, using a setting of "1"
+      for <directive module="mod_cache_disk">CacheDirLength</directive>
+      is recommended.</p>
+
+      <p>Setting
+      <directive module="mod_cache_disk">CacheDirLevels</directive>
+      depends on how many files you anticipate to store in the cache.
+      With the setting of "2" used in the above example, a grand
+      total of 4096 subdirectories can ultimately be created. With
+      1 million files cached, this works out at roughly 245 cached
+      URLs per directory.</p>
+
+      <p>Each URL uses at least two files in the cache-store. Typically
+      there is a ".header" file, which includes meta-information about
+      the URL, such as when it is due to expire and a ".data" file
+      which is a verbatim copy of the content to be served.</p>
 
+      <p>In the case of a content negotiated via the "Vary" header, a
+      ".vary" directory will be created for the URL in question. This
+      directory will have multiple ".data" files corresponding to the
+      differently negotiated content.</p>
     </section>
 
     <section>
-      <title>Cache Poisoning</title>
+      <title>Maintaining the Disk Cache</title>
 
-      <p>When running Apache as a caching proxy server, there is also the
-      potential for so-called cache poisoning. Cache Poisoning is a broad 
-      term for attacks in which an attacker causes the proxy server to 
-      retrieve incorrect (and usually undesirable) content from the backend.
-      </p>
+      <p>The <module>mod_cache_disk</module> module makes no attempt to
+      regulate the amount of disk space used by the cache, although it
+      will gracefully stand down on any disk error and behave as if the
+      cache was never present.</p>
 
-      <p>For example if the DNS servers used by your system running Apache
-      are vulnerable to DNS cache poisoning, an attacker may be able to control
-      where Apache connects to when requesting content from the origin server.
-      Another example is so-called HTTP request-smuggling attacks.</p>
+      <p>Instead, provided with httpd is the <a
+      href="programs/htcacheclean.html">htcacheclean</a> tool which allows you
+      to clean the cache periodically. Determining how frequently to run <a
+      href="programs/htcacheclean.html">htcacheclean</a> and what target size to
+      use for the cache is somewhat complex and trial and error may be needed to
+      select optimal values.</p>
+
+      <p><a href="programs/htcacheclean.html">htcacheclean</a> has two modes of
+      operation. It can be run as persistent daemon, or periodically from
+      cron. <a
+      href="programs/htcacheclean.html">htcacheclean</a> can take up to an hour
+      or more to process very large (tens of gigabytes) caches and if you are
+      running it from cron it is recommended that you determine how long a typical
+      run takes, to avoid running more than one instance at a time.</p>
+
+      <p>It is also recommended that an appropriate "nice" level is chosen for
+      htcacheclean so that the tool does not cause excessive disk io while the
+      server is running.</p>
+
+      <p class="figure">
+      <img src="images/caching_fig1.gif" alt="" width="600"
+          height="406" /><br />
+      <a id="figure1" name="figure1"><dfn>Figure 1</dfn></a>: Typical
+      cache growth / clean sequence.</p>
+
+      <p>Because <module>mod_cache_disk</module> does not itself pay attention
+      to how much space is used you should ensure that
+      <a href="programs/htcacheclean.html">htcacheclean</a> is configured to
+      leave enough "grow room" following a clean.</p>
+    </section>
+
+    <section id="memcache">
+      <title>Caching to memcached</title>
+
+      <p>Using the <module>mod_cache_socache</module> module, <module>mod_cache</module>
+      can cache data from a variety of implementations (aka: "providers"). Using the
+      <module>mod_socache_memcache</module> module, for example, one can specify that
+      <a href="http://memcached.org">memcached</a> is to be used as the
+      the backend storage mechanism.</p>
+
+      <p>Typically the module will be configured as so:</p>
+
+      <highlight language="config">
+CacheEnable socache /
+CacheSocache memcache:memcd.example.com:11211
+      </highlight>
+
+      <p>Additional <code>memcached</code> servers can be specified by
+      appending them to the end of the <code>CacheSocache memcache:</code>
+      line separated by commas:</p>
+
+      <highlight language="config">
+CacheEnable socache /
+CacheSocache memcache:mem1.example.com:11211,mem2.example.com:11212
+      </highlight>
+
+      <p>This format is also used with the other various <module>mod_cache_socache</module>
+      providers. For example:</p>
+
+      <highlight language="config">
+CacheEnable socache /
+CacheSocache shmcb:/path/to/datafile(512000)
+      </highlight>
+
+      <highlight language="config">
+CacheEnable socache /
+CacheSocache dbm:/path/to/datafile
+      </highlight>
 
-      <p>This document is not the correct place for an in-depth discussion
-      of HTTP request smuggling (instead, try your favourite search engine)
-      however it is important to be aware that it is possible to make
-      a series of requests, and to exploit a vulnerability on an origin
-      webserver such that the attacker can entirely control the content
-      retrieved by the proxy.</p>
     </section>
+
   </section>
 
-  <section id="filehandle">
-    <title>File-Handle Caching</title>
+  <section id="socache-caching">
+
+    <title>General Two-state Key/Value Shared Object Caching</title>
 
     <related>
       <modulelist>
-        <module>mod_file_cache</module>
+        <module>mod_authn_socache</module>
+        <module>mod_socache_dbm</module>
+        <module>mod_socache_dc</module>
+        <module>mod_socache_memcache</module>
+        <module>mod_socache_shmcb</module>
+        <module>mod_ssl</module>
       </modulelist>
-      <directivelist>
-        <directive module="mod_file_cache">CacheFile</directive>
-      </directivelist>
+        <directivelist>
+          <directive module="mod_authn_socache">AuthnCacheSOCache</directive>
+          <directive module="mod_ssl">SSLSessionCache</directive>
+          <directive module="mod_ssl">SSLStaplingCache</directive>
+        </directivelist>
     </related>
 
-    <p>The act of opening a file can itself be a source of delay, particularly 
-    on network filesystems. By maintaining a cache of open file descriptors
-    for commonly served files, Apache can avoid this delay. Currently Apache
-    provides one implementation of File-Handle Caching.</p> 
+    <p>The Apache HTTP server offers a low level shared object cache for
+    caching information such as SSL sessions, or authentication credentials,
+    within the <a href="socache.html">socache</a> interface.</p>
+
+    <p>Additional modules are provided for each implementation, offering the
+    following backends:</p>
+
+    <dl>
+    <dt><module>mod_socache_dbm</module></dt>
+    <dd>DBM based shared object cache.</dd>
+    <dt><module>mod_socache_dc</module></dt>
+    <dd>Distcache based shared object cache.</dd>
+    <dt><module>mod_socache_memcache</module></dt>
+    <dd>Memcache based shared object cache.</dd>
+    <dt><module>mod_socache_shmcb</module></dt>
+    <dd>Shared memory based shared object cache.</dd>
+    </dl>
+
+    <section id="mod_authn_socache-caching">
+      <title>Caching Authentication Credentials</title>
+
+      <related>
+        <modulelist>
+          <module>mod_authn_socache</module>
+        </modulelist>
+        <directivelist>
+          <directive module="mod_authn_socache">AuthnCacheSOCache</directive>
+        </directivelist>
+      </related>
+
+      <p>The <module>mod_authn_socache</module> module allows the result of
+      authentication to be cached, relieving load on authentication backends.</p>
 
-    <section>
-      <title>CacheFile</title>
+    </section>
 
-      <p>The most basic form of caching present in Apache is the file-handle
-      caching provided by <module>mod_file_cache</module>. Rather than caching 
-      file-contents, this cache maintains a table of open file descriptors. Files 
-      to be cached in this manner are specified in the configuration file using
-      the <directive module="mod_file_cache">CacheFile</directive> 
-      directive.</p>
+    <section id="mod_ssl-caching">
+      <title>Caching SSL Sessions</title>
 
-      <p>The 
-      <directive module="mod_file_cache">CacheFile</directive> directive 
-      instructs Apache to open the file when Apache is started and to re-use 
-      this file-handle for all subsequent access to this file.</p>
+      <related>
+        <modulelist>
+          <module>mod_ssl</module>
+        </modulelist>
+        <directivelist>
+          <directive module="mod_ssl">SSLSessionCache</directive>
+          <directive module="mod_ssl">SSLStaplingCache</directive>
+        </directivelist>
+      </related>
 
-      <example>
-      <pre>CacheFile /usr/local/apache2/htdocs/index.html</pre>
-      </example>
+      <p>The <module>mod_ssl</module> module uses the <code>socache</code> interface
+      to provide a session cache and a stapling cache.</p>
 
-      <p>If you intend to cache a large number of files in this manner, you 
-      must ensure that your operating system's limit for the number of open 
-      files is set appropriately.</p>
-
-      <p>Although using <directive module="mod_file_cache">CacheFile</directive>
-      does not cause the file-contents to be cached per-se, it does mean
-      that if the file changes while Apache is running these changes will
-      not be picked up. The file will be consistently served as it was
-      when Apache was started.</p>
-
-      <p>If the file is removed while Apache is running, Apache will continue
-      to maintain an open file descriptor and serve the file as it was when
-      Apache was started. This usually also means that although the file
-      will have been deleted, and not show up on the filesystem, extra free
-      space will not be recovered until Apache is stopped and the file
-      descriptor closed.</p>
     </section>
 
   </section>
-  
-  <section id="inmemory">
-    <title>In-Memory Caching</title>
 
-     <related>
+  <section id="file-caching">
+
+    <title>Specialized File Caching</title>
+
+    <related>
       <modulelist>
         <module>mod_file_cache</module>
       </modulelist>
       <directivelist>
-        <directive module="mod_cache">CacheEnable</directive>
-        <directive module="mod_cache">CacheDisable</directive>
+        <directive module="mod_file_cache">CacheFile</directive>
         <directive module="mod_file_cache">MMapFile</directive>
       </directivelist>
     </related>
-       
-    <p>Serving directly from system memory is universally the fastest method
-    of serving content. Reading files from a disk controller or, even worse,
-    from a remote network is orders of magnitude slower. Disk controllers
-    usually involve physical processes, and network access is limited by
-    your available bandwidth. Memory access on the other hand can take mere
-    nano-seconds.</p>
-
-    <p>System memory isn't cheap though, byte for byte it's by far the most
-    expensive type of storage and it's important to ensure that it is used
-    efficiently. By caching files in memory you decrease the amount of 
-    memory available on the system. As we'll see, in the case of operating
-    system caching, this is not so much of an issue, but when using
-    Apache's own in-memory caching it is important to make sure that you
-    do not allocate too much memory to a cache. Otherwise the system
-    will be forced to swap out memory, which will likely degrade 
-    performance.</p>
 
-    <section>
-      <title>Operating System Caching</title>
+    <p>On platforms where a filesystem might be slow, or where file
+    handles are expensive, the option exists to pre-load files into
+    memory on startup.</p>
+
+    <p>On systems where opening files is slow, the option exists to
+    open the file on startup and cache the file handle. These
+    options can help on systems where access to static files is
+    slow.</p>
+
+    <section id="filehandle">
+      <title>File-Handle Caching</title>
+
+      <p>The act of opening a file can itself be a source of delay, particularly
+      on network filesystems. By maintaining a cache of open file descriptors
+      for commonly served files, httpd can avoid this delay. Currently httpd
+      provides one implementation of File-Handle Caching.</p>
+
+      <section>
+        <title>CacheFile</title>
+
+        <p>The most basic form of caching present in httpd is the file-handle
+        caching provided by <module>mod_file_cache</module>. Rather than caching
+        file-contents, this cache maintains a table of open file descriptors. Files
+        to be cached in this manner are specified in the configuration file using
+        the <directive module="mod_file_cache">CacheFile</directive>
+        directive.</p>
+
+        <p>The
+        <directive module="mod_file_cache">CacheFile</directive> directive
+        instructs httpd to open the file when it is started and to re-use
+        this file-handle for all subsequent access to this file.</p>
+
+        <highlight language="config">
+        CacheFile /usr/local/apache2/htdocs/index.html
+        </highlight>
+
+        <p>If you intend to cache a large number of files in this manner, you
+        must ensure that your operating system's limit for the number of open
+        files is set appropriately.</p>
+
+        <p>Although using <directive module="mod_file_cache">CacheFile</directive>
+        does not cause the file-contents to be cached per-se, it does mean
+        that if the file changes while httpd is running these changes will
+        not be picked up. The file will be consistently served as it was
+        when httpd was started.</p>
+
+        <p>If the file is removed while httpd is running, it will continue
+        to maintain an open file descriptor and serve the file as it was when
+        httpd was started. This usually also means that although the file
+        will have been deleted, and not show up on the filesystem, extra free
+        space will not be recovered until httpd is stopped and the file
+        descriptor closed.</p>
+      </section>
 
-      <p>Almost all modern operating systems cache file-data in memory managed
-      directly by the kernel. This is a powerful feature, and for the most
-      part operating systems get it right. For example, on Linux, let's look at
-      the difference in the time it takes to read a file for the first time
-      and the second time;</p>
+    </section>
 
-      <example><pre>
+    <section id="inmemory">
+      <title>In-Memory Caching</title>
+
+      <p>Serving directly from system memory is universally the fastest method
+      of serving content. Reading files from a disk controller or, even worse,
+      from a remote network is orders of magnitude slower. Disk controllers
+      usually involve physical processes, and network access is limited by
+      your available bandwidth. Memory access on the other hand can take mere
+      nano-seconds.</p>
+
+      <p>System memory isn't cheap though, byte for byte it's by far the most
+      expensive type of storage and it's important to ensure that it is used
+      efficiently. By caching files in memory you decrease the amount of
+      memory available on the system. As we'll see, in the case of operating
+      system caching, this is not so much of an issue, but when using
+      httpd's own in-memory caching it is important to make sure that you
+      do not allocate too much memory to a cache. Otherwise the system
+      will be forced to swap out memory, which will likely degrade
+      performance.</p>
+
+      <section>
+        <title>Operating System Caching</title>
+
+        <p>Almost all modern operating systems cache file-data in memory managed
+        directly by the kernel. This is a powerful feature, and for the most
+        part operating systems get it right. For example, on Linux, let's look at
+        the difference in the time it takes to read a file for the first time
+        and the second time;</p>
+
+        <example><pre>
 colm@coroebus:~$ time cat testfile &gt; /dev/null
 real    0m0.065s
 user    0m0.000s
@@ -495,177 +792,161 @@ colm@coroebus:~$ time cat testfile &gt; /dev/null
 real    0m0.003s
 user    0m0.003s
 sys     0m0.000s</pre>
-      </example>
-
-      <p>Even for this small file, there is a huge difference in the amount
-      of time it takes to read the file. This is because the kernel has cached
-      the file contents in memory.</p>
-
-      <p>By ensuring there is "spare" memory on your system, you can ensure 
-      that more and more file-contents will be stored in this cache. This 
-      can be a very efficient means of in-memory caching, and involves no 
-      extra configuration of Apache at all.</p>
-
-      <p>Additionally, because the operating system knows when files are 
-      deleted or modified, it can automatically remove file contents from the 
-      cache when neccessary. This is a big advantage over Apache's in-memory 
-      caching which has no way of knowing when a file has changed.</p>
+        </example>
+
+        <p>Even for this small file, there is a huge difference in the amount
+        of time it takes to read the file. This is because the kernel has cached
+        the file contents in memory.</p>
+
+        <p>By ensuring there is "spare" memory on your system, you can ensure
+        that more and more file-contents will be stored in this cache. This
+        can be a very efficient means of in-memory caching, and involves no
+        extra configuration of httpd at all.</p>
+
+        <p>Additionally, because the operating system knows when files are
+        deleted or modified, it can automatically remove file contents from the
+        cache when necessary. This is a big advantage over httpd's in-memory
+        caching which has no way of knowing when a file has changed.</p>
+      </section>
+
+      <p>Despite the performance and advantages of automatic operating system
+      caching there are some circumstances in which in-memory caching may be
+      better performed by httpd.</p>
+
+      <section>
+        <title>MMapFile Caching</title>
+
+        <p><module>mod_file_cache</module> provides the
+        <directive module="mod_file_cache">MMapFile</directive> directive, which
+        allows you to have httpd map a static file's contents into memory at
+        start time (using the mmap system call). httpd will use the in-memory
+        contents for all subsequent accesses to this file.</p>
+
+        <highlight language="config">
+        MMapFile /usr/local/apache2/htdocs/index.html
+        </highlight>
+
+        <p>As with the
+        <directive module="mod_file_cache">CacheFile</directive> directive, any
+        changes in these files will not be picked up by httpd after it has
+        started.</p>
+
+        <p> The <directive module="mod_file_cache">MMapFile</directive>
+        directive does not keep track of how much memory it allocates, so
+        you must ensure not to over-use the directive. Each httpd child
+        process will replicate this memory, so it is critically important
+        to ensure that the files mapped are not so large as to cause the
+        system to swap memory.</p>
+      </section>
     </section>
 
-    <p>Despite the performance and advantages of automatic operating system
-    caching there are some circumstances in which in-memory caching may be 
-    better performed by Apache.</p>
-
-    <section>
-      <title>MMapFile Caching</title>
-
-      <p><module>mod_file_cache</module> provides the 
-      <directive module="mod_file_cache">MMapFile</directive> directive, which
-      allows you to have Apache map a static file's contents into memory at
-      start time (using the mmap system call). Apache will use the in-memory 
-      contents for all subsequent accesses to this file.</p>
-
-      <example>
-      <pre>MMapFile /usr/local/apache2/htdocs/index.html</pre>
-      </example>
-
-      <p>As with the
-      <directive module="mod_file_cache">CacheFile</directive> directive, any
-      changes in these files will not be picked up by Apache after it has
-      started.</p>
-
-      <p> The <directive module="mod_file_cache">MMapFile</directive> 
-      directive does not keep track of how much memory it allocates, so
-      you must ensure not to over-use the directive. Each Apache child
-      process will replicate this memory, so it is critically important
-      to ensure that the files mapped are not so large as to cause the
-      system to swap memory.</p>
-    </section>
   </section>
-             
-  <section id="disk">
-    <title>Disk-based Caching</title>
-
-     <related>
-      <modulelist>
-        <module>mod_disk_cache</module>
-      </modulelist>
-      <directivelist>
-        <directive module="mod_cache">CacheEnable</directive>
-        <directive module="mod_cache">CacheDisable</directive>
-      </directivelist>
-    </related>
-       
-    <p><module>mod_disk_cache</module> provides a disk-based caching mechanism 
-    for <module>mod_cache</module>. This cache is intelligent and content will
-    be served from the cache only as long as it is considered valid.</p>
 
-    <p>Typically the module will be configured as so;</p>
+  <section id="security">
+    <title>Security Considerations</title>
 
-    <example>  
-    <pre>
-CacheRoot   /var/cache/apache/
-CacheEnable disk /
-CacheDirLevels 2
-CacheDirLength 1</pre>
-    </example>
+    <section>
+      <title>Authorization and Access Control</title>
 
-    <p>Importantly, as the cached files are locally stored, operating system
-    in-memory caching will typically be applied to their access also. So 
-    although the files are stored on disk, if they are frequently accessed 
-    it is likely the operating system will ensure that they are actually
-    served from memory.</p>
+      <p>Using <module>mod_cache</module> in its default state where
+      <directive module="mod_cache">CacheQuickHandler</directive> is set to
+      <code>On</code> is very much like having a caching reverse-proxy bolted
+      to the front of the server. Requests will be served by the caching module
+      unless it determines that the origin server should be queried just as an
+      external cache would, and this drastically changes the security model of
+      httpd.</p>
 
-    <section>
-      <title>Understanding the Cache-Store</title>
+      <p>As traversing a filesystem hierarchy to examine potential
+      <code>.htaccess</code> files would be a very expensive operation,
+      partially defeating the point of caching (to speed up requests),
+      <module>mod_cache</module> makes no decision about whether a cached
+      entity is authorised for serving. In other words; if
+      <module>mod_cache</module> has cached some content, it will be served
+      from the cache as long as that content has not expired.</p>
 
-      <p>To store items in the cache, <module>mod_disk_cache</module> creates
-      a 22 character hash of the URL being requested. This hash incorporates
-      the hostname, protocol, port, path and any CGI arguments to the URL,
-      to ensure that multiple URLs do not collide.</p>
+      <p>If, for example, your configuration permits access to a resource by IP
+      address you should ensure that this content is not cached. You can do this
+      by using the <directive module="mod_cache">CacheDisable</directive>
+      directive, or <module>mod_expires</module>. Left unchecked,
+      <module>mod_cache</module> - very much like a reverse proxy - would cache
+      the content when served and then serve it to any client, on any IP
+      address.</p>
 
-      <p>Each character may be any one of 64-different characters, which mean
-      that overall there are 64^22 possible hashes. For example, a URL might
-      be hashed to <code>xyTGxSMO2b68mBCykqkp1w</code>. This hash is used
-      as a prefix for the naming of the files specific to that URL within
-      the cache, however first it is split up into directories as per
-      the <directive module="mod_disk_cache">CacheDirLevels</directive> and
-      <directive module="mod_disk_cache">CacheDirLength</directive> 
-      directives.</p>
+      <p>When the <directive module="mod_cache">CacheQuickHandler</directive>
+      directive is set to <code>Off</code>, the full set of request processing
+      phases are executed and the security model remains unchanged.</p>
+    </section>
 
-      <p><directive module="mod_disk_cache">CacheDirLevels</directive> 
-      specifies how many levels of subdirectory there should be, and
-      <directive module="mod_disk_cache">CacheDirLength</directive>
-      specifies how many characters should be in each directory. With
-      the example settings given above, the hash would be turned into
-      a filename prefix as 
-      <code>/var/cache/apache/x/y/TGxSMO2b68mBCykqkp1w</code>.</p>
+    <section>
+      <title>Local exploits</title>
 
-      <p>The overall aim of this technique is to reduce the number of
-      subdirectories or files that may be in a particular directory,
-      as most file-systems slow down as this number increases. With
-      setting of "1" for 
-      <directive module="mod_disk_cache">CacheDirLength</directive>
-      there can at most be 64 subdirectories at any particular level. 
-      With a setting of 2 there can be 64 * 64 subdirectories, and so on.
-      Unless you have a good reason not to, using a setting of "1"
-      for <directive module="mod_disk_cache">CacheDirLength</directive>
-      is recommended.</p>
+      <p>As requests to end-users can be served from the cache, the cache
+      itself can become a target for those wishing to deface or interfere with
+      content. It is important to bear in mind that the cache must at all
+      times be writable by the user which httpd is running as. This is in
+      stark contrast to the usually recommended situation of maintaining
+      all content unwritable by the Apache user.</p>
 
-      <p>Setting 
-      <directive module="mod_disk_cache">CacheDirLevels</directive>
-      depends on how many files you anticipate to store in the cache.
-      With the setting of "2" used in the above example, a grand
-      total of 4096 subdirectories can ultimately be created. With
-      1 million files cached, this works out at roughly 245 cached 
-      URLs per directory.</p>
+      <p>If the Apache user is compromised, for example through a flaw in
+      a CGI process, it is possible that the cache may be targeted. When
+      using <module>mod_cache_disk</module>, it is relatively easy to
+      insert or modify a cached entity.</p>
 
-      <p>Each URL uses at least two files in the cache-store. Typically
-      there is a ".header" file, which includes meta-information about 
-      the URL, such as when it is due to expire and a ".data" file
-      which is a verbatim copy of the content to be served.</p>
+      <p>This presents a somewhat elevated risk in comparison to the other
+      types of attack it is possible to make as the Apache user. If you are
+      using <module>mod_cache_disk</module> you should bear this in mind -
+      ensure you upgrade httpd when security upgrades are announced and
+      run CGI processes as a non-Apache user using <a
+      href="suexec.html">suEXEC</a> if possible.</p>
 
-      <p>In the case of a content negotiated via the "Vary" header, a
-      ".vary" directory will be created for the URL in question. This 
-      directory will have multiple ".data" files corresponding to the
-      differently negotiated content.</p>
     </section>
 
     <section>
-      <title>Maintaining the Disk Cache</title>
-    
-      <p>Although <module>mod_disk_cache</module> will remove cached content
-      as it is expired, it does not maintain any information on the total
-      size of the cache or how little free space may be left.</p>
-
-      <p>Instead, provided with Apache is the <a 
-      href="programs/htcacheclean.html">htcacheclean</a> tool which, as the name
-      suggests, allows you to clean the cache periodically. Determining 
-      how frequently to run <a 
-      href="programs/htcacheclean.html">htcacheclean</a> and what target size to 
-      use for the cache is somewhat complex and trial and error may be needed to
-      select optimal values.</p>
+      <title>Cache Poisoning</title>
 
-      <p><a href="programs/htcacheclean.html">htcacheclean</a> has two modes of 
-      operation. It can be run as persistent daemon, or periodically from 
-      cron. <a 
-      href="programs/htcacheclean.html">htcacheclean</a> can take up to an hour 
-      or more to process very large (tens of gigabytes) caches and if you are 
-      running it from cron it is recommended that you determine how long a typical 
-      run takes, to avoid running more than one instance at a time.</p>
+      <p>When running httpd as a caching proxy server, there is also the
+      potential for so-called cache poisoning. Cache Poisoning is a broad
+      term for attacks in which an attacker causes the proxy server to
+      retrieve incorrect (and usually undesirable) content from the origin
+      server.</p>
 
-      <p class="figure">
-      <img src="images/caching_fig1.gif" alt="" width="600"
-          height="406" /><br />
-      <a id="figure1" name="figure1"><dfn>Figure 1</dfn></a>: Typical
-      cache growth / clean sequence.</p>
+      <p>For example if the DNS servers used by your system running httpd
+      are vulnerable to DNS cache poisoning, an attacker may be able to control
+      where httpd connects to when requesting content from the origin server.
+      Another example is so-called HTTP request-smuggling attacks.</p>
 
-      <p>Because <module>mod_disk_cache</module> does not itself pay attention
-      to how much space is used you should ensure that 
-      <a href="programs/htcacheclean.html">htcacheclean</a> is configured to 
-      leave enough "grow room" following a clean.</p>
+      <p>This document is not the correct place for an in-depth discussion
+      of HTTP request smuggling (instead, try your favourite search engine)
+      however it is important to be aware that it is possible to make
+      a series of requests, and to exploit a vulnerability on an origin
+      webserver such that the attacker can entirely control the content
+      retrieved by the proxy.</p>
     </section>
 
+    <section>
+      <title>Denial of Service / Cachebusting</title>
+
+      <p>The Vary mechanism allows multiple variants of the same URL to be
+      cached side by side. Depending on header values provided by the client,
+      the cache will select the correct variant to return to the client. This
+      mechanism can become a problem when an attempt is made to vary on a
+      header that is known to contain a wide range of possible values under
+      normal use, for example the <code>User-Agent</code> header. Depending
+      on the popularity of the particular web site thousands or millions of
+      duplicate cache entries could be created for the same URL, crowding
+      out other entries in the cache.</p>
+
+      <p>In other cases, there may be a need to change the URL of a particular
+      resource on every request, usually by adding a "cachebuster" string to
+      the URL. If this content is declared cacheable by a server for a
+      significant freshness lifetime, these entries can crowd out
+      legitimate entries in a cache. While <module>mod_cache</module>
+      provides a
+      <directive module="mod_cache">CacheIgnoreURLSessionIdentifiers</directive>
+      directive, this directive should be used with care to ensure that
+      downstream proxy or browser caches aren't subjected to the same denial
+      of service issue.</p>
+    </section>
   </section>
 
 </manualpage>