<p>This document discusses some of the technical details of mod_rewrite
and URL matching.</p>
</div>
-<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#Internal">Internal Processing</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#InternalAPI">API Phases</a></li>
+<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#InternalAPI">API Phases</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#InternalRuleset">Ruleset Processing</a></li>
</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="remapping.html">Redirection and remapping</a></li><li><a href="access.html">Controlling access</a></li><li><a href="vhosts.html">Virtual hosts</a></li><li><a href="proxy.html">Proxying</a></li><li><a href="rewritemap.html">Using RewriteMap</a></li><li><a href="advanced.html">Advanced techniques</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul></div>
<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
-<h2><a name="Internal" id="Internal">Internal Processing</a></h2>
-
- <p>The internal processing of this module is very complex but
- needs to be explained once even to the average user to avoid
- common mistakes and to let you exploit its full
- functionality.</p>
-</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
<h2><a name="InternalAPI" id="InternalAPI">API Phases</a></h2>
- <p>First you have to understand that when Apache processes a
- HTTP request it does this in phases. A hook for each of these
- phases is provided by the Apache API. Mod_rewrite uses two of
- these hooks: the URL-to-filename translation hook which is
- used after the HTTP request has been read but before any
- authorization starts and the Fixup hook which is triggered
- after the authorization phases and after the per-directory
- config files (<code>.htaccess</code>) have been read, but
- before the content handler is activated.</p>
-
- <p>So, after a request comes in and Apache has determined the
- corresponding server (or virtual server) the rewriting engine
- starts processing of all mod_rewrite directives from the
- per-server configuration in the URL-to-filename phase. A few
- steps later when the final data directories are found, the
- per-directory configuration directives of mod_rewrite are
- triggered in the Fixup phase. In both situations mod_rewrite
- rewrites URLs either to new URLs or to filenames, although
- there is no obvious distinction between them. This is a usage
- of the API which was not intended to be this way when the API
- was designed, but as of Apache 1.x this is the only way
- mod_rewrite can operate. To make this point more clear
- remember the following two points:</p>
-
- <ol>
- <li>Although mod_rewrite rewrites URLs to URLs, URLs to
- filenames and even filenames to filenames, the API
- currently provides only a URL-to-filename hook. In Apache
- 2.0 the two missing hooks will be added to make the
- processing more clear. But this point has no drawbacks for
- the user, it is just a fact which should be remembered:
- Apache does more in the URL-to-filename hook than the API
- intends for it.</li>
-
- <li>
- Unbelievably mod_rewrite provides URL manipulations in
- per-directory context, <em>i.e.</em>, within
- <code>.htaccess</code> files, although these are reached
- a very long time after the URLs have been translated to
- filenames. It has to be this way because
- <code>.htaccess</code> files live in the filesystem, so
- processing has already reached this stage. In other
- words: According to the API phases at this time it is too
- late for any URL manipulations. To overcome this chicken
- and egg problem mod_rewrite uses a trick: When you
- manipulate a URL/filename in per-directory context
- mod_rewrite first rewrites the filename back to its
- corresponding URL (which is usually impossible, but see
- the <code>RewriteBase</code> directive below for the
- trick to achieve this) and then initiates a new internal
- sub-request with the new URL. This restarts processing of
- the API phases.
-
- <p>Again mod_rewrite tries hard to make this complicated
- step totally transparent to the user, but you should
- remember here: While URL manipulations in per-server
- context are really fast and efficient, per-directory
- rewrites are slow and inefficient due to this chicken and
- egg problem. But on the other hand this is the only way
- mod_rewrite can provide (locally restricted) URL
- manipulations to the average user.</p>
- </li>
- </ol>
-
- <p>Don't forget these two points!</p>
+ <p>The Apache HTTP Server handles requests in several phases. At
+ each of these phases, one or more modules may be called upon to
+ handle that portion of the request lifecycle. Phases include things
+ like URL-to-filename translation, authentication, authorization,
+ content, and logging. (These is not an exhaustive list.)</p>
+
+ <p>mod_rewrite acts in two of these phases (or "hooks", as they are
+ sometimes called) to influence how URLs may be rewritten.</p>
+
+ <p>First, it uses the URL-to-filename translation hook, which occurs
+ after the HTTP request has been read, but before any authorization
+ starts. Secondly, it uses the Fixup hook, which is after the
+ authorizatin phases, and after per-directory configuration files
+ (<code>.htaccess</code> files) have been read, but before the
+ content handler is called.</p>
+
+ <p>So, after a request comes in and a corresponding server or
+ virtual host has been determined, the rewriting engine starts
+ processing any <code>mod_rewrite</code> directives appearing in the
+ per-server configuration. (ie, in the main server configuration file
+ and <code class="directive"><a href="../mod/core.html#virtualhost"><Virtualhost></a></code>
+ sections.) This happens in the URL-to-filename phase.</p>
+
+ <p>A few steps later, when the finaly data directories are found,
+ the per-directory configuration directives (<code>.htaccess</code>
+ files and <code class="directive"><a href="../mod/core.html#directory"><Directory></a></code> blocks) are applied. This
+ happens in the Fixup phase.</p>
+
+ <p>In each of these cases, mod_rewrite rewrites the
+ <code>REQUEST_URI</code> either to a new URI, or to a filename.</p>
+
+ <p>In per-directory context (ie, within <code>.htaccess</code> files
+ and <code>Directory</code> blocks), these rules are being applied
+ after a URI has already been translated to a filename. Because of
+ this, mod_rewrite temporarily translates the filename back into a URI,
+ by stripping off directory paty before appling the rules. (See the
+ <code class="directive"><a href="../mod/mod_rewrite.html#rewritebase">RewriteBase</a></code> directive to
+ see how you can further manipulate how this is handled.) Then, a new
+ internal subrequest is issued with the new URI. This restarts
+ processing of the API phases.</p>
+
+ <p>Because of this further manipulation of the URI in per-directory
+ context, you'll need to take care to craft your rewrite rules
+ differently in that context. In particular, remember that the
+ leading directory path will be stripped off of the URI that your
+ rewrite rules will see. Consider the examples below for further
+ clarification.</p>
+
+ <table class="bordered">
+
+ <tr>
+ <th>Location of rule</th>
+ <th>Rule</th>
+ </tr>
+
+ <tr>
+ <td>VirtualHost section</td>
+ <td>RewriteRule ^/images/(.+)\.jpg /images/$1.gif</td>
+ </tr>
+
+ <tr>
+ <td>.htaccess file in document root</td>
+ <td>RewriteRule ^images/(.+)\.jpg images/$1.gif</td>
+ </tr>
+
+ <tr>
+ <td>.htaccess file in images directory</td>
+ <td>RewriteRule ^(.+)\.jpg $1.gif</td>
+ </tr>
+
+ </table>
+
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<h2><a name="InternalRuleset" id="InternalRuleset">Ruleset Processing</a></h2>