1 <?xml version='1.0' encoding='UTF-8' ?>
2 <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="tech.xml.meta">
24 <parentdocument href="./">Rewrite</parentdocument>
26 <title>Apache mod_rewrite Technical Details</title>
29 <p>This document discusses some of the technical details of mod_rewrite
32 <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso>
33 <seealso><a href="intro.html">mod_rewrite introduction</a></seealso>
34 <seealso><a href="remapping.html">Redirection and remapping</a></seealso>
35 <seealso><a href="access.html">Controlling access</a></seealso>
36 <seealso><a href="vhosts.html">Virtual hosts</a></seealso>
37 <seealso><a href="proxy.html">Proxying</a></seealso>
38 <seealso><a href="rewritemap.html">RewriteMap</a></seealso>
39 <seealso><a href="advanced.html">Advanced techniques and tricks</a></seealso>
40 <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso>
42 <section id="Internal"><title>Internal Processing</title>
44 <p>The internal processing of this module is very complex but
45 needs to be explained once even to the average user to avoid
46 common mistakes and to let you exploit its full
50 <section id="InternalAPI"><title>API Phases</title>
52 <p>First you have to understand that when Apache processes a
53 HTTP request it does this in phases. A hook for each of these
54 phases is provided by the Apache API. Mod_rewrite uses two of
55 these hooks: the URL-to-filename translation hook which is
56 used after the HTTP request has been read but before any
57 authorization starts and the Fixup hook which is triggered
58 after the authorization phases and after the per-directory
59 config files (<code>.htaccess</code>) have been read, but
60 before the content handler is activated.</p>
62 <p>So, after a request comes in and Apache has determined the
63 corresponding server (or virtual server) the rewriting engine
64 starts processing of all mod_rewrite directives from the
65 per-server configuration in the URL-to-filename phase. A few
66 steps later when the final data directories are found, the
67 per-directory configuration directives of mod_rewrite are
68 triggered in the Fixup phase. In both situations mod_rewrite
69 rewrites URLs either to new URLs or to filenames, although
70 there is no obvious distinction between them. This is a usage
71 of the API which was not intended to be this way when the API
72 was designed, but as of Apache 1.x this is the only way
73 mod_rewrite can operate. To make this point more clear
74 remember the following two points:</p>
77 <li>Although mod_rewrite rewrites URLs to URLs, URLs to
78 filenames and even filenames to filenames, the API
79 currently provides only a URL-to-filename hook. In Apache
80 2.0 the two missing hooks will be added to make the
81 processing more clear. But this point has no drawbacks for
82 the user, it is just a fact which should be remembered:
83 Apache does more in the URL-to-filename hook than the API
87 Unbelievably mod_rewrite provides URL manipulations in
88 per-directory context, <em>i.e.</em>, within
89 <code>.htaccess</code> files, although these are reached
90 a very long time after the URLs have been translated to
91 filenames. It has to be this way because
92 <code>.htaccess</code> files live in the filesystem, so
93 processing has already reached this stage. In other
94 words: According to the API phases at this time it is too
95 late for any URL manipulations. To overcome this chicken
96 and egg problem mod_rewrite uses a trick: When you
97 manipulate a URL/filename in per-directory context
98 mod_rewrite first rewrites the filename back to its
99 corresponding URL (which is usually impossible, but see
100 the <code>RewriteBase</code> directive below for the
101 trick to achieve this) and then initiates a new internal
102 sub-request with the new URL. This restarts processing of
105 <p>Again mod_rewrite tries hard to make this complicated
106 step totally transparent to the user, but you should
107 remember here: While URL manipulations in per-server
108 context are really fast and efficient, per-directory
109 rewrites are slow and inefficient due to this chicken and
110 egg problem. But on the other hand this is the only way
111 mod_rewrite can provide (locally restricted) URL
112 manipulations to the average user.</p>
116 <p>Don't forget these two points!</p>
119 <section id="InternalRuleset"><title>Ruleset Processing</title>
121 <p>Now when mod_rewrite is triggered in these two API phases, it
122 reads the configured rulesets from its configuration
123 structure (which itself was either created on startup for
124 per-server context or during the directory walk of the Apache
125 kernel for per-directory context). Then the URL rewriting
126 engine is started with the contained ruleset (one or more
127 rules together with their conditions). The operation of the
128 URL rewriting engine itself is exactly the same for both
129 configuration contexts. Only the final result processing is
132 <p>The order of rules in the ruleset is important because the
133 rewriting engine processes them in a special (and not very
134 obvious) order. The rule is this: The rewriting engine loops
135 through the ruleset rule by rule (<directive
136 module="mod_rewrite">RewriteRule</directive> directives) and
137 when a particular rule matches it optionally loops through
138 existing corresponding conditions (<code>RewriteCond</code>
139 directives). For historical reasons the conditions are given
140 first, and so the control flow is a little bit long-winded. See
141 Figure 1 for more details.</p>
143 <img src="../images/rewrite_rule_flow.png"
144 alt="Flow of RewriteRule and RewriteCond matching" /><br />
145 <dfn>Figure 1:</dfn>The control flow through the rewriting ruleset
147 <p>As you can see, first the URL is matched against the
148 <em>Pattern</em> of each rule. When it fails mod_rewrite
149 immediately stops processing this rule and continues with the
150 next rule. If the <em>Pattern</em> matches, mod_rewrite looks
151 for corresponding rule conditions. If none are present, it
152 just substitutes the URL with a new value which is
153 constructed from the string <em>Substitution</em> and goes on
154 with its rule-looping. But if conditions exist, it starts an
155 inner loop for processing them in the order that they are
156 listed. For conditions the logic is different: we don't match
157 a pattern against the current URL. Instead we first create a
158 string <em>TestString</em> by expanding variables,
159 back-references, map lookups, <em>etc.</em> and then we try
160 to match <em>CondPattern</em> against it. If the pattern
161 doesn't match, the complete set of conditions and the
162 corresponding rule fails. If the pattern matches, then the
163 next condition is processed until no more conditions are
164 available. If all conditions match, processing is continued
165 with the substitution of the URL with
166 <em>Substitution</em>.</p>