1 <?xml version="1.0" encoding="UTF-8" ?>
2 <!DOCTYPE manualpage SYSTEM "../style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="advanced.xml.meta">
24 <parentdocument href="./">Rewrite</parentdocument>
26 <title>Advanced Techniques with mod_rewrite</title>
30 <p>This document supplements the <module>mod_rewrite</module>
31 <a href="../mod/mod_rewrite.html">reference documentation</a>. It provides
32 a few advanced techniques using mod_rewrite.</p>
35 I question whether anything remailing in this document qualifies as
36 "advanced". It's probably time to take inventory of the examples that we
37 have in the various docs, and consider a reorg of the stuff in this
41 <note type="warning">Note that many of these examples won't work unchanged in your
42 particular server configuration, so it's important that you understand
43 them, rather than merely cutting and pasting the examples into your
47 <seealso><a href="../mod/mod_rewrite.html">Module documentation</a></seealso>
48 <seealso><a href="intro.html">mod_rewrite introduction</a></seealso>
49 <seealso><a href="remapping.html">Redirection and remapping</a></seealso>
50 <seealso><a href="access.html">Controlling access</a></seealso>
51 <seealso><a href="vhosts.html">Virtual hosts</a></seealso>
52 <seealso><a href="proxy.html">Proxying</a></seealso>
53 <seealso><a href="rewritemap.html">Using RewriteMap</a></seealso>
54 <!--<seealso><a href="advanced.html">Advanced techniques</a></seealso>-->
55 <seealso><a href="avoid.html">When not to use mod_rewrite</a></seealso>
57 <section id="sharding">
59 <title>URL-based sharding across multiple backends</title>
65 <p>A common technique for distributing the burden of
66 server load or storage space is called "sharding".
67 When using this method, a front-end server will use the
68 url to consistently "shard" users or objects to separate
75 <p>A mapping is maintained, from users to target servers, in
76 external map files. They look like:</p>
79 user1 physical_host_of_user1<br />
80 user2 physical_host_of_user2<br />
84 <p>We put this into a <code>map.users-to-hosts</code> file. The
94 http://physical_host_of_user1/u/user/anypath
97 <p>thus every URL path need not be valid on every backend physical
98 host. The following ruleset does this for us with the help of the map
99 files assuming that server0 is a default server which will be used if
100 a user has no entry in the map:</p>
102 <highlight language="config">
104 RewriteMap users-to-hosts "txt:/path/to/map.users-to-hosts"
105 RewriteRule "^/u/([^/]+)/?(.*)" "http://${users-to-hosts:$1|server0}/u/$1/$2"
110 <p>See the <directive module="mod_rewrite">RewriteMap</directive>
111 documentation for more discussion of the syntax of this directive.</p>
115 <section id="on-the-fly-content">
117 <title>On-the-fly Content-Regeneration</title>
120 <dt>Description:</dt>
123 <p>We wish to dynamically generate content, but store it
124 statically once it is generated. This rule will check for the
125 existence of the static file, and if it's not there, generate
126 it. The static files can be removed periodically, if desired (say,
127 via cron) and will be regenerated on demand.</p>
133 This is done via the following ruleset:
135 <highlight language="config">
136 # This example is valid in per-directory context only
137 RewriteCond "%{REQUEST_URI}" !-U
138 RewriteRule "^(.+)\.html$" "/regenerate_page.cgi" [PT,L]
141 <p>The <code>-U</code> operator determines whether the test string
142 (in this case, <code>REQUEST_URI</code>) is a valid URL. It does
143 this via a subrequest. In the event that this subrequest fails -
144 that is, the requested resource doesn't exist - this rule invokes
145 the CGI program <code>/regenerate_page.cgi</code>, which generates
146 the requested resource and saves it into the document directory, so
147 that the next time it is requested, a static copy can be served.</p>
149 <p>In this way, documents that are infrequently updated can be served in
150 static form. if documents need to be refreshed, they can be deleted
151 from the document directory, and they will then be regenerated the
152 next time they are requested.</p>
158 <section id="load-balancing">
160 <title>Load Balancing</title>
163 <dt>Description:</dt>
166 <p>We wish to randomly distribute load across several servers
167 using mod_rewrite.</p>
173 <p>We'll use <directive
174 module="mod_rewrite">RewriteMap</directive> and a list of servers
175 to accomplish this.</p>
177 <highlight language="config">
179 RewriteMap lb "rnd:/path/to/serverlist.txt"
180 RewriteRule "^/(.*)" "http://${lb:servers}/$1" [P,L]
183 <p><code>serverlist.txt</code> will contain a list of the servers:</p>
186 ## serverlist.txt<br />
188 servers one.example.com|two.example.com|three.example.com<br />
191 <p>If you want one particular server to get more of the load than the
192 others, add it more times to the list.</p>
198 <p>Apache comes with a load-balancing module -
199 <module>mod_proxy_balancer</module> - which is far more flexible and
200 featureful than anything you can cobble together using mod_rewrite.</p>
206 <section id="structuredhomedirs">
208 <title>Structured Userdirs</title>
211 <dt>Description:</dt>
214 <p>Some sites with thousands of users use a
215 structured homedir layout, <em>i.e.</em> each homedir is in a
216 subdirectory which begins (for instance) with the first
217 character of the username. So, <code>/~larry/anypath</code>
218 is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
219 while <code>/~waldo/anypath</code> is
220 <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
226 <p>We use the following ruleset to expand the tilde URLs
227 into the above layout.</p>
229 <highlight language="config">
231 RewriteRule "^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*)" "/home/<strong>$2</strong>/$1/public_html$3"
238 <section id="redirectanchors">
240 <title>Redirecting Anchors</title>
243 <dt>Description:</dt>
246 <p>By default, redirecting to an HTML anchor doesn't work,
247 because mod_rewrite escapes the <code>#</code> character,
248 turning it into <code>%23</code>. This, in turn, breaks the
255 <p>Use the <code>[NE]</code> flag on the
256 <code>RewriteRule</code>. NE stands for No Escape.
261 <dd>This technique will of course also work with other
262 special characters that mod_rewrite, by default, URL-encodes.</dd>
267 <section id="time-dependent">
269 <title>Time-Dependent Rewriting</title>
272 <dt>Description:</dt>
275 <p>We wish to use mod_rewrite to serve different content based on
282 <p>There are a lot of variables named <code>TIME_xxx</code>
283 for rewrite conditions. In conjunction with the special
284 lexicographic comparison patterns <code><STRING</code>,
285 <code>>STRING</code> and <code>=STRING</code> we can
286 do time-dependent redirects:</p>
288 <highlight language="config">
290 RewriteCond "%{TIME_HOUR}%{TIME_MIN}" >0700
291 RewriteCond "%{TIME_HOUR}%{TIME_MIN}" <1900
292 RewriteRule "^foo\.html$" "foo.day.html" [L]
293 RewriteRule "^foo\.html$" "foo.night.html"
296 <p>This provides the content of <code>foo.day.html</code>
297 under the URL <code>foo.html</code> from
298 <code>07:01-18:59</code> and at the remaining time the
299 contents of <code>foo.night.html</code>.</p>
301 <note type="warning"><module>mod_cache</module>, intermediate proxies
302 and browsers may each cache responses and cause the either page to be
303 shown outside of the time-window configured.
304 <module>mod_expires</module> may be used to control this
305 effect. You are, of course, much better off simply serving the
306 content dynamically, and customizing it based on the time of day.</note>
313 <section id="setenvvars">
315 <title>Set Environment Variables Based On URL Parts</title>
318 <dt>Description:</dt>
321 <p>At time, we want to maintain some kind of status when we
322 perform a rewrite. For example, you want to make a note that
323 you've done that rewrite, so that you can check later to see if a
324 request can via that rewrite. One way to do this is by setting an
325 environment variable.</p>
331 <p>Use the [E] flag to set an environment variable.</p>
333 <highlight language="config">
335 RewriteRule "^/horse/(.*)" "/pony/$1" [E=<strong>rewritten:1</strong>]
338 <p>Later in your ruleset you might check for this environment
339 variable using a RewriteCond:</p>
341 <highlight language="config">
342 RewriteCond "%{ENV:rewritten}" =1
345 <p>Note that environment variables do not survive an external
346 redirect. You might consider using the [CO] flag to set a