<li><img alt="" src="../images/down.gif" /> <a href="#on-the-fly-content">On-the-fly Content-Regeneration</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#load-balancing">Load Balancing</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#autorefresh">Document With Autorefresh</a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#structuredhomedirs">Structured Userdirs</a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#redirectanchors">Redirecting Anchors</a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#time-dependent">Time-Dependent Rewriting</a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#setenvvars">Set Environment Variables Based On URL Parts</a></li>
</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="remapping.html">Redirection and remapping</a></li><li><a href="access.html">Controlling access</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul></div>
<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
</dd>
</dl>
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="structuredhomedirs" id="structuredhomedirs">Structured Userdirs</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>Some sites with thousands of users use a
+ structured homedir layout, <em>i.e.</em> each homedir is in a
+ subdirectory which begins (for instance) with the first
+ character of the username. So, <code>/~larry/anypath</code>
+ is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
+ while <code>/~waldo/anypath</code> is
+ <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>We use the following ruleset to expand the tilde URLs
+ into the above layout.</p>
+
+<div class="example"><pre>
+RewriteEngine on
+RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/public_html$3
+</pre></div>
+ </dd>
+ </dl>
+
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="redirectanchors" id="redirectanchors">Redirecting Anchors</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>By default, redirecting to an HTML anchor doesn't work,
+ because mod_rewrite escapes the <code>#</code> character,
+ turning it into <code>%23</code>. This, in turn, breaks the
+ redirection.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>Use the <code>[NE]</code> flag on the
+ <code>RewriteRule</code>. NE stands for No Escape.
+ </p>
+ </dd>
+
+ <dt>Discussion:</dt>
+ <dd>This technique will of course also work with with other
+ special characters that mod_rewrite, by default, URL-encodes.</dd>
+ </dl>
+
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="time-dependent" id="time-dependent">Time-Dependent Rewriting</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>We wish to use mod_rewrite to serve different content based on
+ the time of day.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>There are a lot of variables named <code>TIME_xxx</code>
+ for rewrite conditions. In conjunction with the special
+ lexicographic comparison patterns <code><STRING</code>,
+ <code>>STRING</code> and <code>=STRING</code> we can
+ do time-dependent redirects:</p>
+
+<div class="example"><pre>
+RewriteEngine on
+RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
+RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900
+RewriteRule ^foo\.html$ foo.day.html [L]
+RewriteRule ^foo\.html$ foo.night.html
+</pre></div>
+
+ <p>This provides the content of <code>foo.day.html</code>
+ under the URL <code>foo.html</code> from
+ <code>07:01-18:59</code> and at the remaining time the
+ contents of <code>foo.night.html</code>.</p>
+
+ <div class="warning"><code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code>, intermediate proxies
+ and browsers may each cache responses and cause the either page to be
+ shown outside of the time-window configured.
+ <code class="module"><a href="../mod/mod_expires.html">mod_expires</a></code> may be used to control this
+ effect. You are, of course, much better off simply serving the
+ content dynamically, and customizing it based on the time of day.</div>
+
+ </dd>
+ </dl>
+
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="setenvvars" id="setenvvars">Set Environment Variables Based On URL Parts</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>At time, we want to maintain some kind of status when we
+ perform a rewrite. For example, you want to make a note that
+ you've done that rewrite, so that you can check later to see if a
+ request can via that rewrite. One way to do this is by setting an
+ environment variable.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>Use the [E] flag to set an environment variable.</p>
+
+<div class="example"><pre>
+RewriteEngine on
+RewriteRule ^/horse/(.*) /pony/$1 [E=<strong>rewritten:1</strong>]
+</pre></div>
+
+ <p>Later in your ruleset you might check for this environment
+ variable using a RewriteCond:</p>
+
+<div class="example"><pre>
+RewriteCond %{ENV:rewritten} =1
+</pre></div>
+
+ </dd>
+ </dl>
+
</div></div>
<div class="bottomlang">
<p><span>Available Languages: </span><a href="../en/rewrite/avoid.html" title="English"> en </a></p>
</section>
+<section id="structuredhomedirs">
+
+ <title>Structured Userdirs</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>Some sites with thousands of users use a
+ structured homedir layout, <em>i.e.</em> each homedir is in a
+ subdirectory which begins (for instance) with the first
+ character of the username. So, <code>/~larry/anypath</code>
+ is <code>/home/<strong>l</strong>/larry/public_html/anypath</code>
+ while <code>/~waldo/anypath</code> is
+ <code>/home/<strong>w</strong>/waldo/public_html/anypath</code>.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>We use the following ruleset to expand the tilde URLs
+ into the above layout.</p>
+
+<example><pre>
+RewriteEngine on
+RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/public_html$3
+</pre></example>
+ </dd>
+ </dl>
+
+</section>
+
+<section id="redirectanchors">
+
+ <title>Redirecting Anchors</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>By default, redirecting to an HTML anchor doesn't work,
+ because mod_rewrite escapes the <code>#</code> character,
+ turning it into <code>%23</code>. This, in turn, breaks the
+ redirection.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>Use the <code>[NE]</code> flag on the
+ <code>RewriteRule</code>. NE stands for No Escape.
+ </p>
+ </dd>
+
+ <dt>Discussion:</dt>
+ <dd>This technique will of course also work with with other
+ special characters that mod_rewrite, by default, URL-encodes.</dd>
+ </dl>
+
+</section>
+
+<section id="time-dependent">
+
+ <title>Time-Dependent Rewriting</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>We wish to use mod_rewrite to serve different content based on
+ the time of day.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>There are a lot of variables named <code>TIME_xxx</code>
+ for rewrite conditions. In conjunction with the special
+ lexicographic comparison patterns <code><STRING</code>,
+ <code>>STRING</code> and <code>=STRING</code> we can
+ do time-dependent redirects:</p>
+
+<example><pre>
+RewriteEngine on
+RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
+RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900
+RewriteRule ^foo\.html$ foo.day.html [L]
+RewriteRule ^foo\.html$ foo.night.html
+</pre></example>
+
+ <p>This provides the content of <code>foo.day.html</code>
+ under the URL <code>foo.html</code> from
+ <code>07:01-18:59</code> and at the remaining time the
+ contents of <code>foo.night.html</code>.</p>
+
+ <note type="warning"><module>mod_cache</module>, intermediate proxies
+ and browsers may each cache responses and cause the either page to be
+ shown outside of the time-window configured.
+ <module>mod_expires</module> may be used to control this
+ effect. You are, of course, much better off simply serving the
+ content dynamically, and customizing it based on the time of day.</note>
+
+ </dd>
+ </dl>
+
+</section>
+
+<section id="setenvvars">
+
+ <title>Set Environment Variables Based On URL Parts</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>At time, we want to maintain some kind of status when we
+ perform a rewrite. For example, you want to make a note that
+ you've done that rewrite, so that you can check later to see if a
+ request can via that rewrite. One way to do this is by setting an
+ environment variable.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>Use the [E] flag to set an environment variable.</p>
+
+<example><pre>
+RewriteEngine on
+RewriteRule ^/horse/(.*) /pony/$1 [E=<strong>rewritten:1</strong>]
+</pre></example>
+
+ <p>Later in your ruleset you might check for this environment
+ variable using a RewriteCond:</p>
+
+<example><pre>
+RewriteCond %{ENV:rewritten} =1
+</pre></example>
+
+ </dd>
+ </dl>
+
+</section>
+
</manualpage>
<li><img alt="" src="../images/down.gif" /> <a href="#canonicalurl">Canonical URLs</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#uservhosts">Virtual Hosts Per User</a></li>
<li><img alt="" src="../images/down.gif" /> <a href="#moveddocroot">Moved <code>DocumentRoot</code></a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#mass-virtual-hosting">Mass Virtual Hosting</a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#dynamic-proxy">Proxying Content with mod_rewrite</a></li>
</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module documentation</a></li><li><a href="intro.html">mod_rewrite introduction</a></li><li><a href="access.html">Controlling access</a></li><li><a href="advanced.html">Advanced techniques and tricks</a></li><li><a href="avoid.html">When not to use mod_rewrite</a></li></ul></div>
<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
<div class="section">
<div class="section">
<h2><a name="canonicalhost" id="canonicalhost">Canonical Hostnames</a></h2>
+
+
<dl>
<dt>Description:</dt>
</dd>
</dl>
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="mass-virtual-hosting" id="mass-virtual-hosting">Mass Virtual Hosting</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>Mass virtual hosting is one of the more common uses of
+ mod_rewrite. However, it is seldom the best way to handle mass
+ virtual hosting. This topic is discussed at great length in the <a href="../vhosts/mass.html">virtual host documentation</a>.</p>
+ </dd>
+ </dl>
+
+</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="dynamic-proxy" id="dynamic-proxy">Proxying Content with mod_rewrite</a></h2>
+
+
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>
+ mod_rewrite provides the [P] flag, which allows URLs to be passed,
+ via mod_proxy, to another server. Two examples are given here. In
+ one example, a URL is passed directly to another server, and served
+ as though it were a local URL. In the other example, we proxy
+ missing content to a back-end server.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>To simply map a URL to another server, we use the [P] flag, as
+ follows:</p>
+
+<div class="example"><pre>
+RewriteEngine on
+RewriteBase /products/
+RewriteRule ^<strong>widget/</strong>(.*)$ <strong>http://product.example.com/widget/</strong>$1 [<strong>P</strong>]
+ProxyPassReverse /products/widget/ http://product.example.com/widget/
+</pre></div>
+
+ <p>In the second example, we proxy the request only if we can't find
+ the resource locally. This can be very useful when you're migrating
+ from one server to another, and you're not sure if all the content
+ has been migrated yet.</p>
+
+<div class="example"><pre>
+RewriteCond %{REQUEST_FILENAME} <strong>!-f</strong>
+RewriteCond %{REQUEST_FILENAME} <strong>!-d</strong>
+RewriteRule ^/(.*) http://<strong>old</strong>.example.com$1 [<strong>P</strong>]
+ProxyPassReverse / http://old.example.com/
+</pre></div>
+ </dd>
+
+ <dt>Discussion:</dt>
+
+ <dd><p>In each case, we add a <code class="directive"><a href="../mod/mod_proxy.html#proxypassreverse">ProxyPassReverse</a></code> directive to ensure
+ that any redirects issued by the backend are correctly passed on to
+ the client.</p>
+
+ <p>Consider using either <code class="directive"><a href="../mod/mod_proxy.html#proxypass">ProxyPass</a></code> or <code class="directive"><a href="../mod/mod_rewrite.html#proxypassmatch">ProxyPassMatch</a></code> whenever possible in
+ preference to mod_rewrite.</p>
+ </dd>
+ </dl>
+
</div></div>
<div class="bottomlang">
<p><span>Available Languages: </span><a href="../en/rewrite/remapping.html" title="English"> en </a></p>
</section>
-<section id="canonicalhost"><title>Canonical Hostnames</title>
+<section id="canonicalhost">
+
+<title>Canonical Hostnames</title>
<dl>
<dt>Description:</dt>
</section>
+<section id="mass-virtual-hosting">
+
+ <title>Mass Virtual Hosting</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>Mass virtual hosting is one of the more common uses of
+ mod_rewrite. However, it is seldom the best way to handle mass
+ virtual hosting. This topic is discussed at great length in the <a
+ href="../vhosts/mass.html">virtual host documentation</a>.</p>
+ </dd>
+ </dl>
+
+</section>
+
+<section id="dynamic-proxy">
+
+ <title>Proxying Content with mod_rewrite</title>
+
+ <dl>
+ <dt>Description:</dt>
+
+ <dd>
+ <p>
+ mod_rewrite provides the [P] flag, which allows URLs to be passed,
+ via mod_proxy, to another server. Two examples are given here. In
+ one example, a URL is passed directly to another server, and served
+ as though it were a local URL. In the other example, we proxy
+ missing content to a back-end server.</p>
+ </dd>
+
+ <dt>Solution:</dt>
+
+ <dd>
+ <p>To simply map a URL to another server, we use the [P] flag, as
+ follows:</p>
+
+<example><pre>
+RewriteEngine on
+RewriteBase /products/
+RewriteRule ^<strong>widget/</strong>(.*)$ <strong>http://product.example.com/widget/</strong>$1 [<strong>P</strong>]
+ProxyPassReverse /products/widget/ http://product.example.com/widget/
+</pre></example>
+
+ <p>In the second example, we proxy the request only if we can't find
+ the resource locally. This can be very useful when you're migrating
+ from one server to another, and you're not sure if all the content
+ has been migrated yet.</p>
+
+<example><pre>
+RewriteCond %{REQUEST_FILENAME} <strong>!-f</strong>
+RewriteCond %{REQUEST_FILENAME} <strong>!-d</strong>
+RewriteRule ^/(.*) http://<strong>old</strong>.example.com$1 [<strong>P</strong>]
+ProxyPassReverse / http://old.example.com/
+</pre></example>
+ </dd>
+
+ <dt>Discussion:</dt>
+
+ <dd><p>In each case, we add a <directive
+ module="mod_proxy">ProxyPassReverse</directive> directive to ensure
+ that any redirects issued by the backend are correctly passed on to
+ the client.</p>
+
+ <p>Consider using either <directive
+ module="mod_proxy">ProxyPass</directive> or <directive
+ module="mod_rewrite">ProxyPassMatch</directive> whenever possible in
+ preference to mod_rewrite.</p>
+ </dd>
+ </dl>
+
+</section>
+
+
+
</manualpage>
avoids many problems.</div>
</div>
-<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#trailingslash">Trailing Slash Problem</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#setenvvars">Set Environment Variables According To URL Parts</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#redirecthome">Redirect Homedirs For Foreigners</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#redirectanchors">Redirecting Anchors</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#time-dependent">Time-Dependent Rewriting</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#structuredhomedirs">Structured Homedirs</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#dynamic-mirror">Dynamic Mirror</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#retrieve-missing-data">Retrieve Missing Data from Intranet</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#new-mime-type">New MIME-type, New Service</a></li>
-<li><img alt="" src="../images/down.gif" /> <a href="#mass-virtual-hosting">Mass Virtual Hosting</a></li>
-</ul><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module
+<div id="quickview"><h3>See also</h3><ul class="seealso"><li><a href="../mod/mod_rewrite.html">Module
documentation</a></li><li><a href="intro.html">mod_rewrite
introduction</a></li><li><a href="rewrite_guide_advanced.html">Advanced Rewrite Guide - advanced
useful examples</a></li><li><a href="tech.html">Technical details</a></li></ul></div>
-<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="trailingslash" id="trailingslash">Trailing Slash Problem</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd><p>The vast majority of "trailing slash" problems can be dealt
- with using the techniques discussed in the <a href="http://httpd.apache.org/docs/misc/FAQ-E.html#set-servername">FAQ
- entry</a>. However, occasionally, there is a need to use mod_rewrite
- to handle a case where a missing trailing slash causes a URL to
- fail. This can happen, for example, after a series of complex
- rewrite rules.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>The solution to this subtle problem is to let the server
- add the trailing slash automatically. To do this
- correctly we have to use an external redirect, so the
- browser correctly requests subsequent images etc. If we
- only did a internal rewrite, this would only work for the
- directory page, but would go wrong when any images are
- included into this page with relative URLs, because the
- browser would request an in-lined object. For instance, a
- request for <code>image.gif</code> in
- <code>/~quux/foo/index.html</code> would become
- <code>/~quux/image.gif</code> without the external
- redirect!</p>
-
- <p>So, to do this trick we write:</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^foo<strong>$</strong> foo<strong>/</strong> [<strong>R</strong>]
-</pre></div>
-
- <p>Alternately, you can put the following in a
- top-level <code>.htaccess</code> file in the content directory.
- But note that this creates some processing overhead.</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteCond %{REQUEST_FILENAME} <strong>-d</strong>
-RewriteRule ^(.+<strong>[^/]</strong>)$ $1<strong>/</strong> [R]
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="setenvvars" id="setenvvars">Set Environment Variables According To URL Parts</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Perhaps you want to keep status information between
- requests and use the URL to encode it. But you don't want
- to use a CGI wrapper for all pages just to strip out this
- information.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>We use a rewrite rule to strip out the status information
- and remember it via an environment variable which can be
- later dereferenced from within XSSI or CGI. This way a
- URL <code>/foo/S=java/bar/</code> gets translated to
- <code>/foo/bar/</code> and the environment variable named
- <code>STATUS</code> is set to the value "java".</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteRule ^(.*)/<strong>S=([^/]+)</strong>/(.*) $1/$3 [E=<strong>STATUS:$2</strong>]
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="redirecthome" id="redirecthome">Redirect Homedirs For Foreigners</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>We want to redirect homedir URLs to another webserver
- <code>www.somewhere.com</code> when the requesting user
- does not stay in the local domain
- <code>ourdomain.com</code>. This is sometimes used in
- virtual host contexts.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>Just a rewrite condition:</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteCond %{REMOTE_HOST} <strong>!^.+\.ourdomain\.com$</strong>
-RewriteRule ^(/~.+) http://www.somewhere.com/$1 [R,L]
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="redirectanchors" id="redirectanchors">Redirecting Anchors</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>By default, redirecting to an HTML anchor doesn't work,
- because mod_rewrite escapes the <code>#</code> character,
- turning it into <code>%23</code>. This, in turn, breaks the
- redirection.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>Use the <code>[NE]</code> flag on the
- <code>RewriteRule</code>. NE stands for No Escape.
- </p>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="time-dependent" id="time-dependent">Time-Dependent Rewriting</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>When tricks like time-dependent content should happen a
- lot of webmasters still use CGI scripts which do for
- instance redirects to specialized pages. How can it be done
- via <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>?</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>There are a lot of variables named <code>TIME_xxx</code>
- for rewrite conditions. In conjunction with the special
- lexicographic comparison patterns <code><STRING</code>,
- <code>>STRING</code> and <code>=STRING</code> we can
- do time-dependent redirects:</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
-RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900
-RewriteRule ^foo\.html$ foo.day.html
-RewriteRule ^foo\.html$ foo.night.html
-</pre></div>
-
- <p>This provides the content of <code>foo.day.html</code>
- under the URL <code>foo.html</code> from
- <code>07:01-18:59</code> and at the remaining time the
- contents of <code>foo.night.html</code>. Just a nice
- feature for a homepage...</p>
-
- <div class="warning"><code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code>, intermediate proxies
- and browsers may each cache responses and cause the either page to be
- shown outside of the time-window configured.
- <code class="module"><a href="../mod/mod_expires.html">mod_expires</a></code> may be used to control this
- effect.</div>
-
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="structuredhomedirs" id="structuredhomedirs">Structured Homedirs</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Some sites with thousands of users use a
- structured homedir layout, <em>i.e.</em> each homedir is in a
- subdirectory which begins (for instance) with the first
- character of the username. So, <code>/~foo/anypath</code>
- is <code>/home/<strong>f</strong>/foo/.www/anypath</code>
- while <code>/~bar/anypath</code> is
- <code>/home/<strong>b</strong>/bar/.www/anypath</code>.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>We use the following ruleset to expand the tilde URLs
- into the above layout.</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/.www$3
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="dynamic-mirror" id="dynamic-mirror">Dynamic Mirror</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Assume there are nice web pages on remote hosts we want
- to bring into our namespace. For FTP servers we would use
- the <code>mirror</code> program which actually maintains an
- explicit up-to-date copy of the remote data on the local
- machine. For a web server we could use the program
- <code>webcopy</code> which runs via HTTP. But both
- techniques have a major drawback: The local copy is
- always only as up-to-date as the last time we ran the program. It
- would be much better if the mirror was not a static one we
- have to establish explicitly. Instead we want a dynamic
- mirror with data which gets updated automatically
- as needed on the remote host(s).</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>To provide this feature we map the remote web page or even
- the complete remote web area to our namespace by the use
- of the <dfn>Proxy Throughput</dfn> feature
- (flag <code>[P]</code>):</p>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^<strong>hotsheet/</strong>(.*)$ <strong>http://www.tstimpreso.com/hotsheet/</strong>$1 [<strong>P</strong>]
-</pre></div>
-
-<div class="example"><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^<strong>usa-news\.html</strong>$ <strong>http://www.quux-corp.com/news/index.html</strong> [<strong>P</strong>]
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="retrieve-missing-data" id="retrieve-missing-data">Retrieve Missing Data from Intranet</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>This is a tricky way of virtually running a corporate
- (external) Internet web server
- (<code>www.quux-corp.dom</code>), while actually keeping
- and maintaining its data on an (internal) Intranet web server
- (<code>www2.quux-corp.dom</code>) which is protected by a
- firewall. The trick is that the external web server retrieves
- the requested data on-the-fly from the internal
- one.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>First, we must make sure that our firewall still
- protects the internal web server and only the
- external web server is allowed to retrieve data from it.
- On a packet-filtering firewall, for instance, we could
- configure a firewall ruleset like the following:</p>
-
-<div class="example"><pre>
-<strong>ALLOW</strong> Host www.quux-corp.dom Port >1024 --> Host www2.quux-corp.dom Port <strong>80</strong>
-<strong>DENY</strong> Host * Port * --> Host www2.quux-corp.dom Port <strong>80</strong>
-</pre></div>
-
- <p>Just adjust it to your actual configuration syntax.
- Now we can establish the <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite</a></code>
- rules which request the missing data in the background
- through the proxy throughput feature:</p>
-
-<div class="example"><pre>
-RewriteRule ^/~([^/]+)/?(.*) /home/$1/.www/$2 [C]
-# REQUEST_FILENAME usage below is correct in this per-server context example
-# because the rule that references REQUEST_FILENAME is chained to a rule that
-# sets REQUEST_FILENAME.
-RewriteCond %{REQUEST_FILENAME} <strong>!-f</strong>
-RewriteCond %{REQUEST_FILENAME} <strong>!-d</strong>
-RewriteRule ^/home/([^/]+)/.www/?(.*) http://<strong>www2</strong>.quux-corp.dom/~$1/pub/$2 [<strong>P</strong>]
-</pre></div>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="new-mime-type" id="new-mime-type">New MIME-type, New Service</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>On the net there are many nifty CGI programs. But
- their usage is usually boring, so a lot of webmasters
- don't use them. Even Apache's Action handler feature for
- MIME-types is only appropriate when the CGI programs
- don't need special URLs (actually <code>PATH_INFO</code>
- and <code>QUERY_STRINGS</code>) as their input. First,
- let us configure a new file type with extension
- <code>.scgi</code> (for secure CGI) which will be processed
- by the popular <code>cgiwrap</code> program. The problem
- here is that for instance if we use a Homogeneous URL Layout
- (see above) a file inside the user homedirs might have a URL
- like <code>/u/user/foo/bar.scgi</code>, but
- <code>cgiwrap</code> needs URLs in the form
- <code>/~user/foo/bar.scgi/</code>. The following rule
- solves the problem:</p>
-
-<div class="example"><pre>
-RewriteRule ^/[uge]/<strong>([^/]+)</strong>/\.www/(.+)\.scgi(.*) ...
-... /internal/cgi/user/cgiwrap/~<strong>$1</strong>/$2.scgi$3 [NS,<strong>T=application/x-http-cgi</strong>]
-</pre></div>
-
- <p>Or assume we have some more nifty programs:
- <code>wwwlog</code> (which displays the
- <code>access.log</code> for a URL subtree) and
- <code>wwwidx</code> (which runs Glimpse on a URL
- subtree). We have to provide the URL area to these
- programs so they know which area they are really working with.
- But usually this is complicated, because they may still be
- requested by the alternate URL form, i.e., typically we would
- run the <code>swwidx</code> program from within
- <code>/u/user/foo/</code> via hyperlink to</p>
-
-<div class="example"><pre>
-/internal/cgi/user/swwidx?i=/u/user/foo/
-</pre></div>
-
- <p>which is ugly, because we have to hard-code
- <strong>both</strong> the location of the area
- <strong>and</strong> the location of the CGI inside the
- hyperlink. When we have to reorganize, we spend a
- lot of time changing the various hyperlinks.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>The solution here is to provide a special new URL format
- which automatically leads to the proper CGI invocation.
- We configure the following:</p>
-
-<div class="example"><pre>
-RewriteRule ^/([uge])/([^/]+)(/?.*)/\* /internal/cgi/user/wwwidx?i=/$1/$2$3/
-RewriteRule ^/([uge])/([^/]+)(/?.*):log /internal/cgi/user/wwwlog?f=/$1/$2$3
-</pre></div>
-
- <p>Now the hyperlink to search at
- <code>/u/user/foo/</code> reads only</p>
-
-<div class="example"><pre>
-HREF="*"
-</pre></div>
-
- <p>which internally gets automatically transformed to</p>
-
-<div class="example"><pre>
-/internal/cgi/user/wwwidx?i=/u/user/foo/
-</pre></div>
-
- <p>The same approach leads to an invocation for the
- access log CGI program when the hyperlink
- <code>:log</code> gets used.</p>
- </dd>
- </dl>
-
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
-<div class="section">
-<h2><a name="mass-virtual-hosting" id="mass-virtual-hosting">Mass Virtual Hosting</a></h2>
-
-
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>The <code class="directive"><a href="../mod/core.html#virtualhost"><VirtualHost></a></code> feature of Apache is nice
- and works great when you just have a few dozen
- virtual hosts. But when you are an ISP and have hundreds of
- virtual hosts, this feature is suboptimal.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>To provide this feature we map the remote web page or even
- the complete remote web area to our namespace using the
- <dfn>Proxy Throughput</dfn> feature (flag <code>[P]</code>):</p>
-
-<div class="example"><pre>
-##
-## vhost.map
-##
-www.vhost1.dom:80 /path/to/docroot/vhost1
-www.vhost2.dom:80 /path/to/docroot/vhost2
- :
-www.vhostN.dom:80 /path/to/docroot/vhostN
-</pre></div>
-
-<div class="example"><pre>
-##
-## httpd.conf
-##
- :
-# use the canonical hostname on redirects, etc.
-UseCanonicalName on
-
- :
-# add the virtual host in front of the CLF-format
-CustomLog /path/to/access_log "%{VHOST}e %h %l %u %t \"%r\" %>s %b"
- :
-
-# enable the rewriting engine in the main server
-RewriteEngine on
-
-# define two maps: one for fixing the URL and one which defines
-# the available virtual hosts with their corresponding
-# DocumentRoot.
-RewriteMap lowercase int:tolower
-RewriteMap vhost txt:/path/to/vhost.map
-
-# Now do the actual virtual host mapping
-# via a huge and complicated single rule:
-#
-# 1. make sure we don't map for common locations
-RewriteCond %{REQUEST_URI} !^/commonurl1/.*
-RewriteCond %{REQUEST_URI} !^/commonurl2/.*
- :
-RewriteCond %{REQUEST_URI} !^/commonurlN/.*
-#
-# 2. make sure we have a Host header, because
-# currently our approach only supports
-# virtual hosting through this header
-RewriteCond %{HTTP_HOST} !^$
-#
-# 3. lowercase the hostname
-RewriteCond ${lowercase:%{HTTP_HOST}|NONE} ^(.+)$
-#
-# 4. lookup this hostname in vhost.map and
-# remember it only when it is a path
-# (and not "NONE" from above)
-RewriteCond ${vhost:%1} ^(/.*)$
-#
-# 5. finally we can map the URL to its docroot location
-# and remember the virtual host for logging purposes
-RewriteRule ^/(.*)$ %1/$1 [E=VHOST:${lowercase:%{HTTP_HOST}}]
- :
-</pre></div>
- </dd>
- </dl>
-
- </div></div>
+</div>
<div class="bottomlang">
<p><span>Available Languages: </span><a href="../en/rewrite/rewrite_guide.html" title="English"> en </a> |
<a href="../fr/rewrite/rewrite_guide.html" hreflang="fr" rel="alternate" title="Français"> fr </a></p>
useful examples</a></seealso>
<seealso><a href="tech.html">Technical details</a></seealso>
- <section id="trailingslash">
-
- <title>Trailing Slash Problem</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd><p>The vast majority of "trailing slash" problems can be dealt
- with using the techniques discussed in the <a
- href="http://httpd.apache.org/docs/misc/FAQ-E.html#set-servername">FAQ
- entry</a>. However, occasionally, there is a need to use mod_rewrite
- to handle a case where a missing trailing slash causes a URL to
- fail. This can happen, for example, after a series of complex
- rewrite rules.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>The solution to this subtle problem is to let the server
- add the trailing slash automatically. To do this
- correctly we have to use an external redirect, so the
- browser correctly requests subsequent images etc. If we
- only did a internal rewrite, this would only work for the
- directory page, but would go wrong when any images are
- included into this page with relative URLs, because the
- browser would request an in-lined object. For instance, a
- request for <code>image.gif</code> in
- <code>/~quux/foo/index.html</code> would become
- <code>/~quux/image.gif</code> without the external
- redirect!</p>
-
- <p>So, to do this trick we write:</p>
-
-<example><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^foo<strong>$</strong> foo<strong>/</strong> [<strong>R</strong>]
-</pre></example>
-
- <p>Alternately, you can put the following in a
- top-level <code>.htaccess</code> file in the content directory.
- But note that this creates some processing overhead.</p>
-
-<example><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteCond %{REQUEST_FILENAME} <strong>-d</strong>
-RewriteRule ^(.+<strong>[^/]</strong>)$ $1<strong>/</strong> [R]
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="setenvvars">
-
- <title>Set Environment Variables According To URL Parts</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Perhaps you want to keep status information between
- requests and use the URL to encode it. But you don't want
- to use a CGI wrapper for all pages just to strip out this
- information.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>We use a rewrite rule to strip out the status information
- and remember it via an environment variable which can be
- later dereferenced from within XSSI or CGI. This way a
- URL <code>/foo/S=java/bar/</code> gets translated to
- <code>/foo/bar/</code> and the environment variable named
- <code>STATUS</code> is set to the value "java".</p>
-
-<example><pre>
-RewriteEngine on
-RewriteRule ^(.*)/<strong>S=([^/]+)</strong>/(.*) $1/$3 [E=<strong>STATUS:$2</strong>]
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="redirecthome">
-
- <title>Redirect Homedirs For Foreigners</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>We want to redirect homedir URLs to another webserver
- <code>www.somewhere.com</code> when the requesting user
- does not stay in the local domain
- <code>ourdomain.com</code>. This is sometimes used in
- virtual host contexts.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>Just a rewrite condition:</p>
-
-<example><pre>
-RewriteEngine on
-RewriteCond %{REMOTE_HOST} <strong>!^.+\.ourdomain\.com$</strong>
-RewriteRule ^(/~.+) http://www.somewhere.com/$1 [R,L]
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="redirectanchors">
-
- <title>Redirecting Anchors</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>By default, redirecting to an HTML anchor doesn't work,
- because mod_rewrite escapes the <code>#</code> character,
- turning it into <code>%23</code>. This, in turn, breaks the
- redirection.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>Use the <code>[NE]</code> flag on the
- <code>RewriteRule</code>. NE stands for No Escape.
- </p>
- </dd>
- </dl>
-
- </section>
-
- <section id="time-dependent">
-
- <title>Time-Dependent Rewriting</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>When tricks like time-dependent content should happen a
- lot of webmasters still use CGI scripts which do for
- instance redirects to specialized pages. How can it be done
- via <module>mod_rewrite</module>?</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>There are a lot of variables named <code>TIME_xxx</code>
- for rewrite conditions. In conjunction with the special
- lexicographic comparison patterns <code><STRING</code>,
- <code>>STRING</code> and <code>=STRING</code> we can
- do time-dependent redirects:</p>
-
-<example><pre>
-RewriteEngine on
-RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
-RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900
-RewriteRule ^foo\.html$ foo.day.html
-RewriteRule ^foo\.html$ foo.night.html
-</pre></example>
-
- <p>This provides the content of <code>foo.day.html</code>
- under the URL <code>foo.html</code> from
- <code>07:01-18:59</code> and at the remaining time the
- contents of <code>foo.night.html</code>. Just a nice
- feature for a homepage...</p>
-
- <note type="warning"><module>mod_cache</module>, intermediate proxies
- and browsers may each cache responses and cause the either page to be
- shown outside of the time-window configured.
- <module>mod_expires</module> may be used to control this
- effect.</note>
-
- </dd>
- </dl>
-
- </section>
-
- <section id="structuredhomedirs">
-
- <title>Structured Homedirs</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Some sites with thousands of users use a
- structured homedir layout, <em>i.e.</em> each homedir is in a
- subdirectory which begins (for instance) with the first
- character of the username. So, <code>/~foo/anypath</code>
- is <code>/home/<strong>f</strong>/foo/.www/anypath</code>
- while <code>/~bar/anypath</code> is
- <code>/home/<strong>b</strong>/bar/.www/anypath</code>.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>We use the following ruleset to expand the tilde URLs
- into the above layout.</p>
-
-<example><pre>
-RewriteEngine on
-RewriteRule ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*) /home/<strong>$2</strong>/$1/.www$3
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="dynamic-mirror">
-
- <title>Dynamic Mirror</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>Assume there are nice web pages on remote hosts we want
- to bring into our namespace. For FTP servers we would use
- the <code>mirror</code> program which actually maintains an
- explicit up-to-date copy of the remote data on the local
- machine. For a web server we could use the program
- <code>webcopy</code> which runs via HTTP. But both
- techniques have a major drawback: The local copy is
- always only as up-to-date as the last time we ran the program. It
- would be much better if the mirror was not a static one we
- have to establish explicitly. Instead we want a dynamic
- mirror with data which gets updated automatically
- as needed on the remote host(s).</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>To provide this feature we map the remote web page or even
- the complete remote web area to our namespace by the use
- of the <dfn>Proxy Throughput</dfn> feature
- (flag <code>[P]</code>):</p>
-
-<example><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^<strong>hotsheet/</strong>(.*)$ <strong>http://www.tstimpreso.com/hotsheet/</strong>$1 [<strong>P</strong>]
-</pre></example>
-
-<example><pre>
-RewriteEngine on
-RewriteBase /~quux/
-RewriteRule ^<strong>usa-news\.html</strong>$ <strong>http://www.quux-corp.com/news/index.html</strong> [<strong>P</strong>]
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="retrieve-missing-data">
-
- <title>Retrieve Missing Data from Intranet</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>This is a tricky way of virtually running a corporate
- (external) Internet web server
- (<code>www.quux-corp.dom</code>), while actually keeping
- and maintaining its data on an (internal) Intranet web server
- (<code>www2.quux-corp.dom</code>) which is protected by a
- firewall. The trick is that the external web server retrieves
- the requested data on-the-fly from the internal
- one.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>First, we must make sure that our firewall still
- protects the internal web server and only the
- external web server is allowed to retrieve data from it.
- On a packet-filtering firewall, for instance, we could
- configure a firewall ruleset like the following:</p>
-
-<example><pre>
-<strong>ALLOW</strong> Host www.quux-corp.dom Port >1024 --> Host www2.quux-corp.dom Port <strong>80</strong>
-<strong>DENY</strong> Host * Port * --> Host www2.quux-corp.dom Port <strong>80</strong>
-</pre></example>
-
- <p>Just adjust it to your actual configuration syntax.
- Now we can establish the <module>mod_rewrite</module>
- rules which request the missing data in the background
- through the proxy throughput feature:</p>
-
-<example><pre>
-RewriteRule ^/~([^/]+)/?(.*) /home/$1/.www/$2 [C]
-# REQUEST_FILENAME usage below is correct in this per-server context example
-# because the rule that references REQUEST_FILENAME is chained to a rule that
-# sets REQUEST_FILENAME.
-RewriteCond %{REQUEST_FILENAME} <strong>!-f</strong>
-RewriteCond %{REQUEST_FILENAME} <strong>!-d</strong>
-RewriteRule ^/home/([^/]+)/.www/?(.*) http://<strong>www2</strong>.quux-corp.dom/~$1/pub/$2 [<strong>P</strong>]
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
- <section id="new-mime-type">
-
- <title>New MIME-type, New Service</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>On the net there are many nifty CGI programs. But
- their usage is usually boring, so a lot of webmasters
- don't use them. Even Apache's Action handler feature for
- MIME-types is only appropriate when the CGI programs
- don't need special URLs (actually <code>PATH_INFO</code>
- and <code>QUERY_STRINGS</code>) as their input. First,
- let us configure a new file type with extension
- <code>.scgi</code> (for secure CGI) which will be processed
- by the popular <code>cgiwrap</code> program. The problem
- here is that for instance if we use a Homogeneous URL Layout
- (see above) a file inside the user homedirs might have a URL
- like <code>/u/user/foo/bar.scgi</code>, but
- <code>cgiwrap</code> needs URLs in the form
- <code>/~user/foo/bar.scgi/</code>. The following rule
- solves the problem:</p>
-
-<example><pre>
-RewriteRule ^/[uge]/<strong>([^/]+)</strong>/\.www/(.+)\.scgi(.*) ...
-... /internal/cgi/user/cgiwrap/~<strong>$1</strong>/$2.scgi$3 [NS,<strong>T=application/x-http-cgi</strong>]
-</pre></example>
-
- <p>Or assume we have some more nifty programs:
- <code>wwwlog</code> (which displays the
- <code>access.log</code> for a URL subtree) and
- <code>wwwidx</code> (which runs Glimpse on a URL
- subtree). We have to provide the URL area to these
- programs so they know which area they are really working with.
- But usually this is complicated, because they may still be
- requested by the alternate URL form, i.e., typically we would
- run the <code>swwidx</code> program from within
- <code>/u/user/foo/</code> via hyperlink to</p>
-
-<example><pre>
-/internal/cgi/user/swwidx?i=/u/user/foo/
-</pre></example>
-
- <p>which is ugly, because we have to hard-code
- <strong>both</strong> the location of the area
- <strong>and</strong> the location of the CGI inside the
- hyperlink. When we have to reorganize, we spend a
- lot of time changing the various hyperlinks.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>The solution here is to provide a special new URL format
- which automatically leads to the proper CGI invocation.
- We configure the following:</p>
-
-<example><pre>
-RewriteRule ^/([uge])/([^/]+)(/?.*)/\* /internal/cgi/user/wwwidx?i=/$1/$2$3/
-RewriteRule ^/([uge])/([^/]+)(/?.*):log /internal/cgi/user/wwwlog?f=/$1/$2$3
-</pre></example>
-
- <p>Now the hyperlink to search at
- <code>/u/user/foo/</code> reads only</p>
-
-<example><pre>
-HREF="*"
-</pre></example>
-
- <p>which internally gets automatically transformed to</p>
-
-<example><pre>
-/internal/cgi/user/wwwidx?i=/u/user/foo/
-</pre></example>
-
- <p>The same approach leads to an invocation for the
- access log CGI program when the hyperlink
- <code>:log</code> gets used.</p>
- </dd>
- </dl>
-
- </section>
-
- <section id="mass-virtual-hosting">
-
- <title>Mass Virtual Hosting</title>
-
- <dl>
- <dt>Description:</dt>
-
- <dd>
- <p>The <directive type="section" module="core"
- >VirtualHost</directive> feature of Apache is nice
- and works great when you just have a few dozen
- virtual hosts. But when you are an ISP and have hundreds of
- virtual hosts, this feature is suboptimal.</p>
- </dd>
-
- <dt>Solution:</dt>
-
- <dd>
- <p>To provide this feature we map the remote web page or even
- the complete remote web area to our namespace using the
- <dfn>Proxy Throughput</dfn> feature (flag <code>[P]</code>):</p>
-
-<example><pre>
-##
-## vhost.map
-##
-www.vhost1.dom:80 /path/to/docroot/vhost1
-www.vhost2.dom:80 /path/to/docroot/vhost2
- :
-www.vhostN.dom:80 /path/to/docroot/vhostN
-</pre></example>
-
-<example><pre>
-##
-## httpd.conf
-##
- :
-# use the canonical hostname on redirects, etc.
-UseCanonicalName on
-
- :
-# add the virtual host in front of the CLF-format
-CustomLog /path/to/access_log "%{VHOST}e %h %l %u %t \"%r\" %>s %b"
- :
-
-# enable the rewriting engine in the main server
-RewriteEngine on
-
-# define two maps: one for fixing the URL and one which defines
-# the available virtual hosts with their corresponding
-# DocumentRoot.
-RewriteMap lowercase int:tolower
-RewriteMap vhost txt:/path/to/vhost.map
-
-# Now do the actual virtual host mapping
-# via a huge and complicated single rule:
-#
-# 1. make sure we don't map for common locations
-RewriteCond %{REQUEST_URI} !^/commonurl1/.*
-RewriteCond %{REQUEST_URI} !^/commonurl2/.*
- :
-RewriteCond %{REQUEST_URI} !^/commonurlN/.*
-#
-# 2. make sure we have a Host header, because
-# currently our approach only supports
-# virtual hosting through this header
-RewriteCond %{HTTP_HOST} !^$
-#
-# 3. lowercase the hostname
-RewriteCond ${lowercase:%{HTTP_HOST}|NONE} ^(.+)$
-#
-# 4. lookup this hostname in vhost.map and
-# remember it only when it is a path
-# (and not "NONE" from above)
-RewriteCond ${vhost:%1} ^(/.*)$
-#
-# 5. finally we can map the URL to its docroot location
-# and remember the virtual host for logging purposes
-RewriteRule ^/(.*)$ %1/$1 [E=VHOST:${lowercase:%{HTTP_HOST}}]
- :
-</pre></example>
- </dd>
- </dl>
-
- </section>
-
</manualpage>