From a3a24e2ebd3cbe768210b44dc52a21eab25f245d Mon Sep 17 00:00:00 2001 From: Rich Bowen Date: Thu, 17 Mar 2005 01:47:45 +0000 Subject: [PATCH] Moved some of the gorey details out of the module reference doc and into the supporting documentation. Might want to move some more, too. Haven't decided yet. git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@157850 13f79535-47bb-0310-9956-ffa450edef68 --- docs/manual/mod/mod_rewrite.html.en | 188 ++-------------------------- docs/manual/mod/mod_rewrite.xml | 183 ++------------------------- 2 files changed, 21 insertions(+), 350 deletions(-) diff --git a/docs/manual/mod/mod_rewrite.html.en b/docs/manual/mod/mod_rewrite.html.en index 6b30269ae5..cf23dd8087 100644 --- a/docs/manual/mod/mod_rewrite.html.en +++ b/docs/manual/mod/mod_rewrite.html.en @@ -31,31 +31,6 @@ URLs on the fly Compatibility:Available in Apache 1.3 and later

Summary

-
-

``The great thing about mod_rewrite is it gives you - all the configurability and flexibility of Sendmail. - The downside to mod_rewrite is that it gives you all - the configurability and flexibility of Sendmail.''

- -

-- Brian Behlendorf
- Apache Group

- -
- -
-

`` Despite the tons of examples and docs, - mod_rewrite is voodoo. Damned cool voodoo, but still - voodoo. ''

- -

-- Brian Moore
- bem@news.cmc.net

- -
- - -

Welcome to mod_rewrite, the Swiss Army Knife of URL - manipulation!

-

This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an @@ -75,20 +50,8 @@ URLs on the fly sub-processing, external request redirection or even to an internal proxy throughput.

-

But all this functionality and flexibility has its - drawback: complexity. So don't expect to understand this - entire module in just one day.

- -

This module was invented and originally written in April - 1996 and gifted exclusively to the The Apache Group in July 1997 - by

- -

- Ralf S. - Engelschall
- rse@engelschall.com
- www.engelschall.com -

+

Further details, discussion, and examples, are provided in the + detailed mod_rewrite documentation.

Directives

Topics

top
-

Internal Processing

- -

The internal processing of this module is very complex but - needs to be explained once even to the average user to avoid - common mistakes and to let you exploit its full - functionality.

- -

API Phases

- -

First you have to understand that when Apache processes a - HTTP request it does this in phases. A hook for each of these - phases is provided by the Apache API. Mod_rewrite uses two of - these hooks: the URL-to-filename translation hook which is - used after the HTTP request has been read but before any - authorization starts and the Fixup hook which is triggered - after the authorization phases and after the per-directory - config files (.htaccess) have been read, but - before the content handler is activated.

- -

So, after a request comes in and Apache has determined the - corresponding server (or virtual server) the rewriting engine - starts processing of all mod_rewrite directives from the - per-server configuration in the URL-to-filename phase. A few - steps later when the final data directories are found, the - per-directory configuration directives of mod_rewrite are - triggered in the Fixup phase. In both situations mod_rewrite - rewrites URLs either to new URLs or to filenames, although - there is no obvious distinction between them. This is a usage - of the API which was not intended to be this way when the API - was designed, but as of Apache 1.x this is the only way - mod_rewrite can operate. To make this point more clear - remember the following two points:

- -
    -
  1. Although mod_rewrite rewrites URLs to URLs, URLs to - filenames and even filenames to filenames, the API - currently provides only a URL-to-filename hook. In Apache - 2.0 the two missing hooks will be added to make the - processing more clear. But this point has no drawbacks for - the user, it is just a fact which should be remembered: - Apache does more in the URL-to-filename hook than the API - intends for it.
  2. - -
  3. - Unbelievably mod_rewrite provides URL manipulations in - per-directory context, i.e., within - .htaccess files, although these are reached - a very long time after the URLs have been translated to - filenames. It has to be this way because - .htaccess files live in the filesystem, so - processing has already reached this stage. In other - words: According to the API phases at this time it is too - late for any URL manipulations. To overcome this chicken - and egg problem mod_rewrite uses a trick: When you - manipulate a URL/filename in per-directory context - mod_rewrite first rewrites the filename back to its - corresponding URL (which is usually impossible, but see - the RewriteBase directive below for the - trick to achieve this) and then initiates a new internal - sub-request with the new URL. This restarts processing of - the API phases. - -

    Again mod_rewrite tries hard to make this complicated - step totally transparent to the user, but you should - remember here: While URL manipulations in per-server - context are really fast and efficient, per-directory - rewrites are slow and inefficient due to this chicken and - egg problem. But on the other hand this is the only way - mod_rewrite can provide (locally restricted) URL - manipulations to the average user.

    -
  4. -
- -

Don't forget these two points!

- - -

Ruleset Processing

- -

Now when mod_rewrite is triggered in these two API phases, it - reads the configured rulesets from its configuration - structure (which itself was either created on startup for - per-server context or during the directory walk of the Apache - kernel for per-directory context). Then the URL rewriting - engine is started with the contained ruleset (one or more - rules together with their conditions). The operation of the - URL rewriting engine itself is exactly the same for both - configuration contexts. Only the final result processing is - different.

- -

The order of rules in the ruleset is important because the - rewriting engine processes them in a special (and not very - obvious) order. The rule is this: The rewriting engine loops - through the ruleset rule by rule (RewriteRule directives) and - when a particular rule matches it optionally loops through - existing corresponding conditions (RewriteCond - directives). For historical reasons the conditions are given - first, and so the control flow is a little bit long-winded. See - Figure 1 for more details.

-

- [Needs graphics capability to display]
- Figure 1:The control flow through the rewriting ruleset -

-

As you can see, first the URL is matched against the - Pattern of each rule. When it fails mod_rewrite - immediately stops processing this rule and continues with the - next rule. If the Pattern matches, mod_rewrite looks - for corresponding rule conditions. If none are present, it - just substitutes the URL with a new value which is - constructed from the string Substitution and goes on - with its rule-looping. But if conditions exist, it starts an - inner loop for processing them in the order that they are - listed. For conditions the logic is different: we don't match - a pattern against the current URL. Instead we first create a - string TestString by expanding variables, - back-references, map lookups, etc. and then we try - to match CondPattern against it. If the pattern - doesn't match, the complete set of conditions and the - corresponding rule fails. If the pattern matches, then the - next condition is processed until no more conditions are - available. If all conditions match, processing is continued - with the substitution of the URL with - Substitution.

- - - -

Quoting Special Characters

+

Quoting Special Characters

As of Apache 1.3.20, special characters in TestString and Substitution strings can be @@ -245,9 +84,9 @@ URLs on the fly dollar-sign character in a Substitution string by using '\$'; this keeps mod_rewrite from trying to treat it as a backreference.

- - -

Regex Back-Reference Availability

+
top
+
+

Regex Back-Reference Availability

One important thing here has to be remembered: Whenever you use parentheses in Pattern or in one of the @@ -267,7 +106,6 @@ URLs on the fly reading the following documentation of the available directives.

-
top

Environment Variables

@@ -297,11 +135,11 @@ SCRIPT_URI=http://en1.engelschall.com/u/rse/

Practical Solutions

-

We also have an URL - Rewriting Guide available, which provides a collection of - practical solutions for URL-based problems. There you can - find real-life rulesets and additional information about - mod_rewrite.

+

For numerous examples of common, and not-so-common, uses for + mod_rewrite, see the Rewrite + Guide, and the Advanced Rewrite + Guide documents.

+
top

RewriteBase Directive

diff --git a/docs/manual/mod/mod_rewrite.xml b/docs/manual/mod/mod_rewrite.xml index 5d9c1aaa4c..82cacece6e 100644 --- a/docs/manual/mod/mod_rewrite.xml +++ b/docs/manual/mod/mod_rewrite.xml @@ -33,31 +33,6 @@ URLs on the fly Available in Apache 1.3 and later -
-

``The great thing about mod_rewrite is it gives you - all the configurability and flexibility of Sendmail. - The downside to mod_rewrite is that it gives you all - the configurability and flexibility of Sendmail.''

- -

-- Brian Behlendorf
- Apache Group

- -
- -
-

`` Despite the tons of examples and docs, - mod_rewrite is voodoo. Damned cool voodoo, but still - voodoo. ''

- -

-- Brian Moore
- bem@news.cmc.net

- -
- - -

Welcome to mod_rewrite, the Swiss Army Knife of URL - manipulation!

-

This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an @@ -77,151 +52,10 @@ URLs on the fly sub-processing, external request redirection or even to an internal proxy throughput.

-

But all this functionality and flexibility has its - drawback: complexity. So don't expect to understand this - entire module in just one day.

- -

This module was invented and originally written in April - 1996 and gifted exclusively to the The Apache Group in July 1997 - by

- -

- Ralf S. - Engelschall
- rse@engelschall.com
- www.engelschall.com -

+

Further details, discussion, and examples, are provided in the + detailed mod_rewrite documentation.

-
Internal Processing - -

The internal processing of this module is very complex but - needs to be explained once even to the average user to avoid - common mistakes and to let you exploit its full - functionality.

- -
API Phases - -

First you have to understand that when Apache processes a - HTTP request it does this in phases. A hook for each of these - phases is provided by the Apache API. Mod_rewrite uses two of - these hooks: the URL-to-filename translation hook which is - used after the HTTP request has been read but before any - authorization starts and the Fixup hook which is triggered - after the authorization phases and after the per-directory - config files (.htaccess) have been read, but - before the content handler is activated.

- -

So, after a request comes in and Apache has determined the - corresponding server (or virtual server) the rewriting engine - starts processing of all mod_rewrite directives from the - per-server configuration in the URL-to-filename phase. A few - steps later when the final data directories are found, the - per-directory configuration directives of mod_rewrite are - triggered in the Fixup phase. In both situations mod_rewrite - rewrites URLs either to new URLs or to filenames, although - there is no obvious distinction between them. This is a usage - of the API which was not intended to be this way when the API - was designed, but as of Apache 1.x this is the only way - mod_rewrite can operate. To make this point more clear - remember the following two points:

- -
    -
  1. Although mod_rewrite rewrites URLs to URLs, URLs to - filenames and even filenames to filenames, the API - currently provides only a URL-to-filename hook. In Apache - 2.0 the two missing hooks will be added to make the - processing more clear. But this point has no drawbacks for - the user, it is just a fact which should be remembered: - Apache does more in the URL-to-filename hook than the API - intends for it.
  2. - -
  3. - Unbelievably mod_rewrite provides URL manipulations in - per-directory context, i.e., within - .htaccess files, although these are reached - a very long time after the URLs have been translated to - filenames. It has to be this way because - .htaccess files live in the filesystem, so - processing has already reached this stage. In other - words: According to the API phases at this time it is too - late for any URL manipulations. To overcome this chicken - and egg problem mod_rewrite uses a trick: When you - manipulate a URL/filename in per-directory context - mod_rewrite first rewrites the filename back to its - corresponding URL (which is usually impossible, but see - the RewriteBase directive below for the - trick to achieve this) and then initiates a new internal - sub-request with the new URL. This restarts processing of - the API phases. - -

    Again mod_rewrite tries hard to make this complicated - step totally transparent to the user, but you should - remember here: While URL manipulations in per-server - context are really fast and efficient, per-directory - rewrites are slow and inefficient due to this chicken and - egg problem. But on the other hand this is the only way - mod_rewrite can provide (locally restricted) URL - manipulations to the average user.

    -
  4. -
- -

Don't forget these two points!

-
- -
Ruleset Processing - -

Now when mod_rewrite is triggered in these two API phases, it - reads the configured rulesets from its configuration - structure (which itself was either created on startup for - per-server context or during the directory walk of the Apache - kernel for per-directory context). Then the URL rewriting - engine is started with the contained ruleset (one or more - rules together with their conditions). The operation of the - URL rewriting engine itself is exactly the same for both - configuration contexts. Only the final result processing is - different.

- -

The order of rules in the ruleset is important because the - rewriting engine processes them in a special (and not very - obvious) order. The rule is this: The rewriting engine loops - through the ruleset rule by rule (RewriteRule directives) and - when a particular rule matches it optionally loops through - existing corresponding conditions (RewriteCond - directives). For historical reasons the conditions are given - first, and so the control flow is a little bit long-winded. See - Figure 1 for more details.

-

- [Needs graphics capability to display]
- Figure 1:The control flow through the rewriting ruleset -

-

As you can see, first the URL is matched against the - Pattern of each rule. When it fails mod_rewrite - immediately stops processing this rule and continues with the - next rule. If the Pattern matches, mod_rewrite looks - for corresponding rule conditions. If none are present, it - just substitutes the URL with a new value which is - constructed from the string Substitution and goes on - with its rule-looping. But if conditions exist, it starts an - inner loop for processing them in the order that they are - listed. For conditions the logic is different: we don't match - a pattern against the current URL. Instead we first create a - string TestString by expanding variables, - back-references, map lookups, etc. and then we try - to match CondPattern against it. If the pattern - doesn't match, the complete set of conditions and the - corresponding rule fails. If the pattern matches, then the - next condition is processed until no more conditions are - available. If all conditions match, processing is continued - with the substitution of the URL with - Substitution.

- -
-
Quoting Special Characters

As of Apache 1.3.20, special characters in @@ -256,7 +90,6 @@ URLs on the fly directives.

-
Environment Variables @@ -287,13 +120,13 @@ SCRIPT_URI=http://en1.engelschall.com/u/rse/
Practical Solutions -

We also have an URL - Rewriting Guide available, which provides a collection of - practical solutions for URL-based problems. There you can - find real-life rulesets and additional information about - mod_rewrite.

-
+

For numerous examples of common, and not-so-common, uses for + mod_rewrite, see the Rewrite + Guide, and the Advanced Rewrite + Guide documents.

+
RewriteEngine -- 2.40.0