2 <!DOCTYPE modulesynopsis SYSTEM "../style/modulesynopsis.dtd">
3 <?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
4 <!-- $Revision: 1.7 $ -->
7 Copyright 2004 The Apache Software Foundation
9 Licensed under the Apache License, Version 2.0 (the "License");
10 you may not use this file except in compliance with the License.
11 You may obtain a copy of the License at
13 http://www.apache.org/licenses/LICENSE-2.0
15 Unless required by applicable law or agreed to in writing, software
16 distributed under the License is distributed on an "AS IS" BASIS,
17 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
18 See the License for the specific language governing permissions and
19 limitations under the License.
22 <modulesynopsis metafile="mod_filter.xml.meta">
24 <name>mod_filter</name>
25 <description>Context-sensitive smart filter configuration module</description>
26 <status>Experimental</status>
27 <sourcefile>mod_filter.c</sourcefile>
28 <identifier>filter_module</identifier>
29 <compatibility>Version 2.1 and higher</compatibility>
32 <p>This module enables smart, context-sensitive configuration of
33 output content filters. For example, apache can be configured to
34 process different content-types through different filters, even
35 when the content-type is not known in advance (e.g. in a proxy).</p>
37 <p><module>mod_filter</module> works by introducing indirection into
38 the filter chain. Instead of inserting filters in the chain, we insert
39 a filter harness which in turn dispatches conditionally
40 to a filter provider. Any content filter may be used as a provider
41 to <module>mod_filter</module>; no change to existing filter modules is
42 required (although it may be possible to simplify them).</p>
45 <section id="smart"><title>Smart Filtering</title>
46 <p>In the traditional filtering model, filters are inserted unconditionally
47 using <directive module="mod_mime">AddOutputFilter</directive> and family.
48 Each filter then needs to determine whether to run, and there is little
49 flexibility available for server admins to allow the chain to be
50 configured dynamically.</p>
52 <p><module>mod_filter</module> by contrast gives server administrators a
53 great deal of flexibility in configuring the filter chain. In fact,
54 filters can be inserted based on any Request Header, Response Header
55 or Environment Variable. This generalises the limited flexibility offered
56 by <directive module="core">AddOutputFilterByType</directive>, and fixes
57 it to work correctly with dynamic content, regardless of the
58 content generator. The ability to dispatch based on Environment
59 Variables offers the full flexibility of configuration with
60 <module>mod_rewrite</module> to anyone who needs it.</p>
63 <section id="terms"><title>Filter Declarations, Providers and Chains</title>
65 <img src="../images/mod_filter_old.gif" width="160" height="310"
66 alt="[This image displays the traditional filter model]"/><br />
67 <dfn>Figure 1:</dfn> The traditional filter model</p>
69 <p>In the traditional model, output filters are a simple chain
70 from the content generator (handler) to the client. This works well
71 provided the filter chain can be correctly configured, but presents
72 problems when the filters need to be configured dynamically based on
73 the outcome of the handler.</p>
76 <img src="../images/mod_filter_new.gif" width="423" height="331"
77 alt="[This image shows the mod_filter model]"/><br />
78 <dfn>Figure 2:</dfn> The <module>mod_filter</module> model</p>
80 <p><module>mod_filter</module> works by introducing indirection into
81 the filter chain. Instead of inserting filters in the chain, we insert
82 a filter harness which in turn dispatches conditionally
83 to a filter provider. Any content filter may be used as a provider
84 to <module>mod_filter</module>; no change to existing filter modules
85 is required (although it may be possible to simplify them). There can be
86 multiple providers for one filter, but no more than one provider will
87 run for any single request.</p>
89 <p>A filter chain comprises any number of instances of the filter
90 harness, each of which may have any number of providers. A special
91 case is that of a single provider with unconditional dispatch: this
92 is equivalent to inserting the provider filter directly into the chain.</p>
95 <section id="config"><title>Configuring the Chain</title>
96 <p>There are three stages to configuring a filter chain with
97 <module>mod_filter</module>. For details of the directives, see below.</p>
100 <dt>Declare Filters</dt>
101 <dd>The <directive module="mod_filter">FilterDeclare</directive> directive
102 declares a filter, assigning it a name and a dispatch criterion.</dd>
104 <dt>Register Providers</dt>
105 <dd>The <directive module="mod_filter">FilterProvider</directive>
106 directive registers a provider with a filter. The filter must have
107 been registered with <directive module="mod_filter"
108 >FilterDeclare</directive>. The provider must have been
109 registered with <code>ap_register_output_filter</code> by some module.
110 The final argument to <directive module="mod_filter"
111 >FilterProvider</directive> is a match string, that will be checked
112 against the filter's dispatch criterion to determine whether to run
115 <dt>Configure the Chain</dt>
116 <dd>The above directives build components of a smart filter chain,
117 but do not configure it to run. The <directive module="mod_filter"
118 >FilterChain</directive> directive builds a filter chain from smart
119 filters declared, offering the flexibility to insert filters at the
120 beginning or end of the chain, remove a filter, or clear the chain.</dd>
124 <section id="examples"><title>Examples</title>
126 <dt>Server side Includes (SSI)</dt>
127 <dd>A simple case of using <module>mod_filter</module> in place of
128 <directive module="core">AddOutputFilterByType</directive>
130 FilterDeclare SSI Content-Type<br/>
131 FilterProvider SSI INCLUDES $text/html<br/>
136 <dt>Server side Includes (SSI)</dt>
137 <dd>The same as the above but dispatching on handler (classic
138 SSI behaviour; .shtml files get processed).
140 FilterDeclare SSI Handler<br/>
141 FilterProvider SSI INCLUDES server-parsed<br/>
146 <dt>Emulating mod_gzip with mod_deflate</dt>
147 <dd>Insert INFLATE filter only if "gzip" is NOT in the
148 Accept-Encoding header.
150 FilterDeclare gzip req=Accept-Encoding<br/>
151 FilterProvider gzip inflate !$gzip<br/>
156 <dt>Image Downsampling</dt>
157 <dd>Suppose we want to downsample all web images, and have filters
158 for GIF, JPEG and PNG.
160 FilterDeclare unpack Content-Type<br/>
161 FilterProvider unpack jpeg_unpack $image/jpeg<br/>
162 FilterProvider unpack gif_unpack $image/gif<br/>
163 FilterProvider unpack png_unpack $image/png<br/>
165 FilterDeclare downsample Content-Type<br/>
166 FilterProvider downsample downsample_filter $image<br/>
167 FilterProtocol downsample "change=yes"<br/>
169 FilterDeclare repack Content-Type<br/>
170 FilterProvider repack jpeg_pack $image/jpeg<br/>
171 FilterProvider repack gif_pack $image/gif<br/>
172 FilterProvider repack png_pack $image/png<br/>
173 <Location /image-filter><br/>
175 FilterChain unpack downsample repack<br/>
183 <section id="protocol"><title>Protocol Handling</title>
184 <p>Historically, each filter is responsible for ensuring that whatever
185 changes it makes are correctly represented in the HTTP response headers,
186 and that it does not run when it would make an illegal change. This
187 imposes a burden on filter authors to re-implement some common
188 functionality in every filter:</p>
191 <li>Many filters will change the content, invalidating existing content
192 tags, checksums, hashes, and lengths.</li>
194 <li>Filters that require an entire, unbroken response in input need to
195 ensure they don't get byteranges from a backend.</li>
197 <li>Filters that transform output in a filter need to ensure they don't
198 violate a <code>Cache-Control: no-transform</code> header from the
201 <li>Filters may make responses uncacheable.</li>
204 <p><module>mod_filter</module> aims to offer generic handling of these
205 details of filter implementation, reducing the complexity required of
206 content filter modules. This is work-in-progress; the
207 <directive module="mod_filter">FilterProtocol</directive> implements
208 some of this functionality, but there are no API calls yet.</p>
210 <p>At the same time, <module>mod_filter</module> should not interfere
211 with a filter that wants to handle all aspects of the protocol. By
212 default (i.e. in the absence of any <directive module="mod_filter"
213 >FilterProtocol</directive> directives), <module>mod_filter</module>
214 will leave the headers untouched.</p>
218 <name>FilterDeclare</name>
219 <description>Declare a smart filter</description>
220 <syntax>FilterDeclare <var>filter-name</var> [req|resp|env]=<var>dispatch</var>
221 <var>[type]</var></syntax>
222 <contextlist><context>server config</context><context>virtual host</context>
223 <context>directory</context><context>.htaccess</context></contextlist>
224 <override>Options</override>
227 <p>This directive declares an output filter together with a
228 header or environment variable that will determine runtime
229 configuration. The first argument is a <var>filter-name</var>
230 for use in <directive module="mod_filter">FilterProvider</directive>,
231 <directive module="mod_filter">FilterChain</directive> and
232 <directive module="mod_filter">FilterProtocol</directive> directives.</p>
234 <p>The second is a string with optional <code>req=</code>,
235 <code>resp=</code> or <code>env=</code> prefix causing it
236 to dispatch on (respectively) the request header, response
237 header, or environment variable named. In the absence of a
238 prefix, it defaults to a response header. A special case is the
239 word <code>handler</code>, which causes <module>mod_filter</module>
240 to dispatch on the content handler.</p>
242 <p>The final (optional) argument
243 is the type of filter, and takes values of <code>ap_filter_type</code>
244 - namely <code>RESOURCE</code> (the default), <code>CONTENT_SET</code>,
245 <code>PROTOCOL</code>, <code>TRANSCODE</code>, <code>CONNECTION</code>
246 or <code>NETWORK</code>.</p>
251 <name>FilterProvider</name>
252 <description>Register a content filter</description>
253 <syntax>FilterProvider <var>filter-name</var> <var>provider-name</var>
254 <var>match</var></syntax>
255 <contextlist><context>server config</context><context>virtual host</context>
256 <context>directory</context><context>.htaccess</context></contextlist>
257 <override>Options</override>
260 <p>This directive registers a <em>provider</em> for the smart filter.
261 The provider will be called if and only if the <var>match</var> declared
262 here matches the value of the header or environment variable declared
263 as <var>dispatch</var> in the <directive module="mod_filter"
264 >FilterDeclare</directive> directive that declared
265 <var>filter-name</var>.</p>
267 <p><var>filter-name</var> must have been declared with
268 <directive module="mod_filter">FilterDeclare</directive>.
269 <var>provider-name</var> must have been registered by loading
270 a module that registers the name with
271 <code>ap_register_output_filter</code>.</p>
273 <p>The <var>match</var> argument specifies a match that will be applied to
274 the filter's <var>dispatch</var> criterion. The <var>match</var> may be
275 a string match (exact match or substring), a regexp, an integer (greater,
276 lessthan or equals), or unconditional. The first characters of the
277 <var>match</var> argument determines this:</p>
279 <p><strong>First</strong>, if the first character is an exclamation mark
280 (<code>!</code>), this reverses the rule, so the provider will be used
281 if and only if the match <em>fails</em>.</p>
283 <p><strong>Second</strong>, it interprets the first character excluding
284 any leading <code>!</code> as follows:</p>
286 <table style="zebra" border="yes">
287 <tr><th>Character</th><th>Description</th></tr>
288 <tr><td><em>(none)</em></td><td>exact match</td></tr>
289 <tr><td><code>$</code></td><td>substring match</td></tr>
290 <tr><td><code>/</code></td><td>regexp match</td></tr>
291 <tr><td><code>=</code></td><td>integer equality</td></tr>
292 <tr><td><code><</code></td><td>integer less-than</td></tr>
293 <tr><td><code>></code></td><td>integer greater-than</td></tr>
294 <tr><td><code>*</code></td><td>Unconditional match</td></tr>
300 <name>FilterChain</name>
301 <description>Configure the filter chain</description>
302 <syntax>FilterChain [+=-@!]<var>filter-name</var> <var>...</var></syntax>
303 <contextlist><context>server config</context><context>virtual host</context>
304 <context>directory</context><context>.htaccess</context></contextlist>
305 <override>Options</override>
308 <p>This configures an actual filter chain, from declared filters.
309 <directive>FilterChain</directive> takes any number of arguments,
310 each optionally preceded with a single-character control that
311 determines what to do:</p>
314 <dt><code>+<var>filter-name</var></code></dt>
315 <dd>Add <var>filter-name</var> to the end of the filter chain</dd>
317 <dt><code>@<var>filter-name</var></code></dt>
318 <dd>Insert <var>filter-name</var> at the start of the filter chain</dd>
320 <dt><code>-<var>filter-name</var></code></dt>
321 <dd>Remove <var>filter-name</var> from the filter chain</dd>
323 <dt><code>=<var>filter-name</var></code></dt>
324 <dd>Empty the filter chain and insert <var>filter-name</var></dd>
326 <dt><code>!</code></dt>
327 <dd>Empty the filter chain</dd>
329 <dt><code><var>filter-name</var></code></dt>
330 <dd>Equivalent to <code>+<var>filter-name</var></code></dd>
336 <name>FilterProtocol</name>
337 <description>Deal with correct HTTP protocol handling</description>
338 <syntax>FilterProtocol <var>filter-name</var> [<var>provider-name</var>]
339 <var>proto-flags</var></syntax>
340 <contextlist><context>server config</context><context>virtual host</context>
341 <context>directory</context><context>.htaccess</context></contextlist>
342 <override>Options</override>
345 <p>This directs <module>mod_filter</module> to deal with ensuring the
346 filter doesn't run when it shouldn't, and that the HTTP response
347 headers are correctly set taking into account the effects of the
350 <p>There are two forms of this directive. With three arguments, it
351 applies specifically to a <var>filter-name</var> and a
352 <var>provider-name</var> for that filter.
353 With two arguments it applies to a <var>filter-name</var> whenever the
354 filter runs <em>any</em> provider.</p>
356 <p><var>proto-flags</var> is one or more of</p>
359 <dt><code>change=yes</code></dt>
360 <dd>The filter changes the content, including possibly the content
363 <dt><code>change=1:1</code></dt>
364 <dd>The filter changes the content, but will not change the content
367 <dt><code>byteranges=no</code></dt>
368 <dd>The filter cannot work on byteranges and requires complete input</dd>
370 <dt><code>proxy=no</code></dt>
371 <dd>The filter should not run in a proxy context</dd>
373 <dt><code>proxy=transform</code></dt>
374 <dd>The filter transforms the response in a manner incompatible with
375 the HTTP <code>Cache-Control: no-transform</code> header.</dd>
377 <dt><code>cache=no</code></dt>
378 <dd>The filter renders the output uncacheable (eg by introducing randomised
379 content changes)</dd>
385 <name>FilterTrace</name>
386 <description>Get debug/diagnostic information from
387 <module>mod_filter</module></description>
388 <syntax>FilterTrace <var>filter-name</var> <var>level</var></syntax>
389 <contextlist><context>server config</context><context>virtual host</context>
390 <context>directory</context></contextlist>
393 <p>This directive generates debug information from
394 <module>mod_filter</module>.
395 It is designed to help test and debug providers (filter modules), although
396 it may also help with <module>mod_filter</module> itself.</p>
398 <p>The debug output depends on the <var>level</var> set:</p>
400 <dt><code>0</code> (default)</dt>
401 <dd>No debug information is generated.</dd>
403 <dt><code>1</code></dt>
404 <dd><module>mod_filter</module> will record buckets and brigades
405 passing through the filter to the error log, before the provider has
406 processed them. This is similar to the information generated by
407 <a href="http://apache.webthing.com/mod_diagnostics/">mod_diagnostics</a>.
410 <dt><code>2</code> (not yet implemented)</dt>
411 <dd>Will dump the full data passing through to a tempfile before the
412 provider. <strong>For single-user debug only</strong>; this will not
413 support concurrent hits.</dd>