1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5 This file is generated from xml source: DO NOT EDIT
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
8 <title>Guide to writing output filters - Apache HTTP Server</title>
9 <link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
10 <link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
11 <link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="../style/css/prettify.css" />
12 <script src="../style/scripts/prettify.js" type="text/javascript">
15 <link href="../images/favicon.ico" rel="shortcut icon" /></head>
16 <body id="manual-page"><div id="page-header">
17 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
18 <p class="apache">Apache HTTP Server Version 2.5</p>
19 <img alt="" src="../images/feather.gif" /></div>
20 <div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>
22 <a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.5</a> > <a href="./">Developer Documentation</a></div><div id="page-content"><div id="preamble"><h1>Guide to writing output filters</h1>
24 <p><span>Available Languages: </span><a href="../en/developer/output-filters.html" title="English"> en </a></p>
27 <p>There are a number of common pitfalls encountered when writing
28 output filters; this page aims to document best practice for
29 authors of new or existing filters.</p>
31 <p>This document is applicable to both version 2.0 and version 2.2
32 of the Apache HTTP Server; it specifically targets
33 <code>RESOURCE</code>-level or <code>CONTENT_SET</code>-level
34 filters though some advice is generic to all types of filter.</p>
36 <div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#basics">Filters and bucket brigades</a></li>
37 <li><img alt="" src="../images/down.gif" /> <a href="#invocation">Filter invocation</a></li>
38 <li><img alt="" src="../images/down.gif" /> <a href="#brigade">Brigade structure</a></li>
39 <li><img alt="" src="../images/down.gif" /> <a href="#buckets">Processing buckets</a></li>
40 <li><img alt="" src="../images/down.gif" /> <a href="#filtering">Filtering brigades</a></li>
41 <li><img alt="" src="../images/down.gif" /> <a href="#state">Maintaining state</a></li>
42 <li><img alt="" src="../images/down.gif" /> <a href="#buffer">Buffering buckets</a></li>
43 <li><img alt="" src="../images/down.gif" /> <a href="#nonblock">Non-blocking bucket reads</a></li>
44 <li><img alt="" src="../images/down.gif" /> <a href="#rules">Ten rules for output filters</a></li>
45 </ul><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
46 <div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
48 <h2><a name="basics" id="basics">Filters and bucket brigades</a></h2>
51 <p>Each time a filter is invoked, it is passed a <em>bucket
52 brigade</em>, containing a sequence of <em>buckets</em> which
53 represent both data content and metadata. Every bucket has a
54 <em>bucket type</em>; a number of bucket types are defined and
55 used by the <code>httpd</code> core modules (and the
56 <code>apr-util</code> library which provides the bucket brigade
57 interface), but modules are free to define their own types.</p>
59 <div class="note">Output filters must be prepared to process
60 buckets of non-standard types; with a few exceptions, a filter
61 need not care about the types of buckets being filtered.</div>
63 <p>A filter can tell whether a bucket represents either data or
64 metadata using the <code>APR_BUCKET_IS_METADATA</code> macro.
65 Generally, all metadata buckets should be passed down the filter
66 chain by an output filter. Filters may transform, delete, and
67 insert data buckets as appropriate.</p>
69 <p>There are two metadata bucket types which all filters must pay
70 attention to: the <code>EOS</code> bucket type, and the
71 <code>FLUSH</code> bucket type. An <code>EOS</code> bucket
72 indicates that the end of the response has been reached and no
73 further buckets need be processed. A <code>FLUSH</code> bucket
74 indicates that the filter should flush any buffered buckets (if
75 applicable) down the filter chain immediately.</p>
77 <div class="note"><code>FLUSH</code> buckets are sent when the
78 content generator (or an upstream filter) knows that there may be
79 a delay before more content can be sent. By passing
80 <code>FLUSH</code> buckets down the filter chain immediately,
81 filters ensure that the client is not kept waiting for pending
82 data longer than necessary.</div>
84 <p>Filters can create <code>FLUSH</code> buckets and pass these
85 down the filter chain if desired. Generating <code>FLUSH</code>
86 buckets unnecessarily, or too frequently, can harm network
87 utilisation since it may force large numbers of small packets to
88 be sent, rather than a small number of larger packets. The
89 section on <a href="#nonblock">Non-blocking bucket reads</a>
90 covers a case where filters are encouraged to generate
91 <code>FLUSH</code> buckets.</p>
93 <div class="example"><h3>Example bucket brigade</h3><p><code>
94 HEAP FLUSH FILE EOS</code></p></div>
96 <p>This shows a bucket brigade which may be passed to a filter; it
97 contains two metadata buckets (<code>FLUSH</code> and
98 <code>EOS</code>), and two data buckets (<code>HEAP</code> and
99 <code>FILE</code>).</p>
101 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
102 <div class="section">
103 <h2><a name="invocation" id="invocation">Filter invocation</a></h2>
106 <p>For any given request, an output filter might be invoked only
107 once and be given a single brigade representing the entire response.
108 It is also possible that the number of times a filter is invoked
109 for a single response is proportional to the size of the content
110 being filtered, with the filter being passed a brigade containing
111 a single bucket each time. Filters must operate correctly in
114 <div class="warning">An output filter which allocates long-lived
115 memory every time it is invoked may consume memory proportional to
116 response size. Output filters which need to allocate memory
117 should do so once per response; see <a href="#state">Maintaining
118 state</a> below.</div>
120 <p>An output filter can distinguish the final invocation for a
121 given response by the presence of an <code>EOS</code> bucket in
122 the brigade. Any buckets in the brigade after an EOS should be
125 <p>An output filter should never pass an empty brigade down the
126 filter chain. To be defensive, filters should be prepared to
127 accept an empty brigade, and should return success without passing
128 this brigade on down the filter chain. The handling of an empty
129 brigade should have no side effects (such as changing any state
130 private to the filter).</p>
132 <div class="example"><h3>How to handle an empty brigade</h3><pre class="prettyprint lang-c">
133 apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)<br />
135 if (APR_BRIGADE_EMPTY(bb)) {
142 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
143 <div class="section">
144 <h2><a name="brigade" id="brigade">Brigade structure</a></h2>
147 <p>A bucket brigade is a doubly-linked list of buckets. The list
148 is terminated (at both ends) by a <em>sentinel</em> which can be
149 distinguished from a normal bucket by comparing it with the
150 pointer returned by <code>APR_BRIGADE_SENTINEL</code>. The list
151 sentinel is in fact not a valid bucket structure; any attempt to
152 call normal bucket functions (such as
153 <code>apr_bucket_read</code>) on the sentinel will have undefined
154 behaviour (i.e. will crash the process).</p>
156 <p>There are a variety of functions and macros for traversing and
157 manipulating bucket brigades; see the <a href="http://apr.apache.org/docs/apr-util/trunk/group___a_p_r___util___bucket___brigades.html">apr_bucket.h</a>
158 header for complete coverage. Commonly used macros include:</p>
161 <dt><code>APR_BRIGADE_FIRST(bb)</code></dt>
162 <dd>returns the first bucket in brigade bb</dd>
164 <dt><code>APR_BRIGADE_LAST(bb)</code></dt>
165 <dd>returns the last bucket in brigade bb</dd>
167 <dt><code>APR_BUCKET_NEXT(e)</code></dt>
168 <dd>gives the next bucket after bucket e</dd>
170 <dt><code>APR_BUCKET_PREV(e)</code></dt>
171 <dd>gives the bucket before bucket e</dd>
175 <p>The <code>apr_bucket_brigade</code> structure itself is
176 allocated out of a pool, so if a filter creates a new brigade, it
177 must ensure that memory use is correctly bounded. A filter which
178 allocates a new brigade out of the request pool
179 (<code>r->pool</code>) on every invocation, for example, will fall
180 foul of the <a href="#invocation">warning above</a> concerning
181 memory use. Such a filter should instead create a brigade on the
182 first invocation per request, and store that brigade in its <a href="#state">state structure</a>.</p>
184 <div class="warning"><p>It is generally never advisable to use
185 <code>apr_brigade_destroy</code> to "destroy" a brigade unless
186 you know for certain that the brigade will never be used
187 again, even then, it should be used rarely. The
188 memory used by the brigade structure will not be released by
189 calling this function (since it comes from a pool), but the
190 associated pool cleanup is unregistered. Using
191 <code>apr_brigade_destroy</code> can in fact cause memory leaks;
192 if a "destroyed" brigade contains buckets when its
193 containing pool is destroyed, those buckets will <em>not</em> be
194 immediately destroyed.</p>
196 <p>In general, filters should use <code>apr_brigade_cleanup</code>
197 in preference to <code>apr_brigade_destroy</code>.</p></div>
199 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
200 <div class="section">
201 <h2><a name="buckets" id="buckets">Processing buckets</a></h2>
205 <p>When dealing with non-metadata buckets, it is important to
206 understand that the "<code>apr_bucket *</code>" object is an
207 abstract <em>representation</em> of data:</p>
210 <li>The amount of data represented by the bucket may or may not
211 have a determinate length; for a bucket which represents data of
212 indeterminate length, the <code>->length</code> field is set to
213 the value <code>(apr_size_t)-1</code>. For example, buckets of
214 the <code>PIPE</code> bucket type have an indeterminate length;
215 they represent the output from a pipe.</li>
217 <li>The data represented by a bucket may or may not be mapped
218 into memory. The <code>FILE</code> bucket type, for example,
219 represents data stored in a file on disk.</li>
222 <p>Filters read the data from a bucket using the
223 <code>apr_bucket_read</code> function. When this function is
224 invoked, the bucket may <em>morph</em> into a different bucket
225 type, and may also insert a new bucket into the bucket brigade.
226 This must happen for buckets which represent data not mapped into
229 <p>To give an example; consider a bucket brigade containing a
230 single <code>FILE</code> bucket representing an entire file, 24
231 kilobytes in size:</p>
233 <div class="example"><p><code>FILE(0K-24K)</code></p></div>
235 <p>When this bucket is read, it will read a block of data from the
236 file, morph into a <code>HEAP</code> bucket to represent that
237 data, and return the data to the caller. It also inserts a new
238 <code>FILE</code> bucket representing the remainder of the file;
239 after the <code>apr_bucket_read</code> call, the brigade looks
242 <div class="example"><p><code>HEAP(8K) FILE(8K-24K)</code></p></div>
244 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
245 <div class="section">
246 <h2><a name="filtering" id="filtering">Filtering brigades</a></h2>
249 <p>The basic function of any output filter will be to iterate
250 through the passed-in brigade and transform (or simply examine)
251 the content in some manner. The implementation of the iteration
252 loop is critical to producing a well-behaved output filter.</p>
254 <p>Taking an example which loops through the entire brigade as
257 <div class="example"><h3>Bad output filter -- do not imitate!</h3><pre class="prettyprint lang-c">
258 apr_bucket *e = APR_BRIGADE_FIRST(bb);
262 while (e != APR_BRIGADE_SENTINEL(bb)) {
263 apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
264 e = APR_BUCKET_NEXT(e);
268 return ap_pass_brigade(bb);
272 <p>The above implementation would consume memory proportional to
273 content size. If passed a <code>FILE</code> bucket, for example,
274 the entire file contents would be read into memory as each
275 <code>apr_bucket_read</code> call morphed a <code>FILE</code>
276 bucket into a <code>HEAP</code> bucket.</p>
278 <p>In contrast, the implementation below will consume a fixed
279 amount of memory to filter any brigade; a temporary brigade is
280 needed and must be allocated only once per response, see the <a href="#state">Maintaining state</a> section.</p>
282 <div class="example"><h3>Better output filter</h3><pre class="prettyprint lang-c">
287 while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
288 rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
290 /* Remove bucket e from bb. */
291 APR_BUCKET_REMOVE(e);
292 /* Insert it into temporary brigade. */
293 APR_BRIGADE_INSERT_HEAD(tmpbb, e);
294 /* Pass brigade downstream. */
295 rv = ap_pass_brigade(f->next, tmpbb);
297 apr_brigade_cleanup(tmpbb);
302 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
303 <div class="section">
304 <h2><a name="state" id="state">Maintaining state</a></h2>
308 <p>A filter which needs to maintain state over multiple
309 invocations per response can use the <code>->ctx</code> field of
310 its <code>ap_filter_t</code> structure. It is typical to store a
311 temporary brigade in such a structure, to avoid having to allocate
312 a new brigade per invocation as described in the <a href="#brigade">Brigade structure</a> section.</p>
314 <div class="example"><h3>Example code to maintain filter state</h3><pre class="prettyprint lang-c">
316 apr_bucket_brigade *tmpbb;
321 apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
324 struct dummy_state *state;
329 /* First invocation for this response: initialise state structure.
331 f->ctx = state = apr_palloc(sizeof *state, f->r->pool);
333 state->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
334 state->filter_state = ...;
341 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
342 <div class="section">
343 <h2><a name="buffer" id="buffer">Buffering buckets</a></h2>
346 <p>If a filter decides to store buckets beyond the duration of a
347 single filter function invocation (for example storing them in its
348 <code>->ctx</code> state structure), those buckets must be <em>set
349 aside</em>. This is necessary because some bucket types provide
350 buckets which represent temporary resources (such as stack memory)
351 which will fall out of scope as soon as the filter chain completes
352 processing the brigade.</p>
354 <p>To setaside a bucket, the <code>apr_bucket_setaside</code>
355 function can be called. Not all bucket types can be setaside, but
356 if successful, the bucket will have morphed to ensure it has a
357 lifetime at least as long as the pool given as an argument to the
358 <code>apr_bucket_setaside</code> function.</p>
360 <p>Alternatively, the <code>ap_save_brigade</code> function can be
361 used, which will move all the buckets into a separate brigade
362 containing buckets with a lifetime as long as the given pool
363 argument. This function must be used with care, taking into
364 account the following points:</p>
367 <li>On return, <code>ap_save_brigade</code> guarantees that all
368 the buckets in the returned brigade will represent data mapped
369 into memory. If given an input brigade containing, for example,
370 a <code>PIPE</code> bucket, <code>ap_save_brigade</code> will
371 consume an arbitrary amount of memory to store the entire output
374 <li>When <code>ap_save_brigade</code> reads from buckets which
375 cannot be setaside, it will always perform blocking reads,
376 removing the opportunity to use <a href="#nonblock">Non-blocking
377 bucket reads</a>.</li>
379 <li>If <code>ap_save_brigade</code> is used without passing a
380 non-NULL "<code>saveto</code>" (destination) brigade parameter,
381 the function will create a new brigade, which may cause memory
382 use to be proportional to content size as described in the <a href="#brigade">Brigade structure</a> section.</li>
385 <div class="warning">Filters must ensure that any buffered data is
386 processed and passed down the filter chain during the last
387 invocation for a given response (a brigade containing an EOS
388 bucket). Otherwise such data will be lost.</div>
390 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
391 <div class="section">
392 <h2><a name="nonblock" id="nonblock">Non-blocking bucket reads</a></h2>
395 <p>The <code>apr_bucket_read</code> function takes an
396 <code>apr_read_type_e</code> argument which determines whether a
397 <em>blocking</em> or <em>non-blocking</em> read will be performed
398 from the data source. A good filter will first attempt to read
399 from every data bucket using a non-blocking read; if that fails
400 with <code>APR_EAGAIN</code>, then send a <code>FLUSH</code>
401 bucket down the filter chain, and retry using a blocking read.</p>
403 <p>This mode of operation ensures that any filters further down the
404 filter chain will flush any buffered buckets if a slow content
405 source is being used.</p>
407 <p>A CGI script is an example of a slow content source which is
408 implemented as a bucket type. <code class="module"><a href="../mod/mod_cgi.html">mod_cgi</a></code> will send
409 <code>PIPE</code> buckets which represent the output from a CGI
410 script; reading from such a bucket will block when waiting for the
411 CGI script to produce more output.</p>
413 <div class="example"><h3>Example code using non-blocking bucket reads</h3><pre class="prettyprint lang-c">
415 apr_read_type_e mode = APR_NONBLOCK_READ;
417 while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
420 rv = apr_bucket_read(e, &data, &length, mode);
421 if (rv == APR_EAGAIN && mode == APR_NONBLOCK_READ) {
423 /* Pass down a brigade containing a flush bucket: */
424 APR_BRIGADE_INSERT_TAIL(tmpbb, apr_bucket_flush_create(...));
425 rv = ap_pass_brigade(f->next, tmpbb);
426 apr_brigade_cleanup(tmpbb);
427 if (rv != APR_SUCCESS) return rv;
429 /* Retry, using a blocking read. */
430 mode = APR_BLOCK_READ;
432 } else if (rv != APR_SUCCESS) {
436 /* Next time, try a non-blocking read first. */
437 mode = APR_NONBLOCK_READ;
443 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
444 <div class="section">
445 <h2><a name="rules" id="rules">Ten rules for output filters</a></h2>
448 <p>In summary, here is a set of rules for all output filters to
452 <li>Output filters should not pass empty brigades down the filter
453 chain, but should be tolerant of being passed empty
456 <li>Output filters must pass all metadata buckets down the filter
457 chain; <code>FLUSH</code> buckets should be respected by passing
458 any pending or buffered buckets down the filter chain.</li>
460 <li>Output filters should ignore any buckets following an
461 <code>EOS</code> bucket.</li>
463 <li>Output filters must process a fixed amount of data at a
464 time, to ensure that memory consumption is not proportional to
465 the size of the content being filtered.</li>
467 <li>Output filters should be agnostic with respect to bucket
468 types, and must be able to process buckets of unfamiliar
471 <li>After calling <code>ap_pass_brigade</code> to pass a brigade
472 down the filter chain, output filters should call
473 <code>apr_brigade_cleanup</code> to ensure the brigade is empty
474 before reusing that brigade structure; output filters should
475 never use <code>apr_brigade_destroy</code> to "destroy"
478 <li>Output filters must <em>setaside</em> any buckets which are
479 preserved beyond the duration of the filter function.</li>
481 <li>Output filters must not ignore the return value of
482 <code>ap_pass_brigade</code>, and must return appropriate errors
483 back up the filter chain.</li>
485 <li>Output filters must only create a fixed number of bucket
486 brigades for each response, rather than one per invocation.</li>
488 <li>Output filters should first attempt non-blocking reads from
489 each data bucket, and send a <code>FLUSH</code> bucket down the
490 filter chain if the read blocks, before retrying with a blocking
496 <div class="bottomlang">
497 <p><span>Available Languages: </span><a href="../en/developer/output-filters.html" title="English"> en </a></p>
498 </div><div class="top"><a href="#page-header"><img src="../images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div>
499 <script type="text/javascript"><!--//--><![CDATA[//><!--
500 var comments_shortname = 'httpd';
501 var comments_identifier = 'http://httpd.apache.org/docs/trunk/developer/output-filters.html';
503 if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
504 d.write('<div id="comments_thread"><\/div>');
505 var s = d.createElement('script');
506 s.type = 'text/javascript';
508 s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
509 (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
512 d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
514 })(window, document);
515 //--><!]]></script></div><div id="footer">
516 <p class="apache">Copyright 2012 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
517 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
518 if (typeof(prettyPrint) !== 'undefined') {