1 <?xml version="1.0" encoding="UTF-8" ?>
2 <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="compliance.xml.meta">
25 <title>HTTP Protocol Compliance</title>
28 <p>This document describes the mechanism to set a policy for HTTP
29 protocol compliance for a given URL space by the origin servers or
30 applications behind that URL space.</p>
32 <p>For those who may have received an error message from a rejected
33 policy, and need to know what the policy rejection means and what
34 they might do to fix the error, each policy is described below.</p>
36 <seealso><a href="filter.html">Filters</a></seealso>
39 <title>Enforcing HTTP Protocol Compliance in Apache 2</title>
42 <module>mod_policy</module>
45 <directive module="mod_policy">PolicyConditional</directive>
46 <directive module="mod_policy">PolicyLength</directive>
47 <directive module="mod_policy">PolicyKeepalive</directive>
48 <directive module="mod_policy">PolicyType</directive>
49 <directive module="mod_policy">PolicyVary</directive>
50 <directive module="mod_policy">PolicyValidation</directive>
51 <directive module="mod_policy">PolicyNocache</directive>
52 <directive module="mod_policy">PolicyMaxage</directive>
53 <directive module="mod_policy">PolicyVersion</directive>
57 <p>The HTTP protocol follows the <strong>robustness principle</strong>
58 as described in <a href="http://tools.ietf.org/html/rfc1122">RFC1122</a>,
59 which states <strong>"Be liberal in what you accept, and conservative in
60 what you send"</strong>. As a result of this principle, HTTP clients will
61 compensate for and recover from incorrect or misconfigured responses, or
62 responses that are uncacheable.</p>
64 <p>As a website is scaled up to face greater and greater traffic loads,
65 suboptimal or misconfigured applications or server configurations can
66 threaten both the stability and scalability of the website, as well as
67 the hosting costs associated with it. A website can also scale up to face
68 greater configuration complexity, and it can be increasingly difficult to
69 detect and keep track of suboptimally configured URL spaces on a given
72 <p>Eventually a point is reached where the principle "conservative in
73 what you send" needs to be enforced by the server administrator.</p>
75 <p>The <module>mod_policy</module> module provides a set of filters
76 which can be applied to a server, allowing key features of the HTTP
77 protocol to be explicitly tested, and non compliant responses logged as
78 warnings, or rejected outright as an error. Each filter can be applied
79 separately, allowing the administrator to pick and choose which policies
80 should be enforced depending on the circumstances of their environment.
83 <p>The filters might be placed in testing and staging environments for
84 the benefit of application and website developers, or may be applied
85 to production servers to protect infrastructure from systems outside
86 the administrator's direct control.</p>
89 <img src="images/compliance-reverse-proxy.png" width="666" height="239" alt=
90 "Enforcing HTTP protocol compliance for an application server"/>
93 <p>In the above example, an Apache httpd server has been placed between
94 the application server and the internet at large, and configured to cache
95 responses from the application server. The <module>mod_policy</module>
96 filters have been added to enforce support for cacheable content and
97 conditional requests, ensuring that both <module>mod_cache</module> and
98 public caches on the internet are fully able to cache content created
99 by the restful application server efficiently.</p>
102 <img src="images/compliance-static.png" width="469" height="239" alt=
103 "Enforcing HTTP protocol compliance in a static server"/>
106 <p>In the above simpler example, a static server serving highly cacheable
107 content has a set of policies applied to ensure that the server configuration
108 conforms to a minimum level of compliance.</p>
112 <section id="policyconditional">
113 <title>Conditional Request Policy</title>
116 <module>mod_policy</module>
119 <directive module="mod_policy">PolicyConditional</directive>
123 <p>This policy will be rejected if the server does not correctly respond
124 to a conditional request with the appropriate status code.</p>
126 <p>Conditional requests form the mechanism by which an HTTP cache makes
127 stale content fresh again, and particularly for content with short freshness
128 lifetimes, lack of support for conditional requests can add avoidable load
131 <p>Most specifically, the existence of any of following headers in the
132 request makes the request conditional:</p>
135 <dt><code>If-Match</code></dt>
136 <dd>If the provided ETag in the <code>If-Match</code> header does not match
137 the ETag of the response, the server should return
138 <code>412 Precondition Failed</code>. Full details of how to handle an
139 <code>If-Match</code> header can be found in
140 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24">
141 RFC2616 section 14.24</a>.</dd>
143 <dt><code>If-None-Match</code></dt>
144 <dd>If the provided ETag in the <code>If-None-Match</code> header matches
145 the ETag of the response, the server should return either
146 <code>304 Not Modified</code> for GET/HEAD requests, or
147 <code>412 Precondition Failed</code> for other methods. Full details of how
148 to handle an <code>If-None-Match</code> header can be found in
149 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26">
150 RFC2616 section 14.26</a>.</dd>
152 <dt><code>If-Modified-Since</code></dt>
153 <dd>If the provided date in the <code>If-Modified-Since</code> header is
154 older than the <code>Last-Modified</code> header of the response, the server
155 should return <code>304 Not Modified</code>. Full details of how to handle an
156 <code>If-Modified-Since</code> header can be found in
157 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25">
158 RFC2616 section 14.25</a>.</dd>
160 <dt><code>If-Unmodified-Since</code></dt>
161 <dd>If the provided date in the <code>If-Modified-Since</code> header is
162 newer than the <code>Last-Modified</code> header of the response, the server
163 should return <code>412 Precondition Failed</code>. Full details of how to
164 handle an <code>If-Unmodified-Since</code> header can be found in
165 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28">
166 RFC2616 section 14.28</a>.</dd>
168 <dt><code>If-Range</code></dt>
169 <dd>If the provided ETag or date in the <code>If-Range</code> header matches
170 the ETag or Last-Modified of the response, and a valid <code>Range</code>
171 is present, the server should return
172 <code>206 Partial Response</code>. Full details of how to handle an
173 <code>If-Range</code> header can be found in
174 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.27">
175 RFC2616 section 14.27</a>.</dd>
179 <p>If the response is detected to have been successful (a 2xx response),
180 but was conditional and one of the responses above was expected instead,
181 this policy will be rejected. Responses that indicate a redirect or a
182 failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy.</p>
184 <p>This policy is implemented by the <strong>POLICY_CONDITIONAL</strong>
189 <section id="policylength">
190 <title>Content-Length Policy</title>
193 <module>mod_policy</module>
196 <directive module="mod_policy">PolicyLength</directive>
200 <p>This policy will be rejected if the server response does not contain
201 an explicit <code>Content-Length</code> header.</p>
203 <p>There are a number of ways of determining the length of a response
204 body, described in full in
205 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4">
206 RFC2616 section 4.4 Message Length</a>.</p>
208 <p>When the <code>Content-Length</code> header is present, the size of
209 the body is declared at the start of the response. If this information
210 is missing, an HTTP cache might choose to ignore the response, as it
211 does not know in advance whether the response will fit within the
212 cache's defined limits.</p>
214 <p>HTTP/1.1 defines the <code>Transfer-Encoding</code> header as an
215 alternative to <code>Content-Length</code>, allowing the end of the
216 response to be indicated to the client without the client having to
217 know the length beforehand. However, when HTTP/1.0 requests are
218 processed, and no <code>Content-Length</code> is specified, the only
219 mechanism available to the server to indicate the end of the request
220 is to drop the connection. In an environment containing load
221 balancers, this can cause the keepalive mechanism to be bypassed.
224 <p>If the response is detected to have been successful (a 2xx response),
225 and has a response body (this excludes <code>204 No Content</code>), and
226 the <code>Content-Length</code> header is missing, this policy will be
227 rejected. Responses that indicate a redirect or a failure of some kind
228 (3xx, 4xx, 5xx) will be ignored by this policy.</p>
230 <note type="warning">It should be noted that some modules, such as
231 <module>mod_proxy</module>, add their own <code>Content-Length</code>
232 header should the response be small enough for it to have been possible
233 to read the response lacking such a header in one go. This may cause
234 small responses to pass this policy, while larger responses may
235 fail for the same URL.</note>
237 <p>This policy is implemented by the <strong>POLICY_LENGTH</strong>
242 <section id="policytype">
243 <title>Content-Type Policy</title>
246 <module>mod_policy</module>
249 <directive module="mod_policy">PolicyType</directive>
253 <p>This policy will be rejected if the server response does not contain
254 an explicit and syntactically correct <code>Content-Type</code> header
255 that matches the server defined pattern.</p>
257 <p>The media type of the body is placed in the <code>Content-Type</code>
258 header, and the format of the header is described in full in
259 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7">
260 RFC2616 section 3.7 Media Types</a>.</p>
262 <p>A syntactically valid content type might look as follows:</p>
265 Content-Type: text/html; charset=iso-8859-1
268 <p>Invalid content types might include:</p>
272 Content-Type: foo<br />
277 <p>The server administrator has the option to restrict the policy to one
278 or more specific types, or could specify a general wildcard type such as
279 <code>*/*</code>.</p>
281 <p>This policy is implemented by the <strong>POLICY_TYPE</strong>
286 <section id="policykeepalive">
287 <title>Keepalive Policy</title>
290 <module>mod_policy</module>
293 <directive module="mod_policy">PolicyKeepalive</directive>
297 <p>This policy will be rejected if the server response does not contain
298 an explicit <code>Content-Length</code> header, or a
299 <code>Transfer-Encoding</code> of chunked.</p>
301 <p>There are a number of ways of determining the length of a response
302 body, described in full in
303 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4">
304 RFC2616 section 4.4 Message Length</a>.</p>
306 <p>When the <code>Content-Length</code> header is present, the size of
307 the body is declared at the start of the response. HTTP/1.1 defines the
308 <code>Transfer-Encoding</code> header as an alternative to
309 <code>Content-Length</code>, allowing the end of the response to be
310 indicated to the client without the client having to know the length
311 beforehand. In the absence of these two mechanisms, the only way for
312 a server to indicate the end of the request is to drop the connection.
313 In an environment containing load balancers, this can cause the keepalive
314 mechanism to be bypassed.
317 <p>Most specifically, we follow these rules:</p>
321 <dd>we have not marked this connection as errored;</dd>
324 <dd>the client isn't expecting 100-continue</dd>
327 <dd>the response status does not require a close;</dd>
330 <dd>the response body has a defined length due to the status code
331 being 304 or 204, the request method being HEAD, already having defined
332 Content-Length or Transfer-Encoding: chunked, or the request version
333 being HTTP/1.1 and thus capable of being set as chunked</dd>
336 <dd>we support keepalive.</dd>
339 <note type="warning">The server may choose to turn off keepalive for
340 various reasons, such as an imminent shutdown, or a Connection: close from
341 the client, or an HTTP/1.0 client request with a response with no
342 <code>Content-Length</code>, but for our purposes we only care that
343 keepalive was possible from the application, not that keepalive actually
346 <p>It should also be noted that the Apache httpd server includes a filter
347 that adds chunked encoding to responses without an explicit content
348 length. This policy catches those cases where this filter is bypassed or
351 <p>This policy is implemented by the <strong>POLICY_KEEPALIVE</strong>
356 <section id="policymaxage">
357 <title>Freshness Lifetime / Maxage Policy</title>
360 <module>mod_policy</module>
363 <directive module="mod_policy">PolicyMaxage</directive>
367 <p>This policy will be rejected if the server response does not have
368 an explicit <strong>freshness lifetime</strong> at least as long
369 as the server defined limit, or if the freshness lifetime is
370 calculated based on a heuristic.</p>
372 <p>Full details of how a freshness lifetime is calculated is described in
374 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2">
375 RFC2616 section 13.2 Expiration Model</a>.</p>
377 <p>During the freshness lifetime, a cache does not need to contact the
378 origin server at all, it can simply pass the cached content as is back
381 <p>When the freshness lifetime is reached, the cache should contact the
382 origin server in an effort to check whether the content is still fresh,
383 and if not, replace the content.</p>
385 <p>When the freshness lifetime is too short, it can result in excessive
386 load on the server. In addition, should an outage occur that is as long
387 or longer than the freshness lifetime, all cached content will become
388 stale, which could cause a thundering herd of traffic when the
389 server or network returns.</p>
391 <p>This policy is implemented by the <strong>POLICY_MAXAGE</strong>
396 <section id="policynocache">
397 <title>No Cache Policy</title>
400 <module>mod_policy</module>
403 <directive module="mod_policy">PolicyNocache</directive>
407 <p>This policy will be rejected if the server response declares itself
408 uncacheable using either the <code>Cache-Control</code> or
409 <code>Pragma</code> headers.</p>
411 <p>Full details of how content may be declared uncacheable is described in
413 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1">
414 RFC2616 section 14.9.1 What is Cacheable</a>, and within the definition
415 for the <code>Pragma</code> header in
416 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32">
417 RFC2616 section 14.32 Pragma</a>.</p>
419 <p>Most specifically, should any of the following header combinations
420 exist in the response headers, the response will be rejected:</p>
423 <li><code>Cache-Control: no-cache</code></li>
424 <li><code>Cache-Control: no-store</code></li>
425 <li><code>Cache-Control: private</code></li>
426 <li><code>Pragma: no-cache</code></li>
429 <p>When unexpected, uncacheable content may produce unacceptable levels
430 of server load, or may incur significant cost. When this policy is enabled,
431 all server defined uncacheable content will be rejected.</p>
433 <p>This policy is implemented by the <strong>POLICY_NOCACHE</strong>
438 <section id="policyvalidation">
439 <title>Validation Policy</title>
442 <module>mod_policy</module>
445 <directive module="mod_policy">PolicyValidation</directive>
449 <p>This policy will be rejected if the server response does not contain
450 either a syntactically correct <code>ETag</code> or
451 <code>Last-Modified</code> header.</p>
453 <p>The <code>ETag</code> header is described in full in
454 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19">
455 RFC2616 section 14.19 Etag</a>, and the <code>Last-Modified</code> header
456 is described in full in
457 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29">
458 RFC2616 section 14.29 Last-Modified</a>.</p>
460 <p>In addition to being checked present, the headers are checked for
463 <p>An <code>ETag</code> that is not surrounded with quotes, or is not
464 declared "weak" by prefixing it with a "W/" will cause the policy to be
465 rejected. A <code>Last-Modified</code> that is not parsed as a valid date
466 will cause the policy to be rejected.</p>
468 <p>This policy is implemented by the <strong>POLICY_VALIDATION</strong>
473 <section id="policyvary">
474 <title>Vary Header Policy</title>
477 <module>mod_policy</module>
480 <directive module="mod_policy">PolicyVary</directive>
484 <p>This policy will be rejected if the server response contains a
485 <code>Vary</code> header, and that header in turn contains a header
486 blacklisted by the administrator.</p>
488 <p>The <code>Vary</code> header is described in full in
489 <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44">
490 RFC2616 section 14.44 Vary</a>.</p>
492 <p>Some client provided headers, such as <code>User-Agent</code>,
493 can contain thousands or millions of combinations of values over a period
494 of time, and if the response is declared cacheable, a cache might attempt
495 to cache each of these responses separately, filling up the cache and
496 crowding out other entries in the cache. In this scenario, if so
497 configured, the policy will reject the response.</p>
499 <p>This policy is implemented by the <strong>POLICY_VARY</strong>
504 <section id="policyversion">
505 <title>Protocol Version Policy</title>
508 <module>mod_policy</module>
511 <directive module="mod_policy">PolicyVersion</directive>
515 <p>This policy will be rejected if the client request was made with a
516 version number lower than the version of HTTP specified.</p>
518 <p>This policy is typically used with restful applications where
519 control over the type of client is desired. This policy can be used
520 alongside the <code>POLICY_KEEPALIVE</code> filter to ensure that
521 HTTP/1.0 clients don't cause keepalive connections to be dropped.</p>
523 <p>Possible minimum versions that could be specified are:</p>
525 <ul><li><code>HTTP/1.1</code></li>
526 <li><code>HTTP/1.0</code></li>
527 <li><code>HTTP/0.9</code></li>
530 <p>This policy is implemented by the <strong>POLICY_VERSON</strong>