1 <?xml version='1.0' encoding='UTF-8' ?>
2 <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="content-negotiation.xml.meta">
25 <title>Content Negotiation</title>
29 <p>Apache supports content negotiation as described in
30 the HTTP/1.1 specification. It can choose the best
31 representation of a resource based on the browser-supplied
32 preferences for media type, languages, character set and
33 encoding. It also implements a couple of features to give
34 more intelligent handling of requests from browsers that send
35 incomplete negotiation information.</p>
37 <p>Content negotiation is provided by the
38 <module>mod_negotiation</module> module, which is compiled in
42 <section id="about"><title>About Content Negotiation</title>
44 <p>A resource may be available in several different
45 representations. For example, it might be available in
46 different languages or different media types, or a combination.
47 One way of selecting the most appropriate choice is to give the
48 user an index page, and let them select. However it is often
49 possible for the server to choose automatically. This works
50 because browsers can send, as part of each request, information
51 about what representations they prefer. For example, a browser
52 could indicate that it would like to see information in French,
53 if possible, else English will do. Browsers indicate their
54 preferences by headers in the request. To request only French
55 representations, the browser would send</p>
57 <example>Accept-Language: fr</example>
59 <p>Note that this preference will only be applied when there is
60 a choice of representations and they vary by language.</p>
62 <p>As an example of a more complex request, this browser has
63 been configured to accept French and English, but prefer
64 French, and to accept various media types, preferring HTML over
65 plain text or other text types, and preferring GIF or JPEG over
66 other media types, but also allowing any other media type as a
70 Accept-Language: fr; q=1.0, en; q=0.5<br />
71 Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
74 <p>Apache supports 'server driven' content negotiation, as
75 defined in the HTTP/1.1 specification. It fully supports the
76 <code>Accept</code>, <code>Accept-Language</code>,
77 <code>Accept-Charset</code> and<code>Accept-Encoding</code>
78 request headers. Apache also supports 'transparent'
79 content negotiation, which is an experimental negotiation
80 protocol defined in RFC 2295 and RFC 2296. It does not offer
81 support for 'feature negotiation' as defined in these RFCs.</p>
83 <p>A <strong>resource</strong> is a conceptual entity
84 identified by a URI (RFC 2396). An HTTP server like Apache
85 provides access to <strong>representations</strong> of the
86 resource(s) within its namespace, with each representation in
87 the form of a sequence of bytes with a defined media type,
88 character set, encoding, etc. Each resource may be associated
89 with zero, one, or more than one representation at any given
90 time. If multiple representations are available, the resource
91 is referred to as <strong>negotiable</strong> and each of its
92 representations is termed a <strong>variant</strong>. The ways
93 in which the variants for a negotiable resource vary are called
94 the <strong>dimensions</strong> of negotiation.</p>
97 <section id="negotiation"><title>Negotiation in Apache</title>
99 <p>In order to negotiate a resource, the server needs to be
100 given information about each of the variants. This is done in
104 <li>Using a type map (<em>i.e.</em>, a <code>*.var</code>
105 file) which names the files containing the variants
108 <li>Using a 'MultiViews' search, where the server does an
109 implicit filename pattern match and chooses from among the
113 <section id="type-map"><title>Using a type-map file</title>
115 <p>A type map is a document which is associated with the handler
116 named <code>type-map</code> (or, for backwards-compatibility with
117 older Apache configurations, the <glossary>MIME-type</glossary>
118 <code>application/x-type-map</code>). Note that to use this
119 feature, you must have a handler set in the configuration that
120 defines a file suffix as <code>type-map</code>; this is best done
123 <example>AddHandler type-map .var</example>
125 <p>in the server configuration file.</p>
127 <p>Type map files should have the same name as the resource
128 which they are describing, and have an entry for each available
129 variant; these entries consist of contiguous HTTP-format header
130 lines. Entries for different variants are separated by blank
131 lines. Blank lines are illegal within an entry. It is
132 conventional to begin a map file with an entry for the combined
133 entity as a whole (although this is not required, and if
134 present will be ignored). An example map file is shown below.
135 This file would be named <code>foo.var</code>, as it describes
136 a resource named <code>foo</code>.</p>
141 URI: foo.en.html<br />
142 Content-type: text/html<br />
143 Content-language: en<br />
145 URI: foo.fr.de.html<br />
146 Content-type: text/html;charset=iso-8859-2<br />
147 Content-language: fr, de<br />
149 <p>Note also that a typemap file will take precedence over the
150 filename's extension, even when Multiviews is on. If the
151 variants have different source qualities, that may be indicated
152 by the "qs" parameter to the media type, as in this picture
153 (available as JPEG, GIF, or ASCII-art): </p>
159 Content-type: image/jpeg; qs=0.8<br />
162 Content-type: image/gif; qs=0.5<br />
165 Content-type: text/plain; qs=0.01<br />
168 <p>qs values can vary in the range 0.000 to 1.000. Note that
169 any variant with a qs value of 0.000 will never be chosen.
170 Variants with no 'qs' parameter value are given a qs factor of
171 1.0. The qs parameter indicates the relative 'quality' of this
172 variant compared to the other available variants, independent
173 of the client's capabilities. For example, a JPEG file is
174 usually of higher source quality than an ASCII file if it is
175 attempting to represent a photograph. However, if the resource
176 being represented is an original ASCII art, then an ASCII
177 representation would have a higher source quality than a JPEG
178 representation. A qs value is therefore specific to a given
179 variant depending on the nature of the resource it
182 <p>The full list of headers recognized is available in the <a
183 href="mod/mod_negotiation.html#typemaps">mod_negotation
184 typemap</a> documentation.</p>
187 <section id="multiviews"><title>Multiviews</title>
189 <p><code>MultiViews</code> is a per-directory option, meaning it
190 can be set with an <directive module="core">Options</directive>
191 directive within a <directive module="core"
192 type="section">Directory</directive>, <directive module="core"
193 type="section">Location</directive> or <directive module="core"
194 type="section">Files</directive> section in
195 <code>httpd.conf</code>, or (if <directive
196 module="core">AllowOverride</directive> is properly set) in
197 <code>.htaccess</code> files. Note that <code>Options All</code>
198 does not set <code>MultiViews</code>; you have to ask for it by
201 <p>The effect of <code>MultiViews</code> is as follows: if the
202 server receives a request for <code>/some/dir/foo</code>, if
203 <code>/some/dir</code> has <code>MultiViews</code> enabled, and
204 <code>/some/dir/foo</code> does <em>not</em> exist, then the
205 server reads the directory looking for files named foo.*, and
206 effectively fakes up a type map which names all those files,
207 assigning them the same media types and content-encodings it
208 would have if the client had asked for one of them by name. It
209 then chooses the best match to the client's requirements.</p>
211 <p><code>MultiViews</code> may also apply to searches for the file
212 named by the <directive
213 module="mod_dir">DirectoryIndex</directive> directive, if the
214 server is trying to index a directory. If the configuration files
216 <example>DirectoryIndex index</example>
217 <p>then the server will arbitrate between <code>index.html</code>
218 and <code>index.html3</code> if both are present. If neither
219 are present, and <code>index.cgi</code> is there, the server
222 <p>If one of the files found when reading the directory does not
223 have an extension recognized by <code>mod_mime</code> to designate
224 its Charset, Content-Type, Language, or Encoding, then the result
225 depends on the setting of the <directive
226 module="mod_mime">MultiViewsMatch</directive> directive. This
227 directive determines whether handlers, filters, and other
228 extension types can participate in MultiViews negotiation.</p>
232 <section id="methods"><title>The Negotiation Methods</title>
234 <p>After Apache has obtained a list of the variants for a given
235 resource, either from a type-map file or from the filenames in
236 the directory, it invokes one of two methods to decide on the
237 'best' variant to return, if any. It is not necessary to know
238 any of the details of how negotiation actually takes place in
239 order to use Apache's content negotiation features. However the
240 rest of this document explains the methods used for those
243 <p>There are two negotiation methods:</p>
246 <li><strong>Server driven negotiation with the Apache
247 algorithm</strong> is used in the normal case. The Apache
248 algorithm is explained in more detail below. When this
249 algorithm is used, Apache can sometimes 'fiddle' the quality
250 factor of a particular dimension to achieve a better result.
251 The ways Apache can fiddle quality factors is explained in
252 more detail below.</li>
254 <li><strong>Transparent content negotiation</strong> is used
255 when the browser specifically requests this through the
256 mechanism defined in RFC 2295. This negotiation method gives
257 the browser full control over deciding on the 'best' variant,
258 the result is therefore dependent on the specific algorithms
259 used by the browser. As part of the transparent negotiation
260 process, the browser can ask Apache to run the 'remote
261 variant selection algorithm' defined in RFC 2296.</li>
264 <section id="dimensions"><title>Dimensions of Negotiation</title>
267 <columnspec><column width=".15"/><column width=".85"/></columnspec>
277 <td>Browser indicates preferences with the <code>Accept</code>
278 header field. Each item can have an associated quality factor.
279 Variant description can also have a quality factor (the "qs"
286 <td>Browser indicates preferences with the
287 <code>Accept-Language</code> header field. Each item can have
288 a quality factor. Variants can be associated with none, one or
289 more than one language.</td>
295 <td>Browser indicates preference with the
296 <code>Accept-Encoding</code> header field. Each item can have
297 a quality factor.</td>
303 <td>Browser indicates preference with the
304 <code>Accept-Charset</code> header field. Each item can have a
305 quality factor. Variants can indicate a charset as a parameter
306 of the media type.</td>
311 <section id="algorithm"><title>Apache Negotiation Algorithm</title>
313 <p>Apache can use the following algorithm to select the 'best'
314 variant (if any) to return to the browser. This algorithm is
315 not further configurable. It operates as follows:</p>
318 <li>First, for each dimension of the negotiation, check the
319 appropriate <em>Accept*</em> header field and assign a
320 quality to each variant. If the <em>Accept*</em> header for
321 any dimension implies that this variant is not acceptable,
322 eliminate it. If no variants remain, go to step 4.</li>
325 Select the 'best' variant by a process of elimination. Each
326 of the following tests is applied in order. Any variants
327 not selected at each test are eliminated. After each test,
328 if only one variant remains, select it as the best match
329 and proceed to step 3. If more than one variant remains,
330 move on to the next test.
333 <li>Multiply the quality factor from the <code>Accept</code>
334 header with the quality-of-source factor for this variants
335 media type, and select the variants with the highest
338 <li>Select the variants with the highest language quality
341 <li>Select the variants with the best language match,
342 using either the order of languages in the
343 <code>Accept-Language</code> header (if present), or else
344 the order of languages in the <code>LanguagePriority</code>
345 directive (if present).</li>
347 <li>Select the variants with the highest 'level' media
348 parameter (used to give the version of text/html media
351 <li>Select variants with the best charset media
352 parameters, as given on the <code>Accept-Charset</code>
353 header line. Charset ISO-8859-1 is acceptable unless
354 explicitly excluded. Variants with a <code>text/*</code>
355 media type but not explicitly associated with a particular
356 charset are assumed to be in ISO-8859-1.</li>
358 <li>Select those variants which have associated charset
359 media parameters that are <em>not</em> ISO-8859-1. If
360 there are no such variants, select all variants
363 <li>Select the variants with the best encoding. If there
364 are variants with an encoding that is acceptable to the
365 user-agent, select only these variants. Otherwise if
366 there is a mix of encoded and non-encoded variants,
367 select only the unencoded variants. If either all
368 variants are encoded or all variants are not encoded,
369 select all variants.</li>
371 <li>Select the variants with the smallest content
374 <li>Select the first variant of those remaining. This
375 will be either the first listed in the type-map file, or
376 when variants are read from the directory, the one whose
377 file name comes first when sorted using ASCII code
382 <li>The algorithm has now selected one 'best' variant, so
383 return it as the response. The HTTP response header
384 <code>Vary</code> is set to indicate the dimensions of
385 negotiation (browsers and caches can use this information when
386 caching the resource). End.</li>
388 <li>To get here means no variant was selected (because none
389 are acceptable to the browser). Return a 406 status (meaning
390 "No acceptable representation") with a response body
391 consisting of an HTML document listing the available
392 variants. Also set the HTTP <code>Vary</code> header to
393 indicate the dimensions of variance.</li>
398 <section id="better"><title>Fiddling with Quality
401 <p>Apache sometimes changes the quality values from what would
402 be expected by a strict interpretation of the Apache
403 negotiation algorithm above. This is to get a better result
404 from the algorithm for browsers which do not send full or
405 accurate information. Some of the most popular browsers send
406 <code>Accept</code> header information which would otherwise
407 result in the selection of the wrong variant in many cases. If a
408 browser sends full and correct information these fiddles will not
411 <section id="wildcards"><title>Media Types and Wildcards</title>
413 <p>The <code>Accept:</code> request header indicates preferences
414 for media types. It can also include 'wildcard' media types, such
415 as "image/*" or "*/*" where the * matches any string. So a request
418 <example>Accept: image/*, */*</example>
420 <p>would indicate that any type starting "image/" is acceptable,
421 as is any other type.
422 Some browsers routinely send wildcards in addition to explicit
423 types they can handle. For example:</p>
426 Accept: text/html, text/plain, image/gif, image/jpeg, */*
428 <p>The intention of this is to indicate that the explicitly listed
429 types are preferred, but if a different representation is
430 available, that is ok too. Using explicit quality values,
431 what the browser really wants is something like:</p>
433 Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
435 <p>The explicit types have no quality factor, so they default to a
436 preference of 1.0 (the highest). The wildcard */* is given a
437 low preference of 0.01, so other types will only be returned if
438 no variant matches an explicitly listed type.</p>
440 <p>If the <code>Accept:</code> header contains <em>no</em> q
441 factors at all, Apache sets the q value of "*/*", if present, to
442 0.01 to emulate the desired behavior. It also sets the q value of
443 wildcards of the format "type/*" to 0.02 (so these are preferred
444 over matches against "*/*". If any media type on the
445 <code>Accept:</code> header contains a q factor, these special
446 values are <em>not</em> applied, so requests from browsers which
447 send the explicit information to start with work as expected.</p>
450 <section id="exceptions"><title>Language Negotiation Exceptions</title>
452 <p>New in Apache 2.0, some exceptions have been added to the
453 negotiation algorithm to allow graceful fallback when language
454 negotiation fails to find a match.</p>
456 <p>When a client requests a page on your server, but the server
457 cannot find a single page that matches the
458 <code>Accept-language</code> sent by
459 the browser, the server will return either a "No Acceptable
460 Variant" or "Multiple Choices" response to the client. To avoid
461 these error messages, it is possible to configure Apache to ignore
462 the <code>Accept-language</code> in these cases and provide a
463 document that does not explicitly match the client's request. The
465 module="mod_negotiation">ForceLanguagePriority</directive>
466 directive can be used to override one or both of these error
467 messages and substitute the servers judgement in the form of the
468 <directive module="mod_negotiation">LanguagePriority</directive>
471 <p>The server will also attempt to match language-subsets when no
472 other match can be found. For example, if a client requests
473 documents with the language <code>en-GB</code> for British
474 English, the server is not normally allowed by the HTTP/1.1
475 standard to match that against a document that is marked as simply
476 <code>en</code>. (Note that it is almost surely a configuration
477 error to include <code>en-GB</code> and not <code>en</code> in the
478 <code>Accept-Language</code> header, since it is very unlikely
479 that a reader understands British English, but doesn't understand
480 English in general. Unfortunately, many current clients have
481 default configurations that resemble this.) However, if no other
482 language match is possible and the server is about to return a "No
483 Acceptable Variants" error or fallback to the <directive
484 module="mod_negotiation">LanguagePriority</directive>, the server
485 will ignore the subset specification and match <code>en-GB</code>
486 against <code>en</code> documents. Implicitly, Apache will add
487 the parent language to the client's acceptable language list with
488 a very low quality value. But note that if the client requests
489 "en-GB; q=0.9, fr; q=0.8", and the server has documents
490 designated "en" and "fr", then the "fr" document will be returned.
491 This is necessary to maintain compliance with the HTTP/1.1
492 specification and to work effectively with properly configured
495 <p>In order to support advanced techniques (such as cookies or
496 special URL-paths) to determine the user's preferred language,
497 since Apache 2.0.47 <module>mod_negotiation</module> recognizes
498 the <a href="env.html">environment variable</a>
499 <code>prefer-language</code>. If it exists and contains an
500 appropriate language tag, <module>mod_negotiation</module> will
501 try to select a matching variant. If there's no such variant,
502 the normal negotiation process applies.</p>
504 <example><title>Example</title>
505 SetEnvIf Cookie "language=(.+)" prefer-language=$1<br />
506 Header append Vary cookie
511 <section id="extensions"><title>Extensions to Transparent Content
514 <p>Apache extends the transparent content negotiation protocol (RFC
515 2295) as follows. A new <code>{encoding ..}</code> element is used in
516 variant lists to label variants which are available with a specific
517 content-encoding only. The implementation of the RVSA/1.0 algorithm
518 (RFC 2296) is extended to recognize encoded variants in the list, and
519 to use them as candidate variants whenever their encodings are
520 acceptable according to the <code>Accept-Encoding</code> request
521 header. The RVSA/1.0 implementation does not round computed quality
522 factors to 5 decimal places before choosing the best variant.</p>
525 <section id="naming"><title>Note on hyperlinks and naming conventions</title>
527 <p>If you are using language negotiation you can choose between
528 different naming conventions, because files can have more than
529 one extension, and the order of the extensions is normally
530 irrelevant (see the <a
531 href="mod/mod_mime.html#multipleext">mod_mime</a> documentation
534 <p>A typical file has a MIME-type extension (<em>e.g.</em>,
535 <code>html</code>), maybe an encoding extension (<em>e.g.</em>,
536 <code>gz</code>), and of course a language extension
537 (<em>e.g.</em>, <code>en</code>) when we have different
538 language variants of this file.</p>
547 <li>foo.en.html.gz</li>
550 <p>Here some more examples of filenames together with valid and
551 invalid hyperlinks:</p>
553 <table border="1" cellpadding="8" cellspacing="0">
554 <columnspec><column width=".2"/><column width=".2"/>
555 <column width=".2"/></columnspec>
559 <th>Valid hyperlink</th>
561 <th>Invalid hyperlink</th>
565 <td><em>foo.html.en</em></td>
574 <td><em>foo.en.html</em></td>
582 <td><em>foo.html.en.gz</em></td>
592 <td><em>foo.en.html.gz</em></td>
602 <td><em>foo.gz.html.en</em></td>
612 <td><em>foo.html.gz.en</em></td>
622 <p>Looking at the table above, you will notice that it is always
623 possible to use the name without any extensions in a hyperlink
624 (<em>e.g.</em>, <code>foo</code>). The advantage is that you
625 can hide the actual type of a document rsp. file and can change
626 it later, <em>e.g.</em>, from <code>html</code> to
627 <code>shtml</code> or <code>cgi</code> without changing any
628 hyperlink references.</p>
630 <p>If you want to continue to use a MIME-type in your
631 hyperlinks (<em>e.g.</em> <code>foo.html</code>) the language
632 extension (including an encoding extension if there is one)
633 must be on the right hand side of the MIME-type extension
634 (<em>e.g.</em>, <code>foo.html.en</code>).</p>
637 <section id="caching"><title>Note on Caching</title>
639 <p>When a cache stores a representation, it associates it with
640 the request URL. The next time that URL is requested, the cache
641 can use the stored representation. But, if the resource is
642 negotiable at the server, this might result in only the first
643 requested variant being cached and subsequent cache hits might
644 return the wrong response. To prevent this, Apache normally
645 marks all responses that are returned after content negotiation
646 as non-cacheable by HTTP/1.0 clients. Apache also supports the
647 HTTP/1.1 protocol features to allow caching of negotiated
650 <p>For requests which come from a HTTP/1.0 compliant client
651 (either a browser or a cache), the directive <directive
652 module="mod_negotiation">CacheNegotiatedDocs</directive> can be
653 used to allow caching of responses which were subject to
654 negotiation. This directive can be given in the server config or
655 virtual host, and takes no arguments. It has no effect on requests
656 from HTTP/1.1 clients.</p>
658 <p>For HTTP/1.1 clients, Apache sends a <code>Vary</code> HTTP
659 response header to indicate the negotiation dimensions for the
660 response. Caches can use this information to determine whether a
661 subsequent request can be served from the local copy. To
662 encourage a cache to use the local copy regardless of the
663 negotiation dimensions, set the <code>force-no-vary</code> <a
664 href="env.html#special">environment variable</a>.</p>