1 <?xml version='1.0' encoding='UTF-8' ?>
2 <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="content-negotiation.xml.meta">
25 <title>Content Negotiation</title>
29 <p>Apache HTTPD supports content negotiation as described in
30 the HTTP/1.1 specification. It can choose the best
31 representation of a resource based on the browser-supplied
32 preferences for media type, languages, character set and
33 encoding. It also implements a couple of features to give
34 more intelligent handling of requests from browsers that send
35 incomplete negotiation information.</p>
37 <p>Content negotiation is provided by the
38 <module>mod_negotiation</module> module, which is compiled in
42 <section id="about"><title>About Content Negotiation</title>
44 <p>A resource may be available in several different
45 representations. For example, it might be available in
46 different languages or different media types, or a combination.
47 One way of selecting the most appropriate choice is to give the
48 user an index page, and let them select. However it is often
49 possible for the server to choose automatically. This works
50 because browsers can send, as part of each request, information
51 about what representations they prefer. For example, a browser
52 could indicate that it would like to see information in French,
53 if possible, else English will do. Browsers indicate their
54 preferences by headers in the request. To request only French
55 representations, the browser would send</p>
57 <example>Accept-Language: fr</example>
59 <p>Note that this preference will only be applied when there is
60 a choice of representations and they vary by language.</p>
62 <p>As an example of a more complex request, this browser has
63 been configured to accept French and English, but prefer
64 French, and to accept various media types, preferring HTML over
65 plain text or other text types, and preferring GIF or JPEG over
66 other media types, but also allowing any other media type as a
70 Accept-Language: fr; q=1.0, en; q=0.5<br />
71 Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
74 <p>httpd supports 'server driven' content negotiation, as
75 defined in the HTTP/1.1 specification. It fully supports the
76 <code>Accept</code>, <code>Accept-Language</code>,
77 <code>Accept-Charset</code> and <code>Accept-Encoding</code>
78 request headers. httpd also supports 'transparent'
79 content negotiation, which is an experimental negotiation
80 protocol defined in RFC 2295 and RFC 2296. It does not offer
81 support for 'feature negotiation' as defined in these RFCs.</p>
83 <p>A <strong>resource</strong> is a conceptual entity
84 identified by a URI (RFC 2396). An HTTP server like Apache HTTP Server
85 provides access to <strong>representations</strong> of the
86 resource(s) within its namespace, with each representation in
87 the form of a sequence of bytes with a defined media type,
88 character set, encoding, etc. Each resource may be associated
89 with zero, one, or more than one representation at any given
90 time. If multiple representations are available, the resource
91 is referred to as <strong>negotiable</strong> and each of its
92 representations is termed a <strong>variant</strong>. The ways
93 in which the variants for a negotiable resource vary are called
94 the <strong>dimensions</strong> of negotiation.</p>
97 <section id="negotiation"><title>Negotiation in httpd</title>
99 <p>In order to negotiate a resource, the server needs to be
100 given information about each of the variants. This is done in
104 <li>Using a type map (<em>i.e.</em>, a <code>*.var</code>
105 file) which names the files containing the variants
108 <li>Using a 'MultiViews' search, where the server does an
109 implicit filename pattern match and chooses from among the
113 <section id="type-map"><title>Using a type-map file</title>
115 <p>A type map is a document which is associated with the handler
116 named <code>type-map</code> (or, for backwards-compatibility with
117 older httpd configurations, the <glossary>MIME-type</glossary>
118 <code>application/x-type-map</code>). Note that to use this
119 feature, you must have a handler set in the configuration that
120 defines a file suffix as <code>type-map</code>; this is best done
123 <highlight language="config">
124 AddHandler type-map .var
127 <p>in the server configuration file.</p>
129 <p>Type map files should have the same name as the resource
130 which they are describing, followed by the extension
131 <code>.var</code>. In the examples shown below, the resource is
132 named <code>foo</code>, so the type map file is named
133 <code>foo.var</code>.</p>
135 <p>This file should have an entry for each available
136 variant; these entries consist of contiguous HTTP-format header
137 lines. Entries for different variants are separated by blank
138 lines. Blank lines are illegal within an entry. It is
139 conventional to begin a map file with an entry for the combined
140 entity as a whole (although this is not required, and if
141 present will be ignored). An example map file is shown below.</p>
143 <p>URIs in this file are relative to the location of the type map
144 file. Usually, these files will be located in the same directory as
145 the type map file, but this is not required. You may provide
146 absolute or relative URIs for any file located on the same server as
152 URI: foo.en.html<br />
153 Content-type: text/html<br />
154 Content-language: en<br />
156 URI: foo.fr.de.html<br />
157 Content-type: text/html;charset=iso-8859-2<br />
158 Content-language: fr, de<br />
161 <p>Note also that a typemap file will take precedence over the
162 filename's extension, even when Multiviews is on. If the
163 variants have different source qualities, that may be indicated
164 by the "qs" parameter to the media type, as in this picture
165 (available as JPEG, GIF, or ASCII-art): </p>
171 Content-type: image/jpeg; qs=0.8<br />
174 Content-type: image/gif; qs=0.5<br />
177 Content-type: text/plain; qs=0.01<br />
180 <p>qs values can vary in the range 0.000 to 1.000. Note that
181 any variant with a qs value of 0.000 will never be chosen.
182 Variants with no 'qs' parameter value are given a qs factor of
183 1.0. The qs parameter indicates the relative 'quality' of this
184 variant compared to the other available variants, independent
185 of the client's capabilities. For example, a JPEG file is
186 usually of higher source quality than an ASCII file if it is
187 attempting to represent a photograph. However, if the resource
188 being represented is an original ASCII art, then an ASCII
189 representation would have a higher source quality than a JPEG
190 representation. A qs value is therefore specific to a given
191 variant depending on the nature of the resource it
194 <p>The full list of headers recognized is available in the <a
195 href="mod/mod_negotiation.html#typemaps">mod_negotiation
196 typemap</a> documentation.</p>
199 <section id="multiviews"><title>Multiviews</title>
201 <p><code>MultiViews</code> is a per-directory option, meaning it
202 can be set with an <directive module="core">Options</directive>
203 directive within a <directive module="core"
204 type="section">Directory</directive>, <directive module="core"
205 type="section">Location</directive> or <directive module="core"
206 type="section">Files</directive> section in
207 <code>httpd.conf</code>, or (if <directive
208 module="core">AllowOverride</directive> is properly set) in
209 <code>.htaccess</code> files. Note that <code>Options All</code>
210 does not set <code>MultiViews</code>; you have to ask for it by
213 <p>The effect of <code>MultiViews</code> is as follows: if the
214 server receives a request for <code>/some/dir/foo</code>, if
215 <code>/some/dir</code> has <code>MultiViews</code> enabled, and
216 <code>/some/dir/foo</code> does <em>not</em> exist, then the
217 server reads the directory looking for files named foo.*, and
218 effectively fakes up a type map which names all those files,
219 assigning them the same media types and content-encodings it
220 would have if the client had asked for one of them by name. It
221 then chooses the best match to the client's requirements.</p>
223 <p><code>MultiViews</code> may also apply to searches for the file
224 named by the <directive
225 module="mod_dir">DirectoryIndex</directive> directive, if the
226 server is trying to index a directory. If the configuration files
228 <highlight language="config">
231 <p>then the server will arbitrate between <code>index.html</code>
232 and <code>index.html3</code> if both are present. If neither
233 are present, and <code>index.cgi</code> is there, the server
236 <p>If one of the files found when reading the directory does not
237 have an extension recognized by <code>mod_mime</code> to designate
238 its Charset, Content-Type, Language, or Encoding, then the result
239 depends on the setting of the <directive
240 module="mod_mime">MultiViewsMatch</directive> directive. This
241 directive determines whether handlers, filters, and other
242 extension types can participate in MultiViews negotiation.</p>
246 <section id="methods"><title>The Negotiation Methods</title>
248 <p>After httpd has obtained a list of the variants for a given
249 resource, either from a type-map file or from the filenames in
250 the directory, it invokes one of two methods to decide on the
251 'best' variant to return, if any. It is not necessary to know
252 any of the details of how negotiation actually takes place in
253 order to use httpd's content negotiation features. However the
254 rest of this document explains the methods used for those
257 <p>There are two negotiation methods:</p>
260 <li><strong>Server driven negotiation with the httpd
261 algorithm</strong> is used in the normal case. The httpd
262 algorithm is explained in more detail below. When this
263 algorithm is used, httpd can sometimes 'fiddle' the quality
264 factor of a particular dimension to achieve a better result.
265 The ways httpd can fiddle quality factors is explained in
266 more detail below.</li>
268 <li><strong>Transparent content negotiation</strong> is used
269 when the browser specifically requests this through the
270 mechanism defined in RFC 2295. This negotiation method gives
271 the browser full control over deciding on the 'best' variant,
272 the result is therefore dependent on the specific algorithms
273 used by the browser. As part of the transparent negotiation
274 process, the browser can ask httpd to run the 'remote
275 variant selection algorithm' defined in RFC 2296.</li>
278 <section id="dimensions"><title>Dimensions of Negotiation</title>
281 <columnspec><column width=".15"/><column width=".85"/></columnspec>
291 <td>Browser indicates preferences with the <code>Accept</code>
292 header field. Each item can have an associated quality factor.
293 Variant description can also have a quality factor (the "qs"
300 <td>Browser indicates preferences with the
301 <code>Accept-Language</code> header field. Each item can have
302 a quality factor. Variants can be associated with none, one or
303 more than one language.</td>
309 <td>Browser indicates preference with the
310 <code>Accept-Encoding</code> header field. Each item can have
311 a quality factor.</td>
317 <td>Browser indicates preference with the
318 <code>Accept-Charset</code> header field. Each item can have a
319 quality factor. Variants can indicate a charset as a parameter
320 of the media type.</td>
325 <section id="algorithm"><title>httpd Negotiation Algorithm</title>
327 <p>httpd can use the following algorithm to select the 'best'
328 variant (if any) to return to the browser. This algorithm is
329 not further configurable. It operates as follows:</p>
332 <li>First, for each dimension of the negotiation, check the
333 appropriate <em>Accept*</em> header field and assign a
334 quality to each variant. If the <em>Accept*</em> header for
335 any dimension implies that this variant is not acceptable,
336 eliminate it. If no variants remain, go to step 4.</li>
339 Select the 'best' variant by a process of elimination. Each
340 of the following tests is applied in order. Any variants
341 not selected at each test are eliminated. After each test,
342 if only one variant remains, select it as the best match
343 and proceed to step 3. If more than one variant remains,
344 move on to the next test.
347 <li>Multiply the quality factor from the <code>Accept</code>
348 header with the quality-of-source factor for this variants
349 media type, and select the variants with the highest
352 <li>Select the variants with the highest language quality
355 <li>Select the variants with the best language match,
356 using either the order of languages in the
357 <code>Accept-Language</code> header (if present), or else
358 the order of languages in the <code>LanguagePriority</code>
359 directive (if present).</li>
361 <li>Select the variants with the highest 'level' media
362 parameter (used to give the version of text/html media
365 <li>Select variants with the best charset media
366 parameters, as given on the <code>Accept-Charset</code>
367 header line. Charset ISO-8859-1 is acceptable unless
368 explicitly excluded. Variants with a <code>text/*</code>
369 media type but not explicitly associated with a particular
370 charset are assumed to be in ISO-8859-1.</li>
372 <li>Select those variants which have associated charset
373 media parameters that are <em>not</em> ISO-8859-1. If
374 there are no such variants, select all variants
377 <li>Select the variants with the best encoding. If there
378 are variants with an encoding that is acceptable to the
379 user-agent, select only these variants. Otherwise if
380 there is a mix of encoded and non-encoded variants,
381 select only the unencoded variants. If either all
382 variants are encoded or all variants are not encoded,
383 select all variants.</li>
385 <li>Select the variants with the smallest content
388 <li>Select the first variant of those remaining. This
389 will be either the first listed in the type-map file, or
390 when variants are read from the directory, the one whose
391 file name comes first when sorted using ASCII code
396 <li>The algorithm has now selected one 'best' variant, so
397 return it as the response. The HTTP response header
398 <code>Vary</code> is set to indicate the dimensions of
399 negotiation (browsers and caches can use this information when
400 caching the resource). End.</li>
402 <li>To get here means no variant was selected (because none
403 are acceptable to the browser). Return a 406 status (meaning
404 "No acceptable representation") with a response body
405 consisting of an HTML document listing the available
406 variants. Also set the HTTP <code>Vary</code> header to
407 indicate the dimensions of variance.</li>
412 <section id="better"><title>Fiddling with Quality
415 <p>httpd sometimes changes the quality values from what would
416 be expected by a strict interpretation of the httpd
417 negotiation algorithm above. This is to get a better result
418 from the algorithm for browsers which do not send full or
419 accurate information. Some of the most popular browsers send
420 <code>Accept</code> header information which would otherwise
421 result in the selection of the wrong variant in many cases. If a
422 browser sends full and correct information these fiddles will not
425 <section id="wildcards"><title>Media Types and Wildcards</title>
427 <p>The <code>Accept:</code> request header indicates preferences
428 for media types. It can also include 'wildcard' media types, such
429 as "image/*" or "*/*" where the * matches any string. So a request
432 <example>Accept: image/*, */*</example>
434 <p>would indicate that any type starting "image/" is acceptable,
435 as is any other type.
436 Some browsers routinely send wildcards in addition to explicit
437 types they can handle. For example:</p>
440 Accept: text/html, text/plain, image/gif, image/jpeg, */*
442 <p>The intention of this is to indicate that the explicitly listed
443 types are preferred, but if a different representation is
444 available, that is ok too. Using explicit quality values,
445 what the browser really wants is something like:</p>
447 Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
449 <p>The explicit types have no quality factor, so they default to a
450 preference of 1.0 (the highest). The wildcard */* is given a
451 low preference of 0.01, so other types will only be returned if
452 no variant matches an explicitly listed type.</p>
454 <p>If the <code>Accept:</code> header contains <em>no</em> q
455 factors at all, httpd sets the q value of "*/*", if present, to
456 0.01 to emulate the desired behavior. It also sets the q value of
457 wildcards of the format "type/*" to 0.02 (so these are preferred
458 over matches against "*/*". If any media type on the
459 <code>Accept:</code> header contains a q factor, these special
460 values are <em>not</em> applied, so requests from browsers which
461 send the explicit information to start with work as expected.</p>
464 <section id="exceptions"><title>Language Negotiation Exceptions</title>
466 <p>New in httpd 2.0, some exceptions have been added to the
467 negotiation algorithm to allow graceful fallback when language
468 negotiation fails to find a match.</p>
470 <p>When a client requests a page on your server, but the server
471 cannot find a single page that matches the
472 <code>Accept-language</code> sent by
473 the browser, the server will return either a "No Acceptable
474 Variant" or "Multiple Choices" response to the client. To avoid
475 these error messages, it is possible to configure httpd to ignore
476 the <code>Accept-language</code> in these cases and provide a
477 document that does not explicitly match the client's request. The
479 module="mod_negotiation">ForceLanguagePriority</directive>
480 directive can be used to override one or both of these error
481 messages and substitute the servers judgement in the form of the
482 <directive module="mod_negotiation">LanguagePriority</directive>
485 <p>The server will also attempt to match language-subsets when no
486 other match can be found. For example, if a client requests
487 documents with the language <code>en-GB</code> for British
488 English, the server is not normally allowed by the HTTP/1.1
489 standard to match that against a document that is marked as simply
490 <code>en</code>. (Note that it is almost surely a configuration
491 error to include <code>en-GB</code> and not <code>en</code> in the
492 <code>Accept-Language</code> header, since it is very unlikely
493 that a reader understands British English, but doesn't understand
494 English in general. Unfortunately, many current clients have
495 default configurations that resemble this.) However, if no other
496 language match is possible and the server is about to return a "No
497 Acceptable Variants" error or fallback to the <directive
498 module="mod_negotiation">LanguagePriority</directive>, the server
499 will ignore the subset specification and match <code>en-GB</code>
500 against <code>en</code> documents. Implicitly, httpd will add
501 the parent language to the client's acceptable language list with
502 a very low quality value. But note that if the client requests
503 "en-GB; q=0.9, fr; q=0.8", and the server has documents
504 designated "en" and "fr", then the "fr" document will be returned.
505 This is necessary to maintain compliance with the HTTP/1.1
506 specification and to work effectively with properly configured
509 <p>In order to support advanced techniques (such as cookies or
510 special URL-paths) to determine the user's preferred language,
511 since httpd 2.0.47 <module>mod_negotiation</module> recognizes
512 the <a href="env.html">environment variable</a>
513 <code>prefer-language</code>. If it exists and contains an
514 appropriate language tag, <module>mod_negotiation</module> will
515 try to select a matching variant. If there's no such variant,
516 the normal negotiation process applies.</p>
518 <example><title>Example</title>
519 <highlight language="config">
520 SetEnvIf Cookie "language=(.+)" prefer-language=$1
521 Header append Vary cookie
527 <section id="extensions"><title>Extensions to Transparent Content
530 <p>httpd extends the transparent content negotiation protocol (RFC
531 2295) as follows. A new <code>{encoding ..}</code> element is used in
532 variant lists to label variants which are available with a specific
533 content-encoding only. The implementation of the RVSA/1.0 algorithm
534 (RFC 2296) is extended to recognize encoded variants in the list, and
535 to use them as candidate variants whenever their encodings are
536 acceptable according to the <code>Accept-Encoding</code> request
537 header. The RVSA/1.0 implementation does not round computed quality
538 factors to 5 decimal places before choosing the best variant.</p>
541 <section id="naming"><title>Note on hyperlinks and naming conventions</title>
543 <p>If you are using language negotiation you can choose between
544 different naming conventions, because files can have more than
545 one extension, and the order of the extensions is normally
546 irrelevant (see the <a
547 href="mod/mod_mime.html#multipleext">mod_mime</a> documentation
550 <p>A typical file has a MIME-type extension (<em>e.g.</em>,
551 <code>html</code>), maybe an encoding extension (<em>e.g.</em>,
552 <code>gz</code>), and of course a language extension
553 (<em>e.g.</em>, <code>en</code>) when we have different
554 language variants of this file.</p>
563 <li>foo.en.html.gz</li>
566 <p>Here some more examples of filenames together with valid and
567 invalid hyperlinks:</p>
569 <table border="1" cellpadding="8" cellspacing="0">
570 <columnspec><column width=".2"/><column width=".2"/>
571 <column width=".2"/></columnspec>
575 <th>Valid hyperlink</th>
577 <th>Invalid hyperlink</th>
581 <td><em>foo.html.en</em></td>
590 <td><em>foo.en.html</em></td>
598 <td><em>foo.html.en.gz</em></td>
608 <td><em>foo.en.html.gz</em></td>
618 <td><em>foo.gz.html.en</em></td>
628 <td><em>foo.html.gz.en</em></td>
638 <p>Looking at the table above, you will notice that it is always
639 possible to use the name without any extensions in a hyperlink
640 (<em>e.g.</em>, <code>foo</code>). The advantage is that you
641 can hide the actual type of a document rsp. file and can change
642 it later, <em>e.g.</em>, from <code>html</code> to
643 <code>shtml</code> or <code>cgi</code> without changing any
644 hyperlink references.</p>
646 <p>If you want to continue to use a MIME-type in your
647 hyperlinks (<em>e.g.</em> <code>foo.html</code>) the language
648 extension (including an encoding extension if there is one)
649 must be on the right hand side of the MIME-type extension
650 (<em>e.g.</em>, <code>foo.html.en</code>).</p>
653 <section id="caching"><title>Note on Caching</title>
655 <p>When a cache stores a representation, it associates it with
656 the request URL. The next time that URL is requested, the cache
657 can use the stored representation. But, if the resource is
658 negotiable at the server, this might result in only the first
659 requested variant being cached and subsequent cache hits might
660 return the wrong response. To prevent this, httpd normally
661 marks all responses that are returned after content negotiation
662 as non-cacheable by HTTP/1.0 clients. httpd also supports the
663 HTTP/1.1 protocol features to allow caching of negotiated
666 <p>For requests which come from a HTTP/1.0 compliant client
667 (either a browser or a cache), the directive <directive
668 module="mod_negotiation">CacheNegotiatedDocs</directive> can be
669 used to allow caching of responses which were subject to
670 negotiation. This directive can be given in the server config or
671 virtual host, and takes no arguments. It has no effect on requests
672 from HTTP/1.1 clients.</p>
674 <p>For HTTP/1.1 clients, httpd sends a <code>Vary</code> HTTP
675 response header to indicate the negotiation dimensions for the
676 response. Caches can use this information to determine whether a
677 subsequent request can be served from the local copy. To
678 encourage a cache to use the local copy regardless of the
679 negotiation dimensions, set the <code>force-no-vary</code> <a
680 href="env.html#special">environment variable</a>.</p>