1 <?xml version='1.0' encoding='UTF-8' ?>
2 <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
4 <!-- $LastChangedRevision$ -->
7 Licensed to the Apache Software Foundation (ASF) under one or more
8 contributor license agreements. See the NOTICE file distributed with
9 this work for additional information regarding copyright ownership.
10 The ASF licenses this file to You under the Apache License, Version 2.0
11 (the "License"); you may not use this file except in compliance with
12 the License. You may obtain a copy of the License at
14 http://www.apache.org/licenses/LICENSE-2.0
16 Unless required by applicable law or agreed to in writing, software
17 distributed under the License is distributed on an "AS IS" BASIS,
18 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
19 See the License for the specific language governing permissions and
20 limitations under the License.
23 <manualpage metafile="content-negotiation.xml.meta">
25 <title>Content Negotiation</title>
29 <p>Apache HTTPD supports content negotiation as described in
30 the HTTP/1.1 specification. It can choose the best
31 representation of a resource based on the browser-supplied
32 preferences for media type, languages, character set and
33 encoding. It also implements a couple of features to give
34 more intelligent handling of requests from browsers that send
35 incomplete negotiation information.</p>
37 <p>Content negotiation is provided by the
38 <module>mod_negotiation</module> module, which is compiled in
42 <section id="about"><title>About Content Negotiation</title>
44 <p>A resource may be available in several different
45 representations. For example, it might be available in
46 different languages or different media types, or a combination.
47 One way of selecting the most appropriate choice is to give the
48 user an index page, and let them select. However it is often
49 possible for the server to choose automatically. This works
50 because browsers can send, as part of each request, information
51 about what representations they prefer. For example, a browser
52 could indicate that it would like to see information in French,
53 if possible, else English will do. Browsers indicate their
54 preferences by headers in the request. To request only French
55 representations, the browser would send</p>
57 <example>Accept-Language: fr</example>
59 <p>Note that this preference will only be applied when there is
60 a choice of representations and they vary by language.</p>
62 <p>As an example of a more complex request, this browser has
63 been configured to accept French and English, but prefer
64 French, and to accept various media types, preferring HTML over
65 plain text or other text types, and preferring GIF or JPEG over
66 other media types, but also allowing any other media type as a
70 Accept-Language: fr; q=1.0, en; q=0.5<br />
71 Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
74 <p>httpd supports 'server driven' content negotiation, as
75 defined in the HTTP/1.1 specification. It fully supports the
76 <code>Accept</code>, <code>Accept-Language</code>,
77 <code>Accept-Charset</code> and <code>Accept-Encoding</code>
78 request headers. httpd also supports 'transparent'
79 content negotiation, which is an experimental negotiation
80 protocol defined in RFC 2295 and RFC 2296. It does not offer
81 support for 'feature negotiation' as defined in these RFCs.</p>
83 <p>A <strong>resource</strong> is a conceptual entity
84 identified by a URI (RFC 2396). An HTTP server like Apache HTTP Server
85 provides access to <strong>representations</strong> of the
86 resource(s) within its namespace, with each representation in
87 the form of a sequence of bytes with a defined media type,
88 character set, encoding, etc. Each resource may be associated
89 with zero, one, or more than one representation at any given
90 time. If multiple representations are available, the resource
91 is referred to as <strong>negotiable</strong> and each of its
92 representations is termed a <strong>variant</strong>. The ways
93 in which the variants for a negotiable resource vary are called
94 the <strong>dimensions</strong> of negotiation.</p>
97 <section id="negotiation"><title>Negotiation in httpd</title>
99 <p>In order to negotiate a resource, the server needs to be
100 given information about each of the variants. This is done in
104 <li>Using a type map (<em>i.e.</em>, a <code>*.var</code>
105 file) which names the files containing the variants
108 <li>Using a 'MultiViews' search, where the server does an
109 implicit filename pattern match and chooses from among the
113 <section id="type-map"><title>Using a type-map file</title>
115 <p>A type map is a document which is associated with the handler
116 named <code>type-map</code> (or, for backwards-compatibility with
117 older httpd configurations, the <glossary>MIME-type</glossary>
118 <code>application/x-type-map</code>). Note that to use this
119 feature, you must have a handler set in the configuration that
120 defines a file suffix as <code>type-map</code>; this is best done
123 <highlight language="config">AddHandler type-map .var</highlight>
125 <p>in the server configuration file.</p>
127 <p>Type map files should have the same name as the resource
128 which they are describing, followed by the extension
129 <code>.var</code>. In the examples shown below, the resource is
130 named <code>foo</code>, so the type map file is named
131 <code>foo.var</code>.</p>
133 <p>This file should have an entry for each available
134 variant; these entries consist of contiguous HTTP-format header
135 lines. Entries for different variants are separated by blank
136 lines. Blank lines are illegal within an entry. It is
137 conventional to begin a map file with an entry for the combined
138 entity as a whole (although this is not required, and if
139 present will be ignored). An example map file is shown below.</p>
141 <p>URIs in this file are relative to the location of the type map
142 file. Usually, these files will be located in the same directory as
143 the type map file, but this is not required. You may provide
144 absolute or relative URIs for any file located on the same server as
150 URI: foo.en.html<br />
151 Content-type: text/html<br />
152 Content-language: en<br />
154 URI: foo.fr.de.html<br />
155 Content-type: text/html;charset=iso-8859-2<br />
156 Content-language: fr, de<br />
159 <p>Note also that a typemap file will take precedence over the
160 filename's extension, even when Multiviews is on. If the
161 variants have different source qualities, that may be indicated
162 by the "qs" parameter to the media type, as in this picture
163 (available as JPEG, GIF, or ASCII-art): </p>
169 Content-type: image/jpeg; qs=0.8<br />
172 Content-type: image/gif; qs=0.5<br />
175 Content-type: text/plain; qs=0.01<br />
178 <p>qs values can vary in the range 0.000 to 1.000. Note that
179 any variant with a qs value of 0.000 will never be chosen.
180 Variants with no 'qs' parameter value are given a qs factor of
181 1.0. The qs parameter indicates the relative 'quality' of this
182 variant compared to the other available variants, independent
183 of the client's capabilities. For example, a JPEG file is
184 usually of higher source quality than an ASCII file if it is
185 attempting to represent a photograph. However, if the resource
186 being represented is an original ASCII art, then an ASCII
187 representation would have a higher source quality than a JPEG
188 representation. A qs value is therefore specific to a given
189 variant depending on the nature of the resource it
192 <p>The full list of headers recognized is available in the <a
193 href="mod/mod_negotiation.html#typemaps">mod_negotiation
194 typemap</a> documentation.</p>
197 <section id="multiviews"><title>Multiviews</title>
199 <p><code>MultiViews</code> is a per-directory option, meaning it
200 can be set with an <directive module="core">Options</directive>
201 directive within a <directive module="core"
202 type="section">Directory</directive>, <directive module="core"
203 type="section">Location</directive> or <directive module="core"
204 type="section">Files</directive> section in
205 <code>httpd.conf</code>, or (if <directive
206 module="core">AllowOverride</directive> is properly set) in
207 <code>.htaccess</code> files. Note that <code>Options All</code>
208 does not set <code>MultiViews</code>; you have to ask for it by
211 <p>The effect of <code>MultiViews</code> is as follows: if the
212 server receives a request for <code>/some/dir/foo</code>, if
213 <code>/some/dir</code> has <code>MultiViews</code> enabled, and
214 <code>/some/dir/foo</code> does <em>not</em> exist, then the
215 server reads the directory looking for files named foo.*, and
216 effectively fakes up a type map which names all those files,
217 assigning them the same media types and content-encodings it
218 would have if the client had asked for one of them by name. It
219 then chooses the best match to the client's requirements.</p>
221 <p><code>MultiViews</code> may also apply to searches for the file
222 named by the <directive
223 module="mod_dir">DirectoryIndex</directive> directive, if the
224 server is trying to index a directory. If the configuration files
226 <highlight language="config">DirectoryIndex index</highlight>
227 <p>then the server will arbitrate between <code>index.html</code>
228 and <code>index.html3</code> if both are present. If neither
229 are present, and <code>index.cgi</code> is there, the server
232 <p>If one of the files found when reading the directory does not
233 have an extension recognized by <code>mod_mime</code> to designate
234 its Charset, Content-Type, Language, or Encoding, then the result
235 depends on the setting of the <directive
236 module="mod_mime">MultiViewsMatch</directive> directive. This
237 directive determines whether handlers, filters, and other
238 extension types can participate in MultiViews negotiation.</p>
242 <section id="methods"><title>The Negotiation Methods</title>
244 <p>After httpd has obtained a list of the variants for a given
245 resource, either from a type-map file or from the filenames in
246 the directory, it invokes one of two methods to decide on the
247 'best' variant to return, if any. It is not necessary to know
248 any of the details of how negotiation actually takes place in
249 order to use httpd's content negotiation features. However the
250 rest of this document explains the methods used for those
253 <p>There are two negotiation methods:</p>
256 <li><strong>Server driven negotiation with the httpd
257 algorithm</strong> is used in the normal case. The httpd
258 algorithm is explained in more detail below. When this
259 algorithm is used, httpd can sometimes 'fiddle' the quality
260 factor of a particular dimension to achieve a better result.
261 The ways httpd can fiddle quality factors is explained in
262 more detail below.</li>
264 <li><strong>Transparent content negotiation</strong> is used
265 when the browser specifically requests this through the
266 mechanism defined in RFC 2295. This negotiation method gives
267 the browser full control over deciding on the 'best' variant,
268 the result is therefore dependent on the specific algorithms
269 used by the browser. As part of the transparent negotiation
270 process, the browser can ask httpd to run the 'remote
271 variant selection algorithm' defined in RFC 2296.</li>
274 <section id="dimensions"><title>Dimensions of Negotiation</title>
277 <columnspec><column width=".15"/><column width=".85"/></columnspec>
287 <td>Browser indicates preferences with the <code>Accept</code>
288 header field. Each item can have an associated quality factor.
289 Variant description can also have a quality factor (the "qs"
296 <td>Browser indicates preferences with the
297 <code>Accept-Language</code> header field. Each item can have
298 a quality factor. Variants can be associated with none, one or
299 more than one language.</td>
305 <td>Browser indicates preference with the
306 <code>Accept-Encoding</code> header field. Each item can have
307 a quality factor.</td>
313 <td>Browser indicates preference with the
314 <code>Accept-Charset</code> header field. Each item can have a
315 quality factor. Variants can indicate a charset as a parameter
316 of the media type.</td>
321 <section id="algorithm"><title>httpd Negotiation Algorithm</title>
323 <p>httpd can use the following algorithm to select the 'best'
324 variant (if any) to return to the browser. This algorithm is
325 not further configurable. It operates as follows:</p>
328 <li>First, for each dimension of the negotiation, check the
329 appropriate <em>Accept*</em> header field and assign a
330 quality to each variant. If the <em>Accept*</em> header for
331 any dimension implies that this variant is not acceptable,
332 eliminate it. If no variants remain, go to step 4.</li>
335 Select the 'best' variant by a process of elimination. Each
336 of the following tests is applied in order. Any variants
337 not selected at each test are eliminated. After each test,
338 if only one variant remains, select it as the best match
339 and proceed to step 3. If more than one variant remains,
340 move on to the next test.
343 <li>Multiply the quality factor from the <code>Accept</code>
344 header with the quality-of-source factor for this variants
345 media type, and select the variants with the highest
348 <li>Select the variants with the highest language quality
351 <li>Select the variants with the best language match,
352 using either the order of languages in the
353 <code>Accept-Language</code> header (if present), or else
354 the order of languages in the <code>LanguagePriority</code>
355 directive (if present).</li>
357 <li>Select the variants with the highest 'level' media
358 parameter (used to give the version of text/html media
361 <li>Select variants with the best charset media
362 parameters, as given on the <code>Accept-Charset</code>
363 header line. Charset ISO-8859-1 is acceptable unless
364 explicitly excluded. Variants with a <code>text/*</code>
365 media type but not explicitly associated with a particular
366 charset are assumed to be in ISO-8859-1.</li>
368 <li>Select those variants which have associated charset
369 media parameters that are <em>not</em> ISO-8859-1. If
370 there are no such variants, select all variants
373 <li>Select the variants with the best encoding. If there
374 are variants with an encoding that is acceptable to the
375 user-agent, select only these variants. Otherwise if
376 there is a mix of encoded and non-encoded variants,
377 select only the unencoded variants. If either all
378 variants are encoded or all variants are not encoded,
379 select all variants.</li>
381 <li>Select the variants with the smallest content
384 <li>Select the first variant of those remaining. This
385 will be either the first listed in the type-map file, or
386 when variants are read from the directory, the one whose
387 file name comes first when sorted using ASCII code
392 <li>The algorithm has now selected one 'best' variant, so
393 return it as the response. The HTTP response header
394 <code>Vary</code> is set to indicate the dimensions of
395 negotiation (browsers and caches can use this information when
396 caching the resource). End.</li>
398 <li>To get here means no variant was selected (because none
399 are acceptable to the browser). Return a 406 status (meaning
400 "No acceptable representation") with a response body
401 consisting of an HTML document listing the available
402 variants. Also set the HTTP <code>Vary</code> header to
403 indicate the dimensions of variance.</li>
408 <section id="better"><title>Fiddling with Quality
411 <p>httpd sometimes changes the quality values from what would
412 be expected by a strict interpretation of the httpd
413 negotiation algorithm above. This is to get a better result
414 from the algorithm for browsers which do not send full or
415 accurate information. Some of the most popular browsers send
416 <code>Accept</code> header information which would otherwise
417 result in the selection of the wrong variant in many cases. If a
418 browser sends full and correct information these fiddles will not
421 <section id="wildcards"><title>Media Types and Wildcards</title>
423 <p>The <code>Accept:</code> request header indicates preferences
424 for media types. It can also include 'wildcard' media types, such
425 as "image/*" or "*/*" where the * matches any string. So a request
428 <example>Accept: image/*, */*</example>
430 <p>would indicate that any type starting "image/" is acceptable,
431 as is any other type.
432 Some browsers routinely send wildcards in addition to explicit
433 types they can handle. For example:</p>
436 Accept: text/html, text/plain, image/gif, image/jpeg, */*
438 <p>The intention of this is to indicate that the explicitly listed
439 types are preferred, but if a different representation is
440 available, that is ok too. Using explicit quality values,
441 what the browser really wants is something like:</p>
443 Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
445 <p>The explicit types have no quality factor, so they default to a
446 preference of 1.0 (the highest). The wildcard */* is given a
447 low preference of 0.01, so other types will only be returned if
448 no variant matches an explicitly listed type.</p>
450 <p>If the <code>Accept:</code> header contains <em>no</em> q
451 factors at all, httpd sets the q value of "*/*", if present, to
452 0.01 to emulate the desired behavior. It also sets the q value of
453 wildcards of the format "type/*" to 0.02 (so these are preferred
454 over matches against "*/*". If any media type on the
455 <code>Accept:</code> header contains a q factor, these special
456 values are <em>not</em> applied, so requests from browsers which
457 send the explicit information to start with work as expected.</p>
460 <section id="exceptions"><title>Language Negotiation Exceptions</title>
462 <p>New in httpd 2.0, some exceptions have been added to the
463 negotiation algorithm to allow graceful fallback when language
464 negotiation fails to find a match.</p>
466 <p>When a client requests a page on your server, but the server
467 cannot find a single page that matches the
468 <code>Accept-language</code> sent by
469 the browser, the server will return either a "No Acceptable
470 Variant" or "Multiple Choices" response to the client. To avoid
471 these error messages, it is possible to configure httpd to ignore
472 the <code>Accept-language</code> in these cases and provide a
473 document that does not explicitly match the client's request. The
475 module="mod_negotiation">ForceLanguagePriority</directive>
476 directive can be used to override one or both of these error
477 messages and substitute the servers judgement in the form of the
478 <directive module="mod_negotiation">LanguagePriority</directive>
481 <p>The server will also attempt to match language-subsets when no
482 other match can be found. For example, if a client requests
483 documents with the language <code>en-GB</code> for British
484 English, the server is not normally allowed by the HTTP/1.1
485 standard to match that against a document that is marked as simply
486 <code>en</code>. (Note that it is almost surely a configuration
487 error to include <code>en-GB</code> and not <code>en</code> in the
488 <code>Accept-Language</code> header, since it is very unlikely
489 that a reader understands British English, but doesn't understand
490 English in general. Unfortunately, many current clients have
491 default configurations that resemble this.) However, if no other
492 language match is possible and the server is about to return a "No
493 Acceptable Variants" error or fallback to the <directive
494 module="mod_negotiation">LanguagePriority</directive>, the server
495 will ignore the subset specification and match <code>en-GB</code>
496 against <code>en</code> documents. Implicitly, httpd will add
497 the parent language to the client's acceptable language list with
498 a very low quality value. But note that if the client requests
499 "en-GB; q=0.9, fr; q=0.8", and the server has documents
500 designated "en" and "fr", then the "fr" document will be returned.
501 This is necessary to maintain compliance with the HTTP/1.1
502 specification and to work effectively with properly configured
505 <p>In order to support advanced techniques (such as cookies or
506 special URL-paths) to determine the user's preferred language,
507 since httpd 2.0.47 <module>mod_negotiation</module> recognizes
508 the <a href="env.html">environment variable</a>
509 <code>prefer-language</code>. If it exists and contains an
510 appropriate language tag, <module>mod_negotiation</module> will
511 try to select a matching variant. If there's no such variant,
512 the normal negotiation process applies.</p>
514 <example><title>Example</title>
515 <highlight language="config">
516 SetEnvIf Cookie "language=(.+)" prefer-language=$1
517 Header append Vary cookie
523 <section id="extensions"><title>Extensions to Transparent Content
526 <p>httpd extends the transparent content negotiation protocol (RFC
527 2295) as follows. A new <code>{encoding ..}</code> element is used in
528 variant lists to label variants which are available with a specific
529 content-encoding only. The implementation of the RVSA/1.0 algorithm
530 (RFC 2296) is extended to recognize encoded variants in the list, and
531 to use them as candidate variants whenever their encodings are
532 acceptable according to the <code>Accept-Encoding</code> request
533 header. The RVSA/1.0 implementation does not round computed quality
534 factors to 5 decimal places before choosing the best variant.</p>
537 <section id="naming"><title>Note on hyperlinks and naming conventions</title>
539 <p>If you are using language negotiation you can choose between
540 different naming conventions, because files can have more than
541 one extension, and the order of the extensions is normally
542 irrelevant (see the <a
543 href="mod/mod_mime.html#multipleext">mod_mime</a> documentation
546 <p>A typical file has a MIME-type extension (<em>e.g.</em>,
547 <code>html</code>), maybe an encoding extension (<em>e.g.</em>,
548 <code>gz</code>), and of course a language extension
549 (<em>e.g.</em>, <code>en</code>) when we have different
550 language variants of this file.</p>
559 <li>foo.en.html.gz</li>
562 <p>Here some more examples of filenames together with valid and
563 invalid hyperlinks:</p>
565 <table border="1" cellpadding="8" cellspacing="0">
566 <columnspec><column width=".2"/><column width=".2"/>
567 <column width=".2"/></columnspec>
571 <th>Valid hyperlink</th>
573 <th>Invalid hyperlink</th>
577 <td><em>foo.html.en</em></td>
586 <td><em>foo.en.html</em></td>
594 <td><em>foo.html.en.gz</em></td>
604 <td><em>foo.en.html.gz</em></td>
614 <td><em>foo.gz.html.en</em></td>
624 <td><em>foo.html.gz.en</em></td>
634 <p>Looking at the table above, you will notice that it is always
635 possible to use the name without any extensions in a hyperlink
636 (<em>e.g.</em>, <code>foo</code>). The advantage is that you
637 can hide the actual type of a document rsp. file and can change
638 it later, <em>e.g.</em>, from <code>html</code> to
639 <code>shtml</code> or <code>cgi</code> without changing any
640 hyperlink references.</p>
642 <p>If you want to continue to use a MIME-type in your
643 hyperlinks (<em>e.g.</em> <code>foo.html</code>) the language
644 extension (including an encoding extension if there is one)
645 must be on the right hand side of the MIME-type extension
646 (<em>e.g.</em>, <code>foo.html.en</code>).</p>
649 <section id="caching"><title>Note on Caching</title>
651 <p>When a cache stores a representation, it associates it with
652 the request URL. The next time that URL is requested, the cache
653 can use the stored representation. But, if the resource is
654 negotiable at the server, this might result in only the first
655 requested variant being cached and subsequent cache hits might
656 return the wrong response. To prevent this, httpd normally
657 marks all responses that are returned after content negotiation
658 as non-cacheable by HTTP/1.0 clients. httpd also supports the
659 HTTP/1.1 protocol features to allow caching of negotiated
662 <p>For requests which come from a HTTP/1.0 compliant client
663 (either a browser or a cache), the directive <directive
664 module="mod_negotiation">CacheNegotiatedDocs</directive> can be
665 used to allow caching of responses which were subject to
666 negotiation. This directive can be given in the server config or
667 virtual host, and takes no arguments. It has no effect on requests
668 from HTTP/1.1 clients.</p>
670 <p>For HTTP/1.1 clients, httpd sends a <code>Vary</code> HTTP
671 response header to indicate the negotiation dimensions for the
672 response. Caches can use this information to determine whether a
673 subsequent request can be served from the local copy. To
674 encourage a cache to use the local copy regardless of the
675 negotiation dimensions, set the <code>force-no-vary</code> <a
676 href="env.html#special">environment variable</a>.</p>