the hostname is interpolated into the filename, we need to be sure
that the result of interpolation doesn't expose parts of the
filesystem that should be private. This was done by checking the
syntax of the Host: header according to RFC 1123 and RFC 952. However,
many people have broken configurations that violate this syntax
(frequently because they use underscores in their names), and it also
doesn't accommodate the current effort to internationalize the DNS. I
don't think the former is a compelling reason to relax the syntax
checking, but the latter does justify this change.
The only RFC on internationalized DNS at the moment is RFC 2825 which
is an introduction to how difficult the whole thing is; the other
official documentation is a pile of Internet Drafts produced by the
Internationalized Domain Names Working Group of the IETF (with names
starting "draft-ietf-idn-"). However they have very little to say
about URIs, and the current Internet draft about internationalized
URIs (draft-masinter-url-i18n-05) has very little to say about
hostnames :-( On the gripping hand there is some useful information at
<http://www.apng.org/idns/> where there is some iDNS testbed work
going on. The basic idea is that although the format of the hostnames
in the DNS itself remains compatible with RFC 1123, the actual
hostname presented to the resolver is in UTF8, and therefore the
hostname in the URL and Host: header is also in UTF8.
This change relaxes the checking so that only character sequences that
are sensitive to the filesystem are rejected, i.e. forward slashes,
backward slashes, and sequences of more than one dot.
PR: 6635
git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@86898
13f79535-47bb-0310-9956-
ffa450edef68
/* Lowercase and remove any trailing dot and/or :port from the hostname,
* and check that it is sane.
+ *
+ * In most configurations the exact syntax of the hostname isn't
+ * important so strict sanity checking isn't necessary. However, in
+ * mass hosting setups (using mod_vhost_alias or mod_rewrite) where
+ * the hostname is interpolated into the filename, we need to be sure
+ * that the interpolation doesn't expose parts of the filesystem.
+ * We don't do strict RFC 952 / RFC 1123 syntax checking in order
+ * to support iDNS and people who erroneously use underscores.
+ * Instead we just check for filesystem metacharacters: directory
+ * separators / and \ and sequences of more than one dot.
*/
static void fix_hostname(request_rec *r)
{