From 7bd9637c8647f2d497b251ddd1af0735d55d9fc3 Mon Sep 17 00:00:00 2001 From: Martin Kraemer Date: Wed, 12 Nov 1997 13:37:54 +0000 Subject: [PATCH] Citing Lars: Hi, the attachment includes a reworked Apache manual with the new virtual host documentation. As Dean suggested I created a new directory named 'vhosts' and moved the updated vhosts-in-depth etc. documents into the new directory, renamed them and updated all other documents which refered to the old docs (at least I tried to find all documents...). Submitted by: Lars Eilebrecht Reviewed by: Martin Kraemer git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@79576 13f79535-47bb-0310-9956-ffa450edef68 --- docs/manual/vhosts/host.html | 172 ++++++++++ docs/manual/vhosts/vhosts-in-depth.html | 396 ++++++++++++++++++++++++ docs/manual/vhosts/virtual-host.html | 204 ++++++++++++ 3 files changed, 772 insertions(+) create mode 100644 docs/manual/vhosts/host.html create mode 100644 docs/manual/vhosts/vhosts-in-depth.html create mode 100644 docs/manual/vhosts/virtual-host.html diff --git a/docs/manual/vhosts/host.html b/docs/manual/vhosts/host.html new file mode 100644 index 0000000000..8c37eaccae --- /dev/null +++ b/docs/manual/vhosts/host.html @@ -0,0 +1,172 @@ + + +Apache non-IP Virtual Hosts + + + + + +

Apache non-IP Virtual Hosts

+ +See Also: +Virtual Host Support + +
+ +

What is a Virtual Host

+ +

The "Virtual Host" refers to the practice of maintaining more than +one server on one machine, as differentiated by their apparent +hostname. For example, it is often desirable for companies sharing a +web server to have their own domains, with web servers accessible as +www.company1.com and www.company2.com, +without requiring the user to know any extra path information.

+ +

Apache was one of the first servers to support virtual hosts right +out of the box, but since the base HTTP (HyperText +Transport Protocol) standard does not allow any method for the server +to determine the hostname it is being addressed as, Apache's virtual +host support has required a separate IP address for each +server. Documentation on using this approach (which still works very +well) is available. + +

While the approach described above works, with the available IP +address space growing smaller, and the number of domains increasing, +it is not the most elegant solution, and is hard to implement on some +machines. The HTTP/1.1 protocol contains a method for the +server to identify what name it is being addressed as. Apache 1.1 and +later support this approach as well as the traditional +IP-address-per-hostname method.

+ +

The benefits of using the new virtual host support is a practically +unlimited number of servers, ease of configuration and use, and +requires no additional hardware or software. The main disadvantage is +that the user's browser must support this part of the protocol. The +latest versions of many browsers (including Netscape Navigator 2.0 and +later) do, but many browsers, especially older ones, do not. This can +cause problems, although a possible solution is addressed below.

+ +

Using non-IP Virtual Hosts

+ +

Using the new virtual hosts is quite easy, and superficially looks +like the old method. You simply add to one of the Apache configuration +files (most likely httpd.conf or srm.conf) +code similar to the following:

+
+    <VirtualHost www.apache.org>
+    ServerName www.apache.org
+    DocumentRoot /usr/web/apache
+    </VirtualHost>
+
+ +

Of course, any additional directives can (and should) be placed +into the <VirtualHost> section. To make this work, +all that is needed is to make sure that the www.apache.org +DNS entry points to the same IP address as the main +server. Optionally, you could simply use that IP address in the +<VirtualHost> entry.

+ +

Additionally, many servers may wish to be accessible by more than +one name. For example, the Apache server might want to be accessible +as apache.org, or ftp.apache.org, assuming +the IP addresses pointed to the same server. In fact, one might want it +so that all addresses at apache.org were picked up by the +server. This is possible with the ServerAlias +directive, placed inside the <VirtualHost> section. For +example:

+ +
+    ServerAlias apache.org *.apache.org
+
+ +

Note that you can use * and ? as wild-card +characters.

+ +

You also might need ServerAlias if you are serving local users who +do not always include the domain name. For example, if local users are +familiar with typing "www" or "www.physics" then you will need to add +ServerAlias www www.physics. It isn't possible for the +server to know what domain the client uses for their name resolution +because the client doesn't provide that information in the request.

+ +

Security Considerations

+ +Apache allows all virtual hosts to be made accessible via the +Host: header through all IP interfaces, even those which +are configured to use different IP interfaces. For example, if the +configuration for www.foo.com contained a virtual host +section for www.bar.com, and www.bar.com was +a separate IP interface, such that +non-Host:-header-supporting browsers can use it, as +before with Apache 1.0. If a request is made to +www.foo.com and the request includes the header +Host: www.bar.com, a page from www.bar.com +will be sent. + +

+ +This is a security concern if you are controlling access to a +particular server based on IP-layer controls, such as from within a +firewall or router. Let's say www.bar.com in the above +example was instead an intra-net server called +private.foo.com, and the router used by foo.com only let +internal users access private.foo.com. Obviously, +Host: header functionality now allows someone who has +access to www.foo.com to get +private.foo.com, if they send a Host: +private.foo.com header. It is important to note that this +condition exists only if you only implement this policy at the IP +layer - all security controls used by Apache (i.e., allow, deny from, etc.) are consistently +respected. + +

Compatibility with Older Browsers

+ +

As mentioned earlier, a majority of browsers do not send the +required data for the new virtual hosts to work properly. These +browsers will always be sent to the main server's pages. There is a +workaround, albeit a slightly cumbersome one:

+ +

To continue the www.apache.org example (Note: Apache's +web server does not actually function in this manner), we might use the +new ServerPath directive in the www.apache.org virtual host, +for example: + +

+    ServerPath /apache
+
+

What does this mean? It means that a request for any file beginning +with "/apache" will be looked for in the Apache +docs. This means that the pages can be accessed as +http://www.apache.org/apache/ for all browsers, although +new browsers can also access it as +http://www.apache.org/.

+ +

In order to make this work, put a link on your main server's page +to http://www.apache.org/apache/ (Note: Do not use +http://www.apache.org/ - this would create an endless +loop). Then, in the virtual host's pages, be sure to use either purely +relative links (e.g. "file.html" or +"../icons/image.gif" or links containing the prefacing +/apache/ +(e.g. "http://www.apache.org/apache/file.html" or +"/apache/docs/1.1/index.html").

+ +

This requires a bit of +discipline, but adherence to these guidelines will, for the most part, +ensure that your pages will work with all browsers, new and old. When +a new browser contacts http://www.apache.org/, they will +be directly taken to the Apache pages. Older browsers will be able to +click on the link from the main server, go to +http://www.apache.org/apache/, and then access the +pages.

+ + + + diff --git a/docs/manual/vhosts/vhosts-in-depth.html b/docs/manual/vhosts/vhosts-in-depth.html new file mode 100644 index 0000000000..d2339bff81 --- /dev/null +++ b/docs/manual/vhosts/vhosts-in-depth.html @@ -0,0 +1,396 @@ + + +An In-Depth Discussion of VirtualHost Matching + + + + + +

An In-Depth Discussion of VirtualHost Matching

+ +

This is a very rough document that was probably out of date the moment +it was written. It attempts to explain exactly what the code does when +deciding what virtual host to serve a hit from. It's provided on the +assumption that something is better than nothing. The server version +under discussion is Apache 1.2. + +

If you just want to "make it work" without understanding +how, there's a What Works section at the bottom. + +

Config File Parsing

+ +

There is a main_server which consists of all the definitions appearing +outside of VirtualHost sections. There are virtual servers, +called vhosts, which are defined by +VirtualHost +sections. + +

The directives +Port, +ServerName, +ServerPath, +and +ServerAlias +can appear anywhere within the definition of +a server. However, each appearance overrides the previous appearance +(within that server). + +

The default value of the Port field for main_server +is 80. The main_server has no default ServerName, +ServerPath, or ServerAlias. + +

In the absence of any +Listen +directives, the (final if there +are multiple) Port directive in the main_server indicates +which port httpd will listen on. + +

The Port and ServerName directives for +any server main or virtual are used when generating URLs such as during +redirects. + +

Each address appearing in the VirtualHost directive +can have an optional port. If the port is unspecified it defaults to +the value of the main_server's most recent Port statement. +The special port * indicates a wildcard that matches any port. +Collectively the entire set of addresses (including multiple +A record +results from DNS lookups) are called the vhost's address set. + +

The magic _default_ address has significance during +the matching algorithm. It essentially matches any unspecified address. + +

After parsing the VirtualHost directive, the vhost server +is given a default Port equal to the port assigned to the +first name in its VirtualHost directive. The complete +list of names in the VirtualHost directive are treated +just like a ServerAlias (but are not overridden by any +ServerAlias statement). Note that subsequent Port +statements for this vhost will not affect the ports assigned in the +address set. + +

+All vhosts are stored in a list which is in the reverse order that +they appeared in the config file. For example, if the config file is: + +

+    <VirtualHost A>
+    ...
+    </VirtualHost>
+
+    <VirtualHost B>
+    ...
+    </VirtualHost>
+
+    <VirtualHost C>
+    ...
+    </VirtualHost>
+
+ +Then the list will be ordered: main_server, C, B, A. Keep this in mind. + +

+After parsing has completed, the list of servers is scanned, and various +merges and default values are set. In particular: + +

    +
  1. If a vhost has no + ServerAdmin, + ResourceConfig, + AccessConfig, + Timeout, + KeepAliveTimeout, + KeepAlive, + MaxKeepAliveRequests, + or + SendBufferSize + directive then the respective value is + inherited from the main_server. (That is, inherited from whatever + the final setting of that value is in the main_server.) + +
  2. The "lookup defaults" that define the default directory + permissions + for a vhost are merged with those of the main server. This includes + any per-directory configuration information for any module. + +
  3. The per-server configs for each module from the main_server are + merged into the vhost server. +
+ +Essentially, the main_server is treated as "defaults" or a +"base" on +which to build each vhost. But the positioning of these main_server +definitions in the config file is largely irrelevant -- the entire +config of the main_server has been parsed when this final merging occurs. +So even if a main_server definition appears after a vhost definition +it might affect the vhost definition. + +

If the main_server has no ServerName at this point, +then the hostname of the machine that httpd is running on is used +instead. We will call the main_server address set those IP +addresses returned by a DNS lookup on the ServerName of +the main_server. + +

Now a pass is made through the vhosts to fill in any missing +ServerName fields and to classify the vhost as either +an IP-based vhost or a name-based vhost. A vhost is +considered a name-based vhost if any of its address set overlaps the +main_server (the port associated with each address must match the +main_server's Port). Otherwise it is considered an IP-based +vhost. + +

For any undefined ServerName fields, a name-based vhost +defaults to the address given first in the VirtualHost +statement defining the vhost. Any vhost that includes the magic +_default_ wildcard is given the same ServerName as +the main_server. Otherwise the vhost (which is necessarily an IP-based +vhost) is given a ServerName based on the result of a reverse +DNS lookup on the first address given in the VirtualHost +statement. + +

+ +

Vhost Matching

+ + +

Apache 1.3 differs from what is documented +here, and documentation still has to be written. + +

+The server determines which vhost to use for a request as follows: + +

find_virtual_server: When the connection is first made +by the client, the local IP address (the IP address to which the client +connected) is looked up in the server list. A vhost is matched if it +is an IP-based vhost, the IP address matches and the port matches +(taking into account wildcards). + +

If no vhosts are matched then the last occurrence, if it appears, +of a _default_ address (which if you recall the ordering of the +server list mentioned above means that this would be the first occurrence +of _default_ in the config file) is matched. + +

In any event, if nothing above has matched, then the main_server is +matched. + +

The vhost resulting from the above search is stored with data +about the connection. We'll call this the connection vhost. +The connection vhost is constant over all requests in a particular TCP/IP +session -- that is, over all requests in a KeepAlive/persistent session. + +

For each request made on the connection the following sequence of +events further determines the actual vhost that will be used to serve +the request. + +

check_fulluri: If the requestURI is an absoluteURI, that +is it includes http://hostname/, then an attempt is made to +determine if the hostname's address (and optional port) match that of +the connection vhost. If it does then the hostname portion of the URI +is saved as the request_hostname. If it does not match, then the +URI remains untouched. Note: to achieve this address +comparison, +the hostname supplied goes through a DNS lookup unless it matches the +ServerName or the local IP address of the client's socket. + +

parse_uri: If the URI begins with a protocol +(i.e., http:, ftp:) then the request is +considered a proxy request. Note that even though we may have stripped +an http://hostname/ in the previous step, this could still +be a proxy request. + +

read_request: If the request does not have a hostname +from the earlier step, then any Host: header sent by the +client is used as the request hostname. + +

check_hostalias: If the request now has a hostname, +then an attempt is made to match for this hostname. The first step +of this match is to compare any port, if one was given in the request, +against the Port field of the connection vhost. If there's +a mismatch then the vhost used for the request is the connection vhost. +(This is a bug, see observations.) + +

+If the port matches, then httpd scans the list of vhosts starting with +the next server after the connection vhost. This scan does not +stop if there are any matches, it goes through all possible vhosts, +and in the end uses the last match it found. The comparisons performed +are as follows: + +

+ +

+check_serverpath: If the request has no hostname +(back up a few paragraphs) then a scan similar to the one +in check_hostalias is performed to match any +ServerPath directives given in the vhosts. Note that the +last match is used regardless (again consider the ordering of +the virtual hosts). + +

Observations

+ + + +

What Works

+ +

In addition to the tips on the DNS +Issues page, here are some further tips: + +

+ + + + diff --git a/docs/manual/vhosts/virtual-host.html b/docs/manual/vhosts/virtual-host.html new file mode 100644 index 0000000000..b472a0a073 --- /dev/null +++ b/docs/manual/vhosts/virtual-host.html @@ -0,0 +1,204 @@ + + + +Apache Server Virtual Host Support + + + + + +

Virtual Host Support

+ +See Also: +Non-IP based virtual hosts + +

What are virtual hosts?

+This is the ability of a single machine to be a web server for multiple +domains. For example, an Internet service provider might have a machine +called www.serve.com which provides Web space for several +organizations including, say, smallco and baygroup. +Ordinarily, these groups would be given parts of the Web tree on www.serve.com. +So smallco's home page would have the URL +
+http://www.serve.com/smallco/ +
+and baygroup's home page would have the URL +
+http://www.serve.com/baygroup/ +
+

+For esthetic reasons, however, both organizations would rather their home +pages appeared under their own names rather than that of the service +provider's; but they do not want to set up their own Internet links and +servers. +

+Virtual hosts are the solution to this problem. smallco and baygroup would +have their own Internet name registrations, www.smallco.com and +www.baygroup.org respectively. These hostnames would both +correspond to the service provider's machine (www.serve.com). Thus +smallco's home page would now have the URL +

+http://www.smallco.com/ +
+and baygroup's home page would would have the URL +
+http://www.baygroup.org/ +
+ +

System requirements

+Due to limitations in the HTTP/1.0 protocol, the web server must have a +different IP address for each virtual host. This can be achieved +by the machine having several physical network connections, or by use +of a virtual interface on some operating systems. + +

How to set up Apache

+There are two ways of configuring apache to support multiple hosts. +Either by running a separate httpd daemon for each hostname, or by running a +single daemon which supports all the virtual hosts. +

+Use multiple daemons when: +

+Use a single daemon when: + + +

Setting up multiple daemons

+Create a separate httpd installation for each virtual host. +For each installation, use the +BindAddress directive in the configuration +file to select which IP address (or virtual host) that daemon services. +e.g. +
BindAddress www.smallco.com
+This hostname can also be given as an IP address. + +

Setting up a single daemon

+For this case, a single httpd will service requests for all the virtual hosts. +The VirtualHost directive in the + configuration file is used to set the values of +ServerAdmin, +ServerName, +DocumentRoot, +ErrorLog and +TransferLog configuration +directives to different values for each virtual host. +e.g. +
+<VirtualHost www.smallco.com>
+ServerAdmin webmaster@mail.smallco.com
+DocumentRoot /groups/smallco/www
+ServerName www.smallco.com
+ErrorLog /groups/smallco/logs/error_log
+TransferLog /groups/smallco/logs/access_log
+</VirtualHost>
+
+<VirtualHost www.baygroup.org>
+ServerAdmin webmaster@mail.baygroup.org
+DocumentRoot /groups/baygroup/www
+ServerName www.baygroup.org
+ErrorLog /groups/baygroup/logs/error_log
+TransferLog /groups/baygroup/logs/access_log
+</VirtualHost>
+
+ +This VirtualHost hostnames can also be given as IP addresses. + +

+ +Almost ANY configuration directive can be put +in the VirtualHost directive, with the exception of +ServerType, +User, +Group, +StartServers, +MaxSpareServers, +MinSpareServers, +MaxRequestsPerChild, +BindAddress, +PidFile, +TypesConfig, and +ServerRoot. + +

+ +SECURITY: When specifying where to write log files, be aware +of some security risks which are present if anyone other than the +user that starts Apache has write access to the directory where they +are written. See the security +tips document for details. + +

+ +

File Handle/Resource Limits:

+When using a large number of Virtual Hosts, Apache may run out of available +file descriptors if each Virtual Host specifies different log files. +The total number of file descriptors used by Apache is one for each distinct +error log file, one for every other log file directive, plus 10-20 for +internal use. Unix operating systems limit the number of file descriptors that +may be used by a process; the limit is typically 64, and may usually be +increased up to a large hard-limit. +

+Although Apache attempts to increase the limit as required, this +may not work if: +

    +
  1. Your system does not provide the setrlimit() system call. +
  2. The setrlimit(RLIMIT_NOFILE) call does not function on your system + (such as Solaris 2.3) +
  3. The number of file descriptors required exceeds the hard limit. +
  4. Your system imposes other limits on file descriptors, such as a limit +on stdio streams only using file descriptors below 256. (Solaris 2) +
+ +In the event of problems you can: + + +The have been reports that Apache may start running out of resources allocated +for the root process. This will exhibit itself as errors in the error log like +"unable to fork". There are two ways you can bump this up: + +
    +
  1. Have a csh script wrapper around httpd which sets the +"rlimit" to some large number, like 512. +
  2. Edit http_main.c to add calls to setrlimit() from main(), along the lines of +
    +        struct rlimit rlp;
    +
    +        rlp.rlim_cur = rlp.rlim_max = 512;
    +        if (setrlimit(RLIMIT_NPROC, &rlp)) {
    +            fprintf(stderr, "setrlimit(RLIMIT_NPROC) failed.\n");
    +            exit(1);
    +        }
    +
    +(thanks to "Aaron Gifford <agifford@InfoWest.COM>" for the patch) +
+ +The latter will probably manifest itself in a later version of Apache. + + + + -- 2.40.0