1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
4 <meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type" />
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 This file is generated from xml source: DO NOT EDIT
8 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
10 <title>Apache Performance Tuning - Apache HTTP Server Version 2.5</title>
11 <link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
12 <link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
13 <link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="../style/css/prettify.css" />
14 <script src="../style/scripts/prettify.min.js" type="text/javascript">
17 <link href="../images/favicon.ico" rel="shortcut icon" /></head>
18 <body id="manual-page"><div id="page-header">
19 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/quickreference.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
20 <p class="apache">Apache HTTP Server Version 2.5</p>
21 <img alt="" src="../images/feather.png" /></div>
22 <div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>
24 <a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.5</a> > <a href="./">Miscellaneous Documentation</a></div><div id="page-content"><div id="preamble"><h1>Apache Performance Tuning</h1>
26 <p><span>Available Languages: </span><a href="../en/misc/perf-tuning.html" title="English"> en </a> |
27 <a href="../fr/misc/perf-tuning.html" hreflang="fr" rel="alternate" title="Français"> fr </a> |
28 <a href="../ko/misc/perf-tuning.html" hreflang="ko" rel="alternate" title="Korean"> ko </a> |
29 <a href="../tr/misc/perf-tuning.html" hreflang="tr" rel="alternate" title="Türkçe"> tr </a></p>
33 <div class="warning"><h3>Warning</h3>
34 <p>This document is partially out of date and might be inaccurate.</p>
37 <p>Apache 2.4 is a general-purpose webserver, designed to
38 provide a balance of flexibility, portability, and performance.
39 Although it has not been designed specifically to set benchmark
40 records, Apache 2.4 is capable of high performance in many
41 real-world situations.</p>
43 <p>This document describes the options that a server administrator
44 can configure to tune the performance of an Apache 2.4 installation.
45 Some of these configuration options enable the httpd to better take
46 advantage of the capabilities of the hardware and OS, while others allow
47 the administrator to trade functionality for speed.</p>
50 <div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#hardware">Hardware and Operating System Issues</a></li>
51 <li><img alt="" src="../images/down.gif" /> <a href="#runtime">Run-Time Configuration Issues</a></li>
52 <li><img alt="" src="../images/down.gif" /> <a href="#compiletime">Compile-Time Configuration Issues</a></li>
53 <li><img alt="" src="../images/down.gif" /> <a href="#trace">Appendix: Detailed Analysis of a Trace</a></li>
54 </ul><h3>See also</h3><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
55 <div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
57 <h2><a name="hardware" id="hardware">Hardware and Operating System Issues</a><a title="Permanent link" href="#hardware" class="permalink">¶</a></h2>
61 <p>The single biggest hardware issue affecting webserver
62 performance is RAM. A webserver should never ever have to swap,
63 as swapping increases the latency of each request beyond a point
64 that users consider "fast enough". This causes users to hit
65 stop and reload, further increasing the load. You can, and
66 should, control the <code class="directive"><a href="../mod/mpm_common.html#maxrequestworkers">MaxRequestWorkers</a></code> setting so that your server
67 does not spawn so many children that it starts swapping. The procedure
68 for doing this is simple: determine the size of your average Apache
69 process, by looking at your process list via a tool such as
70 <code>top</code>, and divide this into your total available memory,
71 leaving some room for other processes.</p>
73 <p>Beyond that the rest is mundane: get a fast enough CPU, a
74 fast enough network card, and fast enough disks, where "fast
75 enough" is something that needs to be determined by
78 <p>Operating system choice is largely a matter of local
79 concerns. But some guidelines that have proven generally
84 <p>Run the latest stable release and patch level of the
85 operating system that you choose. Many OS suppliers have
86 introduced significant performance improvements to their
87 TCP stacks and thread libraries in recent years.</p>
91 <p>If your OS supports a <code>sendfile(2)</code> system
92 call, make sure you install the release and/or patches
93 needed to enable it. (With Linux, for example, this means
94 using Linux 2.4 or later. For early releases of Solaris 8,
95 you may need to apply a patch.) On systems where it is
96 available, <code>sendfile</code> enables Apache to deliver
97 static content faster and with lower CPU utilization.</p>
101 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
102 <div class="section">
103 <h2><a name="runtime" id="runtime">Run-Time Configuration Issues</a><a title="Permanent link" href="#runtime" class="permalink">¶</a></h2>
107 <table class="related"><tr><th>Related Modules</th><th>Related Directives</th></tr><tr><td><ul><li><code class="module"><a href="../mod/mod_dir.html">mod_dir</a></code></li><li><code class="module"><a href="../mod/mpm_common.html">mpm_common</a></code></li><li><code class="module"><a href="../mod/mod_status.html">mod_status</a></code></li></ul></td><td><ul><li><code class="directive"><a href="../mod/core.html#allowoverride">AllowOverride</a></code></li><li><code class="directive"><a href="../mod/mod_dir.html#directoryindex">DirectoryIndex</a></code></li><li><code class="directive"><a href="../mod/core.html#hostnamelookups">HostnameLookups</a></code></li><li><code class="directive"><a href="../mod/core.html#enablemmap">EnableMMAP</a></code></li><li><code class="directive"><a href="../mod/core.html#enablesendfile">EnableSendfile</a></code></li><li><code class="directive"><a href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code></li><li><code class="directive"><a href="../mod/prefork.html#maxspareservers">MaxSpareServers</a></code></li><li><code class="directive"><a href="../mod/prefork.html#minspareservers">MinSpareServers</a></code></li><li><code class="directive"><a href="../mod/core.html#options">Options</a></code></li><li><code class="directive"><a href="../mod/mpm_common.html#startservers">StartServers</a></code></li></ul></td></tr></table>
109 <h3><a name="dns" id="dns">HostnameLookups and other DNS considerations</a></h3>
113 <p>Prior to Apache 1.3, <code class="directive"><a href="../mod/core.html#hostnamelookups">HostnameLookups</a></code> defaulted to <code>On</code>.
114 causing an extra latency penalty for every request due to a
115 DNS lookup to complete before the request was finished.
116 In Apache 2.4 this setting defaults to <code>Off</code>. If you need
117 to have addresses in your log files resolved to hostnames, please
118 consider post-processing rather than forcing Apache to do it in the first
119 place. It is recommended that you do this sort of post-processing of
120 your log files on some machine other than the production web
121 server machine, in order that this activity not adversely affect
122 server performance.</p>
124 <p>If you use any <code><code class="directive"><a href="../mod/mod_access_compat.html#allow">Allow</a></code> from domain</code> or <code><code class="directive"><a href="../mod/mod_access_compat.html#deny">Deny</a></code> from domain</code>
125 directives (i.e., using a hostname, or a domain name, rather than
126 an IP address) then you will pay for
127 two DNS lookups (a reverse, followed by a forward lookup
128 to make sure that the reverse is not being spoofed). For best
129 performance, whenever it is possible, use IP addresses rather
130 than domain names.</p>
132 <div class="warning"><h3>Warning:</h3>
133 <p>Please use the <code class="directive"><a href="../mod/mod_authz_core.html#require">Require</a></code> directive with Apache 2.4;
134 more info in the related <a href="../upgrading.html">upgrading guide</a>.</p>
137 <p>Note that it's possible to scope the directives, such as
138 within a <code><Location "/server-status"></code> section.
139 In this case the DNS lookups are only performed on requests
140 matching the criteria. Here's an example which disables lookups
141 except for <code>.html</code> and <code>.cgi</code> files:</p>
143 <pre class="prettyprint lang-config"><Files ~ "\.(html|cgi)$">
148 <p>But even still, if you just need DNS names in some CGIs you
149 could consider doing the <code>gethostbyname</code> call in the
150 specific CGIs that need it.</p>
154 <h3><a name="symlinks" id="symlinks">FollowSymLinks and SymLinksIfOwnerMatch</a></h3>
158 <p>Wherever in your URL-space you do not have an <code>Options
159 FollowSymLinks</code>, or you do have an <code>Options
160 SymLinksIfOwnerMatch</code>, Apache will need to issue extra
161 system calls to check up on symlinks. (One extra call per
162 filename component.) For example, if you had:</p>
164 <pre class="prettyprint lang-config">DocumentRoot "/www/htdocs"
165 <Directory "/">
166 Options SymLinksIfOwnerMatch
167 </Directory></pre>
170 <p>and a request is made for the URI <code>/index.html</code>,
171 then Apache will perform <code>lstat(2)</code> on
172 <code>/www</code>, <code>/www/htdocs</code>, and
173 <code>/www/htdocs/index.html</code>. The results of these
174 <code>lstats</code> are never cached, so they will occur on
175 every single request. If you really desire the symlinks
176 security checking, you can do something like this:</p>
178 <pre class="prettyprint lang-config">DocumentRoot "/www/htdocs"
179 <Directory "/">
180 Options FollowSymLinks
183 <Directory "/www/htdocs">
184 Options -FollowSymLinks +SymLinksIfOwnerMatch
185 </Directory></pre>
188 <p>This at least avoids the extra checks for the
189 <code class="directive"><a href="../mod/core.html#documentroot">DocumentRoot</a></code> path.
190 Note that you'll need to add similar sections if you
191 have any <code class="directive"><a href="../mod/mod_alias.html#alias">Alias</a></code> or
192 <code class="directive"><a href="../mod/mod_rewrite.html#rewriterule">RewriteRule</a></code> paths
193 outside of your document root. For highest performance,
194 and no symlink protection, set <code>FollowSymLinks</code>
195 everywhere, and never set <code>SymLinksIfOwnerMatch</code>.</p>
199 <h3><a name="htaccess" id="htaccess">AllowOverride</a></h3>
203 <p>Wherever in your URL-space you allow overrides (typically
204 <code>.htaccess</code> files), Apache will attempt to open
205 <code>.htaccess</code> for each filename component. For
208 <pre class="prettyprint lang-config">DocumentRoot "/www/htdocs"
209 <Directory "/">
211 </Directory></pre>
214 <p>and a request is made for the URI <code>/index.html</code>.
215 Then Apache will attempt to open <code>/.htaccess</code>,
216 <code>/www/.htaccess</code>, and
217 <code>/www/htdocs/.htaccess</code>. The solutions are similar
218 to the previous case of <code>Options FollowSymLinks</code>.
219 For highest performance use <code>AllowOverride None</code>
220 everywhere in your filesystem.</p>
224 <h3><a name="negotiation" id="negotiation">Negotiation</a></h3>
228 <p>If at all possible, avoid content negotiation if you're
229 really interested in every last ounce of performance. In
230 practice the benefits of negotiation outweigh the performance
231 penalties. There's one case where you can speed up the server.
232 Instead of using a wildcard such as:</p>
234 <pre class="prettyprint lang-config">DirectoryIndex index</pre>
237 <p>Use a complete list of options:</p>
239 <pre class="prettyprint lang-config">DirectoryIndex index.cgi index.pl index.shtml index.html</pre>
242 <p>where you list the most common choice first.</p>
244 <p>Also note that explicitly creating a <code>type-map</code>
245 file provides better performance than using
246 <code>MultiViews</code>, as the necessary information can be
247 determined by reading this single file, rather than having to
248 scan the directory for files.</p>
250 <p>If your site needs content negotiation, consider using
251 <code>type-map</code> files, rather than the <code>Options
252 MultiViews</code> directive to accomplish the negotiation. See the
253 <a href="../content-negotiation.html">Content Negotiation</a>
254 documentation for a full discussion of the methods of negotiation,
255 and instructions for creating <code>type-map</code> files.</p>
259 <h3>Memory-mapping</h3>
263 <p>In situations where Apache 2.x needs to look at the contents
264 of a file being delivered--for example, when doing server-side-include
265 processing--it normally memory-maps the file if the OS supports
266 some form of <code>mmap(2)</code>.</p>
268 <p>On some platforms, this memory-mapping improves performance.
269 However, there are cases where memory-mapping can hurt the performance
270 or even the stability of the httpd:</p>
274 <p>On some operating systems, <code>mmap</code> does not scale
275 as well as <code>read(2)</code> when the number of CPUs increases.
276 On multiprocessor Solaris servers, for example, Apache 2.x sometimes
277 delivers server-parsed files faster when <code>mmap</code> is disabled.</p>
281 <p>If you memory-map a file located on an NFS-mounted filesystem
282 and a process on another NFS client machine deletes or truncates
283 the file, your process may get a bus error the next time it tries
284 to access the mapped file content.</p>
288 <p>For installations where either of these factors applies, you
289 should use <code>EnableMMAP off</code> to disable the memory-mapping
290 of delivered files. (Note: This directive can be overridden on
291 a per-directory basis.)</p>
299 <p>In situations where Apache 2.x can ignore the contents of the file
300 to be delivered -- for example, when serving static file content --
301 it normally uses the kernel sendfile support for the file if the OS
302 supports the <code>sendfile(2)</code> operation.</p>
304 <p>On most platforms, using sendfile improves performance by eliminating
305 separate read and send mechanics. However, there are cases where using
306 sendfile can harm the stability of the httpd:</p>
310 <p>Some platforms may have broken sendfile support that the build
311 system did not detect, especially if the binaries were built on
312 another box and moved to such a machine with broken sendfile support.</p>
315 <p>With an NFS-mounted filesystem, the kernel may be unable
316 to reliably serve the network file through its own cache.</p>
320 <p>For installations where either of these factors applies, you
321 should use <code>EnableSendfile off</code> to disable sendfile
322 delivery of file contents. (Note: This directive can be overridden
323 on a per-directory basis.)</p>
327 <h3><a name="process" id="process">Recycle child processes</a></h3>
331 <p><code class="directive"><a href="../mod/mpm_common.html#maxconnectionsperchild">MaxConnectionsPerChild</a></code>
332 limits the numbers of connections that a child process can handle during
333 its lifetime (by default set to <code>0</code> - unlimited). This affects all
334 the <a href="../mpm.html#defaults">MPMs</a>, even the ones using threads.
335 For example, each process created by the <code class="module"><a href="../mod/worker.html">worker</a></code> MPM spawns
336 multiple threads that will handle connections, but this does not influence
337 the overall count. It only means that the sum of requests handled by all the
338 threads spawned by a single process will be counted against the
339 <code class="directive"><a href="../mod/mpm_common.html#maxconnectionsperchild">MaxConnectionsPerChild</a></code> value.</p>
341 <p><code class="directive"><a href="../mod/mpm_common.html#maxconnectionsperchild">MaxConnectionsPerChild</a></code> should
342 not have any limit in the optimal use case, since there should not be any
343 reason to force a process kill other than software bugs causing memory leaks
344 or excessive CPU usage.</p>
346 <p>When keep-alives are in use, a process (or a thread spawned by a process)
347 will be kept busy doing nothing but waiting for more requests on the already open
348 connection. The default <code class="directive"><a href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a></code> of <code>5</code>
349 seconds attempts to minimize this effect. The tradeoff here is
350 between network bandwidth and server resources.</p>
353 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
354 <div class="section">
355 <h2><a name="compiletime" id="compiletime">Compile-Time Configuration Issues</a><a title="Permanent link" href="#compiletime" class="permalink">¶</a></h2>
359 <h3>Choosing an MPM</h3>
363 <p>Apache 2.x supports pluggable concurrency models, called
364 <a href="../mpm.html">Multi-Processing Modules</a> (MPMs).
365 When building Apache, you must choose an MPM to use. There
366 are platform-specific MPMs for some platforms:
367 <code class="module"><a href="../mod/mpm_netware.html">mpm_netware</a></code>,
368 <code class="module"><a href="../mod/mpmt_os2.html">mpmt_os2</a></code>, and <code class="module"><a href="../mod/mpm_winnt.html">mpm_winnt</a></code>. For
369 general Unix-type systems, there are several MPMs from which
370 to choose. The choice of MPM can affect the speed and scalability
375 <li>The <code class="module"><a href="../mod/worker.html">worker</a></code> MPM uses multiple child
376 processes with many threads each. Each thread handles
377 one connection at a time. Worker generally is a good
378 choice for high-traffic servers because it has a smaller
379 memory footprint than the prefork MPM.</li>
381 <li>The <code class="module"><a href="../mod/event.html">event</a></code> MPM is threaded like the
382 Worker MPM, but is designed to allow more requests to be
383 served simultaneously by passing off some processing work
384 to supporting threads, freeing up the main threads to work
385 on new requests.</li>
387 <li>The <code class="module"><a href="../mod/prefork.html">prefork</a></code> MPM uses multiple child
388 processes with one thread each. Each process handles
389 one connection at a time. On many systems, prefork is
390 comparable in speed to worker, but it uses more memory.
391 Prefork's threadless design has advantages over worker
392 in some situations: it can be used with non-thread-safe
393 third-party modules, and it is easier to debug on platforms
394 with poor thread debugging support.</li>
398 <p>For more information on these and other MPMs, please
399 see the MPM <a href="../mpm.html">documentation</a>.</p>
403 <h3><a name="modules" id="modules">Modules</a></h3>
407 <p>Since memory usage is such an important consideration in
408 performance, you should attempt to eliminate modules that you are
409 not actually using. If you have built the modules as <a href="../dso.html">DSOs</a>, eliminating modules is a simple
410 matter of commenting out the associated <code class="directive"><a href="../mod/mod_so.html#loadmodule">LoadModule</a></code> directive for that module.
411 This allows you to experiment with removing modules and seeing
412 if your site still functions in their absence.</p>
414 <p>If, on the other hand, you have modules statically linked
415 into your Apache binary, you will need to recompile Apache in
416 order to remove unwanted modules.</p>
418 <p>An associated question that arises here is, of course, what
419 modules you need, and which ones you don't. The answer here
420 will, of course, vary from one web site to another. However, the
421 <em>minimal</em> list of modules which you can get by with tends
422 to include <code class="module"><a href="../mod/mod_mime.html">mod_mime</a></code>, <code class="module"><a href="../mod/mod_dir.html">mod_dir</a></code>,
423 and <code class="module"><a href="../mod/mod_log_config.html">mod_log_config</a></code>. <code>mod_log_config</code> is,
424 of course, optional, as you can run a web site without log
425 files. This is, however, not recommended.</p>
429 <h3>Atomic Operations</h3>
433 <p>Some modules, such as <code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code> and
434 recent development builds of the worker MPM, use APR's
435 atomic API. This API provides atomic operations that can
436 be used for lightweight thread synchronization.</p>
438 <p>By default, APR implements these operations using the
439 most efficient mechanism available on each target
440 OS/CPU platform. Many modern CPUs, for example, have
441 an instruction that does an atomic compare-and-swap (CAS)
442 operation in hardware. On some platforms, however, APR
443 defaults to a slower, mutex-based implementation of the
444 atomic API in order to ensure compatibility with older
445 CPU models that lack such instructions. If you are
446 building Apache for one of these platforms, and you plan
447 to run only on newer CPUs, you can select a faster atomic
448 implementation at build time by configuring Apache with
449 the <code>--enable-nonportable-atomics</code> option:</p>
451 <div class="example"><p><code>
453 ./configure --with-mpm=worker --enable-nonportable-atomics=yes
456 <p>The <code>--enable-nonportable-atomics</code> option is
457 relevant for the following platforms:</p>
461 <li>Solaris on SPARC<br />
462 By default, APR uses mutex-based atomics on Solaris/SPARC.
463 If you configure with <code>--enable-nonportable-atomics</code>,
464 however, APR generates code that uses a SPARC v8plus opcode for
465 fast hardware compare-and-swap. If you configure Apache with
466 this option, the atomic operations will be more efficient
467 (allowing for lower CPU utilization and higher concurrency),
468 but the resulting executable will run only on UltraSPARC
472 <li>Linux on x86<br />
473 By default, APR uses mutex-based atomics on Linux. If you
474 configure with <code>--enable-nonportable-atomics</code>,
475 however, APR generates code that uses a 486 opcode for fast
476 hardware compare-and-swap. This will result in more efficient
477 atomic operations, but the resulting executable will run only
478 on 486 and later chips (and not on 386).
485 <h3>mod_status and ExtendedStatus On</h3>
489 <p>If you include <code class="module"><a href="../mod/mod_status.html">mod_status</a></code> and you also set
490 <code>ExtendedStatus On</code> when building and running
491 Apache, then on every request Apache will perform two calls to
492 <code>gettimeofday(2)</code> (or <code>times(2)</code>
493 depending on your operating system), and (pre-1.3) several
494 extra calls to <code>time(2)</code>. This is all done so that
495 the status report contains timing indications. For highest
496 performance, set <code>ExtendedStatus off</code> (which is the
501 <h3>accept Serialization - Multiple Sockets</h3>
505 <div class="warning"><h3>Warning:</h3>
506 <p>This section has not been fully updated
507 to take into account changes made in the 2.x version of the
508 Apache HTTP Server. Some of the information may still be
509 relevant, but please use it with care.</p>
512 <p>This discusses a shortcoming in the Unix socket API. Suppose
513 your web server uses multiple <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code> statements to listen on either multiple
514 ports or multiple addresses. In order to test each socket
515 to see if a connection is ready, Apache uses
516 <code>select(2)</code>. <code>select(2)</code> indicates that a
517 socket has <em>zero</em> or <em>at least one</em> connection
518 waiting on it. Apache's model includes multiple children, and
519 all the idle ones test for new connections at the same time. A
520 naive implementation looks something like this (these examples
521 do not match the code, they're contrived for pedagogical
524 <pre class="prettyprint lang-c"> for (;;) {
528 FD_ZERO (&accept_fds);
529 for (i = first_socket; i <= last_socket; ++i) {
530 FD_SET (i, &accept_fds);
532 rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL);
533 if (rc < 1) continue;
535 for (i = first_socket; i <= last_socket; ++i) {
536 if (FD_ISSET (i, &accept_fds)) {
537 new_connection = accept (i, NULL, NULL);
538 if (new_connection != -1) break;
541 if (new_connection != -1) break;
543 process_the(new_connection);
547 <p>But this naive implementation has a serious starvation problem.
548 Recall that multiple children execute this loop at the same
549 time, and so multiple children will block at
550 <code>select</code> when they are in between requests. All
551 those blocked children will awaken and return from
552 <code>select</code> when a single request appears on any socket.
553 (The number of children which awaken varies depending on the
554 operating system and timing issues.) They will all then fall
555 down into the loop and try to <code>accept</code> the
556 connection. But only one will succeed (assuming there's still
557 only one connection ready). The rest will be <em>blocked</em>
558 in <code>accept</code>. This effectively locks those children
559 into serving requests from that one socket and no other
560 sockets, and they'll be stuck there until enough new requests
561 appear on that socket to wake them all up. This starvation
562 problem was first documented in <a href="http://bugs.apache.org/index/full/467">PR#467</a>. There
563 are at least two solutions.</p>
565 <p>One solution is to make the sockets non-blocking. In this
566 case the <code>accept</code> won't block the children, and they
567 will be allowed to continue immediately. But this wastes CPU
568 time. Suppose you have ten idle children in
569 <code>select</code>, and one connection arrives. Then nine of
570 those children will wake up, try to <code>accept</code> the
571 connection, fail, and loop back into <code>select</code>,
572 accomplishing nothing. Meanwhile none of those children are
573 servicing requests that occurred on other sockets until they
574 get back up to the <code>select</code> again. Overall this
575 solution does not seem very fruitful unless you have as many
576 idle CPUs (in a multiprocessor box) as you have idle children
577 (not a very likely situation).</p>
579 <p>Another solution, the one used by Apache, is to serialize
580 entry into the inner loop. The loop looks like this
581 (differences highlighted):</p>
583 <pre class="prettyprint lang-c"> for (;;) {
584 <strong>accept_mutex_on ();</strong>
588 FD_ZERO (&accept_fds);
589 for (i = first_socket; i <= last_socket; ++i) {
590 FD_SET (i, &accept_fds);
592 rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL);
593 if (rc < 1) continue;
595 for (i = first_socket; i <= last_socket; ++i) {
596 if (FD_ISSET (i, &accept_fds)) {
597 new_connection = accept (i, NULL, NULL);
598 if (new_connection != -1) break;
601 if (new_connection != -1) break;
603 <strong>accept_mutex_off ();</strong>
604 process the new_connection;
608 <p><a id="serialize" name="serialize">The functions</a>
609 <code>accept_mutex_on</code> and <code>accept_mutex_off</code>
610 implement a mutual exclusion semaphore. Only one child can have
611 the mutex at any time. There are several choices for
612 implementing these mutexes. The choice is defined in
613 <code>src/conf.h</code> (pre-1.3) or
614 <code>src/include/ap_config.h</code> (1.3 or later). Some
615 architectures do not have any locking choice made, on these
616 architectures it is unsafe to use multiple
617 <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code>
620 <p>The <code class="directive"><a href="../mod/core.html#mutex">Mutex</a></code> directive can
621 be used to change the mutex implementation of the
622 <code>mpm-accept</code> mutex at run-time. Special considerations
623 for different mutex implementations are documented with that
626 <p>Another solution that has been considered but never
627 implemented is to partially serialize the loop -- that is, let
628 in a certain number of processes. This would only be of
629 interest on multiprocessor boxes where it's possible that multiple
630 children could run simultaneously, and the serialization
631 actually doesn't take advantage of the full bandwidth. This is
632 a possible area of future investigation, but priority remains
633 low because highly parallel web servers are not the norm.</p>
635 <p>Ideally you should run servers without multiple
636 <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code>
637 statements if you want the highest performance.
642 <h3>accept Serialization - Single Socket</h3>
646 <p>The above is fine and dandy for multiple socket servers, but
647 what about single socket servers? In theory they shouldn't
648 experience any of these same problems because all children can
649 just block in <code>accept(2)</code> until a connection
650 arrives, and no starvation results. In practice this hides
651 almost the same "spinning" behavior discussed above in the
652 non-blocking solution. The way that most TCP stacks are
653 implemented, the kernel actually wakes up all processes blocked
654 in <code>accept</code> when a single connection arrives. One of
655 those processes gets the connection and returns to user-space.
656 The rest spin in the kernel and go back to sleep when they
657 discover there's no connection for them. This spinning is
658 hidden from the user-land code, but it's there nonetheless.
659 This can result in the same load-spiking wasteful behavior
660 that a non-blocking solution to the multiple sockets case
663 <p>For this reason we have found that many architectures behave
664 more "nicely" if we serialize even the single socket case. So
665 this is actually the default in almost all cases. Crude
666 experiments under Linux (2.0.30 on a dual Pentium pro 166
667 w/128Mb RAM) have shown that the serialization of the single
668 socket case causes less than a 3% decrease in requests per
669 second over unserialized single-socket. But unserialized
670 single-socket showed an extra 100ms latency on each request.
671 This latency is probably a wash on long haul lines, and only an
672 issue on LANs. If you want to override the single socket
673 serialization, you can define
674 <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code>, and then
675 single-socket servers will not serialize at all.</p>
679 <h3>Lingering Close</h3>
683 <p>As discussed in <a href="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt">
684 draft-ietf-http-connection-00.txt</a> section 8, in order for
685 an HTTP server to <strong>reliably</strong> implement the
686 protocol, it needs to shut down each direction of the
687 communication independently. (Recall that a TCP connection is
688 bi-directional. Each half is independent of the other.)</p>
690 <p>When this feature was added to Apache, it caused a flurry of
691 problems on various versions of Unix because of shortsightedness.
692 The TCP specification does not state that the <code>FIN_WAIT_2</code>
693 state has a timeout, but it doesn't prohibit it.
694 On systems without the timeout, Apache 1.2 induces many sockets
695 stuck forever in the <code>FIN_WAIT_2</code> state. In many cases this
696 can be avoided by simply upgrading to the latest TCP/IP patches
697 supplied by the vendor. In cases where the vendor has never
698 released patches (<em>i.e.</em>, SunOS4 -- although folks with
699 a source license can patch it themselves), we have decided to
700 disable this feature.</p>
702 <p>There are two ways to accomplish this. One is the socket
703 option <code>SO_LINGER</code>. But as fate would have it, this
704 has never been implemented properly in most TCP/IP stacks. Even
705 on those stacks with a proper implementation (<em>i.e.</em>,
706 Linux 2.0.31), this method proves to be more expensive (cputime)
707 than the next solution.</p>
709 <p>For the most part, Apache implements this in a function
710 called <code>lingering_close</code> (in
711 <code>http_main.c</code>). The function looks roughly like
714 <pre class="prettyprint lang-c"> void lingering_close (int s)
716 char junk_buffer[2048];
718 /* shutdown the sending side */
721 signal (SIGALRM, lingering_death);
725 select (s for reading, 2 second timeout);
727 if (s is ready for reading) {
728 if (read (s, junk_buffer, sizeof (junk_buffer)) <= 0) {
731 /* just toss away whatever is here */
739 <p>This naturally adds some expense at the end of a connection,
740 but it is required for a reliable implementation. As HTTP/1.1
741 becomes more prevalent, and all connections are persistent,
742 this expense will be amortized over more requests. If you want
743 to play with fire and disable this feature, you can define
744 <code>NO_LINGCLOSE</code>, but this is not recommended at all.
745 In particular, as HTTP/1.1 pipelined persistent connections
746 come into use, <code>lingering_close</code> is an absolute
747 necessity (and <a href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
748 pipelined connections are faster</a>, so you want to support
753 <h3>Scoreboard File</h3>
757 <p>Apache's parent and children communicate with each other
758 through something called the scoreboard. Ideally this should be
759 implemented in shared memory. For those operating systems that
760 we either have access to, or have been given detailed ports
761 for, it typically is implemented using shared memory. The rest
762 default to using an on-disk file. The on-disk file is not only
763 slow, but it is unreliable (and less featured). Peruse the
764 <code>src/main/conf.h</code> file for your architecture, and
765 look for either <code>USE_MMAP_SCOREBOARD</code> or
766 <code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two
767 (as well as their companions <code>HAVE_MMAP</code> and
768 <code>HAVE_SHMGET</code> respectively) enables the supplied
769 shared memory code. If your system has another type of shared
770 memory, edit the file <code>src/main/http_main.c</code> and add
771 the hooks necessary to use it in Apache. (Send us back a patch
774 <div class="note">Historical note: The Linux port of Apache didn't start to
775 use shared memory until version 1.2 of Apache. This oversight
776 resulted in really poor and unreliable behavior of earlier
777 versions of Apache on Linux.</div>
781 <h3>DYNAMIC_MODULE_LIMIT</h3>
785 <p>If you have no intention of using dynamically loaded modules
786 (you probably don't if you're reading this and tuning your
787 server for every last ounce of performance), then you should add
788 <code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your
789 server. This will save RAM that's allocated only for supporting
790 dynamically loaded modules.</p>
794 </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
795 <div class="section">
796 <h2><a name="trace" id="trace">Appendix: Detailed Analysis of a Trace</a><a title="Permanent link" href="#trace" class="permalink">¶</a></h2>
800 <p>Here is a system call trace of Apache 2.0.38 with the worker MPM
801 on Solaris 8. This trace was collected using:</p>
803 <div class="example"><p><code>
804 truss -l -p <var>httpd_child_pid</var>.
807 <p>The <code>-l</code> option tells truss to log the ID of the
808 LWP (lightweight process--Solaris' form of kernel-level thread)
809 that invokes each system call.</p>
811 <p>Other systems may have different system call tracing utilities
812 such as <code>strace</code>, <code>ktrace</code>, or <code>par</code>.
813 They all produce similar output.</p>
815 <p>In this trace, a client has requested a 10KB static file
816 from the httpd. Traces of non-static requests or requests
817 with content negotiation look wildly different (and quite ugly
820 <div class="example"><pre>/67: accept(3, 0x00200BEC, 0x00200C0C, 1) (sleeping...)
821 /67: accept(3, 0x00200BEC, 0x00200C0C, 1) = 9</pre></div>
823 <p>In this trace, the listener thread is running within LWP #67.</p>
825 <div class="note">Note the lack of <code>accept(2)</code> serialization. On this
826 particular platform, the worker MPM uses an unserialized accept by
827 default unless it is listening on multiple ports.</div>
829 <div class="example"><pre>/65: lwp_park(0x00000000, 0) = 0
830 /67: lwp_unpark(65, 1) = 0</pre></div>
832 <p>Upon accepting the connection, the listener thread wakes up
833 a worker thread to do the request processing. In this trace,
834 the worker thread that handles the request is mapped to LWP #65.</p>
836 <div class="example"><pre>/65: getsockname(9, 0x00200BA4, 0x00200BC4, 1) = 0</pre></div>
838 <p>In order to implement virtual hosts, Apache needs to know
839 the local socket address used to accept the connection. It
840 is possible to eliminate this call in many situations (such
841 as when there are no virtual hosts, or when
842 <code class="directive"><a href="../mod/mpm_common.html#listen">Listen</a></code> directives
843 are used which do not have wildcard addresses). But
844 no effort has yet been made to do these optimizations. </p>
846 <div class="example"><pre>/65: brk(0x002170E8) = 0
847 /65: brk(0x002190E8) = 0</pre></div>
849 <p>The <code>brk(2)</code> calls allocate memory from the heap.
850 It is rare to see these in a system call trace, because the httpd
851 uses custom memory allocators (<code>apr_pool</code> and
852 <code>apr_bucket_alloc</code>) for most request processing.
853 In this trace, the httpd has just been started, so it must
854 call <code>malloc(3)</code> to get the blocks of raw memory
855 with which to create the custom memory allocators.</p>
857 <div class="example"><pre>/65: fcntl(9, F_GETFL, 0x00000000) = 2
858 /65: fstat64(9, 0xFAF7B818) = 0
859 /65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B910, 2190656) = 0
860 /65: fstat64(9, 0xFAF7B818) = 0
861 /65: getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B914, 2190656) = 0
862 /65: setsockopt(9, 65535, 8192, 0xFAF7B918, 4, 2190656) = 0
863 /65: fcntl(9, F_SETFL, 0x00000082) = 0</pre></div>
865 <p>Next, the worker thread puts the connection to the client (file
866 descriptor 9) in non-blocking mode. The <code>setsockopt(2)</code>
867 and <code>getsockopt(2)</code> calls are a side-effect of how
868 Solaris' libc handles <code>fcntl(2)</code> on sockets.</p>
870 <div class="example"><pre>/65: read(9, " G E T / 1 0 k . h t m".., 8000) = 97</pre></div>
872 <p>The worker thread reads the request from the client.</p>
874 <div class="example"><pre>/65: stat("/var/httpd/apache/httpd-8999/htdocs/10k.html", 0xFAF7B978) = 0
875 /65: open("/var/httpd/apache/httpd-8999/htdocs/10k.html", O_RDONLY) = 10</pre></div>
877 <p>This httpd has been configured with <code>Options FollowSymLinks</code>
878 and <code>AllowOverride None</code>. Thus it doesn't need to
879 <code>lstat(2)</code> each directory in the path leading up to the
880 requested file, nor check for <code>.htaccess</code> files.
881 It simply calls <code>stat(2)</code> to verify that the file:
882 1) exists, and 2) is a regular file, not a directory.</p>
884 <div class="example"><pre>/65: sendfilev(0, 9, 0x00200F90, 2, 0xFAF7B53C) = 10269</pre></div>
886 <p>In this example, the httpd is able to send the HTTP response
887 header and the requested file with a single <code>sendfilev(2)</code>
888 system call. Sendfile semantics vary among operating systems. On some other
889 systems, it is necessary to do a <code>write(2)</code> or
890 <code>writev(2)</code> call to send the headers before calling
891 <code>sendfile(2)</code>.</p>
893 <div class="example"><pre>/65: write(4, " 1 2 7 . 0 . 0 . 1 - ".., 78) = 78</pre></div>
895 <p>This <code>write(2)</code> call records the request in the
896 access log. Note that one thing missing from this trace is a
897 <code>time(2)</code> call. Unlike Apache 1.3, Apache 2.x uses
898 <code>gettimeofday(3)</code> to look up the time. On some operating
899 systems, like Linux or Solaris, <code>gettimeofday</code> has an
900 optimized implementation that doesn't require as much overhead
901 as a typical system call.</p>
903 <div class="example"><pre>/65: shutdown(9, 1, 1) = 0
904 /65: poll(0xFAF7B980, 1, 2000) = 1
905 /65: read(9, 0xFAF7BC20, 512) = 0
906 /65: close(9) = 0</pre></div>
908 <p>The worker thread does a lingering close of the connection.</p>
910 <div class="example"><pre>/65: close(10) = 0
911 /65: lwp_park(0x00000000, 0) (sleeping...)</pre></div>
913 <p>Finally the worker thread closes the file that it has just delivered
914 and blocks until the listener assigns it another connection.</p>
916 <div class="example"><pre>/67: accept(3, 0x001FEB74, 0x001FEB94, 1) (sleeping...)</pre></div>
918 <p>Meanwhile, the listener thread is able to accept another connection
919 as soon as it has dispatched this connection to a worker thread (subject
920 to some flow-control logic in the worker MPM that throttles the listener
921 if all the available workers are busy). Though it isn't apparent from
922 this trace, the next <code>accept(2)</code> can (and usually does, under
923 high load conditions) occur in parallel with the worker thread's handling
924 of the just-accepted connection.</p>
927 <div class="bottomlang">
928 <p><span>Available Languages: </span><a href="../en/misc/perf-tuning.html" title="English"> en </a> |
929 <a href="../fr/misc/perf-tuning.html" hreflang="fr" rel="alternate" title="Français"> fr </a> |
930 <a href="../ko/misc/perf-tuning.html" hreflang="ko" rel="alternate" title="Korean"> ko </a> |
931 <a href="../tr/misc/perf-tuning.html" hreflang="tr" rel="alternate" title="Türkçe"> tr </a></p>
932 </div><div class="top"><a href="#page-header"><img src="../images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed again by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Freenode, or sent to our <a href="http://httpd.apache.org/lists.html">mailing lists</a>.</div>
933 <script type="text/javascript"><!--//--><![CDATA[//><!--
934 var comments_shortname = 'httpd';
935 var comments_identifier = 'http://httpd.apache.org/docs/trunk/misc/perf-tuning.html';
937 if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
938 d.write('<div id="comments_thread"><\/div>');
939 var s = d.createElement('script');
940 s.type = 'text/javascript';
942 s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
943 (d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
946 d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
948 })(window, document);
949 //--><!]]></script></div><div id="footer">
950 <p class="apache">Copyright 2018 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
951 <p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/quickreference.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
952 if (typeof(prettyPrint) !== 'undefined') {