-<?xml version="1.0" encoding="ISO-8859-1"?>\r
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\r
-<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--\r
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r
- This file is generated from xml source: DO NOT EDIT\r
- XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r
- -->\r
-<title>Performance Scaling\r
- - Apache HTTP Server</title>\r
-<link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />\r
-<link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />\r
-<link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" />\r
-<link href="../images/favicon.ico" rel="shortcut icon" /></head>\r
-<body id="manual-page"><div id="page-header">\r
-<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>\r
-<p class="apache">Apache HTTP Server Version 2.5</p>\r
-<img alt="" src="../images/feather.gif" /></div>\r
-<div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>\r
-<div id="path">\r
-<a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.5</a></div><div id="page-content"><div id="preamble"><h1>Performance Scaling\r
- </h1>\r
-<div class="toplang">\r
-<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a></p>\r
-</div>\r
-\r
- \r
- <p>The Performance Tuning page in the Apache 1.3 documentation says: \r
- </p>\r
- <ul>\r
- <li>“Apache is a general webserver, which is designed to be\r
- correct first, and fast\r
- second. Even so, its performance is quite satisfactory. Most\r
- sites have less than 10Mbits of outgoing bandwidth, which\r
- Apache can fill using only a low end Pentium-based\r
- webserver.” \r
- </li>\r
- </ul>\r
- <p>However, this sentence was written a few years ago, and in the\r
- meantime several things have happened. On one hand, web server\r
- hardware has become much faster. On the other hand, many sites now\r
- are allowed much more than ten megabits per second of outgoing\r
- bandwidth. In addition, web applications have become more complex.\r
- The classic brochureware site is alive and well, but the web has\r
- grown up substantially as a computing application platform and\r
- webmasters may find themselves running dynamic content in Perl, PHP\r
- or Java, all of which take a toll on performance. \r
- </p>\r
- <p>Therefore, in spite of strides forward in machine speed and\r
- bandwidth allowances, web server performance and web application\r
- performance remain areas of concern. In this documentation several\r
- aspects of web server performance will be discussed. \r
- </p>\r
- \r
- </div>\r
-<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#What Will and Will Not Be Discussed">What Will and Will Not Be Discussed\r
- </a></li>\r
-<li><img alt="" src="../images/down.gif" /> <a href="#Monitoring Your Server">Monitoring Your Server\r
- </a></li>\r
-<li><img alt="" src="../images/down.gif" /> <a href="#Configuring for Performance">Configuring for Performance\r
- </a></li>\r
-<li><img alt="" src="../images/down.gif" /> <a href="#Caching Content">Caching Content\r
- </a></li>\r
-<li><img alt="" src="../images/down.gif" /> <a href="#Further Considerations">Further Considerations\r
- </a></li>\r
-</ul></div>\r
-<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>\r
-<div class="section">\r
-<h2><a name="What Will and Will Not Be Discussed" id="What Will and Will Not Be Discussed">What Will and Will Not Be Discussed\r
- </a></h2>\r
- \r
- <p>The session will focus on easily accessible configuration and tuning\r
- options for Apache httpd 2.2 and 2.3 as well as monitoring tools.\r
- Monitoring tools will allow you to observe your web server to\r
- gather information about its performance, or lack thereof.\r
- We'll assume that you don't have an unlimited budget for\r
- server hardware, so the existing infrastructure will have to do the\r
- job. You have no desire to compile your own Apache, or to recompile\r
- the operating system kernel. We do assume, though, that you have\r
- some familiarity with the Apache httpd configuration file. \r
- </p>\r
- \r
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>\r
-<div class="section">\r
-<h2><a name="Monitoring Your Server" id="Monitoring Your Server">Monitoring Your Server\r
- </a></h2>\r
- \r
- <p>The first task when sizing or performance-tuning your server is to\r
- find out how your system is currently performing. By monitoring\r
- your server under real-world load, or artificially generated load,\r
- you can extrapolate its behavior under stress, such as when your\r
- site is mentioned on Slashdot. \r
- </p>\r
- \r
- \r
- <h3><a name="Monitoring Tools" id="Monitoring Tools">Monitoring Tools\r
- </a></h3>\r
- \r
- \r
- \r
- <h4><a name="top" id="top">top\r
- </a></h4>\r
- \r
- <p>The top tool ships with Linux and FreeBSD. Solaris offers\r
- `prstat'. It collects a number of statistics for the\r
- system and for each running process, then displays them\r
- interactively on your terminal. The data displayed is\r
- refreshed every second and varies by platform, but\r
- typically includes system load average, number of processes\r
- and their current states, the percent CPU(s) time spent\r
- executing user and system code, and the state of the\r
- virtual memory system. The data displayed for each process\r
- is typically configurable and includes its process name and\r
- ID, priority and nice values, memory footprint, and\r
- percentage CPU usage. The following example shows multiple\r
- httpd processes (with MPM worker and event) running on an\r
- Linux (Xen) system: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- top - 23:10:58 up 71 days, 6:14, 4 users, load average:\r
- 0.25, 0.53, 0.47<br />\r
- Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, \r
- 0 zombie<br />\r
- Cpu(s): 11.6%us, 0.7%sy, 0.0%ni, 87.3%id, 0.4%wa, \r
- 0.0%hi, 0.0%si, 0.0%st<br />\r
- Mem: 2621656k total, 2178684k used, 442972k free, \r
- 100500k buffers<br />\r
- Swap: 4194296k total, 860584k used, 3333712k free, \r
- 1157552k cached<br />\r
- <br />\r
- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ \r
- COMMAND<br />\r
- 16687 example_ 20 0 1200m 547m 179m S 45 21.4 \r
- 1:09.59 httpd-worker<br />\r
- 15195 www 20 0 441m 33m 2468 S 0 1.3 \r
- 0:41.41 httpd-worker<br />\r
- 1 root 20 0 10312 328 308 S 0 0.0 0:33.17\r
- init<br />\r
- 2 root 15 -5 0 0 0 S 0 0.0 0:00.00\r
- kthreadd<br />\r
- 3 root RT -5 0 0 0 S 0 0.0 0:00.14\r
- migration/0<br />\r
- 4 root 15 -5 0 0 0 S 0 0.0 0:04.58\r
- ksoftirqd/0<br />\r
- 5 root RT -5 0 0 0 S 0 0.0 4:45.89\r
- watchdog/0<br />\r
- 6 root 15 -5 0 0 0 S 0 0.0 1:42.52\r
- events/0<br />\r
- 7 root 15 -5 0 0 0 S 0 0.0 0:00.00\r
- khelper<br />\r
- 19 root 15 -5 0 0 0 S 0 0.0 0:00.00\r
- xenwatch<br />\r
- 20 root 15 -5 0 0 0 S 0 0.0 0:00.00\r
- xenbus<br />\r
- 28 root RT -5 0 0 0 S 0 0.0 0:00.14\r
- migration/1<br />\r
- 29 root 15 -5 0 0 0 S 0 0.0 0:00.20\r
- ksoftirqd/1<br />\r
- 30 root RT -5 0 0 0 S 0 0.0 0:05.96\r
- watchdog/1<br />\r
- 31 root 15 -5 0 0 0 S 0 0.0 1:18.35\r
- events/1<br />\r
- 32 root RT -5 0 0 0 S 0 0.0 0:00.08\r
- migration/2<br />\r
- 33 root 15 -5 0 0 0 S 0 0.0 0:00.18\r
- ksoftirqd/2<br />\r
- 34 root RT -5 0 0 0 S 0 0.0 0:06.00\r
- watchdog/2<br />\r
- 35 root 15 -5 0 0 0 S 0 0.0 1:08.39\r
- events/2<br />\r
- 36 root RT -5 0 0 0 S 0 0.0 0:00.10\r
- migration/3<br />\r
- 37 root 15 -5 0 0 0 S 0 0.0 0:00.16\r
- ksoftirqd/3<br />\r
- 38 root RT -5 0 0 0 S 0 0.0 0:06.08\r
- watchdog/3<br />\r
- 39 root 15 -5 0 0 0 S 0 0.0 1:22.81\r
- events/3<br />\r
- 68 root 15 -5 0 0 0 S 0 0.0 0:06.28\r
- kblockd/0<br />\r
- 69 root 15 -5 0 0 0 S 0 0.0 0:00.04\r
- kblockd/1<br />\r
- 70 root 15 -5 0 0 0 S 0 0.0 0:00.04\r
- kblockd/2\r
- </code></p></div>\r
- \r
- <p>Top is a wonderful tool even though it’s slightly resource\r
- intensive (when running, its own process is usually in the\r
- top ten CPU gluttons). It is indispensable in determining\r
- the size of a running process, which comes in handy when\r
- determining how many server processes you can run on your\r
- machine. How to do this is described in '<a href="/httpd/PerformanceScalingUp#S">\r
- sizing MaxClients\r
- </a>\r
- '. Top is, however, an interactive tool and running it\r
- continuously has few if any advantages. \r
- </p>\r
- \r
- <h4><a name="free" id="free">free\r
- </a></h4>\r
- \r
- <p>This command is only available on Linux. It shows how much\r
- memory and swap space is in use. Linux allocates unused\r
- memory as file system cache. The free command shows usage\r
- both with and without this cache. The free command can be\r
- used to find out how much memory the operating system is\r
- using, as described in the paragraph '<a href="/httpd/PerformanceScalingUp#S">\r
- Sizing MaxClients\r
- </a>\r
- '. The output of free looks like this: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- sctemme@brutus:~$ free<br />\r
- total used free shared buffers cached<br />\r
- Mem: 4026028 3901892 124136 0 253144\r
- 841044<br />\r
- -/+ buffers/cache: 2807704 1218324<br />\r
- Swap: 3903784 12540 3891244\r
- </code></p></div>\r
- \r
- \r
- <h4><a name="vmstat" id="vmstat">vmstat\r
- </a></h4>\r
- \r
- <p>This command is available on many unix platforms. It\r
- displays a large number of operating system metrics. Run\r
- without argument, it displays a status line for that\r
- moment. When a numeric argument is added, the status is\r
- redisplayed at designated intervals. For example, <code>\r
- vmstat 5\r
- </code>\r
- causes the information to reappear every five seconds.\r
- Vmstat displays the amount of virtual memory in use, how\r
- much memory is swapped in and out each second, the number\r
- of processes currently running and sleeping, the number of\r
- interrupts and context switches per second and the usage\r
- percentages of the CPU. \r
- </p>\r
- <p>\r
- The following is <code>vmstat\r
- </code>\r
- output of an idle server: \r
- </p>\r
- \r
- \r
- <div class="example"><p><code>\r
- [sctemme@GayDeceiver sctemme]$ vmstat 5 3<br />\r
- procs memory swap io \r
- system cpu<br />\r
- r b w swpd free buff cache si so bi bo in \r
- cs us sy i<br />\r
- 0 0 0 0 186252 6688 37516 0 0 12 5 47 \r
- 311 0 1 9<br />\r
- 0 0 0 0 186244 6696 37516 0 0 0 16 41 \r
- 314 0 0 10<br />\r
- 0 0 0 0 186236 6704 37516 0 0 0 9 44 \r
- 314 0 0 100\r
- </code></p></div>\r
- \r
- <p>And this is output of a server that is under a load of one\r
- hundred simultaneous connections fetching static content: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- sctemme@GayDeceiver sctemme]$ vmstat 5 3<br />\r
- procs memory swap io \r
- system cpu<br />\r
- r b w swpd free buff cache si so bi bo in \r
- cs us sy id<br />\r
- 1 0 1 0 162580 6848 40056 0 0 11 5 150 \r
- 324 1 1 98<br />\r
- 6 0 1 0 163280 6856 40248 0 0 0 66 6384\r
- 1117 42 25 32<br />\r
- 11 0 0 0 162780 6864 40436 0 0 0 61 6309\r
- 1165 33 28 40\r
- </code></p></div>\r
- \r
- <p>The first line gives averages since the last reboot. The\r
- subsequent lines give information for five second\r
- intervals. The second argument tells vmstat to generate\r
- three reports and then exit. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="SE Toolkit" id="SE Toolkit">SE Toolkit\r
- </a></h4>\r
- \r
- <p>The SE Toolkit is a system monitoring toolkit for Solaris.\r
- Its programming language is based on the C preprocessor and\r
- comes with a number of sample scripts. It can use both the\r
- command line and the GUI to display information. It can\r
- also be programmed to apply rules to the system data. The\r
- example script shown in Figure 2, Zoom.se, shows green,\r
- orange or red indicators when utilization of various parts\r
- of the system rises above certain thresholds. Another\r
- included script, Virtual Adrian, applies performance tuning\r
- metrics according to. \r
- </p>\r
- <p>The SE Toolkit has drifted around for a while and has had\r
- several owners since its inception. It seems that it has\r
- now found a final home at Sunfreeware.com, where it can be\r
- downloaded at no charge. There is a single package for\r
- Solaris 8, 9 and 10 on SPARC and x86, and includes source\r
- code. SE Toolkit author Richard Pettit has started a new\r
- company, Captive Metrics4 that plans to bring to market a\r
- multiplatform monitoring tool built on the same principles\r
- as SE Toolkit, written in Java. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="DTrace" id="DTrace">DTrace\r
- </a></h4>\r
- \r
- <p>Given that DTrace is available for Solaris, FreeBSD and OS\r
- X, it might be worth exploring it. There's also\r
- mod_dtrace available for httpd. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="mod_status" id="mod_status">mod_status\r
- </a></h4>\r
- \r
- <p>The mod_status module gives an overview of the server\r
- performance at a given moment. It generates an HTML page\r
- with, among others, the number of Apache processes running\r
- and how many bytes each has served, and the CPU load caused\r
- by httpd and the rest of the system. The Apache Software\r
- Foundation uses mod_status on its own <a href="http://apache.org/server-status">\r
- web site\r
- </a>\r
- .If you put the <code>ExtendedStatus On\r
- </code>\r
- directive in your <code>httpd.conf\r
- </code>\r
- ,the <code>mod_status\r
- </code>\r
- page will give you more information at the cost of a little\r
- extra work per request. \r
- </p>\r
- \r
- \r
- \r
- \r
- <h3><a name="Web Server Log Files" id="Web Server Log Files">Web Server Log Files\r
- </a></h3>\r
- \r
- <p>Monitoring and analyzing the log files httpd writes is one of\r
- the most effective ways to keep track of your server health and\r
- performance. Monitoring the error log allows you to detect\r
- error conditions, discover attacks and find performance issues.\r
- Analyzing the access logs tells you how busy your server is,\r
- which resources are the most popular and where your users come\r
- from. Historical log file data can give you invaluable insight\r
- into trends in access to your server, which allows you to\r
- predict when your performance needs will overtake your server\r
- capacity. \r
- </p>\r
- \r
- \r
- <h4><a name="Error Log" id="Error Log">Error Log\r
- </a></h4>\r
- \r
- <p>The error log will contain messages if the server has\r
- reached the maximum number of active processes or the\r
- maximum number of concurrently open files. The error log\r
- also reflects when processes are being spawned at a\r
- higher-than-usual rate in response to a sudden increase in\r
- load. When the server starts, the stderr file descriptor is\r
- redirected to the error logfile, so any error encountered\r
- by httpd after it opens its logfiles will appear in this\r
- log. This makes it good practice to review the error log\r
- frequently. \r
- </p>\r
- <p>Before Apache httpd opens its logfiles, any errors will be\r
- written to the stderr stream. If you start httpd manually,\r
- this error information will appear on your terminal and you\r
- can use it directly to troubleshoot your server. If your\r
- httpd is started by a startup script, the destination of\r
- early error messages depends on their design. The <code>\r
- /var/log/messages\r
- </code>\r
- file is usually a good bet. On Windows, early error\r
- messages are written to the Applications Event Log, which\r
- can be viewed through the Event Viewer in Administrative\r
- Tools. \r
- </p>\r
- <p>\r
- The Error Log is configured through the <code>ErrorLog\r
- </code>\r
- and <code>LogLevel\r
- </code>\r
- configuration directives. The error log of httpd’s main\r
- server configuration receives the log messages that pertain\r
- to the entire server: startup, shutdown, crashes, excessive\r
- process spawns, etc. The <code>ErrorLog\r
- </code>\r
- directive can also be used in virtual host containers. The\r
- error log of a virtual host receives only log messages\r
- specific to that virtual host, such as authentication\r
- failures and 'File not Found' errors. \r
- </p>\r
- <p>On a server that is visible to the Internet, expect to see a\r
- lot of exploit attempt and worm attacks in the error log. A\r
- lot of these will be targeted at other server platforms\r
- instead of Apache, but the current state of affairs is that\r
- attack scripts just throw everything they have at any open\r
- port, regardless of which server is actually running or\r
- what applications might be installed. You could block these\r
- attempts using a firewall or <a href="http://www.modsecurity.org/">\r
- mod_security\r
- </a>\r
- ,but this falls outside the scope of this discussion. \r
- </p>\r
- <p>\r
- The <code>LogLevel\r
- </code>\r
- directive determines the level of detail included in the\r
- logs. There are eight log levels as described here: \r
- </p>\r
- <table>\r
- <tr>\r
- <td>\r
- <p>\r
- <strong>Level\r
- </strong>\r
- </p>\r
- </td>\r
- <td>\r
- <p>\r
- <strong>Description\r
- </strong>\r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> emerg \r
- </p>\r
- </td>\r
- <td>\r
- <p> Emergencies - system is unusable. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> alert \r
- </p>\r
- </td>\r
- <td>\r
- <p> Action must be taken immediately. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> crit \r
- </p>\r
- </td>\r
- <td>\r
- <p> Critical Conditions. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> error \r
- </p>\r
- </td>\r
- <td>\r
- <p> Error conditions. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> warn \r
- </p>\r
- </td>\r
- <td>\r
- <p> Warning conditions. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> notice \r
- </p>\r
- </td>\r
- <td>\r
- <p> Normal but significant condition. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> info \r
- </p>\r
- </td>\r
- <td>\r
- <p> Informational. \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> debug \r
- </p>\r
- </td>\r
- <td>\r
- <p> Debug-level messages \r
- </p>\r
- </td>\r
- </tr>\r
- </table>\r
- <p>The default log level is warn. A production server should\r
- not be run on debug, but increasing the level of detail in\r
- the error log can be useful during troubleshooting.\r
- Starting with 2.3.8 <code>LogLevel\r
- </code>\r
- can be specified on a per module basis: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- LogLevel debug mod_ssl:warn\r
- </code></p></div>\r
- \r
- <p>\r
- This puts all of the server in debug mode, except for <code>\r
- mod_ssl\r
- </code>\r
- ,which tends to be very noisy. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Access Log" id="Access Log">Access Log\r
- </a></h4>\r
- \r
- <p>Apache httpd keeps track of every request it services in its\r
- access log file. In addition to the time and nature of a\r
- request, httpd can log the client IP address, date and time\r
- of the request, the result and a host of other information.\r
- The various logging format features are documented in the <a href="http://httpd.apache.org/docs/current/mod/core.html#loglevel">\r
- manual\r
- </a>\r
- .This file exists by default for the main server and can be\r
- configured per virtual host by using the <code>TransferLog\r
- </code>\r
- or <code>CustomLog\r
- </code>\r
- configuration directive. \r
- </p>\r
- <p>The access logs can be analyzed with any of several free and\r
- commercially available programs. Popular free analysis\r
- packages include Analog and Webalizer. Log analysis should\r
- be done offline so the web server machine is not burdened\r
- by processing the log files. Most log analysis packages\r
- understand the Common Log Format. The fields in the log\r
- lines are explained in in the following: \r
- </p>\r
- \r
- \r
- <div class="example"><p><code>\r
- 195.54.228.42 - - [24/Mar/2007:23:05:11 -0400] "GET\r
- /sander/feed/ HTTP/1.1" 200 9747<br />\r
- 64.34.165.214 - - [24/Mar/2007:23:10:11 -0400] "GET\r
- /sander/feed/atom HTTP/1.1" 200 9068<br />\r
- 60.28.164.72 - - [24/Mar/2007:23:11:41 -0400] "GET /\r
- HTTP/1.0" 200 618<br />\r
- 85.140.155.56 - - [24/Mar/2007:23:14:12 -0400] "GET\r
- /sander/2006/09/27/44/ HTTP/1.1" 200 14172<br />\r
- 85.140.155.56 - - [24/Mar/2007:23:14:15 -0400] "GET\r
- /sander/2006/09/21/gore-tax-pollution/ HTTP/1.1" 200 15147<br />\r
- 74.6.72.187 - - [24/Mar/2007:23:18:11 -0400] "GET\r
- /sander/2006/09/27/44/ HTTP/1.0" 200 14172<br />\r
- 74.6.72.229 - - [24/Mar/2007:23:24:22 -0400] "GET\r
- /sander/2006/11/21/os-java/ HTTP/1.0" 200 13457\r
- </code></p></div>\r
- \r
- <table>\r
- <tr>\r
- <td>\r
- <p>\r
- <strong>Field\r
- </strong>\r
- </p>\r
- </td>\r
- <td>\r
- <p>\r
- <strong>Content\r
- </strong>\r
- </p>\r
- </td>\r
- <td>\r
- <p>\r
- <strong>Explanation\r
- </strong>\r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> Client IP \r
- </p>\r
- </td>\r
- <td>\r
- <p> 195.54.228.42 \r
- </p>\r
- </td>\r
- <td>\r
- <p> IP address where the request originated \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> RFC 1413 ident \r
- </p>\r
- </td>\r
- <td>\r
- <p> - \r
- </p>\r
- </td>\r
- <td>\r
- <p> Remote user identity as reported by their\r
- identd \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> username \r
- </p>\r
- </td>\r
- <td>\r
- <p> - \r
- </p>\r
- </td>\r
- <td>\r
- <p> Remote username as authenticated by Apache \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> timestamp \r
- </p>\r
- </td>\r
- <td>\r
- <p> [24/Mar/2007:23:05:11 -0400] \r
- </p>\r
- </td>\r
- <td>\r
- <p> Date and time of request \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> Request \r
- </p>\r
- </td>\r
- <td>\r
- <p> "GET /sander/feed/ HTTP/1.1" \r
- </p>\r
- </td>\r
- <td>\r
- <p> Request line \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> Status Code \r
- </p>\r
- </td>\r
- <td>\r
- <p> 200 \r
- </p>\r
- </td>\r
- <td>\r
- <p> Response code \r
- </p>\r
- </td>\r
- </tr>\r
- <tr>\r
- <td>\r
- <p> Content Bytes \r
- </p>\r
- </td>\r
- <td>\r
- <p> 9747 \r
- </p>\r
- </td>\r
- <td>\r
- <p> Bytes transferred w/o headers \r
- </p>\r
- </td>\r
- </tr>\r
- </table>\r
- \r
- \r
- <h4><a name="Rotating Log Files" id="Rotating Log Files">Rotating Log Files\r
- </a></h4>\r
- \r
- <p>There are several reasons to rotate logfiles. Even though\r
- almost no operating systems out there have a hard file size\r
- limit of two Gigabytes anymore, log files simply become too\r
- large to handle over time. Additionally, any periodic log\r
- file analysis should not be performed on files to which the\r
- server is actively writing. Periodic logfile rotation helps\r
- keep the analysis job manageable, and allows you to keep a\r
- closer eye on usage trends. \r
- </p>\r
- <p>On unix systems, you can simply rotate logfiles by giving\r
- the old file a new name using mv. The server will keep\r
- writing to the open file even though it has a new name.\r
- When you send a graceful restart signal to the server, it\r
- will open a new logfile with the configured name. For\r
- example, you could run a script from cron like this: \r
- </p>\r
- \r
- \r
- <div class="example"><p><code>\r
- APACHE=/usr/local/apache2<br />\r
- HTTPD=$APACHE/bin/httpd<br />\r
- mv $APACHE/logs/access_log\r
- $APACHE/logarchive/access_log-‘date +%F‘<br />\r
- $HTTPD -k graceful\r
- </code></p></div>\r
- \r
- <p>This approach also works on Windows, just not as smoothly.\r
- While the httpd process on your Windows server will keep\r
- writing to the log file after it has been renamed, the\r
- Windows Service that runs Apache can not do a graceful\r
- restart. Restarting a Service on Windows means stopping it\r
- and then starting it again. The advantage of a graceful\r
- restart is that the httpd child processes get to complete\r
- responding to their current requests before they exit.\r
- Meanwhile, the httpd server becomes immediately available\r
- again to serve new requests. The stop-start that the\r
- Windows Service has to perform will interrupt any requests\r
- currently in progress, and the server is unavailable until\r
- it is started again. Plan for this when you decide the\r
- timing of your restarts. \r
- </p>\r
- <p>\r
- A second approach is to use piped logs. From the <code>\r
- CustomLog\r
- </code>\r
- ,<code>TransferLog\r
- </code>\r
- or <code>ErrorLog\r
- </code>\r
- directives you can send the log data into any program using\r
- a pipe character (<code>|\r
- </code>\r
- ). For instance: \r
- </p>\r
- \r
- <div class="example"><p><code>CustomLog "|/usr/local/apache2/bin/rotatelogs\r
- /var/log/access_log 86400" common\r
- </code></p></div>\r
- \r
- <p>The program on the other end of the pipe will receive the\r
- Apache log data on its stdin stream, and can do with this\r
- data whatever it wants. The rotatelogs program that comes\r
- with Apache seamlessly turns over the log file based on\r
- time elapsed or the amount of data written, and leaves the\r
- old log files with a timestamp suffix to its name. This\r
- method for rotating logfiles works well on unix platforms,\r
- but is currently broken on Windows. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Logging and Performance" id="Logging and Performance">Logging and Performance\r
- </a></h4>\r
- \r
- <p>Writing entries to the Apache log files obviously takes some\r
- effort, but the information gathered from the logs is so\r
- valuable that under normal circumstances logging should not\r
- be turned off. For optimal performance, you should put your\r
- disk-based site content on a different physical disk than\r
- the server log files: the access patterns are very\r
- different. Retrieving content from disk is a read operation\r
- in a fairly random pattern, and log files are written to\r
- disk sequentially. \r
- </p>\r
- <p>\r
- Do not run a production server with your error <code>\r
- LogLevel\r
- </code>\r
- set to debug. This log level causes a vast amount of\r
- information to be written to the error log, including, in\r
- the case of SSL access, complete dumps of BIO read and\r
- write operations. The performance implications are\r
- significant: use the default warn level instead. \r
- </p>\r
- <p>If your server has more than one virtual host, you may give\r
- each virtual host a separate access logfile. This makes it\r
- easier to analyze the logfile later. However, if your\r
- server has many virtual hosts, all the open logfiles put a\r
- resource burden on your system, and it may be preferable to\r
- log to a single file. Use the <code>%v\r
- </code>\r
- format character at the start of your <a href="/httpd/LogFormat" class="nonexistent">\r
- LogFormat\r
- </a>\r
- and starting 2.3.8 of your <code>ErrorLogFormat\r
- </code>\r
- to make httpd print the hostname of the virtual host that\r
- received the request or the error at the beginning of each\r
- log line. A simple Perl script can split out the log file\r
- after it rotates: one is included with the Apache source\r
- under <code>support/split-logfile\r
- </code>\r
- .\r
- </p>\r
- <p>\r
- You can use the <code>BufferedLogs\r
- </code>\r
- directive to have Apache collect several log lines in\r
- memory before writing them to disk. This might yield better\r
- performance, but could affect the order in which the\r
- server's log is written. \r
- </p>\r
- \r
- \r
- \r
- \r
- <h3><a name="Generating A Test Load" id="Generating A Test Load">Generating A Test Load\r
- </a></h3>\r
- \r
- <p>It is useful to generate a test load to monitor system\r
- performance under realistic operating circumstances. Besides\r
- commercial packages such as <a href="/httpd/LoadRunner" class="nonexistent">\r
- LoadRunner\r
- </a>\r
- ,there are a number of freely available tools to generate a\r
- test load against your web server. \r
- </p>\r
- <ul>\r
- <li>Apache ships with a test program called ab, short for\r
- Apache Bench. It can generate a web server load by\r
- repeatedly asking for the same file in rapid succession.\r
- You can specify a number of concurrent connections and have\r
- the program run for either a given amount of time or a\r
- specified number of requests. \r
- </li>\r
- <li>Another freely available load generator is http load11 .\r
- This program works with a URL file and can be compiled with\r
- SSL support. \r
- </li>\r
- <li>The Apache Software Foundation offers a tool named flood12\r
- . Flood is a fairly sophisticated program that is\r
- configured through an XML file. \r
- </li>\r
- <li>Finally, JMeter13 , a Jakarta subproject, is an all-Java\r
- load-testing tool. While early versions of this application\r
- were slow and difficult to use, the current version 2.1.1\r
- seems to be versatile and useful. \r
- </li>\r
- <li>\r
- <p>ASF external projects, that have proven to be quite\r
- good: grinder, httperf, tsung, <a href="/httpd/FunkLoad" class="nonexistent">\r
- FunkLoad\r
- </a>\r
- </p>\r
- </li>\r
- </ul>\r
- <p>When you load-test your web server, please keep in mind that if\r
- that server is in production, the test load may negatively\r
- affect the server’s response. Also, any data traffic you\r
- generate may be charged against your monthly traffic allowance.\r
- </p>\r
- \r
- \r
- \r
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>\r
-<div class="section">\r
-<h2><a name="Configuring for Performance" id="Configuring for Performance">Configuring for Performance\r
- </a></h2>\r
- \r
- \r
- \r
- <h3><a name="Apache Configuration" id="Apache Configuration">Apache Configuration\r
- </a></h3>\r
- \r
- <p>The Apache 2.2 httpd is by default a pre-forking web server.\r
- When the server starts, the parent process spawns a number of\r
- child processes that do the actual work of servicing requests.\r
- But Apache httpd 2.0 introduced the concept of the\r
- Multi-Processing Module (MPM). Developers can write MPMs to\r
- suit the process- or threadingarchitecture of their specific\r
- operating system. Apache 2 comes with special MPMs for Windows,\r
- OS/2, Netware and BeOS. On unix-like platforms, the two most\r
- popular MPMs are Prefork and Worker. The Prefork MPM offers the\r
- same pre-forking process model that Apache 1.3 uses. The Worker\r
- MPM runs a smaller number of child processes, and spawns\r
- multiple request handling threads within each child process. In\r
- 2.3+ MPMs are no longer hard-wired. They too can be exchanged\r
- via <a href="/httpd/LoadModule" class="nonexistent">LoadModule\r
- </a>\r
- .The default MPM in 2.3 is the event MPM. \r
- </p>\r
- <p>The maximum number of workers, be they pre-forked child\r
- processes or threads within a process, is an indication of how\r
- many requests your server can manage concurrently. It is merely\r
- a rough estimate because the kernel can queue connection\r
- attempts for your web server. When your site becomes busy and\r
- the maximum number of workers is running, the machine\r
- doesn't hit a hard limit beyond which clients will be\r
- denied access. However, once requests start backing up, system\r
- performance is likely to degrade. \r
- </p>\r
- \r
- \r
- <h4><a name="MaxClients" id="MaxClients">MaxClients\r
- </a></h4>\r
- \r
- <p>\r
- The <code>MaxClients\r
- </code>\r
- directive in your Apache httpd configuration file specifies\r
- the maximum number of workers your server can create. It\r
- has two related directives, <code>MinSpareServers\r
- </code>\r
- and <code>MaxSpareServers\r
- </code>\r
- ,which specify the number of workers Apache keeps waiting\r
- in the wings ready to serve requests. The absolute maximum\r
- number of processes is configurable through the <code>\r
- ServerLimit\r
- </code>\r
- directive. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Spinning Threads" id="Spinning Threads">Spinning Threads\r
- </a></h4>\r
- \r
- <p>For the prefork MPM of the above directives are all there is\r
- to determining the process limit. However, if you are\r
- running a threaded MPM the situation is a little more\r
- complicated. Threaded MPMs support the <code>\r
- ThreadsPerChild\r
- </code>\r
- directive1 . Apache requires that <code>MaxClients\r
- </code>\r
- is evenly divisible by <code>ThreadsPerChild\r
- </code>\r
- .If you set either directive to a number that doesn’t\r
- meet this requirement, Apache will send a message of\r
- complaint to the error log and adjust the <code>\r
- ThreadsPerChild\r
- </code>\r
- value downwards until it is an even factor of <code>\r
- MaxClients\r
- </code>\r
- .\r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Sizing MaxClients" id="Sizing MaxClients">Sizing MaxClients\r
- </a></h4>\r
- \r
- <p>Optimally, the maximum number of processes should be set so\r
- that all the memory on your system is used, but no more. If\r
- your system gets so overloaded that it needs to heavily\r
- swap core memory out to disk, performance will degrade\r
- quickly. The formula for determining <code>MaxClients\r
- </code>\r
- is fairly simple: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- total RAM − RAM for OS − RAM for external programs<br />\r
- MaxClients =\r
- -------------------------------------------------------<br />\r
- RAM per httpd process\r
- </code></p></div>\r
- \r
- <p>The various amounts of memory allocated for the OS, external\r
- programs and the httpd processes is best determined by\r
- observation: use the top and free commands described above\r
- to determine the memory footprint of the OS without the web\r
- server running. You can also determine the footprint of a\r
- typical web server process from top: most top\r
- implementations have a Resident Size (RSS) column and a\r
- Shared Memory column. \r
- </p>\r
- <p>The difference between these two is the amount of memory\r
- per-process. The shared segment really exists only once and\r
- is used for the code and libraries loaded and the dynamic\r
- inter-process tally, or 'scoreboard,' that Apache\r
- keeps. How much memory each process takes for itself\r
- depends heavily on the number and kind of modules you use.\r
- The best approach to use in determining this need is to\r
- generate a typical test load against your web site and see\r
- how large the httpd processes become. \r
- </p>\r
- <p>The RAM for external programs parameter is intended mostly\r
- for CGI programs and scripts that run outside the web\r
- server process. However, if you have a Java virtual machine\r
- running Tomcat on the same box it will need a significant\r
- amount of memory as well. The above assessment should give\r
- you an idea how far you can push <code>MaxClients\r
- </code>\r
- ,but it is not an exact science. When in doubt, be\r
- conservative and use a low <code>MaxClients\r
- </code>\r
- value. The Linux kernel will put extra memory to good use\r
- for caching disk access. On Solaris you need enough\r
- available real RAM memory to create any process. If no real\r
- memory is available, httpd will start writing ‘No space\r
- left on device’ messages to the error log and be unable\r
- to fork additional child processes, so a higher <code>\r
- MaxClients\r
- </code>\r
- value may actually be a disadvantage. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Selecting your MPM" id="Selecting your MPM">Selecting your MPM\r
- </a></h4>\r
- \r
- <p>The prime reason for selecting a threaded MPM is that\r
- threads consume fewer system resources than processes, and\r
- it takes less effort for the system to switch between\r
- threads. This is more true for some operating systems than\r
- for others. On systems like Solaris and AIX, manipulating\r
- processes is relatively expensive in terms of system\r
- resources. On these systems, running a threaded MPM makes\r
- sense. On Linux, the threading implementation actually uses\r
- one process for each thread. Linux processes are relatively\r
- lightweight, but it means that a threaded MPM offers less\r
- of a performance advantage than in other environments. \r
- </p>\r
- <p>Running a threaded MPM can cause stability problems in some\r
- situations For instance, should a child process of a\r
- preforked MPM crash, at most one client connection is\r
- affected. However, if a threaded child crashes, all the\r
- threads in that process disappear, which means all the\r
- clients currently being served by that process will see\r
- their connection aborted. Additionally, there may be\r
- so-called "thread-safety" issues, especially with\r
- third-party libraries. In threaded applications, threads\r
- may access the same variables indiscriminately, not knowing\r
- whether a variable may have been changed by another thread.\r
- </p>\r
- <p>This has been a sore point within the PHP community. The PHP\r
- processor heavily relies on third-party libraries and\r
- cannot guarantee that all of these are thread-safe. The\r
- good news is that if you are running Apache on Linux, you\r
- can run PHP in the preforked MPM without fear of losing too\r
- much performance relative to the threaded option. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Spinning Locks" id="Spinning Locks">Spinning Locks\r
- </a></h4>\r
- \r
- <p>Apache httpd maintains an inter-process lock around its\r
- network listener. For all practical purposes, this means\r
- that only one httpd child process can receive a request at\r
- any given time. The other processes are either servicing\r
- requests already received or are 'camping out' on\r
- the lock, waiting for the network listener to become\r
- available. This process is best visualized as a revolving\r
- door, with only one process allowed in the door at any\r
- time. On a heavily loaded web server with requests arriving\r
- constantly, the door spins quickly and requests are\r
- accepted at a steady rate. On a lightly loaded web server,\r
- the process that currently "holds" the lock may\r
- have to stay in the door for a while, during which all the\r
- other processes sit idle, waiting to acquire the lock. At\r
- this time, the parent process may decide to terminate some\r
- children based on its <code>MaxSpareServers\r
- </code>\r
- directive. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="The Thundering Herd" id="The Thundering Herd">The Thundering Herd\r
- </a></h4>\r
- \r
- <p>The function of the 'accept mutex' (as this\r
- inter-process lock is called) is to keep request reception\r
- moving along in an orderly fashion. If the lock is absent,\r
- the server may exhibit the Thundering Herd syndrome. \r
- </p>\r
- <p>Consider an American Football team poised on the line of\r
- scrimmage. If the football players were Apache processes\r
- all team members would go for the ball simultaneously at\r
- the snap. One process would get it, and all the others\r
- would have to lumber back to the line for the next snap. In\r
- this metaphor, the accept mutex acts as the quarterback,\r
- delivering the connection "ball" to the\r
- appropriate player process. \r
- </p>\r
- <p>Moving this much information around is obviously a lot of\r
- work, and, like a smart person, a smart web server tries to\r
- avoid it whenever possible. Hence the revolving door\r
- construction. In recent years, many operating systems,\r
- including Linux and Solaris, have put code in place to\r
- prevent the Thundering Herd syndrome. Apache recognizes\r
- this and if you run with just one network listener, meaning\r
- one virtual host or just the main server, Apache will\r
- refrain from using an accept mutex. If you run with\r
- multiple listeners (for instance because you have a virtual\r
- host serving SSL requests), it will activate the accept\r
- mutex to avoid internal conflicts. \r
- </p>\r
- <p>\r
- You can manipulate the accept mutex with the <code>\r
- AcceptMutex\r
- </code>\r
- directive. Besides turning the accept mutex off, you can\r
- select the locking mechanism. Common locking mechanisms\r
- include fcntl, System V Semaphores and pthread locking. Not\r
- all are available on every platform, and their availability\r
- also depends on compile-time settings. The various locking\r
- mechanisms may place specific demands on system resources:\r
- manipulate them with care. \r
- </p>\r
- <p>There is no compelling reason to disable the accept mutex.\r
- Apache automatically recognizes the single listener\r
- situation described above and knows if it is safe to run\r
- without mutex on your platform. \r
- </p>\r
- \r
- \r
- \r
- \r
- <h3><a name="Tuning the Operating System" id="Tuning the Operating System">Tuning the Operating System\r
- </a></h3>\r
- \r
- <p>People often look for the 'magic tune-up' that will\r
- make their system perform four times as fast by tweaking just\r
- one little setting. The truth is, present-day UNIX derivatives\r
- are pretty well adjusted straight out of the box and there is\r
- not a lot that needs to be done to make them perform optimally.\r
- However, there are a few things that an administrator can do to\r
- improve performance. \r
- </p>\r
- \r
- \r
- <h4><a name="RAM and Swap Space" id="RAM and Swap Space">RAM and Swap Space\r
- </a></h4>\r
- \r
- <p>The usual mantra regarding RAM is "more is\r
- better". As discussed above, unused RAM is put to good\r
- use as file system cache. The Apache processes get bigger\r
- if you load more modules, especially if you use modules\r
- that generate dynamic page content within the processes,\r
- like PHP and mod_perl. A large configuration file-with many\r
- virtual hosts-also tends to inflate the process footprint.\r
- Having ample RAM allows you to run Apache with more child\r
- processes, which allows the server to process more\r
- concurrent requests. \r
- </p>\r
- <p>While the various platforms treat their virtual memory in\r
- different ways, it is never a good idea to run with less\r
- disk-based swap space than RAM. The virtual memory system\r
- is designed to provide a fallback for RAM, but when you\r
- don't have disk space available and run out of\r
- swappable memory, your machine grinds to a halt. This can\r
- crash your box, requiring a physical reboot for which your\r
- hosting facility may charge you. \r
- </p>\r
- <p>Also, such an outage naturally occurs when you least want\r
- it: when the world has found your website and is beating a\r
- path to your door. If you have enough disk-based swap space\r
- available and the machine gets overloaded, it may get very,\r
- very slow as the system needs to swap memory pages to disk\r
- and back, but when the load decreases the system should\r
- recover. Remember, you still have <code>MaxClients\r
- </code>\r
- to keep things in hand. \r
- </p>\r
- <p>Most unix-like operating systems use designated disk\r
- partitions for swap space. When a system starts up it finds\r
- all swap partitions on the disk(s), by partition type or\r
- because they are listed in the file <code>/etc/fstab\r
- </code>\r
- ,and automatically enables them. When adding a disk or\r
- installing the operating system, be sure to allocate enough\r
- swap space to accommodate eventual RAM upgrades.\r
- Reassigning disk space on a running system is a cumbersome\r
- process. \r
- </p>\r
- <p>Plan for available hard drive swap space of at least twice\r
- your amount of RAM, perhaps up to four times in situations\r
- with frequent peaking loads. Remember to adjust this\r
- configuration whenever you upgrade RAM on your system. In a\r
- pinch, you can use a regular file as swap space. For\r
- instructions on how to do this, see the manual pages for\r
- the <code>mkswap\r
- </code>\r
- and <code>swapon\r
- </code>\r
- or <code>swap\r
- </code>\r
- programs. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="ulimit: Files and Processes" id="ulimit: Files and Processes">ulimit: Files and Processes\r
- </a></h4>\r
- \r
- <p>Given a machine with plenty of RAM and processor capacity,\r
- you can run hundreds of Apache processes if necessary. . .\r
- and if your kernel allows it. \r
- </p>\r
- <p>Consider a situation in which several hundred web servers\r
- are running; if some of these need to spawn CGI processes,\r
- the maximum number of processes would occur quickly. \r
- </p>\r
- <p>However, you can change this limit with the command \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- ulimit [-H|-S] -u [newvalue]\r
- </code></p></div>\r
- \r
- <p>This must be changed before starting the server, since the\r
- new value will only be available to the current shell and\r
- programs started from it. In newer Linux kernels the\r
- default has been raised to 2048. On FreeBSD, the number\r
- seems to be the rather unusual 513. In the default user\r
- shell on this system, <code>csh\r
- </code>\r
- the equivalent is <code>limit\r
- </code>\r
- and works analogous the the Bourne-like <code>ulimit\r
- </code>\r
- :\r
- </p>\r
- \r
- <div class="example"><p><code>\r
- limit [-h] maxproc [newvalue]\r
- </code></p></div>\r
- \r
- <p>Similarly, the kernel may limit the number of open files per\r
- process. This is generally not a problem for pre-forked\r
- servers, which just handle one request at a time per\r
- process. Threaded servers, however, serve many requests per\r
- process and much more easily run out of available file\r
- descriptors. You can increase the maximum number of open\r
- files per process by running the \r
- </p>\r
- \r
- <div class="example"><p><code>ulimit -n [newvalue]\r
- </code></p></div>\r
- \r
- <p>command. Once again, this must be done prior to starting\r
- Apache. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Setting User Limits on System Startup" id="Setting User Limits on System Startup">Setting User Limits on System Startup\r
- </a></h4>\r
- \r
- <p>Under Linux, you can set the ulimit parameters on bootup by\r
- editing the <code>/etc/security/limits.conf\r
- </code>\r
- file. This file allows you to set soft and hard limits on a\r
- per-user or per-group basis; the file contains commentary\r
- explaining the options. To enable this, make sure that the\r
- file <code>/etc/pam.d/login\r
- </code>\r
- contains the line \r
- </p>\r
- \r
- <div class="example"><p><code>session required /lib/security/pam_limits.so\r
- </code></p></div>\r
- \r
- <p>All items can have a 'soft' and a 'hard'\r
- limit: the first is the default setting and the second the\r
- maximum value for that item. \r
- </p>\r
- <p>\r
- In FreeBSD's <code>/etc/login.conf\r
- </code>\r
- these resources can be limited or extended system wide,\r
- analogously to <code>limits.conf\r
- </code>\r
- .'Soft' limits can be specified with <code>-cur\r
- </code>\r
- and 'hard' limits with <code>-max\r
- </code>\r
- .\r
- </p>\r
- <p>Solaris has a similar mechanism for manipulating limit\r
- values at boot time: In <code>/etc/system\r
- </code>\r
- you can set kernel tunables valid for the entire system at\r
- boot time. These are the same tunables that can be set with\r
- the <code>mdb\r
- </code>\r
- kernel debugger during run time. The soft and hard limit\r
- corresponding to ulimit -u can be set via: \r
- </p>\r
- \r
- <div class="example"><p><code>\r
- set rlim_fd_max=65536<br />\r
- set rlim_fd_cur=2048\r
- </code></p></div>\r
- \r
- <p>Solaris calculates the maximum number of allowed processes\r
- per user (<code>maxuprc\r
- </code>\r
- )based on the total amount available memory on the system (<code>\r
- maxusers\r
- </code>\r
- ). You can review the numbers with \r
- </p>\r
- \r
- <div class="example"><p><code>sysdef -i | grep maximum\r
- </code></p></div>\r
- \r
- <p>but it is not recommended to change them. \r
- </p>\r
- \r
- \r
- \r
- <h4><a name="Turn Off Unused Services and Modules" id="Turn Off Unused Services and Modules">Turn Off Unused Services and Modules\r
- </a></h4>\r
- \r
- <p>Many UNIX and Linux distributions come with a slew of\r
- services turned on by default. You probably need few of\r
- them. For example, your web server does not need to be\r
- running sendmail, nor is it likely to be an NFS server,\r
- etc. Turn them off. \r
- </p>\r
- <p>On Red Hat Linux, the chkconfig tool will help you do this\r
- from the command line. On Solaris systems <code>svcs\r
- </code>\r
- and <code>svcadm\r
- </code>\r
- will show which services are enabled and disable them\r
- respectively. \r
- </p>\r
- <p>In a similar fashion, cast a critical eye on the Apache\r
- modules you load. Most binary distributions of Apache\r
- httpd, and pre-installed versions that come with Linux\r
- distributions, have their modules enabled through the <code>\r
- LoadModule\r
- </code>\r
- directive. \r
- </p>\r
- <p>Unused modules may be culled: if you don't rely on\r
- their functionality and configuration directives, you can\r
- turn them off by commenting out the corresponding <code>\r
- LoadModule\r
- </code>\r
- lines. Read the documentation on each module’s\r
- functionality before deciding whether to keep it enabled.\r
- While the performance overhead of an unused module is\r
- small, it's also unnecessary. \r
- </p>\r
- \r
- \r
- \r
- \r
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>\r
-<div class="section">\r
-<h2><a name="Caching Content" id="Caching Content">Caching Content\r
- </a></h2>\r
- \r
- <p>Requests for dynamically generated content usually take\r
- significantly more resources than requests for static content.\r
- Static content consists of simple filespages, images, etc.-on disk\r
- that are very efficiently served. Many operating systems also\r
- automatically cache the contents of frequently accessed files in\r
- memory. \r
- </p>\r
- <p>Processing dynamic requests, on the contrary, can be much more\r
- involved. Running CGI scripts, handing off requests to an external\r
- application server and accessing database content can introduce\r
- significant latency and processing load to a busy web server. Under\r
- many circumstances, performance can be improved by turning popular\r
- dynamic requests into static requests. In this section, two\r
- approaches to this will be discussed. \r
- </p>\r
- \r
- \r
- <h3><a name="Making Popular Pages Static" id="Making Popular Pages Static">Making Popular Pages Static\r
- </a></h3>\r
- \r
- <p>By pre-rendering the response pages for the most popular queries\r
- in your application, you can gain a significant performance\r
- improvement without giving up the flexibility of dynamically\r
- generated content. For instance, if your application is a\r
- flower delivery service, you would probably want to pre-render\r
- your catalog pages for red roses during the weeks leading up to\r
- Valentine's Day. When the user searches for red roses,\r
- they are served the pre-rendered page. Queries for, say, yellow\r
- roses will be generated directly from the database. The\r
- mod_rewrite module included with Apache is a great tool to\r
- implement these substitutions. \r
- </p>\r
- \r
- \r
- <h4><a name="Example: A Statically Rendered Blog" id="Example: A Statically Rendered Blog">Example: A Statically Rendered Blog\r
- </a></h4>\r
- \r
- <p>\r
- <strong>'we should provide a more useful example here.\r
- One showing how to make Wordpress or Drupal suck less.\r
- </strong>\r
- ' \r
- </p>\r
- <p>Blosxom is a lightweight web log package that runs as a CGI.\r
- It is written in Perl and uses plain text files for entry\r
- input. Besides running as CGI, Blosxom can be run from the\r
- command line to pre-render blog pages. Pre-rendering pages\r
- to static HTML can yield a significant performance boost in\r
- the event that large numbers of people actually start\r
- reading your blog. \r
- </p>\r
- <p>To run blosxom for static page generation, edit the CGI\r
- script according to the documentation. Set the $static dir\r
- variable to the <code>DocumentRoot\r
- </code>\r
- of the web server, and run the script from the command line\r
- as follows: \r
- </p>\r
- \r
- <div class="example"><p><code>$ perl blosxom.cgi -password='whateveryourpassword'\r
- </code></p></div>\r
- \r
- <p>This can be run periodically from Cron, after you upload\r
- content, etc. To make Apache substitute the statically\r
- rendered pages for the dynamic content, we’ll use\r
- mod_rewrite. This module is included with the Apache source\r
- code, but is not compiled by default. It can be built with\r
- the server by passing the option <code>\r
- --enable-rewrite[=shared]\r
- </code>\r
- to the configure command. Many binary distributions of\r
- Apache come with mod_rewrite included. The following is an\r
- example of an Apache virtual host that takes advantage of\r
- pre-rendered blog pages: \r
- </p>\r
- \r
- <div class="example"><p><code>Listen *:8001<br />\r
- <VirtualHost *:8001><br />\r
- <span class="indent">\r
- ServerName blog.sandla.org:8001<br />\r
- ServerAdmin sander@temme.net<br />\r
- DocumentRoot "/home/sctemme/inst/blog/httpd/htdocs"<br />\r
- <Directory\r
- "/home/sctemme/inst/blog/httpd/htdocs"><br />\r
- <span class="indent">\r
- Options +Indexes<br />\r
- Order allow,deny<br />\r
- Allow from all<br />\r
- RewriteEngine on<br />\r
- RewriteCond %{REQUEST_FILENAME} !-f<br />\r
- RewriteCond %{REQUEST_FILENAME} !-d<br />\r
- RewriteRule ^(.*)$ /cgi-bin/blosxom.cgi/$1 [L,QSA]<br />\r
- </span>\r
- </Directory><br />\r
- RewriteLog\r
- /home/sctemme/inst/blog/httpd/logs/rewrite_log<br />\r
- RewriteLogLevel 9<br />\r
- ErrorLog /home/sctemme/inst/blog/httpd/logs/error_log<br />\r
- LogLevel debug<br />\r
- CustomLog /home/sctemme/inst/blog/httpd/logs/access_log\r
- common<br />\r
- ScriptAlias /cgi-bin/ /home/sctemme/inst/blog/bin/<br />\r
- <Directory "/home/sctemme/inst/blog/bin"><br />\r
- <span class="indent">\r
- Options +ExecCGI<br />\r
- Order allow,deny<br />\r
- Allow from all<br />\r
- </span>\r
- </Directory><br />\r
- </span>\r
- </VirtualHost>\r
- </code></p></div>\r
- \r
- <p>\r
- The <code>RewriteCond\r
- </code>\r
- and <code>RewriteRule\r
- </code>\r
- directives say that, if the requested resource does not\r
- exist as a file or a directory, its path is passed to the\r
- Blosxom CGI for rendering. Blosxom uses Path Info to\r
- specify blog entries and index pages, so this means that if\r
- a particular path under Blosxom exists as a static file in\r
- the file system, the file is served instead. Any request\r
- that isn't pre- rendered is served by the CGI. This\r
- means that individual entries, which show the comments, are\r
- always served by the CGI which in turn means that your\r
- comment spam is always visible. This configuration also\r
- hides the Blosxom CGI from the user-visible URL in their\r
- Location bar. mod_rewrite is a fantastically powerful and\r
- versatile module: investigate it to arrive at a\r
- configuration that is best for your situation. \r
- </p>\r
- \r
- \r
- \r
- \r
- <h3><a name="Caching Content With mod_cache" id="Caching Content With mod_cache">Caching Content With mod_cache\r
- </a></h3>\r
- \r
- <p>The mod_cache module provides intelligent caching of HTTP\r
- responses: it is aware of the expiration timing and content\r
- requirements that are part of the HTTP specification. The\r
- mod_cache module caches URL response content. If content sent\r
- to the client is considered cacheable, it is saved to disk.\r
- Subsequent requests for that URL will be served directly from\r
- the cache. The provider module for mod_cache, mod_disk_cache,\r
- determines how the cached content is stored on disk. Most\r
- server systems will have more disk available than memory, and\r
- it's good to note that some operating system kernels cache\r
- frequently accessed disk content transparently in memory, so\r
- replicating this in the server is not very useful. \r
- </p>\r
- <p>To enable efficient content caching and avoid presenting the\r
- user with stale or invalid content, the application that\r
- generates the actual content has to send the correct response\r
- headers. Without headers like <code>Etag:\r
- </code>\r
- ,<code>Last-Modified:\r
- </code>\r
- or <code>Expires:\r
- </code>\r
- ,mod_cache can not make the right decision on whether to cache\r
- the content, serve it from cache or leave it alone. When\r
- testing content caching, you may find that you need to modify\r
- your application or, if this is impossible, selectively disable\r
- caching for URLs that cause problems. The mod_cache modules are\r
- not compiled by default, but can be enabled by passing the\r
- option <code>--enable-cache[=shared]\r
- </code>\r
- to the configure script. If you use a binary distribution of\r
- Apache httpd, or it came with your port or package collection,\r
- it may have mod_cache already included. \r
- </p>\r
- \r
- \r
- <h4><a name="Example: wiki.apache.org" id="Example: wiki.apache.org">Example: wiki.apache.org\r
- </a></h4>\r
- \r
- <p>\r
- <strong>'Is this still the case? Maybe we should give\r
- a better example here too.\r
- </strong>\r
- </p>\r
- <p>\r
- The Apache Software Foundation Wiki is served by <a href="/httpd/MoinMoin">\r
- MoinMoin\r
- </a>\r
- .<a href="/httpd/MoinMoin">MoinMoin\r
- </a>\r
- is written in Python and runs as a CGI. To date, any\r
- attempts to run it under mod_python has been unsuccessful.\r
- The CGI proved to place an untenably high load on the\r
- server machine, especially when the Wiki was being indexed\r
- by search engines like Google. To lighten the load on the\r
- server machine, the Apache Infrastructure team turned to\r
- mod_cache. It turned out <a href="/httpd/MoinMoin">MoinMoin\r
- </a>\r
- needed a small patch to ensure proper behavior behind the\r
- caching server: certain requests can never be cached and\r
- the corresponding Python modules were patched to send the\r
- proper HTTP response headers. After this modification, the\r
- cache in front of the Wiki was enabled with the following\r
- configuration snippet in <code>httpd.conf\r
- </code>\r
- :\r
- </p>\r
- \r
- <div class="example"><p><code>\r
- CacheRoot /raid1/cacheroot<br />\r
- CacheEnable disk /<br />\r
- # A page modified 100 minutes ago will expire in 10 minutes<br />\r
- CacheLastModifiedFactor .1<br />\r
- # Always check again after 6 hours<br />\r
- CacheMaxExpire 21600\r
- </code></p></div>\r
- \r
- <p>This configuration will try to cache any and all content\r
- within its virtual host. It will never cache content for\r
- more than six hours (the <code>CacheMaxExpire\r
- </code>\r
- directive). If no <code>Expires:\r
- </code>\r
- header is present in the response, mod_cache will compute\r
- an expiration period from the <code>Last-Modified:\r
- </code>\r
- header. The computation using <code>CacheLastModifiedFactor\r
- </code>\r
- is based on the assumption that if a page was recently\r
- modified, it is likely to change again in the near future\r
- and will have to be re-cached. \r
- </p>\r
- <p>\r
- Do note that it can pay off to <em>disable\r
- </em>\r
- the <code>ETag:\r
- </code>\r
- header: For files smaller than 1k the server has to\r
- calculate the checksum (usually MD5) and then send out a <code>\r
- 304 Not Modified\r
- </code>\r
- response, which will take waste some CPU and still saturate\r
- the same amount of network resources for the transfer (one\r
- TCP packet). For resources larger than 1k it might prove\r
- CPU expensive to calculate the header for each request.\r
- Unfortunately there does currently not exist a way to cache\r
- these headers. \r
- </p>\r
- <div class="example"><p><code>\r
- <FilesMatch \.(jpe?g|png|gif|js|css|x?html|xml)><br />\r
- <span class="indent">\r
- FilesETag None<br />\r
- </span>\r
- </FilesMatch>\r
- </code></p></div>\r
- \r
- <p>\r
- This will disable the generation of the <code>ETag:\r
- </code>\r
- header for most static resources. The server does not\r
- calculate these headers for dynamic resources. \r
- </p>\r
- \r
- \r
- \r
- \r
- </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>\r
-<div class="section">\r
-<h2><a name="Further Considerations" id="Further Considerations">Further Considerations\r
- </a></h2>\r
- \r
- <p>Armed with the knowledge of how to tune a sytem to deliver the\r
- desired the performance, we will soon discover that <em>one\r
- </em>\r
- system might prove a bottleneck. How to make a system fit for\r
- growth, or how to put a number of systems into tune will be\r
- discussed in <a href="/httpd/PerformanceScalingOut">\r
- PerformanceScalingOut\r
- </a>\r
- .\r
- </p>\r
- </div></div>\r
-<div class="bottomlang">\r
-<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a></p>\r
-</div><div id="footer">\r
-<p class="apache">Copyright 2012 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>\r
-<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div>\r
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head><!--
+ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+ This file is generated from xml source: DO NOT EDIT
+ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+ -->
+<title>Performance Scaling
+ - Apache HTTP Server</title>
+<link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
+<link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
+<link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" />
+<link href="../images/favicon.ico" rel="shortcut icon" /></head>
+<body id="manual-page"><div id="page-header">
+<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
+<p class="apache">Apache HTTP Server Version 2.5</p>
+<img alt="" src="../images/feather.gif" /></div>
+<div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>
+<div id="path">
+<a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.5</a></div><div id="page-content"><div id="preamble"><h1>Performance Scaling
+ </h1>
+<div class="toplang">
+<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a></p>
+</div>
+
+
+ <p>The Performance Tuning page in the Apache 1.3 documentation says:
+ </p>
+ <ul>
+ <li>“Apache is a general webserver, which is designed to be
+ correct first, and fast
+ second. Even so, its performance is quite satisfactory. Most
+ sites have less than 10Mbits of outgoing bandwidth, which
+ Apache can fill using only a low end Pentium-based
+ webserver.”
+ </li>
+ </ul>
+ <p>However, this sentence was written a few years ago, and in the
+ meantime several things have happened. On one hand, web server
+ hardware has become much faster. On the other hand, many sites now
+ are allowed much more than ten megabits per second of outgoing
+ bandwidth. In addition, web applications have become more complex.
+ The classic brochureware site is alive and well, but the web has
+ grown up substantially as a computing application platform and
+ webmasters may find themselves running dynamic content in Perl, PHP
+ or Java, all of which take a toll on performance.
+ </p>
+ <p>Therefore, in spite of strides forward in machine speed and
+ bandwidth allowances, web server performance and web application
+ performance remain areas of concern. In this documentation several
+ aspects of web server performance will be discussed.
+ </p>
+
+ </div>
+<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#What Will and Will Not Be Discussed">What Will and Will Not Be Discussed
+ </a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#Monitoring Your Server">Monitoring Your Server
+ </a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#Configuring for Performance">Configuring for Performance
+ </a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#Caching Content">Caching Content
+ </a></li>
+<li><img alt="" src="../images/down.gif" /> <a href="#Further Considerations">Further Considerations
+ </a></li>
+</ul></div>
+<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="What Will and Will Not Be Discussed" id="What Will and Will Not Be Discussed">What Will and Will Not Be Discussed
+ </a></h2>
+
+ <p>The session will focus on easily accessible configuration and tuning
+ options for Apache httpd 2.2 and 2.3 as well as monitoring tools.
+ Monitoring tools will allow you to observe your web server to
+ gather information about its performance, or lack thereof.
+ We'll assume that you don't have an unlimited budget for
+ server hardware, so the existing infrastructure will have to do the
+ job. You have no desire to compile your own Apache, or to recompile
+ the operating system kernel. We do assume, though, that you have
+ some familiarity with the Apache httpd configuration file.
+ </p>
+
+ </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="Monitoring Your Server" id="Monitoring Your Server">Monitoring Your Server
+ </a></h2>
+
+ <p>The first task when sizing or performance-tuning your server is to
+ find out how your system is currently performing. By monitoring
+ your server under real-world load, or artificially generated load,
+ you can extrapolate its behavior under stress, such as when your
+ site is mentioned on Slashdot.
+ </p>
+
+
+ <h3><a name="Monitoring Tools" id="Monitoring Tools">Monitoring Tools
+ </a></h3>
+
+
+
+ <h4><a name="top" id="top">top
+ </a></h4>
+
+ <p>The top tool ships with Linux and FreeBSD. Solaris offers
+ `prstat'. It collects a number of statistics for the
+ system and for each running process, then displays them
+ interactively on your terminal. The data displayed is
+ refreshed every second and varies by platform, but
+ typically includes system load average, number of processes
+ and their current states, the percent CPU(s) time spent
+ executing user and system code, and the state of the
+ virtual memory system. The data displayed for each process
+ is typically configurable and includes its process name and
+ ID, priority and nice values, memory footprint, and
+ percentage CPU usage. The following example shows multiple
+ httpd processes (with MPM worker and event) running on an
+ Linux (Xen) system:
+ </p>
+
+ <div class="example"><p><code>
+ top - 23:10:58 up 71 days, 6:14, 4 users, load average:
+ 0.25, 0.53, 0.47<br />
+ Tasks: 163 total, 1 running, 162 sleeping, 0 stopped,
+ 0 zombie<br />
+ Cpu(s): 11.6%us, 0.7%sy, 0.0%ni, 87.3%id, 0.4%wa,
+ 0.0%hi, 0.0%si, 0.0%st<br />
+ Mem: 2621656k total, 2178684k used, 442972k free,
+ 100500k buffers<br />
+ Swap: 4194296k total, 860584k used, 3333712k free,
+ 1157552k cached<br />
+ <br />
+ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
+ COMMAND<br />
+ 16687 example_ 20 0 1200m 547m 179m S 45 21.4
+ 1:09.59 httpd-worker<br />
+ 15195 www 20 0 441m 33m 2468 S 0 1.3
+ 0:41.41 httpd-worker<br />
+ 1 root 20 0 10312 328 308 S 0 0.0 0:33.17
+ init<br />
+ 2 root 15 -5 0 0 0 S 0 0.0 0:00.00
+ kthreadd<br />
+ 3 root RT -5 0 0 0 S 0 0.0 0:00.14
+ migration/0<br />
+ 4 root 15 -5 0 0 0 S 0 0.0 0:04.58
+ ksoftirqd/0<br />
+ 5 root RT -5 0 0 0 S 0 0.0 4:45.89
+ watchdog/0<br />
+ 6 root 15 -5 0 0 0 S 0 0.0 1:42.52
+ events/0<br />
+ 7 root 15 -5 0 0 0 S 0 0.0 0:00.00
+ khelper<br />
+ 19 root 15 -5 0 0 0 S 0 0.0 0:00.00
+ xenwatch<br />
+ 20 root 15 -5 0 0 0 S 0 0.0 0:00.00
+ xenbus<br />
+ 28 root RT -5 0 0 0 S 0 0.0 0:00.14
+ migration/1<br />
+ 29 root 15 -5 0 0 0 S 0 0.0 0:00.20
+ ksoftirqd/1<br />
+ 30 root RT -5 0 0 0 S 0 0.0 0:05.96
+ watchdog/1<br />
+ 31 root 15 -5 0 0 0 S 0 0.0 1:18.35
+ events/1<br />
+ 32 root RT -5 0 0 0 S 0 0.0 0:00.08
+ migration/2<br />
+ 33 root 15 -5 0 0 0 S 0 0.0 0:00.18
+ ksoftirqd/2<br />
+ 34 root RT -5 0 0 0 S 0 0.0 0:06.00
+ watchdog/2<br />
+ 35 root 15 -5 0 0 0 S 0 0.0 1:08.39
+ events/2<br />
+ 36 root RT -5 0 0 0 S 0 0.0 0:00.10
+ migration/3<br />
+ 37 root 15 -5 0 0 0 S 0 0.0 0:00.16
+ ksoftirqd/3<br />
+ 38 root RT -5 0 0 0 S 0 0.0 0:06.08
+ watchdog/3<br />
+ 39 root 15 -5 0 0 0 S 0 0.0 1:22.81
+ events/3<br />
+ 68 root 15 -5 0 0 0 S 0 0.0 0:06.28
+ kblockd/0<br />
+ 69 root 15 -5 0 0 0 S 0 0.0 0:00.04
+ kblockd/1<br />
+ 70 root 15 -5 0 0 0 S 0 0.0 0:00.04
+ kblockd/2
+ </code></p></div>
+
+ <p>Top is a wonderful tool even though it’s slightly resource
+ intensive (when running, its own process is usually in the
+ top ten CPU gluttons). It is indispensable in determining
+ the size of a running process, which comes in handy when
+ determining how many server processes you can run on your
+ machine. How to do this is described in '<a href="/httpd/PerformanceScalingUp#S">
+ sizing MaxClients
+ </a>
+ '. Top is, however, an interactive tool and running it
+ continuously has few if any advantages.
+ </p>
+
+ <h4><a name="free" id="free">free
+ </a></h4>
+
+ <p>This command is only available on Linux. It shows how much
+ memory and swap space is in use. Linux allocates unused
+ memory as file system cache. The free command shows usage
+ both with and without this cache. The free command can be
+ used to find out how much memory the operating system is
+ using, as described in the paragraph '<a href="/httpd/PerformanceScalingUp#S">
+ Sizing MaxClients
+ </a>
+ '. The output of free looks like this:
+ </p>
+
+ <div class="example"><p><code>
+ sctemme@brutus:~$ free<br />
+ total used free shared buffers cached<br />
+ Mem: 4026028 3901892 124136 0 253144
+ 841044<br />
+ -/+ buffers/cache: 2807704 1218324<br />
+ Swap: 3903784 12540 3891244
+ </code></p></div>
+
+
+ <h4><a name="vmstat" id="vmstat">vmstat
+ </a></h4>
+
+ <p>This command is available on many unix platforms. It
+ displays a large number of operating system metrics. Run
+ without argument, it displays a status line for that
+ moment. When a numeric argument is added, the status is
+ redisplayed at designated intervals. For example, <code>
+ vmstat 5
+ </code>
+ causes the information to reappear every five seconds.
+ Vmstat displays the amount of virtual memory in use, how
+ much memory is swapped in and out each second, the number
+ of processes currently running and sleeping, the number of
+ interrupts and context switches per second and the usage
+ percentages of the CPU.
+ </p>
+ <p>
+ The following is <code>vmstat
+ </code>
+ output of an idle server:
+ </p>
+
+
+ <div class="example"><p><code>
+ [sctemme@GayDeceiver sctemme]$ vmstat 5 3<br />
+ procs memory swap io
+ system cpu<br />
+ r b w swpd free buff cache si so bi bo in
+ cs us sy i<br />
+ 0 0 0 0 186252 6688 37516 0 0 12 5 47
+ 311 0 1 9<br />
+ 0 0 0 0 186244 6696 37516 0 0 0 16 41
+ 314 0 0 10<br />
+ 0 0 0 0 186236 6704 37516 0 0 0 9 44
+ 314 0 0 100
+ </code></p></div>
+
+ <p>And this is output of a server that is under a load of one
+ hundred simultaneous connections fetching static content:
+ </p>
+
+ <div class="example"><p><code>
+ sctemme@GayDeceiver sctemme]$ vmstat 5 3<br />
+ procs memory swap io
+ system cpu<br />
+ r b w swpd free buff cache si so bi bo in
+ cs us sy id<br />
+ 1 0 1 0 162580 6848 40056 0 0 11 5 150
+ 324 1 1 98<br />
+ 6 0 1 0 163280 6856 40248 0 0 0 66 6384
+ 1117 42 25 32<br />
+ 11 0 0 0 162780 6864 40436 0 0 0 61 6309
+ 1165 33 28 40
+ </code></p></div>
+
+ <p>The first line gives averages since the last reboot. The
+ subsequent lines give information for five second
+ intervals. The second argument tells vmstat to generate
+ three reports and then exit.
+ </p>
+
+
+
+ <h4><a name="SE Toolkit" id="SE Toolkit">SE Toolkit
+ </a></h4>
+
+ <p>The SE Toolkit is a system monitoring toolkit for Solaris.
+ Its programming language is based on the C preprocessor and
+ comes with a number of sample scripts. It can use both the
+ command line and the GUI to display information. It can
+ also be programmed to apply rules to the system data. The
+ example script shown in Figure 2, Zoom.se, shows green,
+ orange or red indicators when utilization of various parts
+ of the system rises above certain thresholds. Another
+ included script, Virtual Adrian, applies performance tuning
+ metrics according to.
+ </p>
+ <p>The SE Toolkit has drifted around for a while and has had
+ several owners since its inception. It seems that it has
+ now found a final home at Sunfreeware.com, where it can be
+ downloaded at no charge. There is a single package for
+ Solaris 8, 9 and 10 on SPARC and x86, and includes source
+ code. SE Toolkit author Richard Pettit has started a new
+ company, Captive Metrics4 that plans to bring to market a
+ multiplatform monitoring tool built on the same principles
+ as SE Toolkit, written in Java.
+ </p>
+
+
+
+ <h4><a name="DTrace" id="DTrace">DTrace
+ </a></h4>
+
+ <p>Given that DTrace is available for Solaris, FreeBSD and OS
+ X, it might be worth exploring it. There's also
+ mod_dtrace available for httpd.
+ </p>
+
+
+
+ <h4><a name="mod_status" id="mod_status">mod_status
+ </a></h4>
+
+ <p>The mod_status module gives an overview of the server
+ performance at a given moment. It generates an HTML page
+ with, among others, the number of Apache processes running
+ and how many bytes each has served, and the CPU load caused
+ by httpd and the rest of the system. The Apache Software
+ Foundation uses mod_status on its own <a href="http://apache.org/server-status">
+ web site
+ </a>
+ .If you put the <code>ExtendedStatus On
+ </code>
+ directive in your <code>httpd.conf
+ </code>
+ ,the <code>mod_status
+ </code>
+ page will give you more information at the cost of a little
+ extra work per request.
+ </p>
+
+
+
+
+ <h3><a name="Web Server Log Files" id="Web Server Log Files">Web Server Log Files
+ </a></h3>
+
+ <p>Monitoring and analyzing the log files httpd writes is one of
+ the most effective ways to keep track of your server health and
+ performance. Monitoring the error log allows you to detect
+ error conditions, discover attacks and find performance issues.
+ Analyzing the access logs tells you how busy your server is,
+ which resources are the most popular and where your users come
+ from. Historical log file data can give you invaluable insight
+ into trends in access to your server, which allows you to
+ predict when your performance needs will overtake your server
+ capacity.
+ </p>
+
+
+ <h4><a name="Error Log" id="Error Log">Error Log
+ </a></h4>
+
+ <p>The error log will contain messages if the server has
+ reached the maximum number of active processes or the
+ maximum number of concurrently open files. The error log
+ also reflects when processes are being spawned at a
+ higher-than-usual rate in response to a sudden increase in
+ load. When the server starts, the stderr file descriptor is
+ redirected to the error logfile, so any error encountered
+ by httpd after it opens its logfiles will appear in this
+ log. This makes it good practice to review the error log
+ frequently.
+ </p>
+ <p>Before Apache httpd opens its logfiles, any errors will be
+ written to the stderr stream. If you start httpd manually,
+ this error information will appear on your terminal and you
+ can use it directly to troubleshoot your server. If your
+ httpd is started by a startup script, the destination of
+ early error messages depends on their design. The <code>
+ /var/log/messages
+ </code>
+ file is usually a good bet. On Windows, early error
+ messages are written to the Applications Event Log, which
+ can be viewed through the Event Viewer in Administrative
+ Tools.
+ </p>
+ <p>
+ The Error Log is configured through the <code>ErrorLog
+ </code>
+ and <code>LogLevel
+ </code>
+ configuration directives. The error log of httpd’s main
+ server configuration receives the log messages that pertain
+ to the entire server: startup, shutdown, crashes, excessive
+ process spawns, etc. The <code>ErrorLog
+ </code>
+ directive can also be used in virtual host containers. The
+ error log of a virtual host receives only log messages
+ specific to that virtual host, such as authentication
+ failures and 'File not Found' errors.
+ </p>
+ <p>On a server that is visible to the Internet, expect to see a
+ lot of exploit attempt and worm attacks in the error log. A
+ lot of these will be targeted at other server platforms
+ instead of Apache, but the current state of affairs is that
+ attack scripts just throw everything they have at any open
+ port, regardless of which server is actually running or
+ what applications might be installed. You could block these
+ attempts using a firewall or <a href="http://www.modsecurity.org/">
+ mod_security
+ </a>
+ ,but this falls outside the scope of this discussion.
+ </p>
+ <p>
+ The <code>LogLevel
+ </code>
+ directive determines the level of detail included in the
+ logs. There are eight log levels as described here:
+ </p>
+ <table>
+ <tr>
+ <td>
+ <p>
+ <strong>Level
+ </strong>
+ </p>
+ </td>
+ <td>
+ <p>
+ <strong>Description
+ </strong>
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> emerg
+ </p>
+ </td>
+ <td>
+ <p> Emergencies - system is unusable.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> alert
+ </p>
+ </td>
+ <td>
+ <p> Action must be taken immediately.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> crit
+ </p>
+ </td>
+ <td>
+ <p> Critical Conditions.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> error
+ </p>
+ </td>
+ <td>
+ <p> Error conditions.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> warn
+ </p>
+ </td>
+ <td>
+ <p> Warning conditions.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> notice
+ </p>
+ </td>
+ <td>
+ <p> Normal but significant condition.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> info
+ </p>
+ </td>
+ <td>
+ <p> Informational.
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> debug
+ </p>
+ </td>
+ <td>
+ <p> Debug-level messages
+ </p>
+ </td>
+ </tr>
+ </table>
+ <p>The default log level is warn. A production server should
+ not be run on debug, but increasing the level of detail in
+ the error log can be useful during troubleshooting.
+ Starting with 2.3.8 <code>LogLevel
+ </code>
+ can be specified on a per module basis:
+ </p>
+
+ <div class="example"><p><code>
+ LogLevel debug mod_ssl:warn
+ </code></p></div>
+
+ <p>
+ This puts all of the server in debug mode, except for <code>
+ mod_ssl
+ </code>
+ ,which tends to be very noisy.
+ </p>
+
+
+
+ <h4><a name="Access Log" id="Access Log">Access Log
+ </a></h4>
+
+ <p>Apache httpd keeps track of every request it services in its
+ access log file. In addition to the time and nature of a
+ request, httpd can log the client IP address, date and time
+ of the request, the result and a host of other information.
+ The various logging format features are documented in the <a href="http://httpd.apache.org/docs/current/mod/core.html#loglevel">
+ manual
+ </a>
+ .This file exists by default for the main server and can be
+ configured per virtual host by using the <code>TransferLog
+ </code>
+ or <code>CustomLog
+ </code>
+ configuration directive.
+ </p>
+ <p>The access logs can be analyzed with any of several free and
+ commercially available programs. Popular free analysis
+ packages include Analog and Webalizer. Log analysis should
+ be done offline so the web server machine is not burdened
+ by processing the log files. Most log analysis packages
+ understand the Common Log Format. The fields in the log
+ lines are explained in in the following:
+ </p>
+
+
+ <div class="example"><p><code>
+ 195.54.228.42 - - [24/Mar/2007:23:05:11 -0400] "GET
+ /sander/feed/ HTTP/1.1" 200 9747<br />
+ 64.34.165.214 - - [24/Mar/2007:23:10:11 -0400] "GET
+ /sander/feed/atom HTTP/1.1" 200 9068<br />
+ 60.28.164.72 - - [24/Mar/2007:23:11:41 -0400] "GET /
+ HTTP/1.0" 200 618<br />
+ 85.140.155.56 - - [24/Mar/2007:23:14:12 -0400] "GET
+ /sander/2006/09/27/44/ HTTP/1.1" 200 14172<br />
+ 85.140.155.56 - - [24/Mar/2007:23:14:15 -0400] "GET
+ /sander/2006/09/21/gore-tax-pollution/ HTTP/1.1" 200 15147<br />
+ 74.6.72.187 - - [24/Mar/2007:23:18:11 -0400] "GET
+ /sander/2006/09/27/44/ HTTP/1.0" 200 14172<br />
+ 74.6.72.229 - - [24/Mar/2007:23:24:22 -0400] "GET
+ /sander/2006/11/21/os-java/ HTTP/1.0" 200 13457
+ </code></p></div>
+
+ <table>
+ <tr>
+ <td>
+ <p>
+ <strong>Field
+ </strong>
+ </p>
+ </td>
+ <td>
+ <p>
+ <strong>Content
+ </strong>
+ </p>
+ </td>
+ <td>
+ <p>
+ <strong>Explanation
+ </strong>
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> Client IP
+ </p>
+ </td>
+ <td>
+ <p> 195.54.228.42
+ </p>
+ </td>
+ <td>
+ <p> IP address where the request originated
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> RFC 1413 ident
+ </p>
+ </td>
+ <td>
+ <p> -
+ </p>
+ </td>
+ <td>
+ <p> Remote user identity as reported by their
+ identd
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> username
+ </p>
+ </td>
+ <td>
+ <p> -
+ </p>
+ </td>
+ <td>
+ <p> Remote username as authenticated by Apache
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> timestamp
+ </p>
+ </td>
+ <td>
+ <p> [24/Mar/2007:23:05:11 -0400]
+ </p>
+ </td>
+ <td>
+ <p> Date and time of request
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> Request
+ </p>
+ </td>
+ <td>
+ <p> "GET /sander/feed/ HTTP/1.1"
+ </p>
+ </td>
+ <td>
+ <p> Request line
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> Status Code
+ </p>
+ </td>
+ <td>
+ <p> 200
+ </p>
+ </td>
+ <td>
+ <p> Response code
+ </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p> Content Bytes
+ </p>
+ </td>
+ <td>
+ <p> 9747
+ </p>
+ </td>
+ <td>
+ <p> Bytes transferred w/o headers
+ </p>
+ </td>
+ </tr>
+ </table>
+
+
+ <h4><a name="Rotating Log Files" id="Rotating Log Files">Rotating Log Files
+ </a></h4>
+
+ <p>There are several reasons to rotate logfiles. Even though
+ almost no operating systems out there have a hard file size
+ limit of two Gigabytes anymore, log files simply become too
+ large to handle over time. Additionally, any periodic log
+ file analysis should not be performed on files to which the
+ server is actively writing. Periodic logfile rotation helps
+ keep the analysis job manageable, and allows you to keep a
+ closer eye on usage trends.
+ </p>
+ <p>On unix systems, you can simply rotate logfiles by giving
+ the old file a new name using mv. The server will keep
+ writing to the open file even though it has a new name.
+ When you send a graceful restart signal to the server, it
+ will open a new logfile with the configured name. For
+ example, you could run a script from cron like this:
+ </p>
+
+
+ <div class="example"><p><code>
+ APACHE=/usr/local/apache2<br />
+ HTTPD=$APACHE/bin/httpd<br />
+ mv $APACHE/logs/access_log
+ $APACHE/logarchive/access_log-‘date +%F‘<br />
+ $HTTPD -k graceful
+ </code></p></div>
+
+ <p>This approach also works on Windows, just not as smoothly.
+ While the httpd process on your Windows server will keep
+ writing to the log file after it has been renamed, the
+ Windows Service that runs Apache can not do a graceful
+ restart. Restarting a Service on Windows means stopping it
+ and then starting it again. The advantage of a graceful
+ restart is that the httpd child processes get to complete
+ responding to their current requests before they exit.
+ Meanwhile, the httpd server becomes immediately available
+ again to serve new requests. The stop-start that the
+ Windows Service has to perform will interrupt any requests
+ currently in progress, and the server is unavailable until
+ it is started again. Plan for this when you decide the
+ timing of your restarts.
+ </p>
+ <p>
+ A second approach is to use piped logs. From the <code>
+ CustomLog
+ </code>
+ ,<code>TransferLog
+ </code>
+ or <code>ErrorLog
+ </code>
+ directives you can send the log data into any program using
+ a pipe character (<code>|
+ </code>
+ ). For instance:
+ </p>
+
+ <div class="example"><p><code>CustomLog "|/usr/local/apache2/bin/rotatelogs
+ /var/log/access_log 86400" common
+ </code></p></div>
+
+ <p>The program on the other end of the pipe will receive the
+ Apache log data on its stdin stream, and can do with this
+ data whatever it wants. The rotatelogs program that comes
+ with Apache seamlessly turns over the log file based on
+ time elapsed or the amount of data written, and leaves the
+ old log files with a timestamp suffix to its name. This
+ method for rotating logfiles works well on unix platforms,
+ but is currently broken on Windows.
+ </p>
+
+
+
+ <h4><a name="Logging and Performance" id="Logging and Performance">Logging and Performance
+ </a></h4>
+
+ <p>Writing entries to the Apache log files obviously takes some
+ effort, but the information gathered from the logs is so
+ valuable that under normal circumstances logging should not
+ be turned off. For optimal performance, you should put your
+ disk-based site content on a different physical disk than
+ the server log files: the access patterns are very
+ different. Retrieving content from disk is a read operation
+ in a fairly random pattern, and log files are written to
+ disk sequentially.
+ </p>
+ <p>
+ Do not run a production server with your error <code>
+ LogLevel
+ </code>
+ set to debug. This log level causes a vast amount of
+ information to be written to the error log, including, in
+ the case of SSL access, complete dumps of BIO read and
+ write operations. The performance implications are
+ significant: use the default warn level instead.
+ </p>
+ <p>If your server has more than one virtual host, you may give
+ each virtual host a separate access logfile. This makes it
+ easier to analyze the logfile later. However, if your
+ server has many virtual hosts, all the open logfiles put a
+ resource burden on your system, and it may be preferable to
+ log to a single file. Use the <code>%v
+ </code>
+ format character at the start of your <a href="/httpd/LogFormat" class="nonexistent">
+ LogFormat
+ </a>
+ and starting 2.3.8 of your <code>ErrorLogFormat
+ </code>
+ to make httpd print the hostname of the virtual host that
+ received the request or the error at the beginning of each
+ log line. A simple Perl script can split out the log file
+ after it rotates: one is included with the Apache source
+ under <code>support/split-logfile
+ </code>
+ .
+ </p>
+ <p>
+ You can use the <code>BufferedLogs
+ </code>
+ directive to have Apache collect several log lines in
+ memory before writing them to disk. This might yield better
+ performance, but could affect the order in which the
+ server's log is written.
+ </p>
+
+
+
+
+ <h3><a name="Generating A Test Load" id="Generating A Test Load">Generating A Test Load
+ </a></h3>
+
+ <p>It is useful to generate a test load to monitor system
+ performance under realistic operating circumstances. Besides
+ commercial packages such as <a href="/httpd/LoadRunner" class="nonexistent">
+ LoadRunner
+ </a>
+ ,there are a number of freely available tools to generate a
+ test load against your web server.
+ </p>
+ <ul>
+ <li>Apache ships with a test program called ab, short for
+ Apache Bench. It can generate a web server load by
+ repeatedly asking for the same file in rapid succession.
+ You can specify a number of concurrent connections and have
+ the program run for either a given amount of time or a
+ specified number of requests.
+ </li>
+ <li>Another freely available load generator is http load11 .
+ This program works with a URL file and can be compiled with
+ SSL support.
+ </li>
+ <li>The Apache Software Foundation offers a tool named flood12
+ . Flood is a fairly sophisticated program that is
+ configured through an XML file.
+ </li>
+ <li>Finally, JMeter13 , a Jakarta subproject, is an all-Java
+ load-testing tool. While early versions of this application
+ were slow and difficult to use, the current version 2.1.1
+ seems to be versatile and useful.
+ </li>
+ <li>
+ <p>ASF external projects, that have proven to be quite
+ good: grinder, httperf, tsung, <a href="/httpd/FunkLoad" class="nonexistent">
+ FunkLoad
+ </a>
+ </p>
+ </li>
+ </ul>
+ <p>When you load-test your web server, please keep in mind that if
+ that server is in production, the test load may negatively
+ affect the server’s response. Also, any data traffic you
+ generate may be charged against your monthly traffic allowance.
+ </p>
+
+
+
+ </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="Configuring for Performance" id="Configuring for Performance">Configuring for Performance
+ </a></h2>
+
+
+
+ <h3><a name="Apache Configuration" id="Apache Configuration">Apache Configuration
+ </a></h3>
+
+ <p>The Apache 2.2 httpd is by default a pre-forking web server.
+ When the server starts, the parent process spawns a number of
+ child processes that do the actual work of servicing requests.
+ But Apache httpd 2.0 introduced the concept of the
+ Multi-Processing Module (MPM). Developers can write MPMs to
+ suit the process- or threadingarchitecture of their specific
+ operating system. Apache 2 comes with special MPMs for Windows,
+ OS/2, Netware and BeOS. On unix-like platforms, the two most
+ popular MPMs are Prefork and Worker. The Prefork MPM offers the
+ same pre-forking process model that Apache 1.3 uses. The Worker
+ MPM runs a smaller number of child processes, and spawns
+ multiple request handling threads within each child process. In
+ 2.3+ MPMs are no longer hard-wired. They too can be exchanged
+ via <a href="/httpd/LoadModule" class="nonexistent">LoadModule
+ </a>
+ .The default MPM in 2.3 is the event MPM.
+ </p>
+ <p>The maximum number of workers, be they pre-forked child
+ processes or threads within a process, is an indication of how
+ many requests your server can manage concurrently. It is merely
+ a rough estimate because the kernel can queue connection
+ attempts for your web server. When your site becomes busy and
+ the maximum number of workers is running, the machine
+ doesn't hit a hard limit beyond which clients will be
+ denied access. However, once requests start backing up, system
+ performance is likely to degrade.
+ </p>
+
+
+ <h4><a name="MaxClients" id="MaxClients">MaxClients
+ </a></h4>
+
+ <p>
+ The <code>MaxClients
+ </code>
+ directive in your Apache httpd configuration file specifies
+ the maximum number of workers your server can create. It
+ has two related directives, <code>MinSpareServers
+ </code>
+ and <code>MaxSpareServers
+ </code>
+ ,which specify the number of workers Apache keeps waiting
+ in the wings ready to serve requests. The absolute maximum
+ number of processes is configurable through the <code>
+ ServerLimit
+ </code>
+ directive.
+ </p>
+
+
+
+ <h4><a name="Spinning Threads" id="Spinning Threads">Spinning Threads
+ </a></h4>
+
+ <p>For the prefork MPM of the above directives are all there is
+ to determining the process limit. However, if you are
+ running a threaded MPM the situation is a little more
+ complicated. Threaded MPMs support the <code>
+ ThreadsPerChild
+ </code>
+ directive1 . Apache requires that <code>MaxClients
+ </code>
+ is evenly divisible by <code>ThreadsPerChild
+ </code>
+ .If you set either directive to a number that doesn’t
+ meet this requirement, Apache will send a message of
+ complaint to the error log and adjust the <code>
+ ThreadsPerChild
+ </code>
+ value downwards until it is an even factor of <code>
+ MaxClients
+ </code>
+ .
+ </p>
+
+
+
+ <h4><a name="Sizing MaxClients" id="Sizing MaxClients">Sizing MaxClients
+ </a></h4>
+
+ <p>Optimally, the maximum number of processes should be set so
+ that all the memory on your system is used, but no more. If
+ your system gets so overloaded that it needs to heavily
+ swap core memory out to disk, performance will degrade
+ quickly. The formula for determining <code>MaxClients
+ </code>
+ is fairly simple:
+ </p>
+
+ <div class="example"><p><code>
+ total RAM − RAM for OS − RAM for external programs<br />
+ MaxClients =
+ -------------------------------------------------------<br />
+ RAM per httpd process
+ </code></p></div>
+
+ <p>The various amounts of memory allocated for the OS, external
+ programs and the httpd processes is best determined by
+ observation: use the top and free commands described above
+ to determine the memory footprint of the OS without the web
+ server running. You can also determine the footprint of a
+ typical web server process from top: most top
+ implementations have a Resident Size (RSS) column and a
+ Shared Memory column.
+ </p>
+ <p>The difference between these two is the amount of memory
+ per-process. The shared segment really exists only once and
+ is used for the code and libraries loaded and the dynamic
+ inter-process tally, or 'scoreboard,' that Apache
+ keeps. How much memory each process takes for itself
+ depends heavily on the number and kind of modules you use.
+ The best approach to use in determining this need is to
+ generate a typical test load against your web site and see
+ how large the httpd processes become.
+ </p>
+ <p>The RAM for external programs parameter is intended mostly
+ for CGI programs and scripts that run outside the web
+ server process. However, if you have a Java virtual machine
+ running Tomcat on the same box it will need a significant
+ amount of memory as well. The above assessment should give
+ you an idea how far you can push <code>MaxClients
+ </code>
+ ,but it is not an exact science. When in doubt, be
+ conservative and use a low <code>MaxClients
+ </code>
+ value. The Linux kernel will put extra memory to good use
+ for caching disk access. On Solaris you need enough
+ available real RAM memory to create any process. If no real
+ memory is available, httpd will start writing ‘No space
+ left on device’ messages to the error log and be unable
+ to fork additional child processes, so a higher <code>
+ MaxClients
+ </code>
+ value may actually be a disadvantage.
+ </p>
+
+
+
+ <h4><a name="Selecting your MPM" id="Selecting your MPM">Selecting your MPM
+ </a></h4>
+
+ <p>The prime reason for selecting a threaded MPM is that
+ threads consume fewer system resources than processes, and
+ it takes less effort for the system to switch between
+ threads. This is more true for some operating systems than
+ for others. On systems like Solaris and AIX, manipulating
+ processes is relatively expensive in terms of system
+ resources. On these systems, running a threaded MPM makes
+ sense. On Linux, the threading implementation actually uses
+ one process for each thread. Linux processes are relatively
+ lightweight, but it means that a threaded MPM offers less
+ of a performance advantage than in other environments.
+ </p>
+ <p>Running a threaded MPM can cause stability problems in some
+ situations For instance, should a child process of a
+ preforked MPM crash, at most one client connection is
+ affected. However, if a threaded child crashes, all the
+ threads in that process disappear, which means all the
+ clients currently being served by that process will see
+ their connection aborted. Additionally, there may be
+ so-called "thread-safety" issues, especially with
+ third-party libraries. In threaded applications, threads
+ may access the same variables indiscriminately, not knowing
+ whether a variable may have been changed by another thread.
+ </p>
+ <p>This has been a sore point within the PHP community. The PHP
+ processor heavily relies on third-party libraries and
+ cannot guarantee that all of these are thread-safe. The
+ good news is that if you are running Apache on Linux, you
+ can run PHP in the preforked MPM without fear of losing too
+ much performance relative to the threaded option.
+ </p>
+
+
+
+ <h4><a name="Spinning Locks" id="Spinning Locks">Spinning Locks
+ </a></h4>
+
+ <p>Apache httpd maintains an inter-process lock around its
+ network listener. For all practical purposes, this means
+ that only one httpd child process can receive a request at
+ any given time. The other processes are either servicing
+ requests already received or are 'camping out' on
+ the lock, waiting for the network listener to become
+ available. This process is best visualized as a revolving
+ door, with only one process allowed in the door at any
+ time. On a heavily loaded web server with requests arriving
+ constantly, the door spins quickly and requests are
+ accepted at a steady rate. On a lightly loaded web server,
+ the process that currently "holds" the lock may
+ have to stay in the door for a while, during which all the
+ other processes sit idle, waiting to acquire the lock. At
+ this time, the parent process may decide to terminate some
+ children based on its <code>MaxSpareServers
+ </code>
+ directive.
+ </p>
+
+
+
+ <h4><a name="The Thundering Herd" id="The Thundering Herd">The Thundering Herd
+ </a></h4>
+
+ <p>The function of the 'accept mutex' (as this
+ inter-process lock is called) is to keep request reception
+ moving along in an orderly fashion. If the lock is absent,
+ the server may exhibit the Thundering Herd syndrome.
+ </p>
+ <p>Consider an American Football team poised on the line of
+ scrimmage. If the football players were Apache processes
+ all team members would go for the ball simultaneously at
+ the snap. One process would get it, and all the others
+ would have to lumber back to the line for the next snap. In
+ this metaphor, the accept mutex acts as the quarterback,
+ delivering the connection "ball" to the
+ appropriate player process.
+ </p>
+ <p>Moving this much information around is obviously a lot of
+ work, and, like a smart person, a smart web server tries to
+ avoid it whenever possible. Hence the revolving door
+ construction. In recent years, many operating systems,
+ including Linux and Solaris, have put code in place to
+ prevent the Thundering Herd syndrome. Apache recognizes
+ this and if you run with just one network listener, meaning
+ one virtual host or just the main server, Apache will
+ refrain from using an accept mutex. If you run with
+ multiple listeners (for instance because you have a virtual
+ host serving SSL requests), it will activate the accept
+ mutex to avoid internal conflicts.
+ </p>
+ <p>
+ You can manipulate the accept mutex with the <code>
+ AcceptMutex
+ </code>
+ directive. Besides turning the accept mutex off, you can
+ select the locking mechanism. Common locking mechanisms
+ include fcntl, System V Semaphores and pthread locking. Not
+ all are available on every platform, and their availability
+ also depends on compile-time settings. The various locking
+ mechanisms may place specific demands on system resources:
+ manipulate them with care.
+ </p>
+ <p>There is no compelling reason to disable the accept mutex.
+ Apache automatically recognizes the single listener
+ situation described above and knows if it is safe to run
+ without mutex on your platform.
+ </p>
+
+
+
+
+ <h3><a name="Tuning the Operating System" id="Tuning the Operating System">Tuning the Operating System
+ </a></h3>
+
+ <p>People often look for the 'magic tune-up' that will
+ make their system perform four times as fast by tweaking just
+ one little setting. The truth is, present-day UNIX derivatives
+ are pretty well adjusted straight out of the box and there is
+ not a lot that needs to be done to make them perform optimally.
+ However, there are a few things that an administrator can do to
+ improve performance.
+ </p>
+
+
+ <h4><a name="RAM and Swap Space" id="RAM and Swap Space">RAM and Swap Space
+ </a></h4>
+
+ <p>The usual mantra regarding RAM is "more is
+ better". As discussed above, unused RAM is put to good
+ use as file system cache. The Apache processes get bigger
+ if you load more modules, especially if you use modules
+ that generate dynamic page content within the processes,
+ like PHP and mod_perl. A large configuration file-with many
+ virtual hosts-also tends to inflate the process footprint.
+ Having ample RAM allows you to run Apache with more child
+ processes, which allows the server to process more
+ concurrent requests.
+ </p>
+ <p>While the various platforms treat their virtual memory in
+ different ways, it is never a good idea to run with less
+ disk-based swap space than RAM. The virtual memory system
+ is designed to provide a fallback for RAM, but when you
+ don't have disk space available and run out of
+ swappable memory, your machine grinds to a halt. This can
+ crash your box, requiring a physical reboot for which your
+ hosting facility may charge you.
+ </p>
+ <p>Also, such an outage naturally occurs when you least want
+ it: when the world has found your website and is beating a
+ path to your door. If you have enough disk-based swap space
+ available and the machine gets overloaded, it may get very,
+ very slow as the system needs to swap memory pages to disk
+ and back, but when the load decreases the system should
+ recover. Remember, you still have <code>MaxClients
+ </code>
+ to keep things in hand.
+ </p>
+ <p>Most unix-like operating systems use designated disk
+ partitions for swap space. When a system starts up it finds
+ all swap partitions on the disk(s), by partition type or
+ because they are listed in the file <code>/etc/fstab
+ </code>
+ ,and automatically enables them. When adding a disk or
+ installing the operating system, be sure to allocate enough
+ swap space to accommodate eventual RAM upgrades.
+ Reassigning disk space on a running system is a cumbersome
+ process.
+ </p>
+ <p>Plan for available hard drive swap space of at least twice
+ your amount of RAM, perhaps up to four times in situations
+ with frequent peaking loads. Remember to adjust this
+ configuration whenever you upgrade RAM on your system. In a
+ pinch, you can use a regular file as swap space. For
+ instructions on how to do this, see the manual pages for
+ the <code>mkswap
+ </code>
+ and <code>swapon
+ </code>
+ or <code>swap
+ </code>
+ programs.
+ </p>
+
+
+
+ <h4><a name="ulimit: Files and Processes" id="ulimit: Files and Processes">ulimit: Files and Processes
+ </a></h4>
+
+ <p>Given a machine with plenty of RAM and processor capacity,
+ you can run hundreds of Apache processes if necessary. . .
+ and if your kernel allows it.
+ </p>
+ <p>Consider a situation in which several hundred web servers
+ are running; if some of these need to spawn CGI processes,
+ the maximum number of processes would occur quickly.
+ </p>
+ <p>However, you can change this limit with the command
+ </p>
+
+ <div class="example"><p><code>
+ ulimit [-H|-S] -u [newvalue]
+ </code></p></div>
+
+ <p>This must be changed before starting the server, since the
+ new value will only be available to the current shell and
+ programs started from it. In newer Linux kernels the
+ default has been raised to 2048. On FreeBSD, the number
+ seems to be the rather unusual 513. In the default user
+ shell on this system, <code>csh
+ </code>
+ the equivalent is <code>limit
+ </code>
+ and works analogous the the Bourne-like <code>ulimit
+ </code>
+ :
+ </p>
+
+ <div class="example"><p><code>
+ limit [-h] maxproc [newvalue]
+ </code></p></div>
+
+ <p>Similarly, the kernel may limit the number of open files per
+ process. This is generally not a problem for pre-forked
+ servers, which just handle one request at a time per
+ process. Threaded servers, however, serve many requests per
+ process and much more easily run out of available file
+ descriptors. You can increase the maximum number of open
+ files per process by running the
+ </p>
+
+ <div class="example"><p><code>ulimit -n [newvalue]
+ </code></p></div>
+
+ <p>command. Once again, this must be done prior to starting
+ Apache.
+ </p>
+
+
+
+ <h4><a name="Setting User Limits on System Startup" id="Setting User Limits on System Startup">Setting User Limits on System Startup
+ </a></h4>
+
+ <p>Under Linux, you can set the ulimit parameters on bootup by
+ editing the <code>/etc/security/limits.conf
+ </code>
+ file. This file allows you to set soft and hard limits on a
+ per-user or per-group basis; the file contains commentary
+ explaining the options. To enable this, make sure that the
+ file <code>/etc/pam.d/login
+ </code>
+ contains the line
+ </p>
+
+ <div class="example"><p><code>session required /lib/security/pam_limits.so
+ </code></p></div>
+
+ <p>All items can have a 'soft' and a 'hard'
+ limit: the first is the default setting and the second the
+ maximum value for that item.
+ </p>
+ <p>
+ In FreeBSD's <code>/etc/login.conf
+ </code>
+ these resources can be limited or extended system wide,
+ analogously to <code>limits.conf
+ </code>
+ .'Soft' limits can be specified with <code>-cur
+ </code>
+ and 'hard' limits with <code>-max
+ </code>
+ .
+ </p>
+ <p>Solaris has a similar mechanism for manipulating limit
+ values at boot time: In <code>/etc/system
+ </code>
+ you can set kernel tunables valid for the entire system at
+ boot time. These are the same tunables that can be set with
+ the <code>mdb
+ </code>
+ kernel debugger during run time. The soft and hard limit
+ corresponding to ulimit -u can be set via:
+ </p>
+
+ <div class="example"><p><code>
+ set rlim_fd_max=65536<br />
+ set rlim_fd_cur=2048
+ </code></p></div>
+
+ <p>Solaris calculates the maximum number of allowed processes
+ per user (<code>maxuprc
+ </code>
+ )based on the total amount available memory on the system (<code>
+ maxusers
+ </code>
+ ). You can review the numbers with
+ </p>
+
+ <div class="example"><p><code>sysdef -i | grep maximum
+ </code></p></div>
+
+ <p>but it is not recommended to change them.
+ </p>
+
+
+
+ <h4><a name="Turn Off Unused Services and Modules" id="Turn Off Unused Services and Modules">Turn Off Unused Services and Modules
+ </a></h4>
+
+ <p>Many UNIX and Linux distributions come with a slew of
+ services turned on by default. You probably need few of
+ them. For example, your web server does not need to be
+ running sendmail, nor is it likely to be an NFS server,
+ etc. Turn them off.
+ </p>
+ <p>On Red Hat Linux, the chkconfig tool will help you do this
+ from the command line. On Solaris systems <code>svcs
+ </code>
+ and <code>svcadm
+ </code>
+ will show which services are enabled and disable them
+ respectively.
+ </p>
+ <p>In a similar fashion, cast a critical eye on the Apache
+ modules you load. Most binary distributions of Apache
+ httpd, and pre-installed versions that come with Linux
+ distributions, have their modules enabled through the <code>
+ LoadModule
+ </code>
+ directive.
+ </p>
+ <p>Unused modules may be culled: if you don't rely on
+ their functionality and configuration directives, you can
+ turn them off by commenting out the corresponding <code>
+ LoadModule
+ </code>
+ lines. Read the documentation on each module’s
+ functionality before deciding whether to keep it enabled.
+ While the performance overhead of an unused module is
+ small, it's also unnecessary.
+ </p>
+
+
+
+
+ </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="Caching Content" id="Caching Content">Caching Content
+ </a></h2>
+
+ <p>Requests for dynamically generated content usually take
+ significantly more resources than requests for static content.
+ Static content consists of simple filespages, images, etc.-on disk
+ that are very efficiently served. Many operating systems also
+ automatically cache the contents of frequently accessed files in
+ memory.
+ </p>
+ <p>Processing dynamic requests, on the contrary, can be much more
+ involved. Running CGI scripts, handing off requests to an external
+ application server and accessing database content can introduce
+ significant latency and processing load to a busy web server. Under
+ many circumstances, performance can be improved by turning popular
+ dynamic requests into static requests. In this section, two
+ approaches to this will be discussed.
+ </p>
+
+
+ <h3><a name="Making Popular Pages Static" id="Making Popular Pages Static">Making Popular Pages Static
+ </a></h3>
+
+ <p>By pre-rendering the response pages for the most popular queries
+ in your application, you can gain a significant performance
+ improvement without giving up the flexibility of dynamically
+ generated content. For instance, if your application is a
+ flower delivery service, you would probably want to pre-render
+ your catalog pages for red roses during the weeks leading up to
+ Valentine's Day. When the user searches for red roses,
+ they are served the pre-rendered page. Queries for, say, yellow
+ roses will be generated directly from the database. The
+ mod_rewrite module included with Apache is a great tool to
+ implement these substitutions.
+ </p>
+
+
+ <h4><a name="Example: A Statically Rendered Blog" id="Example: A Statically Rendered Blog">Example: A Statically Rendered Blog
+ </a></h4>
+
+ <p>
+ <strong>'we should provide a more useful example here.
+ One showing how to make Wordpress or Drupal suck less.
+ </strong>
+ '
+ </p>
+ <p>Blosxom is a lightweight web log package that runs as a CGI.
+ It is written in Perl and uses plain text files for entry
+ input. Besides running as CGI, Blosxom can be run from the
+ command line to pre-render blog pages. Pre-rendering pages
+ to static HTML can yield a significant performance boost in
+ the event that large numbers of people actually start
+ reading your blog.
+ </p>
+ <p>To run blosxom for static page generation, edit the CGI
+ script according to the documentation. Set the $static dir
+ variable to the <code>DocumentRoot
+ </code>
+ of the web server, and run the script from the command line
+ as follows:
+ </p>
+
+ <div class="example"><p><code>$ perl blosxom.cgi -password='whateveryourpassword'
+ </code></p></div>
+
+ <p>This can be run periodically from Cron, after you upload
+ content, etc. To make Apache substitute the statically
+ rendered pages for the dynamic content, we’ll use
+ mod_rewrite. This module is included with the Apache source
+ code, but is not compiled by default. It can be built with
+ the server by passing the option <code>
+ --enable-rewrite[=shared]
+ </code>
+ to the configure command. Many binary distributions of
+ Apache come with mod_rewrite included. The following is an
+ example of an Apache virtual host that takes advantage of
+ pre-rendered blog pages:
+ </p>
+
+ <div class="example"><p><code>Listen *:8001<br />
+ <VirtualHost *:8001><br />
+ <span class="indent">
+ ServerName blog.sandla.org:8001<br />
+ ServerAdmin sander@temme.net<br />
+ DocumentRoot "/home/sctemme/inst/blog/httpd/htdocs"<br />
+ <Directory
+ "/home/sctemme/inst/blog/httpd/htdocs"><br />
+ <span class="indent">
+ Options +Indexes<br />
+ Order allow,deny<br />
+ Allow from all<br />
+ RewriteEngine on<br />
+ RewriteCond %{REQUEST_FILENAME} !-f<br />
+ RewriteCond %{REQUEST_FILENAME} !-d<br />
+ RewriteRule ^(.*)$ /cgi-bin/blosxom.cgi/$1 [L,QSA]<br />
+ </span>
+ </Directory><br />
+ RewriteLog
+ /home/sctemme/inst/blog/httpd/logs/rewrite_log<br />
+ RewriteLogLevel 9<br />
+ ErrorLog /home/sctemme/inst/blog/httpd/logs/error_log<br />
+ LogLevel debug<br />
+ CustomLog /home/sctemme/inst/blog/httpd/logs/access_log
+ common<br />
+ ScriptAlias /cgi-bin/ /home/sctemme/inst/blog/bin/<br />
+ <Directory "/home/sctemme/inst/blog/bin"><br />
+ <span class="indent">
+ Options +ExecCGI<br />
+ Order allow,deny<br />
+ Allow from all<br />
+ </span>
+ </Directory><br />
+ </span>
+ </VirtualHost>
+ </code></p></div>
+
+ <p>
+ The <code>RewriteCond
+ </code>
+ and <code>RewriteRule
+ </code>
+ directives say that, if the requested resource does not
+ exist as a file or a directory, its path is passed to the
+ Blosxom CGI for rendering. Blosxom uses Path Info to
+ specify blog entries and index pages, so this means that if
+ a particular path under Blosxom exists as a static file in
+ the file system, the file is served instead. Any request
+ that isn't pre- rendered is served by the CGI. This
+ means that individual entries, which show the comments, are
+ always served by the CGI which in turn means that your
+ comment spam is always visible. This configuration also
+ hides the Blosxom CGI from the user-visible URL in their
+ Location bar. mod_rewrite is a fantastically powerful and
+ versatile module: investigate it to arrive at a
+ configuration that is best for your situation.
+ </p>
+
+
+
+
+ <h3><a name="Caching Content With mod_cache" id="Caching Content With mod_cache">Caching Content With mod_cache
+ </a></h3>
+
+ <p>The mod_cache module provides intelligent caching of HTTP
+ responses: it is aware of the expiration timing and content
+ requirements that are part of the HTTP specification. The
+ mod_cache module caches URL response content. If content sent
+ to the client is considered cacheable, it is saved to disk.
+ Subsequent requests for that URL will be served directly from
+ the cache. The provider module for mod_cache, mod_disk_cache,
+ determines how the cached content is stored on disk. Most
+ server systems will have more disk available than memory, and
+ it's good to note that some operating system kernels cache
+ frequently accessed disk content transparently in memory, so
+ replicating this in the server is not very useful.
+ </p>
+ <p>To enable efficient content caching and avoid presenting the
+ user with stale or invalid content, the application that
+ generates the actual content has to send the correct response
+ headers. Without headers like <code>Etag:
+ </code>
+ ,<code>Last-Modified:
+ </code>
+ or <code>Expires:
+ </code>
+ ,mod_cache can not make the right decision on whether to cache
+ the content, serve it from cache or leave it alone. When
+ testing content caching, you may find that you need to modify
+ your application or, if this is impossible, selectively disable
+ caching for URLs that cause problems. The mod_cache modules are
+ not compiled by default, but can be enabled by passing the
+ option <code>--enable-cache[=shared]
+ </code>
+ to the configure script. If you use a binary distribution of
+ Apache httpd, or it came with your port or package collection,
+ it may have mod_cache already included.
+ </p>
+
+
+ <h4><a name="Example: wiki.apache.org" id="Example: wiki.apache.org">Example: wiki.apache.org
+ </a></h4>
+
+ <p>
+ <strong>'Is this still the case? Maybe we should give
+ a better example here too.
+ </strong>
+ </p>
+ <p>
+ The Apache Software Foundation Wiki is served by <a href="/httpd/MoinMoin">
+ MoinMoin
+ </a>
+ .<a href="/httpd/MoinMoin">MoinMoin
+ </a>
+ is written in Python and runs as a CGI. To date, any
+ attempts to run it under mod_python has been unsuccessful.
+ The CGI proved to place an untenably high load on the
+ server machine, especially when the Wiki was being indexed
+ by search engines like Google. To lighten the load on the
+ server machine, the Apache Infrastructure team turned to
+ mod_cache. It turned out <a href="/httpd/MoinMoin">MoinMoin
+ </a>
+ needed a small patch to ensure proper behavior behind the
+ caching server: certain requests can never be cached and
+ the corresponding Python modules were patched to send the
+ proper HTTP response headers. After this modification, the
+ cache in front of the Wiki was enabled with the following
+ configuration snippet in <code>httpd.conf
+ </code>
+ :
+ </p>
+
+ <div class="example"><p><code>
+ CacheRoot /raid1/cacheroot<br />
+ CacheEnable disk /<br />
+ # A page modified 100 minutes ago will expire in 10 minutes<br />
+ CacheLastModifiedFactor .1<br />
+ # Always check again after 6 hours<br />
+ CacheMaxExpire 21600
+ </code></p></div>
+
+ <p>This configuration will try to cache any and all content
+ within its virtual host. It will never cache content for
+ more than six hours (the <code>CacheMaxExpire
+ </code>
+ directive). If no <code>Expires:
+ </code>
+ header is present in the response, mod_cache will compute
+ an expiration period from the <code>Last-Modified:
+ </code>
+ header. The computation using <code>CacheLastModifiedFactor
+ </code>
+ is based on the assumption that if a page was recently
+ modified, it is likely to change again in the near future
+ and will have to be re-cached.
+ </p>
+ <p>
+ Do note that it can pay off to <em>disable
+ </em>
+ the <code>ETag:
+ </code>
+ header: For files smaller than 1k the server has to
+ calculate the checksum (usually MD5) and then send out a <code>
+ 304 Not Modified
+ </code>
+ response, which will take waste some CPU and still saturate
+ the same amount of network resources for the transfer (one
+ TCP packet). For resources larger than 1k it might prove
+ CPU expensive to calculate the header for each request.
+ Unfortunately there does currently not exist a way to cache
+ these headers.
+ </p>
+ <div class="example"><p><code>
+ <FilesMatch \.(jpe?g|png|gif|js|css|x?html|xml)><br />
+ <span class="indent">
+ FilesETag None<br />
+ </span>
+ </FilesMatch>
+ </code></p></div>
+
+ <p>
+ This will disable the generation of the <code>ETag:
+ </code>
+ header for most static resources. The server does not
+ calculate these headers for dynamic resources.
+ </p>
+
+
+
+
+ </div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
+<div class="section">
+<h2><a name="Further Considerations" id="Further Considerations">Further Considerations
+ </a></h2>
+
+ <p>Armed with the knowledge of how to tune a sytem to deliver the
+ desired the performance, we will soon discover that <em>one
+ </em>
+ system might prove a bottleneck. How to make a system fit for
+ growth, or how to put a number of systems into tune will be
+ discussed in <a href="/httpd/PerformanceScalingOut">
+ PerformanceScalingOut
+ </a>
+ .
+ </p>
+ </div></div>
+<div class="bottomlang">
+<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a></p>
+</div><div id="footer">
+<p class="apache">Copyright 2012 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
+<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/directives.html">Directives</a> | <a href="../faq/">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div>
</body></html>
\ No newline at end of file