1 <?xml version="1.0" encoding="UTF-8" ?>
2 <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
3 <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
6 <relativepath href="."/>
8 <title>Log Files</title>
11 <p>In order to effectively manage a web server, it is necessary
12 to get feedback about the activity and performance of the
13 server as well as any problems that may be occuring. The Apache
14 HTTP Server provides very comprehensive and flexible logging
15 capabilities. This document describes how to configure its
16 logging capabilities, and how to understand what the logs
20 <section id="security">
21 <title>Security Warning</title>
23 <p>Anyone who can write to the directory where Apache is
24 writing a log file can almost certainly gain access to the uid
25 that the server is started as, which is normally root. Do
26 <em>NOT</em> give people write access to the directory the logs
27 are stored in without being aware of the consequences; see the
28 <a href="misc/security_tips.html">security tips</a> document
31 <p>In addition, log files may contain information supplied
32 directly by the client, without escaping. Therefore, it is
33 possible for malicious clients to insert control-characters in
34 the log files, so care must be taken in dealing with raw
38 <section id="errorlog">
39 <title>Error Log</title>
43 <directive module="mod_core">ErrorLog</directive>
44 <directive module="mod_core">LogLevel</directive>
48 <p>The server error log, whose name and location is set by the
49 <directive module="core">ErrorLog</directive> directive, is the
50 most important log file. This is the place where Apache httpd
51 will send diagnostic information and record any errors that it
52 encounters in processing requests. It is the first place to
53 look when a problem occurs with starting the server or with the
54 operation of the server, since it will often contain details of
55 what went wrong and how to fix it.</p>
57 <p>The error log is usually written to a file (typically
58 <code>error_log</code> on unix systems and
59 <code>error.log</code> on Windows and OS/2). On unix systems it
60 is also possible to have the server send errors to
61 <code>syslog</code> or <a href="#piped">pipe them to a
64 <p>The format of the error log is relatively free-form and
65 descriptive. But there is certain information that is contained
66 in most error log entries. For example, here is a typical
70 [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1]
71 client denied by server configuration:
72 /export/home/live/ap/htdocs/test
75 <p>The first item in the log entry is the date and time of the
76 message. The second entry lists the severity of the error being
77 reported. The <directive module="core">LogLevel</directive>
78 directive is used to control the types of errors that are sent
79 to the error log by restricting the severity level. The third
80 entry gives the IP address of the client that generated the
81 error. Beyond that is the message itself, which in this case
82 indicates that the server has been configured to deny the
83 client access. The server reports the file-system path (as
84 opposed to the web path) of the requested document.</p>
86 <p>A very wide variety of different messages can appear in the
87 error log. Most look similar to the example above. The error
88 log will also contain debugging output from CGI scripts. Any
89 information written to <code>stderr</code> by a CGI script will
90 be copied directly to the error log.</p>
92 <p>It is not possible to customize the error log by adding or
93 removing information. However, error log entries dealing with
94 particular requests have corresponding entries in the <a
95 href="#accesslog">access log</a>. For example, the above example
96 entry corresponds to an access log entry with status code 403.
97 Since it is possible to customize the access log, you can
98 obtain more information about error conditions using that log
101 <p>During testing, it is often useful to continuously monitor
102 the error log for any problems. On unix systems, you can
103 accomplish this using:</p>
110 <section id="accesslog">
111 <title>Access Log</title>
115 <module>mod_log_config</module>
116 <module>mod_setenvif</module>
119 <directive module="mod_log_config">CustomLog</directive>
120 <directive module="mod_log_config">LogFormat</directive>
121 <directive module="mod_setenvif">SetEnvIf</directive>
125 <p>The server access log records all requests processed by the
126 server. The location and content of the access log are
127 controlled by the <directive module="mod_log_config">CustomLog</directive>
128 directive. The <directive module="mod_log_config">LogFormat</directive>
129 directive can be used to simplify the selection of
130 the contents of the logs. This section describes how to configure the server
131 to record information in the access log.</p>
133 <p>Of course, storing the information in the access log is only
134 the start of log management. The next step is to analyze this
135 information to produce useful statistics. Log analysis in
136 general is beyond the scope of this document, and not really
137 part of the job of the web server itself. For more information
138 about this topic, and for applications which perform log
139 analysis, check the <a
140 href="http://dmoz.org/Computers/Software/Internet/Site_Management/Log_analysis/">
141 Open Directory</a> or <a
142 href="http://dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/">
145 <p>Various versions of Apache httpd have used other modules and
146 directives to control access logging, including
147 mod_log_referer, mod_log_agent, and the
148 <code>TransferLog</code> directive. The <code>CustomLog</code>
149 directive now subsumes the functionality of all the older
152 <p>The format of the access log is highly configurable. The
153 format is specified using a <directive module="mod_log_config">
154 CustomLog</directive> that
155 looks much like a C-style printf(1) format string. Some
156 examples are presented in the next sections. For a complete
157 list of the possible contents of the format string, see the <a
158 href="mod/mod_log_config.html">mod_log_config
159 documentation</a>.</p>
161 <section id="common">
162 <title>Common Log Format</title>
164 <p>A typical configuration for the access log might look as
168 LogFormat "%h %l %u %t \"%r\" %>s %b" common<br />
169 CustomLog logs/access_log common
172 <p>This defines the <em>nickname</em> <code>common</code> and
173 associates it with a particular log format string. The format
174 string consists of percent directives, each of which tell the
175 server to log a particular piece of information. Literal
176 characters may also be placed in the format string and will be
177 copied directly into the log output. The quote character
178 (<code>"</code>) must be escaped by placing a back-slash before
179 it to prevent it from being interpreted as the end of the
180 format string. The format string may also contain the special
181 control characters "<code>\n</code>" for new-line and
182 "<code>\t</code>" for tab.</p>
184 <p>The <code>CustomLog</code> directive sets up a new log file
185 using the defined <em>nickname</em>. The filename for the
186 access log is relative to the <directive
187 module="core">ServerRoot</directive> unless it begins
190 <p>The above configuration will write log entries in a format
191 known as the Common Log Format (CLF). This standard format can
192 be produced by many different web servers and read by many log
193 analysis programs. The log file entries produced in CLF will
194 look something like this:</p>
197 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET
198 /apache_pb.gif HTTP/1.0" 200 2326
201 <p>Each part of this log entry is described below.</p>
204 <dt><code>127.0.0.1</code> (<code>%h</code>)</dt>
206 <dd>This is the IP address of the client (remote host) which
207 made the request to the server. If <directive
208 module="core">HostnameLookups</directive> is
209 set to <code>On</code>, then the server will try to determine
210 the hostname and log it in place of the IP address. However,
211 this configuration is not recommended since it can
212 significantly slow the server. Instead, it is best to use a
213 log post-processor such as <a
214 href="programs/logresolve.html">logresolve</a> to determine
215 the hostnames. The IP address reported here is not
216 necessarily the address of the machine at which the user is
217 sitting. If a proxy server exists between the user and the
218 server, this address will be the address of the proxy, rather
219 than the originating machine.</dd>
221 <dt><code>-</code> (<code>%l</code>)</dt>
223 <dd>The "hyphen" in the output indicates that the requested
224 piece of information is not available. In this case, the
225 information that is not available is the RFC 1413 identity of
226 the client determined by <code>identd</code> on the clients
227 machine. This information is highly unreliable and should
228 almost never be used except on tightly controlled internal
229 networks. Apache httpd will not even attempt to determine
230 this information unless <directive
231 module="core">IdentityCheck</directive> is set
232 to <code>On</code>.</dd>
234 <dt><code>frank</code> (<code>%u</code>)</dt>
236 <dd>This is the userid of the person requesting the document
237 as determined by HTTP authentication. The same value is
238 typically provided to CGI scripts in the
239 <code>REMOTE_USER</code> environment variable. If the status
240 code for the request (see below) is 401, then this value
241 should not be trusted because the user is not yet
242 authenticated. If the document is not password protected,
243 this entry will be "<code>-</code>" just like the previous
246 <dt><code>[10/Oct/2000:13:55:36 -0700]</code>
247 (<code>%t</code>)</dt>
250 The time that the server finished processing the request.
254 <code>[day/month/year:hour:minute:second zone]<br />
256 month = 3*letter<br />
259 minute = 2*digit<br />
260 second = 2*digit<br />
261 zone = (`+' | `-') 4*digit</code>
263 It is possible to have the time displayed in another format
264 by specifying <code>%{format}t</code> in the log format
265 string, where <code>format</code> is as in
266 <code>strftime(3)</code> from the C standard library.
269 <dt><code>"GET /apache_pb.gif HTTP/1.0"</code>
270 (<code>\"%r\"</code>)</dt>
272 <dd>The request line from the client is given in double
273 quotes. The request line contains a great deal of useful
274 information. First, the method used by the client is
275 <code>GET</code>. Second, the client requested the resource
276 <code>/apache_pb.gif</code>, and third, the client used the
277 protocol <code>HTTP/1.0</code>. It is also possible to log
278 one or more parts of the request line independently. For
279 example, the format string "<code>%m %U%q %H</code>" will log
280 the method, path, query-string, and protocol, resulting in
281 exactly the same output as "<code>%r</code>".</dd>
283 <dt><code>200</code> (<code>%>s</code>)</dt>
285 <dd>This is the status code that the server sends back to the
286 client. This information is very valuable, because it reveals
287 whether the request resulted in a successful response (codes
288 beginning in 2), a redirection (codes beginning in 3), an
289 error caused by the client (codes beginning in 4), or an
290 error in the server (codes beginning in 5). The full list of
291 possible status codes can be found in the <a
292 href="http://www.w3.org/Protocols/rfc2616/rfc2616.txt">HTTP
293 specification</a> (RFC2616 section 10).</dd>
295 <dt><code>2326</code> (<code>%b</code>)</dt>
297 <dd>The last entry indicates the size of the object returned
298 to the client, not including the response headers. If no
299 content was returned to the client, this value will be
300 "<code>-</code>". To log "<code>0</code>" for no content, use
301 <code>%B</code> instead.</dd>
305 <section id="combined">
306 <title>Combined Log Format</title>
308 <p>Another commonly used format string is called the Combined
309 Log Format. It can be used as follows.</p>
312 LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\"
313 \"%{User-agent}i\"" combined<br />
314 CustomLog log/acces_log combined
317 <p>This format is exactly the same as the Common Log Format,
318 with the addition of two more fields. Each of the additional
319 fields uses the percent-directive
320 <code>%{<em>header</em>}i</code>, where <em>header</em> can be
321 any HTTP request header. The access log under this format will
325 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET
326 /apache_pb.gif HTTP/1.0" 200 2326
327 "http://www.example.com/start.html" "Mozilla/4.08 [en]
331 <p>The additional fields are:</p>
334 <dt><code>"http://www.example.com/start.html"</code>
335 (<code>\"%{Referer}i\"</code>)</dt>
337 <dd>The "Referer" (sic) HTTP request header. This gives the
338 site that the client reports having been referred from. (This
339 should be the page that links to or includes
340 <code>/apache_pb.gif</code>).</dd>
342 <dt><code>"Mozilla/4.08 [en] (Win98; I ;Nav)"</code>
343 (<code>\"%{User-agent}i\"</code>)</dt>
345 <dd>The User-Agent HTTP request header. This is the
346 identifying information that the client browser reports about
351 <section id="multiple">
352 <title>Multiple Access Logs</title>
354 <p>Multiple access logs can be created simply by specifying
355 multiple <directive module="core">CustomLog</directive>
356 directives in the configuration
357 file. For example, the following directives will create three
358 access logs. The first contains the basic CLF information,
359 while the second and third contain referer and browser
360 information. The last two <directive
361 module="core">CustomLog</directive> lines show how
362 to mimic the effects of the <directive module="core">ReferLog</directive>
363 and <directive module="core">AgentLog</directive> directives.</p>
366 LogFormat "%h %l %u %t \"%r\" %>s %b" common<br />
367 CustomLog logs/access_log common<br />
368 CustomLog logs/referer_log "%{Referer}i -> %U"<br />
369 CustomLog logs/agent_log "%{User-agent}i"
372 <p>This example also shows that it is not necessary to define a
373 nickname with the <code>LogFormat</code> directive. Instead,
374 the log format can be specified directly in the
375 <directive module="core">CustomLog</directive> directive.</p>
378 <section id="conditional">
379 <title>Conditional Logs</title>
381 <p>There are times when it is convenient to exclude certain
382 entries from the access logs based on characteristics of the
383 client request. This is easily accomplished with the help of <a
384 href="env.html">environment variables</a>. First, an
385 environment variable must be set to indicate that the request
386 meets certain conditions. This is usually accomplished with
387 <directive module="mod_setenvif">SetEnvIf</directive>. Then the
388 <code>env=</code> clause of the <code>CustomLog</code>
389 directive is used to include or exclude requests where the
390 environment variable is set. Some examples:</p>
393 # Mark requests from the loop-back interface<br />
394 SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog<br />
395 # Mark requests for the robots.txt file<br />
396 SetEnvIf Request_URI "^/robots\.txt$" dontlog<br />
397 # Log what remains<br />
398 CustomLog logs/access_log common env=!dontlog
401 <p>As another example, consider logging requests from
402 english-speakers to one log file, and non-english speakers to a
403 different log file.</p>
406 SetEnvIf Accept-Language "en" english<br />
407 CustomLog logs/english_log common env=english<br />
408 CustomLog logs/non_english_log common env=!english
411 <p>Although we have just shown that conditional logging is very
412 powerful and flexibly, it is not the only way to control the
413 contents of the logs. Log files are more useful when they
414 contain a complete record of server activity. It is often
415 easier to simply post-process the log files to remove requests
416 that you do not want to consider.</p>
420 <section id="rotation">
421 <title>Rotation Logs</title>
423 <p>On even a moderately busy server, the quantity of
424 information stored in the log files is very large. The access
425 log file typically grows 1 MB or more per 10,000 requests. It
426 will consequently be necessary to periodically rotate the log
427 files by moving or deleting the existing logs. This cannot be
428 done while the server is running, because Apache will continue
429 writing to the old log file as long as it holds the file open.
430 Instead, the server must be <a
431 href="stopping.html">restarted</a> after the log files are
432 moved or deleted so that it will open new log files.</p>
434 <p>By using a <em>graceful</em> restart, the server can be
435 instructed to open new log files without losing any existing or
436 pending connections from clients. However, in order to
437 accomplish this, the server must continue to write to the old
438 log files while it finishes serving old requests. It is
439 therefore necessary to wait for some time after the restart
440 before doing any processing on the log files. A typical
441 scenario that simply rotates the logs and compresses the old
442 logs to save space is:</p>
445 mv access_log access_log.old<br />
446 mv error_log error_log.old<br />
447 apachectl graceful<br />
449 gzip access_log.old error_log.old
452 <p>Another way to perform log rotation is using <a
453 href="#piped">piped logs</a> as discussed in the next
458 <title>Piped Logs</title>
460 <p>Apache httpd is capable of writing error and access log
461 files through a pipe to another process, rather than directly
462 to a file. This capability dramatically increases the
463 flexibility of logging, without adding code to the main server.
464 In order to write logs to a pipe, simply replace the filename
465 with the pipe character "<code>|</code>", followed by the name
466 of the executable which should accept log entries on its
467 standard input. Apache will start the piped-log process when
468 the server starts, and will restart it if it crashes while the
469 server is running. (This last feature is why we can refer to
470 this technique as "reliable piped logging".)</p>
472 <p>Piped log processes are spawned by the parent Apache httpd
473 process, and inherit the userid of that process. This means
474 that piped log programs usually run as root. It is therefore
475 very important to keep the programs simple and secure.</p>
477 <p>Some simple examples using piped logs:</p>
480 # compressed logs<br />
481 CustomLog "|/usr/bin/gzip -c >>
482 /var/log/access_log.gz" common<br />
483 # almost-real-time name resolution<br />
484 CustomLog "|/usr/local/apache/bin/logresolve >>
485 /var/log/access_log" common
488 <p>Notice that quotes are used to enclose the entire command
489 that will be called for the pipe. Although these examples are
490 for the access log, the same technique can be used for the
493 <p>One important use of piped logs is to allow log rotation
494 without having to restart the server. The Apache HTTP Server
495 includes a simple program called <a
496 href="programs/rotatelogs.html">rotatelogs</a> for this
497 purpose. For example, to rotate the logs every 24 hours, you
501 CustomLog "|/usr/local/apache/bin/rotatelogs
502 /var/log/access_log 86400" common
505 <p>A similar, but much more flexible log rotation program
506 called <a href="http://www.cronolog.org/">cronolog</a>
507 is available at an external site.</p>
509 <p>As with conditional logging, piped logs are a very powerful
510 tool, but they should not be used where a simpler solution like
511 off-line post-processing is available.</p>
514 <section id="virtualhost">
515 <title>Virtual Hosts</title>
517 <p>When running a server with many <a href="vhosts/">virtual
518 hosts</a>, there are several options for dealing with log
519 files. First, it is possible to use logs exactly as in a
520 single-host server. Simply by placing the logging directives
521 outside the <directive module="core"
522 type="section">VirtualHost</directive> sections in the
523 main server context, it is possible to log all requests in the
524 same access log and error log. This technique does not allow
525 for easy collection of statistics on individual virtual
528 <p>If <directive module="mod_log_config">CustomLog</directive>
529 or <directive module="mod_log_config">ErrorLog</directive>
530 directives are placed inside a
531 <directive module="core" type="section">VirtualHost</directive>
532 section, all requests or errors for that virtual host will be
533 logged only to the specified file. Any virtual host which does
534 not have logging directives will still have its requests sent
535 to the main server logs. This technique is very useful for a
536 small number of virtual hosts, but if the number of hosts is
537 very large, it can be complicated to manage. In addition, it
538 can often create problems with <a
539 href="vhosts/fd-limits.html">insufficient file
542 <p>For the access log, there is a very good compromise. By
543 adding information on the virtual host to the log format
544 string, it is possible to log all hosts to the same log, and
545 later split the log into individual files. For example,
546 consider the following directives.</p>
549 LogFormat "%v %l %u %t \"%r\" %>s %b"
551 CustomLog logs/access_log comonvhost
554 <p>The <code>%v</code> is used to log the name of the virtual
555 host that is serving the request. Then a program like <a
556 href="programs/other.html">split-logfile</a> can be used to
557 post-process the access log in order to split it into one file
558 per virtual host.</p>
562 <title>Other Log Files</title>
566 <module>mod_cgi</module>
567 <module>mod_rewrite</module>
570 <directive module="core">PidFile</directive>
571 <directive module="mod_rewrite">RewriteLog</directive>
572 <directive module="mod_rewrite">RewriteLogLevel</directive>
573 <directive module="mod_cgi">ScriptLog</directive>
574 <directive module="mod_cgi">ScriptLogLength</directive>
575 <directive module="mod_cgi">ScriptBuffer</directive>
579 <section id="pidfile">
580 <title>PID File</title>
582 <p>On startup, Apache httpd saves the process id of the parent
583 httpd process to the file <code>logs/httpd.pid</code>. This
584 filename can be changed with the <directive module="core">PidFile
585 </directive> directive. The
586 process-id is for use by the administrator in restarting and
587 terminating the daemon by sending signals to the parent
588 process; on Windows, use the -k command line option instead.
589 For more information see the <a href="stopping.html">Stopping
590 and Restarting</a> page.</p>
593 <section id="scriptlog">
594 <title>Script Log</title>
596 <p>In order to aid in debugging, the
597 <directive module="mod_cgi">ScriptLog</directive> directive
598 allows you to record the input to and output from CGI scripts.
599 This should only be used in testing - not for live servers.
600 More information is available in the <a
601 href="mod/mod_cgi.html">mod_cgi documentation</a>.</p>
604 <section id="rewritelog">
605 <title>Rewrite Log</title>
607 <p>When using the powerful and complex features of <a
608 href="mod/mod_rewrite.html">mod_rewrite</a>, it is almost
609 always necessary to use the <directive
610 module="mod_rewrite">RewriteLog</directive> to help
611 in debugging. This log file produces a detailed analysis of how
612 the rewriting engine transforms requests. The level of detail
613 is controlled by the <directive
614 module="mod_rewrite">RewriteLogLevel</directive> directive.</p>