From: dgaudet Date: Wed, 23 Apr 1997 02:40:50 +0000 (+0000) Subject: This documents graceful restarts, and the caveats associated with them. The X-Git-Tag: APACHE_1_2b9~17 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=08e269d77e97f479aa9a8380a69bf4bdf5f9c8c3;p=apache This documents graceful restarts, and the caveats associated with them. The graceful restart code itself isn't committed yet... and so this doc won't be linked to until we commit the graceful restart code. Reviewed by: Submitted by: Obtained from: git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@77988 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/manual/stopping.html b/docs/manual/stopping.html new file mode 100644 index 0000000000..898acca3ca --- /dev/null +++ b/docs/manual/stopping.html @@ -0,0 +1,137 @@ + + + +Stopping and Restarting Apache + + + + +

Stopping and Restarting Apache

+ +

You will notice many httpd executables running on your system, +but you should not send signals to any of them except the parent, whose +pid is in the PidFile. That is to +say you shouldn't ever need to send signals to any process except the +parent. There are three signals that you can send the parent: +TERM, HUP, and USR1, which will +be described in a moment. + +

To send a signal to the parent you should issue a command such as: +

+    kill -TERM `cat /usr/local/etc/httpd/logs/httpd.pid`
+
+ +You can read about its progress by issuing: + +
+    tail -f /usr/local/etc/httpd/logs/error_log
+
+ +Modify those examples to match your +ServerRoot and +PidFile settings. + +

TERM Signal: stop now

+ +

Sending the TERM signal to the parent causes it to +immediately attempt to kill off all of its children. It may take it +several seconds to complete killing off its children. Then the +parent itself exits. Any requests in progress are terminated, and no +further requests are served. + +

HUP Signal: restart now

+ +

Sending the HUP signal to the parent causes it to kill off +its children like in TERM but the parent doesn't exit. It +re-reads its configuration files, and re-opens any log files. +Then it spawns a new set of children and continues +serving hits. + +

Users of the +status module +will notice that the server statistics are +set to zero when a HUP is sent. + +

USR1 Signal: graceful restart

+ +

Note: prior to release 1.2b9 this code is quite unstable and +shouldn't be used at all. + +

The USR1 signal causes the parent process to advise +the children to exit after their current request (or to exit immediately +if they're not serving anything). The parent re-reads its configuration +files and re-opens its log files. As each child dies off the parent +replaces it with a child from the new generation of the +configuration, which begins serving new requests immediately. + +

This code is designed to always respect the +MaxClients, +MinSpareServers, +and MaxSpareServers settings. +Furthermore, it respects StartServers +in the following manner: if after one second at least StartServers new +children have not been created, then create enough to pick up the slack. +This is to say that the code tries to maintain both the number of children +appropriate for the current load on the server, and respect your wishes +with the StartServers parameter. + +

Users of the +status module +will notice that the server statistics +are not set to zero when a USR1 is sent. The code +was written to both minimize the time in which the server is unable to serve +new requests (they will be queued up by the operating system, so they're +not lost in any event) and to respect your tuning parameters. In order +to do this it has to keep the scoreboard used to keep track +of all children across generations. + +

The status module will also use a G to indicate those +children which are still serving requests started before the graceful +restart was given. + +

At present there is no way for a log rotation script using +USR1 to know for certain that all children writing the +pre-restart log have finished. We suggest that you use a suitable delay +after sending the USR1 signal before you do anything with the +old log. For example if most of your hits take less than 10 minutes to +complete for users on low bandwidth links then you could wait 15 minutes +before doing anything with the old log. + +

Appendix: signals and race conditions

+ +

Prior to Apache 1.2b9 there were several race conditions +involving the restart and die signals (a simple description of race +condition is: a time-sensitive problem, as in if something happens at just +the wrong time it won't behave as expected). For those architectures that +have the "right" feature set we have eliminated as many as we can. +But it should be noted that there still do exist race conditions on +certain architectures. + +

Architectures that use an on disk ScoreBoardFile have the potential +to lose track of a child during graceful restart (you'll see an ErrorLog message saying something about +a long lost child). The ScoreBoardFile directive explains how +to figure out if your server uses a file, and possibly how to avoid it. +There is also the potential that the scoreboard will be corrupted during +any signalling, but this only has bad effects on graceful restart. + +

NEXT and MACHTEN have small race conditions +which can cause a restart/die signal to be lost, but should not cause the +server to do anything otherwise problematic. + + +

All architectures have a small race condition in each child involving +the second and subsequent requests on a persistent HTTP connection +(KeepAlive). It may exit after reading the request line but before +reading any of the request headers. There is a fix that was discovered +too late to make 1.2. In theory this isn't an issue because the KeepAlive +client has to expect these events because of network latencies and +server timeouts. In practice it doesn't seem to affect anything either +-- in a test case the server was restarted twenty times per second and +clients successfully browsed the site without getting broken images or +empty documents. + + + + diff --git a/docs/manual/stopping.html.en b/docs/manual/stopping.html.en new file mode 100644 index 0000000000..898acca3ca --- /dev/null +++ b/docs/manual/stopping.html.en @@ -0,0 +1,137 @@ + + + +Stopping and Restarting Apache + + + + +

Stopping and Restarting Apache

+ +

You will notice many httpd executables running on your system, +but you should not send signals to any of them except the parent, whose +pid is in the PidFile. That is to +say you shouldn't ever need to send signals to any process except the +parent. There are three signals that you can send the parent: +TERM, HUP, and USR1, which will +be described in a moment. + +

To send a signal to the parent you should issue a command such as: +

+    kill -TERM `cat /usr/local/etc/httpd/logs/httpd.pid`
+
+ +You can read about its progress by issuing: + +
+    tail -f /usr/local/etc/httpd/logs/error_log
+
+ +Modify those examples to match your +ServerRoot and +PidFile settings. + +

TERM Signal: stop now

+ +

Sending the TERM signal to the parent causes it to +immediately attempt to kill off all of its children. It may take it +several seconds to complete killing off its children. Then the +parent itself exits. Any requests in progress are terminated, and no +further requests are served. + +

HUP Signal: restart now

+ +

Sending the HUP signal to the parent causes it to kill off +its children like in TERM but the parent doesn't exit. It +re-reads its configuration files, and re-opens any log files. +Then it spawns a new set of children and continues +serving hits. + +

Users of the +status module +will notice that the server statistics are +set to zero when a HUP is sent. + +

USR1 Signal: graceful restart

+ +

Note: prior to release 1.2b9 this code is quite unstable and +shouldn't be used at all. + +

The USR1 signal causes the parent process to advise +the children to exit after their current request (or to exit immediately +if they're not serving anything). The parent re-reads its configuration +files and re-opens its log files. As each child dies off the parent +replaces it with a child from the new generation of the +configuration, which begins serving new requests immediately. + +

This code is designed to always respect the +MaxClients, +MinSpareServers, +and MaxSpareServers settings. +Furthermore, it respects StartServers +in the following manner: if after one second at least StartServers new +children have not been created, then create enough to pick up the slack. +This is to say that the code tries to maintain both the number of children +appropriate for the current load on the server, and respect your wishes +with the StartServers parameter. + +

Users of the +status module +will notice that the server statistics +are not set to zero when a USR1 is sent. The code +was written to both minimize the time in which the server is unable to serve +new requests (they will be queued up by the operating system, so they're +not lost in any event) and to respect your tuning parameters. In order +to do this it has to keep the scoreboard used to keep track +of all children across generations. + +

The status module will also use a G to indicate those +children which are still serving requests started before the graceful +restart was given. + +

At present there is no way for a log rotation script using +USR1 to know for certain that all children writing the +pre-restart log have finished. We suggest that you use a suitable delay +after sending the USR1 signal before you do anything with the +old log. For example if most of your hits take less than 10 minutes to +complete for users on low bandwidth links then you could wait 15 minutes +before doing anything with the old log. + +

Appendix: signals and race conditions

+ +

Prior to Apache 1.2b9 there were several race conditions +involving the restart and die signals (a simple description of race +condition is: a time-sensitive problem, as in if something happens at just +the wrong time it won't behave as expected). For those architectures that +have the "right" feature set we have eliminated as many as we can. +But it should be noted that there still do exist race conditions on +certain architectures. + +

Architectures that use an on disk ScoreBoardFile have the potential +to lose track of a child during graceful restart (you'll see an ErrorLog message saying something about +a long lost child). The ScoreBoardFile directive explains how +to figure out if your server uses a file, and possibly how to avoid it. +There is also the potential that the scoreboard will be corrupted during +any signalling, but this only has bad effects on graceful restart. + +

NEXT and MACHTEN have small race conditions +which can cause a restart/die signal to be lost, but should not cause the +server to do anything otherwise problematic. + + +

All architectures have a small race condition in each child involving +the second and subsequent requests on a persistent HTTP connection +(KeepAlive). It may exit after reading the request line but before +reading any of the request headers. There is a fix that was discovered +too late to make 1.2. In theory this isn't an issue because the KeepAlive +client has to expect these events because of network latencies and +server timeouts. In practice it doesn't seem to affect anything either +-- in a test case the server was restarted twenty times per second and +clients successfully browsed the site without getting broken images or +empty documents. + + + +