granicus.if.org Git - apache/blob - docs/manual/misc/fin_wait_2.html

   1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
   2 <HTML>
   3 <HEAD>
   4 <TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
   5 <LINK REV="made" HREF="mailto:marc@apache.org">
   6
   7 </HEAD>
   8
   9 <BODY>
  10 <!--#include virtual="header.html" -->
  11
  12 <H1>Connections in the FIN_WAIT_2 state and Apache</H1>
  13 <OL>
  14 <LI><H2>What is the FIN_WAIT_2 state?</H2>
  15 Starting with the Apache 1.2 betas, people are reporting many more
  16 connections in the FIN_WAIT_2 state (as reported by
  17 <code>netstat</code>) than they saw using older versions.  When the
  18 server closes a TCP connection, it sends a packet with the FIN bit
  19 sent to the client, which then responds with a packet with the ACK bit
  20 set.  The client then sends a packet with the FIN bit set to the
  21 server, which responds with an ACK and the connection is closed.  The
  22 state that the connection is in during the period between when the
  23 server gets the ACK from the client and the server gets the FIN from
  24 the client is known as FIN_WAIT_2.  See the <A
  25 HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
  26 technical details of the state transitions.<P>
  27
  28 The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
  29 defined in the standard for it.  This means that on many operating
  30 systems, a connection in the FIN_WAIT_2 state will stay around until
  31 the system is rebooted.  If the system does not have a timeout and
  32 too many FIN_WAIT_2 connections build up, it can fill up the space
  33 allocated for storing information about the connections and crash
  34 the kernel.  The connections in FIN_WAIT_2 do not tie up an httpd
  35 process.<P>
  36
  37 <LI><H2>But why does it happen?</H2>
  38
  39 There are several reasons for it happening, and not all of them are
  40 fully understood by the Apache team yet.  What is known follows.<P>
  41
  42 <H3>Buggy clients and persistent connections</H3>
  43
  44 Several clients have a bug which pops up when dealing with
  45 <A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
  46 When the connection is idle and the server closes the connection
  47 (based on the <A HREF="../mod/core.html#keepalivetimeout">
  48 KeepAliveTimeout</A>), the client is programmed so that the client does
  49 not send back a FIN and ACK to the server.  This means that the
  50 connection stays in the FIN_WAIT_2 state until one of the following
  51 happens:<P>
  52 <UL>
  53         <LI>The client opens a new connection to the same or a different
  54             site, which causes it to fully close the older connection on
  55             that socket.
  56         <LI>The user exits the client, which on some (most?) clients
  57             causes the OS to fully shutdown the connection.
  58         <LI>The FIN_WAIT_2 times out, on servers that have a timeout
  59             for this state.
  60 </UL><P>
  61 If you are lucky, this means that the buggy client will fully close the
  62 connection and release the resources on your server.  However, there
  63 are some cases where the socket is never fully closed, such as a dialup
  64 client disconnecting from their provider before closing the client.
  65 In addition, a client might sit idle for days without making another
  66 connection, and thus may hold its end of the socket open for days
  67 even though it has no further use for it.
  68 <STRONG>This is a bug in the browser or in its operating system's
  69 TCP implementation.</STRONG>  <P>
  70
  71 The clients on which this problem has been verified to exist:<P>
  72 <UL>
  73         <LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
  74         <LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
  75         <LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
  76         <LI>MSIE 3.01 on the Macintosh
  77         <LI>MSIE 3.01 on Windows 95
  78 </UL><P>
  79
  80 This does not appear to be a problem on:
  81 <UL>
  82         <LI>Mozilla/3.01 (Win95; I)
  83 </UL>
  84 <P>
  85
  86 It is expected that many other clients have the same problem. What a
  87 client <STRONG>should do</STRONG> is periodically check its open
  88 socket(s) to see if they have been closed by the server, and close their
  89 side of the connection if the server has closed.  This check need only
  90 occur once every few seconds, and may even be detected by a OS signal
  91 on some systems (e.g., Win95 and NT clients have this capability, but
  92 they seem to be ignoring it).<P>
  93
  94 Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
  95 disables persistent connections for the buggy clients, just
  96 like we recommend doing for Navigator 2.x clients due to other bugs.
  97 However, non-persistent connections increase the total number of
  98 connections needed per client and slow retrieval of an image-laden
  99 web page.  Since non-persistent connections have their own resource
 100 consumptions and a short waiting period after each closure, a busy server
 101 may need persistence in order to best serve its clients.<P>
 102
 103 As far as we know, the client-caused FIN_WAIT_2 problem is present for
 104 all servers that support persistent connections, including Apache 1.1.x
 105 and 1.2.<P>
 106
 107 <H3>Something in Apache may be broken</H3>
 108
 109 While the above bug is a problem, it is not the whole problem.
 110 Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
 111 but with 1.2b enough connections build up in the FIN_WAIT_2 state to
 112 crash their server.  We have not yet identified why this would occur
 113 and welcome additional test input.<P>
 114
 115 One possible (and most likely) source for additional FIN_WAIT_2 states
 116 is a function called <CODE>lingering_close()</CODE> which was added
 117 between 1.1 and 1.2.  This function is necessary for the proper
 118 handling of persistent connections and any request which includes
 119 content in the message body (e.g., PUTs and POSTs).
 120 What it does is read any data sent by the client for
 121 a certain time after the server closes the connection.  The exact
 122 reasons for doing this are somewhat complicated, but involve what
 123 happens if the client is making a request at the same time the
 124 server sends a response and closes the connection. Without lingering,
 125 the client might be forced to reset its TCP input buffer before it
 126 has a chance to read the server's response, and thus understand why
 127 the connection has closed.
 128 See the <A HREF="#appendix">appendix</A> for more details.<P>
 129
 130 We have not yet tracked down the exact reason why
 131 <CODE>lingering_close()</CODE> causes problems.  Its code has been
 132 thoroughly reviewed and extensively updated in 1.2b6.  It is possible
 133 that there is some problem in the BSD TCP stack which is causing the
 134 observed problems.  It is also possible that we fixed it in 1.2b6.
 135 Unfortunately, we have not been able to replicate the problem on our
 136 test servers.<P>
 137
 138 <H2><LI>What can I do about it?</H2>
 139
 140 There are several possible workarounds to the problem, some of
 141 which work better than others.<P>
 142
 143 <H3>Add a timeout for FIN_WAIT_2</H3>
 144
 145 The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
 146 This is not specified by the RFC, and could be claimed to be a
 147 violation of the RFC, but it is widely recognized as being necessary.
 148 The following systems are known to have a timeout:
 149 <P>
 150 <UL>
 151         <LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at 2.0 or possibly earlier.
 152         <LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
 153         <LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
 154         <LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
 155             <A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
 156             K210-027</A> patch installed.
 157         <LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
 158             2.2.  The timeout can be tuned by using <CODE>ndd</CODE> to
 159             modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
 160             default should be appropriate for most servers and improper
 161             tuning can have negative impacts.
 162         <LI><A HREF="http://www.sco.com/">SCO TCP/IP Release 1.2.1</A>
 163             can be modified to have a timeout by following
 164             <A HREF="http://www.sco.com/cgi-bin/waisgate?WAISdocID=2242622956+0+0+0&WAISaction=retrieve"> SCO's instructions</A>.
 165         <LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
 166             earlier(?)
 167         <LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
 168             terminating connections in the FIN_WAIT_2 state after the
 169             normal keepalive timeouts.  This does not
 170             refer to the persistent connection or HTTP keepalive
 171             timeouts, but the <CODE>SO_LINGER</CODE> socket option
 172             which is enabled by Apache.  This parameter can be adjusted
 173             by using <CODE>nettune</CODE> to modify parameters such as
 174             <CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
 175             In later revisions, there is an explicit timer for
 176             connections in FIN_WAIT_2 that can be modified; contact HP
 177             support for details.
 178         <LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
 179             support a timeout.  For IRIX 5.3, 6.2, and 6.3,
 180             use patches 1654, 1703 and 1778 respectively.  If you
 181             have trouble locating these patches, please contact your
 182             SGI support channel for help.
 183         <LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
 184             3.xx both have FIN_WAIT_2 timeouts.  In 2.xx it is non-tunable
 185             at 600 seconds, while in 3.xx it defaults to 600 seconds and
 186             is calculated based on the tunable "max keep alive probes"
 187             (default of 8) multiplied by the "keep alive interval" (default
 188             75 seconds).
 189         <LI><A HREF="http://www.sequent.com">Squent's ptx/TCP/IP for
 190             DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
 191             release 4.1 in mid-1994.
 192 </UL>
 193 <P>
 194 The following systems are known to not have a timeout:
 195 <P>
 196 <UL>
 197         <LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
 198             almost certainly never will have one because it as at the
 199             very end of its development cycle for Sun.  If you have kernel
 200             source should be easy to patch.
 201 </UL>
 202 <P>
 203 There is a
 204 <A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch">
 205 patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
 206 was originally intended for BSD/OS, but should be adaptable to most
 207 systems using BSD networking code.  You need kernel source code to be
 208 able to use it.  If you do adapt it to work for any other systems,
 209 please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
 210 <P>
 211 <H3>Compile without using <CODE>lingering_close()</CODE></H3>
 212
 213 It is possible to compile Apache 1.2 without using the
 214 <CODE>lingering_close()</CODE> function.  This will result in that
 215 section of code being similar to that which was in 1.1.  If you do
 216 this, be aware that it can cause problems with PUTs, POSTs and
 217 persistent connections, especially if the client uses pipelining.
 218 That said, it is no worse than on 1.1, and we understand that keeping your
 219 server running is quite important.<P>
 220
 221 To compile without the <CODE>lingering_close()</CODE> function, add
 222 <CODE>-DNO_LINGCLOSE</CODE> to the end of the
 223 <CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
 224 rerun <CODE>Configure</CODE> and rebuild the server.
 225 <P>
 226 <H3>Use <CODE>SO_LINGER</CODE> as an alternative to
 227 <CODE>lingering_close()</CODE></H3>
 228
 229 On most systems, there is an option called <CODE>SO_LINGER</CODE> that
 230 can be set with <CODE>setsockopt(2)</CODE>.  It does something very
 231 similar to <CODE>lingering_close()</CODE>, except that it is broken
 232 on many systems so that it causes far more problems than
 233 <CODE>lingering_close</CODE>.  On some systems, it could possibly work
 234 better so it may be worth a try if you have no other alternatives. <P>
 235
 236 To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE>  to the end of the
 237 <CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
 238 file, rerun <CODE>Configure</CODE> and rebuild the server.  <P>
 239
 240 <STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
 241 <CODE>lingering_close()</CODE> at the same time is very likely to do
 242 very bad things, so don't.<P>
 243
 244 <H3>Increase the amount of memory used for storing connection state</H3>
 245 <DL>
 246 <DT>BSD based networking code:
 247 <DD>BSD stores network data, such as connection states,
 248 in something called an mbuf.  When you get so many connections
 249 that the kernel does not have enough mbufs to put them all in, your
 250 kernel will likely crash.  You can reduce the effects of the problem
 251 by increasing the number of mbufs that are available; this will not
 252 prevent the problem, it will just make the server go longer before
 253 crashing.<P>
 254
 255 The exact way to increase them may depend on your OS; look
 256 for some reference to the number of "mbufs" or "mbuf clusters".  On
 257 many systems, this can be done by adding the line
 258 <CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
 259 mbuf clusters you want to your kernel config file and rebuilding your
 260 kernel.<P>
 261 </DL>
 262 <H2><LI>Feedback</H2>
 263
 264 If you have any information to add to this page, please contact me at
 265 <A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>
 266
 267 <H2><A NAME="appendix"><LI>Appendix</A></H2>
 268 <P>
 269 Below is a message from Roy Fielding, one of the authors of HTTP/1.1.
 270
 271 <H3>Why the lingering close functionality is necessary with HTTP</H3>
 272
 273 The need for a server to linger on a socket after a close is noted a couple
 274 times in the HTTP specs, but not explained.  This explanation is based on
 275 discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
 276 and John C. Mallery in the hallways of MIT while I was at W3C.<P>
 277
 278 If a server closes the input side of the connection while the client
 279 is sending data (or is planning to send data), then the server's TCP
 280 stack will signal an RST (reset) back to the client.  Upon
 281 receipt of the RST, the client will flush its own incoming TCP buffer
 282 back to the un-ACKed packet indicated by the RST packet argument.
 283 If the server has sent a message, usually an error response, to the
 284 client just before the close, and the client receives the RST packet
 285 before its application code has read the error message from its incoming
 286 TCP buffer and before the server has received the ACK sent by the client
 287 upon receipt of that buffer, then the RST will flush the error message
 288 before the client application has a chance to see it. The result is
 289 that the client is left thinking that the connection failed for no
 290 apparent reason.<P>
 291
 292 There are two conditions under which this is likely to occur:
 293 <OL>
 294 <LI>sending POST or PUT data without proper authorization
 295 <LI>sending multiple requests before each response (pipelining)
 296     and one of the middle requests resulting in an error or
 297     other break-the-connection result.
 298 </OL>
 299 <P>
 300 The solution in all cases is to send the response, close only the
 301 write half of the connection (what shutdown is supposed to do), and
 302 continue reading on the socket until it is either closed by the
 303 client (signifying it has finally read the response) or a timeout occurs.
 304 That is what the kernel is supposed to do if SO_LINGER is set.
 305 Unfortunately, SO_LINGER has no effect on some systems; on some other
 306 systems, it does not have its own timeout and thus the TCP memory
 307 segments just pile-up until the next reboot (planned or not).<P>
 308
 309 Please note that simply removing the linger code will not solve the
 310 problem -- it only moves it to a different and much harder one to detect.
 311 </OL>
 312 <!--#include virtual="footer.html" -->
 313 </BODY>
 314 </HTML>