Add more information on the FIN_WAIT_2 problem.

author Marc Slemko <marc@apache.org>

Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)

committer Marc Slemko <marc@apache.org>

Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)
author Marc Slemko <marc@apache.org>
Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)
committer Marc Slemko <marc@apache.org>
Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)
diff --git a/docs/manual/misc/fin_wait_2.html b/docs/manual/misc/fin_wait_2.html

new file mode 100644 (file)

index 0000000..d4aa20e
--- /dev/null
+++ b/docs/manual/misc/fin_wait_2.html
@@ -0,0 +1,268 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
+<HTML>
+<HEAD>
+<TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
+<LINK REV="made" HREF="mailto:marc@apache.org">
+
+</HEAD>
+
+<BODY>
+<!--#include virtual="header.html" -->
+
+<H1>Connections in the FIN_WAIT_2 state and Apache</H1>
+<OL>
+<H2><LI>What is the FIN_WAIT_2 state?</H2>
+Starting with the Apache 1.2 betas, people are reporting many more
+connections in the FIN_WAIT_2 state (as reported by
+<code>netstat</code>) than they saw using older versions.  When the
+server closes a TCP connection, it sends a packet with the FIN bit
+sent to the client, which then responds with a packet with the ACK bit
+set.  The client then sends a packet with the FIN bit set to the
+server, which responds with an ACK and the connection is closed.  The
+state that the connection is in during the period between when the
+server gets the ACK from the client and the server gets the FIN from
+the client is known as FIN_WAIT_2.  See the <A
+HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the 
+technical details of the state transitions.<P>
+
+The FIN_WAIT_2 state is somewhat unusual in that there is no timeout 
+defined in the standard for it.  This means that on many operating
+systems, a connection in the FIN_WAIT_2 state will stay around until
+the system is rebooted.  If the system does not have a timeout and
+too many FIN_WAIT_2 connections build up, it can fill up the space
+allocated for storing information about the connections and crash
+the kernel.  The connections in FIN_WAIT_2 do not tie up a httpd
+process.<P>
+
+<H2><LI>But why does it happen?</H2>
+
+There are several reasons for it happening, and not all of them are
+fully understood by the Apache team yet.  What is known follows.<P>
+
+<H3>Buggy clients and persistent connections</H3>
+
+Several clients have a bug which pops up when dealing with
+<A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
+When the connection is idle and the server closes the connection
+(based on the <A HREF="../mod/core.html#keepalivetimeout">
+KeepAliveTimeout</A>), the client is programmed so that the client does
+not send back a FIN and ACK to the server.  This means that the
+connection stays in the FIN_WAIT_2 state until one of the following
+happens:<P>
+<UL>
+       <LI>The buggy client  opens a new connection to the same or a different
+           site, which causes it to fully close the connection.
+       <LI>The user exits the client which, on some (most?) clients
+           causes the OS to fully shutdown the connection.
+       <LI>The FIN_WAIT_2 times out, on servers that have a timeout
+           for this state.
+</UL><P>
+If you are lucky, this means that the buggy client will fully close the
+connection and release the resources on your server.  However, there
+are many cases where things, such as a dialup client disconnecting from
+their provider before closing the client, cause it to remain open.
+<STRONG>This is a bug in the browser.</STRONG>  <P>
+
+The clients on which this problem has been verified to exist:<P>
+<UL>
+       <LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
+       <LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
+       <LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
+       <LI>MSIE 3.01 on the Macintosh
+</UL><P>
+
+It is expected that many other clients have the same problem.<P>
+
+Apache can <STRONG>NOT</STRONG> do anything to avoid this other
+than disabling persistent connections for all buggy clients, just
+like we recommend doing for Navigator 2.x clients due to other bugs
+in Navigator 2.x.  As far as we know, this happens with all servers
+that support persistent connections including Apache 1.1.x and
+1.2.<P>
+
+<H3>Something is broken</H3>
+
+While the above bug is a problem, it is not the whole problem.
+There is some other problem involved; some people do not have any
+serious problems on 1.1.x, but with 1.2 enough connections build
+up in the FIN_WAIT_2 state to crash their server.  This is due to
+a function called <CODE>lingering_close()</CODE> which was added
+between 1.1 and 1.2.  This function is necessary for the proper
+handling of PUTs and POSTs to the server as well as persistent
+connections.  What it does is read any data sent by the client for
+a certain time after the server closes the connection.  The exact
+reasons for doing this are somewhat complicated but involve what
+happens if the client is making a request at the same time the
+server closes the connection; without it, the client would get an
+error.  With it the client just gets the closed connection and
+knows to retry.  See the <A HREF="#appendix">appendix</A> for more
+details.<P>
+
+We have not yet tracked down the exact reason why
+<CODE>lingering_close()</CODE> causes problems.  Its code has been
+thoroughly reviewed.  It is possible there is some problem in the BSD
+TCP stack which is causing this.  Unfortunately, we are not able to
+easily replicate the problem on test servers so it is difficult to
+debug.  We are still working on the problem.  <P>
+
+<H2><LI>What can I do about it?</H2>
+
+There are several possible workarounds to the problem, some of
+which work better than others.<P>
+<H3>Add a timeout for FIN_WAIT_2</H3>
+The obvious workaround is to simply have a timeout for the FIN_WAIT_2
+state.  This is not specified by the RFC and could be claimed to be a
+violation of the RFC, however it is becoming necessary in many cases.
+The following systems are known to have a timeout:
+<P>
+<UL>
+       <LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at 2.0 or possibly earlier.
+       <LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
+       <LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
+       <LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the 
+           <A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
+           K210-027</A> patch installed.  
+       <LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
+           2.2.  The timeout may need tuning by using <CODE>ndd</CODE> to 
+           modify <CODE>tcp_fin_wait_2_flush_interval</CODE>.
+       <LI><A HREF="http://www.sco.com/">SCO TCP/IP Release 1.2.1</A>
+           can be modified to have a timeout by following
+           <A HREF="http://www.sco.com/cgi-bin/waisgate?WAISdocID=2242622956+0+0+0&WAISaction=retrieve"> SCO's instructions</A>.
+       <LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
+           earlier(?)
+</UL>
+<P>
+The following systems are known to not have at timeout:
+<P>
+<UL>
+       <LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
+           almost certainly never will have one because it as at the
+           very end of its development cycle for Sun.  If you have kernel
+           source should be easy to patch.
+       <LI><A HREF="http://www.sgi.com/">IRIX</A> does not have a
+           timeout and, according to our information, has stated that
+           they will not add one unless it is specified in the RFC.
+</UL>
+<P>
+There is a 
+<A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch">
+patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
+was originally intended for BSD/OS, but should be adaptable to most
+systems using BSD networking code.  You need kernel source code to be
+able to use it.  If you do adapt it to work for any other systems,
+please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
+<P>
+<H3>Compile without using <CODE>lingering_close()</CODE></H3>
+
+It is possible to compile Apache 1.2 without using the
+<CODE>lingering_close()</CODE> function.  This will result in that
+section of code being similar to that which was in 1.1.  If you do
+this, be aware that it can cause problems with PUTs, POSTs and
+persistent connections, especially if the client uses pipelining.  
+That said, it is no worse than on 1.1 and I assume that keeping your 
+server running is quite important.<P>
+
+To compile without the <CODE>lingering_close()</CODE> function, add
+<CODE>-DNO_LINGCLOSE</CODE> to the end of the
+<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
+rerun <CODE>Configure</CODE> and rebuild the server.
+<P>
+<H3>Use <CODE>SO_LINGER</CODE> as an alternative to
+<CODE>lingering_close()</CODE></H3>
+
+On most systems, there is an option called <CODE>SO_LINGER</CODE> that
+can be set with <CODE>setsockopt(2)</CODE>.  It does something very
+similar to <CODE>lingering_close()</CODE>, except that it is broken
+on many systems so that it causes far more problems than
+<CODE>lingering_close</CODE>.  On some systems, it could possibly work
+better so it may be worth a try if you have no other alternatives. <P>
+
+To try it, add <CODE>-DUSE_SO_LINGER</CODE>  to the end of the
+<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
+file, rerun <CODE>Configure</CODE> and rebuild the server.  <P>
+
+<STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
+<CODE>lingering_close()</CODE> at the same time is very likely to do
+very bad things, so don't.<P>
+
+<H3>Increase the amount of memory used for storing connection state</H3>
+<DL>
+<DT>BSD based networking code: <DD>BSD stores network data such as connection
+states in something called a mbuf.  When you get so many connections
+that the kernel does not have enough mbufs to put them all in, your
+kernel will likely crash.  You can reduce the effects of the problem
+by increasing the number of mbufs that are available; this will not
+prevent the problem, it will just make the server go longer before
+crashing.<P>
+
+The exact way to increase them may depend on your OS; look
+for some reference to the number of "mbufs" or "mbuf clusters".  On
+many systems, this can be done by adding the line 
+<CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of 
+mbuf clusters you want to your kernel config file and rebuilding your 
+kernel.<P>
+</DL>
+<H2><LI>Feedback</H2>
+
+If you have any information to add to this page, please contact me at
+<A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>
+
+<H2><A NAME="appendix"><LI>Appendix</H2>
+<P>
+Below is a message from Roy Fielding that details some of the
+reasons why some type of function that has the functionality of
+<CODE>lingering_close()</CODE> is necessary.
+
+<PRE>
+Date: Tue, 21 Jan 1997 01:15:38 -0800
+From: "Roy T. Fielding" &lt;fielding@liege.ICS.UCI.EDU&gt;
+Subject: Re: lingering_close() 
+
+Sorry, I thought everyone was up to speed on this problem (and I just
+managed to catch up on my apache mail, finally).  This is noted a couple
+times in the HTTP specs, but most of the discussion was between myself,
+Henrik, rst, and Dave Raggett in the hallways of MIT (which is why it
+doesn't appear in our archives).
+
+If a server closes the input side of the connection while the client
+is sending data (or is planning to send data), then the server's TCP
+stack will signal an RST (reset, not Robert) back to the client.  Upon
+receipt of the RST, the client will flush its own incoming TCP buffer
+back to the un-ACKed packet indicated by the RST packet argument.
+If the server has sent a message, usually an error response, to the
+client just before the close, and the client receives the RST packet
+before its application code has read the error message from its incoming
+TCP buffer, then the RST will flush the error message before the client
+application has a chance to see it, and thus the client is left thinking
+that the connection failed for no apparent reason.
+
+There are two conditions under which this is likely to occur:
+  1) sending POST or PUT data without proper authorization
+  2) sending multiple requests before each response (pipelining) 
+     and one of the middle requests resulting in an error or
+     other break-the-connection result.
+
+The solution in all cases is to send the response, close only the
+write half of the connection (what shutdown is supposed to do), and
+continue reading on the socket until it is either closed by the
+client (signifying it has finally read the response) or a timeout occurs.
+That is what the kernel is supposed to do if SO_LINGER is set.
+Unfortunately, SO_LINGER has no effect on some systems; on some other
+systems, it does not have its own timeout and thus the TCP memory
+segments just pile-up until the next reboot (planned or not).
+
+That is why rst coded-up a linger replacement.  As I recall, he said at
+the time that it needed further testing, which we never got around to
+doing.  From the descriptions I have read, it sounds like the lingering
+close code is doing something wrong when it is timed-out, since that
+is what happens if a client does not close its connection.
+
+Please note that simply removing the linger code will not solve the
+problem -- it only moves it to a different and much harder to detect one.
+
+.....Roy
+</PRE>
+</OL>
+<!--#include virtual="footer.html" -->
+</BODY>
+</HTML>
diff --git a/docs/manual/platform/perf-bsd44.html b/docs/manual/platform/perf-bsd44.html

index c22c982326cf2ed8ec2df242c57510f51d30c58d..495579eca3ebd4f66754ee6add5343a29b10e4b5 100644 (file)
--- a/docs/manual/platform/perf-bsd44.html
+++ b/docs/manual/platform/perf-bsd44.html
@@ -126,9 +126,8 @@ connection ends up in the TIME_WAIT state for several minutes, during
  which time its mbufs are not yet freed. Another reason is that, on server
  timeouts, some connections end up in FIN_WAIT_2 state forever, because
  this state doesn't time out on the server, and the browser never sent
-a final FIN. An example patch for BSDI is available
-<a href="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch">
-here</a>.
+a final FIN.  For more details see the 
+<A HREF="fin_wait_2.html">FIN_WAIT_2</A> page. 
  
  <p>
author	Marc Slemko <marc@apache.org>
	Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)
committer	Marc Slemko <marc@apache.org>
	Tue, 28 Jan 1997 04:23:08 +0000 (04:23 +0000)
docs/manual/misc/fin_wait_2.html	[new file with mode: 0644]	patch \| blob
docs/manual/platform/perf-bsd44.html		patch \| blob \| history