From: Remi Gacogne Date: Tue, 1 Dec 2015 13:24:26 +0000 (+0100) Subject: Gracefully handle a reused downstream TCP connection dying on us X-Git-Tag: dnsdist-1.0.0-alpha1~150^2 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=8b92100f1f563aed5e6567b5257b5520bcb955a8;p=pdns Gracefully handle a reused downstream TCP connection dying on us In dnsdist, we try to reuse TCP connection to Downstream servers as much as possible. However, when sending the size of a new query, we didn't properly handle a connection being closed by the downstream server. Turns out, writing tests actually help finding bugs, who would have thought? --- diff --git a/pdns/dnsdist-tcp.cc b/pdns/dnsdist-tcp.cc index 0508deaa7..739ad5200 100644 --- a/pdns/dnsdist-tcp.cc +++ b/pdns/dnsdist-tcp.cc @@ -268,9 +268,18 @@ void* tcpClientThread(int pipefd) downstream_failures++; goto retry; } - - writen2WithTimeout(dsock, query, qlen, ds->tcpSendTimeout); - + + try { + writen2WithTimeout(dsock, query, qlen, ds->tcpSendTimeout); + } + catch(const runtime_error& e) { + vinfolog("Downstream connection to %s died on us, getting a new one!", ds->getName()); + close(dsock); + sockets[ds->remote]=dsock=setupTCPDownstream(ds->remote); + downstream_failures++; + goto retry; + } + if(!getNonBlockingMsgLen(dsock, &rlen, ds->tcpRecvTimeout)) { vinfolog("Downstream connection to %s died on us phase 2, getting a new one!", ds->getName()); close(dsock);