From 13fe35dbb9d7ccc5febd73daac679bbc7fb75cb1 Mon Sep 17 00:00:00 2001 From: Remi Gacogne Date: Fri, 26 Feb 2016 17:16:28 +0100 Subject: [PATCH] dnsdist: maxOustanding defaults to 10240. Add 'tuning' to README --- pdns/README-dnsdist.md | 87 ++++++++++++++++++++++++++++++++++++++++-- pdns/dnsdist.cc | 4 +- 2 files changed, 85 insertions(+), 6 deletions(-) diff --git a/pdns/README-dnsdist.md b/pdns/README-dnsdist.md index fed94ca27..4a77b60fc 100644 --- a/pdns/README-dnsdist.md +++ b/pdns/README-dnsdist.md @@ -10,7 +10,7 @@ interface. Compiling --------- -`dnsdist` depends on boost, Lua or luajit and a pretty recent C++ +`dnsdist` depends on boost, Lua or LuaJIT and a pretty recent C++ compiler (g++ 4.8 or higher, clang 3.5 or higher). It can optionally use libsodium for encrypted communications with its client. @@ -351,6 +351,7 @@ A DNS rule can be: * a TCPRule Some specific actions do not stop the processing when they match, contrary to all other actions: + * Delay * Disable Validation * Log @@ -447,6 +448,7 @@ end ``` Valid return values for `LuaAction` functions are: + * DNSAction.Allow: let the query pass, skipping other rules * DNSAction.Delay: delay the response for the specified milliseconds (UDP-only), continue to the next rule * DNSAction.Drop: drop the query @@ -704,7 +706,6 @@ fe80::/10 Caching ------- - `dnsdist` implements a simple but effective packet cache, not enabled by default. It is enabled per-pool, but the same cache can be shared between several pools. The first step is to define a cache, then to assign that cache to the chosen pool, @@ -721,6 +722,86 @@ and the last one, optional too, is the minimum TTL an entry should have to be co for insertion in the cache. +Performance tuning +------------------ +First, a few words about `dnsdist` architecture: + + * Each local bind has its own thread listening for incoming UDP queries + * and its own thread listening for incoming TCP connections, + dispatching them right away to a pool of threads + * Each backend has its own thread listening for UDP responses + * A maintenance thread calls the `maintenance()` Lua function every second + if any, and is responsible for cleaning the cache + * A health check thread checks the backends availability + * A control thread handles console connections + * A carbon thread exports statistics to a carbon server if needed + * One or more webserver threads handle queries to the internal webserver + +The maximum number of threads in the TCP pool is controlled by the +`setMaxTCPClientThreads()` directive, and defaults to 10. This number can be +increased to handle a large number of simultaneous TCP connections. + +When dispatching UDP queries to backend servers, `dnsdist` keeps track of at +most `n` outstanding queries for each backend. This number `n` can be tuned by +the `setMaxUDPOutstanding()` directive, defaulting to 10240, with a maximum +value of 65535. Large installations are advised to increase the default value +at the cost of a slightly increased memory usage. + +Most of the query processing is done in C++ for maximum performance, +but some operations are executed in Lua for maximum flexibility: + + * the `blockfilter()` function + * rules added by `addLuaAction()` + * server selection policies defined via `setServerPolicyLua()` or `newServerPolicy()` + +While Lua is fast, its use should be restricted to the strict necessary in order +to achieve maximum performance, it might be worth considering using LuaJIT instead +of Lua. When Lua inspection is needed, the best course of action is to restrict +the queries sent to Lua inspection by using `addLuaAction()` instead of inspecting +all queries in the `blockfilter()` function. + +`dnsdist` design choices mean that the processing of UDP queries is done by only +one thread per local bind. This is great to keep lock contention to a low level, +but might not be optimal for setups using a lot of processing power, caused for +example by a large number of complicated rules. To be able to use more CPU cores +for UDP queries processing, it is possible to use the `reuseport` parameter of +the `addLocal()` and `setLocal()` directives to be able to add several identical +local binds to `dnsdist`: + +``` +addLocal("192.0.2.1:53", true, true) +addLocal("192.0.2.1:53", true, true) +addLocal("192.0.2.1:53", true, true) +addLocal("192.0.2.1:53", true, true) +``` + +`dnsdist` will then add four identical local binds as if they were different IPs +or ports, start four threads to handle incoming queries and let the kernel load +balance those randomly to the threads, thus using four CPU cores for rules +processing. Note that this require SO_REUSEPORT support in the underlying +operating system (added for example in Linux 3.9). +Please also be aware that doing so will increase lock contention and might not +therefore scale linearly. This is especially true for Lua-intensive setups, +because Lua processing in `dnsdist` is serialized by an unique lock for all +threads. + +Another possibility is to use the reuseport option to run several `dnsdist` +processes in parallel on the same host, thus avoiding the lock contention issue +at the cost of having to deal with the fact that the different processes will +not share informations, like statistics or DDoS offenders. + +The UDP threads handling the responses from the backends do not use a lot of CPU, +but if needed it is also possible to add the same backend several times to the +`dnsdist` configuration to distribute the load over several responder threads. + +``` +newServer({address="192.0.2.127:53", name="Backend1"}) +newServer({address="192.0.2.127:53", name="Backend2"}) +newServer({address="192.0.2.127:53", name="Backend3"}) +newServer({address="192.0.2.127:53", name="Backend4"}) +``` + + Carbon/Graphite/Metronome ------------------------- To emit metrics to Graphite, or any other software supporting the Carbon protocol, use: @@ -1005,7 +1086,7 @@ instantiate a server with additional parameters * `setTCPRecvTimeout(n)`: set the read timeout on TCP connections from the client, in seconds * `setTCPSendTimeout(n)`: set the write timeout on TCP connections from the client, in seconds * `setMaxTCPClientThreads(n)`: set the maximum of TCP client threads, handling TCP connections - * `setMaxUDPOutstanding(n)`: set the maximum number of outstanding UDP queries to a given backend server. This can only be set at configuration time + * `setMaxUDPOutstanding(n)`: set the maximum number of outstanding UDP queries to a given backend server. This can only be set at configuration time and defaults to 10240 * `setCacheCleaningDelay(n)`: set the interval in seconds between two runs of the cache cleaning algorithm, removing expired entries * DNSCrypt related: * `addDNSCryptBind("127.0.0.1:8443", "provider name", "/path/to/resolver.cert", "/path/to/resolver.key", [false]):` listen to incoming DNSCrypt queries on 127.0.0.1 port 8443, with a provider name of "provider name", using a resolver certificate and associated key stored respectively in the `resolver.cert` and `resolver.key` files. The last optional parameter sets SO_REUSEPORT when available diff --git a/pdns/dnsdist.cc b/pdns/dnsdist.cc index c81d6373d..8ceee8d66 100644 --- a/pdns/dnsdist.cc +++ b/pdns/dnsdist.cc @@ -59,7 +59,7 @@ using std::thread; bool g_verbose; struct DNSDistStats g_stats; -uint16_t g_maxOutstanding; +uint16_t g_maxOutstanding{10240}; bool g_console; bool g_verboseHealthChecks{false}; @@ -1309,8 +1309,6 @@ try g_cmdLine.remotes.push_back(*p); } - g_maxOutstanding = 1024; - ServerPolicy leastOutstandingPol{"leastOutstanding", leastOutstanding}; g_policy.setState(leastOutstandingPol); -- 2.40.0