Azat Khuzhin [Sun, 21 Oct 2018 22:06:48 +0000 (01:06 +0300)]
Disable parallel jobs for the osx (due to CPU time deficit) in travis-ci
As you can see right now linux workers has zero failed tests, while osx
workers has 18 failed tests:
[bufferevent_connect_hostname_emfile FAILED]
[bufferevent_pair_release_lock FAILED]
[bufferevent_timeout FAILED]
[bufferevent_timeout_filter FAILED]
[bufferevent_timeout_pair FAILED]
[common_timeout FAILED]
[del_wait FAILED]
[immediatesignal FAILED]
[loopexit FAILED]
[loopexit_multiple FAILED]
[monotonic_res FAILED]
[no_events FAILED]
[persistent_active_timeout FAILED]
[persistent_timeout_jump FAILED]
[signal_switchbase FAILED]
[signal_while_processing FAILED]
[simpletimeout FAILED]
[usleep FAILED]
And this patch should remove from this list time related failures
(though maybe not all of them).
Azat Khuzhin [Sun, 21 Oct 2018 15:31:01 +0000 (18:31 +0300)]
Simplify bufferevent timeout tests to reduce CPU usage in between start/compare
Between start (setting "started_at") and comparing the time when
timeouts triggered with the start (test_timeval_diff_eq), there is too
much various things that can introduce extra delays and eventually could
fail the test on machine with shortage of CPU.
And this is exactly what happend on:
- travis-ci
- #262
Here is a simple reproducer that I came up with for this issue:
docker run --cpus=0.01 -e LD_LIBRARY_PATH=$PWD/lib -e PATH=/usr/bin:/bin:$PWD/bin -v $PWD:$PWD --rm -it debian:testing regress --no-fork --verbose bufferevent/bufferevent_timeout
Under limited CPU (see reproducer) the test almost always has problems
with that "write_timeout_at" exceed default timeval diff tolerance
(test_timeval_diff_eq() has 50 tolerance), i.e.:
FAIL ../test/regress_bufferevent.c:1040: assert(labs(timeval_msec_diff(((&started_at)), ((&res1.write_timeout_at))) - (100)) <= 50): 101 vs 50
But under some setup write timeout can even not triggered, and the
reason for this is that we write to the bufferevent 1024*1024 bytes, and
hence if evbuffer_write_iovec() will has some delay after writev() and
not send more then one vector at a time [1], it is pretty simple to
trigger, i.e.:
FAIL ../test/regress_bufferevent.c:1040: assert(labs(timeval_msec_diff(((&started_at)), ((&res1.write_timeout_at))) - (100)) <= 50): 1540155888478 vs 50
So this patch just send static small payload for all cases (plus a few
more asserts added).
The outcome of this patch is that all regression tests passed on
travis-ci for linux box [2]. While before it fails almost always [3].
Also reproducer with CPU limiting via docker also survive some
iterations (and strictly speaking it should has less CPU then travis-ci
workers I guess).
Azat Khuzhin [Sun, 21 Oct 2018 00:15:34 +0000 (03:15 +0300)]
Merge branch 'regress-dns-fixes'
* regress-dns-fixes:
Do not rely on getservbyname() for most of the dns regression tests
Turn off dns/getaddrinfo_race_gotresolve by default
Fix an error for debug locking in dns/getaddrinfo_race_gotresolve
Azat Khuzhin [Sun, 21 Oct 2018 00:03:25 +0000 (03:03 +0300)]
Do not rely on getservbyname() for most of the dns regression tests
There is only one test that uses service name getaddrinfo_async, which
manually check whether it works or not, other should not assume that it
is available and works.
There was already an attempt to overcome some possible limitations, like
lack of "http" in /etc/services in d6bafbbeb27ff3943d6f3b6783bcded76384c31e ("test/dns: replace servname
since solaris does not have "http"")
Azat Khuzhin [Sat, 20 Oct 2018 23:50:04 +0000 (02:50 +0300)]
Fix an error for debug locking in dns/getaddrinfo_race_gotresolve
When there is no /etc/services file evdns_getaddrinfo() will fail (with
service="ssh") and hence it will go to then "end" label with locked
rp.lock which in case of debug locking checks will bail with:
[err] ../evthread.c:220: Assertion lock->count == 0 failed in debug_lock_free
So add rp.locked flag, and unlock the lock before freeing it if it is in
locked state.
And here is how you can reproduce the issue:
$ docker run -e LD_LIBRARY_PATH=$PWD/lib -e PATH=/usr/bin:/bin:$PWD/bin -v $PWD:$PWD --rm -it debian:testing regress dns/getaddrinfo_race_gotresolve
(since debian:testing does not have /etc/services)
Jiri Luznicky [Wed, 23 May 2018 13:39:13 +0000 (15:39 +0200)]
Fix missing LIST_HEAD
Despite the presence of 'sys/queue.h' in some stdlib implementations
(i.e. uclibc) 'LIST_HEAD' macro can be missing. This fix defines this
macro in the same manner as was done previously for 'TAILQ_'.
Azat Khuzhin [Wed, 17 Oct 2018 20:21:32 +0000 (23:21 +0300)]
Merge branch 'be-wm-overrun-v2'
* be-wm-overrun-v2:
Fix hangs due to watermarks overruns in bufferevents implementations
test: cover watermarks (with some corner cases) in ssl bufferevent
Azat Khuzhin [Wed, 17 Oct 2018 20:21:17 +0000 (23:21 +0300)]
Fix hangs due to watermarks overruns in bufferevents implementations
Some implementations of bufferevents (for example openssl) can overrun
read high watermark.
And after this if user callback will not drain enough data it will be
suspended (i.e. it will not be runned again anymore).
This is not the expecting behaviour as one may guess, since in this case
the data will never be read. Hence once we detected that the watermark
exceeded (even after calling user callback) we will schedule the
callback again.
This also can be fixed in bufferevent openssl implementation (by
strictly limiting how much data is added to the read buffer according to
read high watermark), but since this data is already available (and in
memory) there is no point in doing so.
avoid warnings with any modern C99 compiler due to implicit function
declaration for pthread_create, as shown by the following :
test/regress_dns.c:2226:2: warning: implicit declaration of function
'pthread_create' is invalid in C99 [-Wimplicit-function-declaration]
THREAD_START(thread[0], race_base_run, &rp);
^
test/regress_thread.h:35:2: note: expanded from macro 'THREAD_START'
pthread_create(&(threadvar), NULL, fn, arg)
^
test/regress_dns.c:2226:2: warning: this function declaration is not a prototype
[-Wstrict-prototypes]
test/regress_thread.h:35:2: note: expanded from macro 'THREAD_START'
pthread_create(&(threadvar), NULL, fn, arg)
^
$ clang --version
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Sergey Fionov [Wed, 1 Aug 2018 21:35:28 +0000 (00:35 +0300)]
evdns: fix race condition in evdns_getaddrinfo()
evdns_getaddrinfo() starts two parallel requests for A and AAAA record.
But if request is created from thread different from dns_base's, request of A record is
started immediately and may result in calling free_getaddrinfo_request() from
evdns_getaddrinfo_gotresolve() because `other_req' doesn't exist yet.
After that, request of AAAA record starts and finishes, and evdns_getaddrinfo_gotresolve()
is called again for structure that is already freed.
This commits adds locking into evdns_getaddrinfo() function.
Azat Khuzhin [Tue, 19 Jun 2018 07:15:08 +0000 (10:15 +0300)]
Cleanup __func__ detection
First of all __func__ is not a macro, it is char[] array, so the code
that we had before in cmake, was incorrect, i.e.:
#if defined (__func__)
#define EVENT____func__ __func__
#elif defined(__FUNCTION__)
#define EVENT____func__ __FUNCTION__
#else
#define EVENT____func__ __FILE__
#endif
So just detect do we have __func__/__FUNCTION__ in configure/cmake
before build and define EVENT__HAVE___func__/EVENT__HAVE___FUNCTION__
to use the later to choose which should be used as a __func__ (if it is
not presented).
Azat Khuzhin [Wed, 1 Aug 2018 06:48:42 +0000 (09:48 +0300)]
Merge branch 'official/pr/671' -- README cleanup
* official/pr/671:
Capitalise project names consistently in README.md
Indent configure flag section to make markdown format them as code
Use https for resources that support it
Rewords awkward sentences in README.md
Fix typos in README.md
Azat Khuzhin [Tue, 31 Jul 2018 21:58:02 +0000 (00:58 +0300)]
autotools: include win32 specific headers for socklen_t detection on win32/mingw
The [1] removes EVENT__ prefix, and now if we will incorrectly detect
that "foobar" (or socklen_t in our case) type is not available, but
somewhere later it will be available then we will get next error [2]:
error: two or more data types in declaration specifiers
According to [3]:
- Compile something in Cygwin and you are compiling it for Cygwin.
- Compile something in MinGW and you are compiling it for Windows.
And I can confirm this, since there is _WIN32 defined (according to [4])
And since according to [5] our image in appveyour (Visual Studion 2015)
has mingw (and we use it, not cygwin) we need ws2tcpip.h (over
sys/socket.h -- which does not exist in win32) header to detect
socklen_t existence.
The script make-event-config.sed was mangling all the symbols by
prefixing them with "EVENT__". The problem here is that some
symbols aren't for local consumption within libevent, but rather
influence other system header files (ex: __USE_FILE_OFFSET64 is
used by dozens of header files including <sys/sendfile.h>).
As a workaround, all symbols starting with a capital letter only
(with the exception of STDC_HEADERS which must also be left
untouched) will be mangled.
Future contributors will need to be aware of this distinction.
Nathan French [Mon, 30 Apr 2018 22:13:45 +0000 (18:13 -0400)]
[core] re-order fields in struct event for memory efficiency
The sizeof `struct event` can reduced on both 32 bit and 64 bit systems
by moving the 4 bytes that make up `ev_events` and `ev_res` below `ev_fd`,
before `struct event_base * ev_base;` since our compiler wouldn't dare do
such a thing (it instead will pad twice, whereas it only needs to be padded
once)
Azat Khuzhin [Sun, 22 Apr 2018 22:50:55 +0000 (01:50 +0300)]
Fix CheckFunctionExistsEx() cmake macro on win32
For example under mingw64 it could not detect that strtok_r() exists,
because it checks with:
void *p = func_name;
And for this you need the function to be defined, so just sync our
CheckFunctionExistsEx.c with CheckFunctionExists.c from cmake (and later
we should drop them out) since it does correct things to detech
functions existence.
Also for WIN32 there is -FIwinsock2.h -FIws2tcpip.h, and I guess that is
not works for mingw gcc (since -F in gcc is framework, and in windows
-FI is like -include in gcc). But looks like we do not need them
already (due to fixed CheckFunctionExistsEx()).
Greg Hazel [Mon, 12 Feb 2018 00:28:58 +0000 (16:28 -0800)]
Fix evhttp_connection_get_addr() fox incomming http connections
Install conn_address of the bufferevent on incomping http connections
(even though this is kind of subsytem violation, so let's fix it in a
simplest way and thinkg about long-term solution).
Jesse Fang [Fri, 23 Feb 2018 11:15:12 +0000 (19:15 +0800)]
bufferevent_socket_connect{,_hostname}() missing event callback and use ret code
- When socket() failed in bufferevent_socket_connect() , the event
callback should be called also in
bufferevent_socket_connect_hostname(). eg. when use
bufferevent_socket_connect_hostname() to resolve and connect an IP
address but process have a smaller ulimit open files, socket() fails
always but caller is not notified.
- Make bufferevent_socket_connect()'s behavior more consistent: function
return error then no callback, function return ok then error passed by
event callback.
Not changing anything right now AFAIK. But if for any reason in the
future we end up with two headers with the same name in the source and
build directories, chances are we want to use the one in the build
directory.
It will be generated by autotools, so there is not reason to include it.
And infact this breaks compilation with out-of-tree builds (VPATH),
since, for the quote form of the include directive, headers in the
directory of the file with the #include line have priority over those
named in -I options, the copy of evconfig-private.h from the source
directory had priority over the one in the build directory.
Azat Khuzhin [Sun, 22 Apr 2018 21:26:08 +0000 (00:26 +0300)]
Adopt ignore rules for cmake + ninja
In case we have build directory differs from source directory there will be
bunch of files we should ignore, so just remove leading "/" for some or rules.
And fix others.
"Upon successful completion, the select() function may modify the object
pointed to by the timout argument."
If "struct timeval" pointer is a "static const", it could potentially
be allocated in a RO text segment. The kernel would then try to copy
back the modified value (with the time remaining) into a read-only
address and SEGV.
Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com> Closes: #614
Azat Khuzhin [Tue, 27 Feb 2018 18:12:14 +0000 (21:12 +0300)]
Fix base unlocking in event_del() if event_base_set() runned in another thread
Image next situation:
T1: T2:
event_del_()
lock the event.ev_base.th_base_lock
event_del_nolock_() event_set_base()
unlock the event.ev_base.th_base_lock
In this case we will unlock the wrong base after event_del_nolock_()
returns, and deadlock is likely to happens, since event_base_set() do
not check any mutexes (due to it is possible to do this only if event is
not inserted anywhere).
So event_del_() has to cache the base before removing the event, and
cached base.th_base_lock after.
John Fremlin [Mon, 18 Dec 2017 03:43:00 +0000 (22:43 -0500)]
Immediately stop trying to accept more connections if listener disabled
This is a refined version of the logic previously in #578
The rationale is that the consumer of sockets may wish to temporarily
delay accepting for some reason (e.g. being out of file-descriptors).
The kernel will then queue them up. The kernel queue is bounded and
programs like NodeJS will actually try to quickly accept and then close
(as the current behaviour before this PR).
However, it seems that libevent should allow the user to choose whether
to accept and respond correctly if the listener is disabled.
John Fremlin [Fri, 1 Dec 2017 01:29:32 +0000 (01:29 +0000)]
http: add callback to allow server to decline (and thereby close) incoming connections.
This is important, as otherwise clients can easily exhaust the file
descriptors available on a libevent HTTP server, which can cause
problems in other code which does not handle EMFILE well: for example,
see https://github.com/bitcoin/bitcoin/issues/11368
Fixes: #577
* evconnlistener-do-not-close-client-fd:
listener: cover closing of fd in case evconnlistener_free() called from acceptcb
Revert "Fix potential fd leak in listener_read_cb()"
@kgraefe:
"I believe that this commit is just wrong: if lev->cnt is not 1 after
the callback, new_fd will still never be closed in listener_read_cb().
So in that case it is the responsibility of the user's code to close
the file descriptor (which is fine). But why shouldn't it be in the
other case? And how does the user's code know?"
Andrey Okoshkin [Wed, 29 Nov 2017 08:13:51 +0000 (11:13 +0300)]
Fix generation of LibeventConfig.cmake for the installation tree
'LIBEVENT_INCLUDE_DIRS' is properly initialized in 'LibeventConfig.cmake' as
'LibeventConfig.cmake.in' contains usage of 'LIBEVENT_CMAKE_DIR' and
'EVENT_INSTALL_INCLUDE_DIR' variables but not 'EVENT_CMAKE_DIR' and
'EVENT__INCLUDE_DIRS'.
Related typos are fixed.
ejurgensen [Sun, 5 Nov 2017 11:18:49 +0000 (12:18 +0100)]
Fix incorrect ref to evhttp_get_decoded_uri in http.h
Replaces reference in the http.h include header file to evhttp_get_decoded_uri
with evhttp_uridecode. There is no function called evhttp_get_decoded_uri.
Azat Khuzhin [Sun, 29 Oct 2017 19:53:41 +0000 (22:53 +0300)]
Allow bodies for GET/DELETE/OPTIONS/CONNECT
I checked with nginx, and via it's lua bindings it allows body for all
this methods. Also everybody knows that some of web-servers allows body
for GET even though this is not RFC conformant.
Azat Khuzhin [Sun, 22 Oct 2017 21:13:37 +0000 (00:13 +0300)]
Fix crashing http server when callback do not reply in place
General http callback looks like:
static void http_cb(struct evhttp_request *req, void *arg)
{
evhttp_send_reply(req, HTTP_OK, "Everything is fine", NULL);
}
And they will work fine becuase in this case http will write request
first, and during write preparation it will disable *read callback* (in
evhttp_write_buffer()), but if we don't reply immediately, for example:
static void http_cb(struct evhttp_request *req, void *arg)
{
return;
}
This will leave connection in incorrect state, and if another request
will be written to the same connection libevent will abort with:
[err] ../http.c: illegal connection state 7
Because it thinks that read for now is not possible, since there were no
write.
Fix this by disabling EV_READ entirely. We couldn't just reset callbacks
because this will leave EOF detection, which we don't need, since user
hasn't replied to callback yet.
Azat Khuzhin [Sun, 24 Sep 2017 12:12:13 +0000 (15:12 +0300)]
Remove OpenSSL paragram from README
Because it is mauvais ton to use binaries instead of normal packages
(like apt-get in debian, pacman in arch, and others).
Plus that link was borken and according to [1] OpenSSL do not ship
binaries officially.
And personally I don't think that this is not obvious that you need
openssl libraries to build libevent with it's support, and BTW you need
headers too (of course).
Vincent JARDIN [Mon, 11 Sep 2017 19:56:30 +0000 (21:56 +0200)]
test: fix warning
In function ‘send_a_byte_cb’:
test/regress.c:1853:2: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
(void) write(*sockp, "A", 1);