Rich Felker [Fri, 10 Feb 2012 02:24:56 +0000 (21:24 -0500)]
small fix for new pthread cleanup stuff
even if pthread_create/exit code is not linked, run flag needs to be
checked and cleanup function potentially run on pop. thus, move the
code to the module that's always linked when pthread_cleanup_push/pop
is used.
Rich Felker [Thu, 9 Feb 2012 07:33:08 +0000 (02:33 -0500)]
replace bad cancellation cleanup abi with a sane one
the old abi was intended to duplicate glibc's abi at the expense of
being ugly and slow, but it turns out glib was not even using that abi
except on non-gcc-compatible compilers (which it doesn't even support)
and was instead using an exceptions-in-c/unwind-based approach whose
abi we could not duplicate anyway without nasty dwarf2/unwind
integration.
the new abi is copied from a very old glibc abi, which seems to still
be supported/present in current glibc. it avoids all unwinding,
whether by sjlj or exceptions, and merely maintains a linked list of
cleanup functions to be called from the context of pthread_exit. i've
made some care to ensure that longjmp out of a cleanup function should
work, even though it is not required to.
this change breaks abi compatibility with programs which were using
pthread cancellation, which is unfortunate, but that's why i'm making
the change now rather than later. considering that most pthread
features have not been usable until recently anyway, i don't see it as
a major issue at this point.
Rich Felker [Wed, 8 Feb 2012 01:31:27 +0000 (20:31 -0500)]
protect against cancellation in dlopen
i'm not sure that it's "correct" for dlopen to block cancellation
when calling constructors for libraries it loads, but it sure seems
like the right thing. in any case, dlopen itself needs cancellation
blocked.
Rich Felker [Tue, 7 Feb 2012 18:10:30 +0000 (13:10 -0500)]
declare basename in string.h when _GNU_SOURCE is defined
note that it still will have the standards-conformant behavior, not
the GNU behavior. but at least this prevents broken code from ending
up with truncated pointers due to implicit declarations...
Rich Felker [Tue, 7 Feb 2012 17:08:27 +0000 (12:08 -0500)]
revert hacks for types of stdint.h integer constant macros
per 7.18.4: Each invocation of one of these macros shall expand to an
integer constant expression suitable for use in #if preprocessing
directives. The type of the expression shall have the same type as
would an expression of the corresponding type converted according to
the integer promotions. The value of the expression shall be that of
the argument.
the key phrase is "converted according to the integer promotions".
thus there is no intent or allowance that the expression have
smaller-than-int types.
Rich Felker [Mon, 6 Feb 2012 19:39:09 +0000 (14:39 -0500)]
add support for init/finit (constructors and destructors)
this is mainly in hopes of supporting c++ (not yet possible for other
reasons) but will also help applications/libraries which use (and more
often, abuse) the gcc __attribute__((__constructor__)) feature in "C"
code.
x86_64 and arm versions of the new startup asm are untested and may
have minor problems.
Rich Felker [Fri, 3 Feb 2012 08:16:07 +0000 (03:16 -0500)]
include dummied-out dlopen and dlsym functions for static binaries
these don't work (or do anything at all) but at least make it possible
to static link programs that insist on "having" dynamic loading
support...as long as they don't actually need to use it.
adding real support for dlopen/dlsym with static linking is going to
be significantly more difficult...
Rich Felker [Thu, 2 Feb 2012 05:11:29 +0000 (00:11 -0500)]
make stdio open, read, and write operations cancellation points
it should be noted that only the actual underlying buffer flush and
fill operations are cancellable, not reads from or writes to the
buffer. this behavior is compatible with POSIX, which makes all
cancellation points in stdio optional, and it achieves the goal of
allowing cancellation of a thread that's "stuck" on IO (due to a
non-responsive socket/pipe peer, slow/stuck hardware, etc.) without
imposing any measurable performance cost.
Rich Felker [Tue, 24 Jan 2012 05:22:27 +0000 (00:22 -0500)]
make gcc wrapper support -shared correctly
it was previously attempting to link start files as part of shared
objects. this is definitely wrong and depending on the platform and
linker could range from just adding extraneous junk to introducing
textrels to making linking fail entirely.
Rich Felker [Sun, 22 Jan 2012 22:19:37 +0000 (17:19 -0500)]
fix cancellation failure in single-threaded programs
even a single-threaded program can be cancellable, e.g. if it's called
pthread_cancel(pthread_self()). the correct predicate to check is not
whether multiple threads have been invoked, but whether pthread_self
has been invoked.
Rich Felker [Fri, 20 Jan 2012 16:14:27 +0000 (11:14 -0500)]
fix dynamic linker not to depend on DYNAMIC ptr in 0th entry of GOT
this fixes an issue using gold instead of gnu ld for linking. it also
should eliminate the need of the startup code to even load/pass the
got address to the dynamic linker.
based on patch submitted by sh4rm4 with minor cosmetic changes.
Rich Felker [Thu, 19 Jan 2012 04:28:48 +0000 (23:28 -0500)]
alias basename to glibc name for it, to meet abi goals
note that regardless of the name used, basename is always conformant.
it never takes on the bogus gnu behavior, unlike glibc where basename
is nonconformant when declared manually without including libgen.h.
Rich Felker [Thu, 17 Nov 2011 04:59:28 +0000 (23:59 -0500)]
fix issue with excessive mremap syscalls on realloc
CHUNK_SIZE macro was defined incorrectly and shaving off at least one
significant bit in the size of mmapped chunks, resulting in the test
for oldlen==newlen always failing and incurring a syscall. fortunately
i don't think this issue caused any other observable behavior; the
definition worked correctly for all non-mmapped chunks where its
correctness matters more, since their lengths are always multiples of
the alignment.
Rich Felker [Sat, 15 Oct 2011 04:28:49 +0000 (00:28 -0400)]
don't define wchar_t on c++
it's a keyword in c++ (wtf). i'm not sure this is the cleanest
solution; it might be better to avoid ever defining __NEED_wchar_t on
c++. but in any case, this works for now.
Rich Felker [Sat, 15 Oct 2011 03:31:04 +0000 (23:31 -0400)]
add dummy __cxa_finalize
musl's dynamic linker does not support unloading dsos, so there's
nothing for this function to do. adding the symbol in case anything
depends on its presence..
Rich Felker [Mon, 10 Oct 2011 02:51:03 +0000 (22:51 -0400)]
fix F_GETOWN return value handling
the fcntl syscall can return a negative value when the command is
F_GETOWN, and this is not an error code but an actual value. thus we
must special-case it and avoid calling __syscall_ret to set errno.
this fix is better than the glibc fix (using F_GETOWN_EX) which only
works on newer kernels and is more complex.
Rich Felker [Mon, 3 Oct 2011 04:19:05 +0000 (00:19 -0400)]
simplify robust mutex unlock code path
right now it's questionable whether this change is an improvement or
not, but if we later want to support priority inheritance mutexes, it
will be important to have the code paths unified like this to avoid
major code duplication.
Rich Felker [Mon, 3 Oct 2011 04:09:08 +0000 (00:09 -0400)]
use count=0 instead of 1 for recursive mutex with only one lock reference
this simplifies the code paths slightly, but perhaps what's nicer is
that it makes recursive mutexes fully reentrant, i.e. locking and
unlocking from a signal handler works even if the interrupted code was
in the middle of locking or unlocking.
Rich Felker [Sat, 1 Oct 2011 13:11:35 +0000 (09:11 -0400)]
fix failure-to-wake in rwlock unlock
a reader unlocking the lock need only wake one waiter (necessarily a
writer, but a writer unlocking the lock must wake all waiters
(necessarily readers). if it only wakes one, the remainder can remain
blocked indefinitely, or at least until the first reader unlocks (in
which case the whole lock becomes serialized and behaves as a mutex
rather than a read lock).
there is no need to send a wake when the lock count does not hit zero,
but when it does, all waiters must be woken (since all with the same
sign are eligible to obtain the lock).
eliminate the sequence number field and instead use the counter as the
futex because of the way the lock is held, sequence numbers are
completely useless, and this frees up a field in the barrier structure
to be used as a waiter count for the count futex, which lets us avoid
some syscalls in the best case.
as of now, self-synchronized destruction and unmapping should be fully
safe. before any thread can return from the barrier, all threads in
the barrier have obtained the vm lock, and each holds a shared lock on
the barrier. the barrier memory is not inspected after the shared lock
count reaches 0, nor after the vm lock is released.
it was assuming the result of the condition it was supposed to be
checking for, i.e. that the thread ptr had already been initialized by
pthread_mutex_lock. use the slower call to be safe.
improve/debloat mutex unlock error checking in pthread_cond_wait
we're not required to check this except for error-checking mutexes,
but it doesn't hurt. the new test is actually simpler/lighter, and it
also eliminates the need to later check that pthread_mutex_unlock
succeeds.
when used with error-checking mutexes, pthread_cond_wait is required
to fail with EPERM if the mutex is not locked by the caller.
previously we relied on pthread_mutex_unlock to generate the error,
but this is not valid, since in the case of such invalid usage the
internal state of the cond variable has already been potentially
corrupted (due to access outside the control of the mutex). thus, we
have to check first.
process-shared barrier support, based on discussion with bdonlan
this implementation is rather heavy-weight, but it's the first
solution i've found that's actually correct. all waiters actually wait
twice at the barrier so that they can synchronize exit, and they hold
a "vm lock" that prevents changes to virtual memory mappings (and
blocks pthread_barrier_destroy) until all waiters are finished
inspecting the barrier.
thus, it is safe for any thread to destroy and/or unmap the barrier's
memory as soon as pthread_barrier_wait returns, without further
synchronization.
fix ctype macros to cast argument to (unsigned) first
issue reported by nsz, but it's actually not just pedantic. the
functions can take input of any arithmetic type, including floating
point, and the behavior needs to be as if the conversion implicit in
the function call took place.
another cond var fix: requeue count race condition
lock out new waiters during the broadcast. otherwise the wait count
added to the mutex might be lower than the actual number of waiters
moved, and wakeups may be lost.
this issue could also be solved by temporarily setting the mutex
waiter count higher than any possible real count, then relying on the
kernel to tell us how many waiters were requeued, and updating the
counts afterwards. however the logic is more complex, and i don't
really trust the kernel. the solution here is also nice in that it
replaces some atomic cas loops with simple non-atomic ops under lock.