Rich Felker [Sun, 5 Aug 2012 16:50:26 +0000 (12:50 -0400)]
mips dynamic linker support
not heavily tested, but the basics are working. the basic concept is
that the dynamic linker entry point code invokes a pure-PIC (no global
accesses) C function in reloc.h to perform the early GOT relocations
needed to make the dynamic linker itself functional, then invokes
__dynlink like on other archs. since mips uses some ugly arch-specific
hacks to optimize relocating the GOT (rather than just using the
normal DT_REL[A] tables like on other archs), the dynamic linker has
been modified slightly to support calling arch-specific relocation
code in reloc.h.
most of the actual mips-specific behavior was developed by reading the
output of readelf on libc.so and simple executable files. i could not
find good reference information on which relocation types need to be
supported or their semantics, so it's possible that some legitimate
usage cases will not work yet.
Rich Felker [Sun, 5 Aug 2012 16:32:20 +0000 (12:32 -0400)]
make dynlink.lo depend on reloc.h in makefile
this is mainly a development convenience but will also ensure users
building from latest git always get up-to-date arch-specific dynamic
linker code without having to "make clean".
Rich Felker [Sun, 5 Aug 2012 06:44:32 +0000 (02:44 -0400)]
more dynamic linker internals cleanup
changing the string printed for the dso name is not a regression; the
old code was simply using the wrong dso name (head rather than the dso
currently being relocated). this will be fixed in a later commit.
Rich Felker [Sun, 5 Aug 2012 06:37:14 +0000 (02:37 -0400)]
floating point support for arm setjmp/longjmp
not heavily tested, but at least they don't seem to break anything on
soft float targets with or without coprocessors. they check the auxv
AT_HWCAP flags to determine which coprocessor, if any, is available.
Rich Felker [Fri, 3 Aug 2012 01:02:34 +0000 (21:02 -0400)]
fix argument type error on wcwidth function
since the correct declaration was not visible, and since the
representation of the types wchar_t and wint_t always match, a
compiler would have to go out of its way to make this bug manifest,
but better to fix it anyway.
save AT_HWCAP from auxv for subsequent use in machine-specific code
it's expected that this will be needed/useful only in asm, so I've
given it its own symbol that can be addressed in pc-relative ways from
asm rather than adding a field in the __libc structure which would
require hard-coding the offset wherever it's used.
this seems counter-intuitive since sem_trywait is supposed to just try
once, not wait for the semaphore. however, the retry loop is not a
wait. instead, it's to handle the case where the value changes due to
a simultaneous post or wait from another thread while the semaphore
value remains positive. in such a case, it's absolutely wrong for
sem_trywait to fail with EAGAIN because the semaphore is not busy.
based on patches by orc and Isaac Dunham, with some fixes. sys/io.h
exists and contains prototypes for these functions regardless of
whether the target arch has them; this is a bit unorthodox but I don't
think it will break anything. the function definitions do not exist
unless the appropriate SYS_* syscall number macro is defined, which
should make sure configure scripts looking for these functions don't
find them on other systems.
presently, sys/io.h does not have the inb/outb/etc. port io
macros/functions. I'd be surprised if ioperm/iopl are useful without
them, so they probably need to be added at some point in appropriate
bits/io.h files...
add floating point register saving/restoring to mips setjmp/longjmp
also fix the alignment of jmp_buf to meet the abi. linux always
emulates fpu on mips if it's not present, so enabling this code
unconditionally is "safe" but may be slow. in the long term it may be
preferable to find a way to disable it on soft float builds.
the fields in the mcontext_t are long long (for no good reason) even
on 32-bit mips, so the offset of the instruction pointer (as a word)
varies depending on endianness.
workaround another sendmsg kernel bug on 64-bit machines
the kernel wrongly expects the cmsg length field to be size_t instead
of socklen_t. in order to work around the issue, we have to impose a
length limit and copy to a local buffer. the length limit should be
more than sufficient for any real-world use; these headers are only
used for passing file descriptors and permissions between processes
over unix sockets.
fix several locks that weren't updated right for new futex-based __lock
these could have caused memory corruption due to invalid accesses to
the next field. all should be fixed now; I found the errors with fgrep
-r '__lock(&', which is bogus since the argument should be an array.
after the thread unmaps its own stack/thread structure, the kernel,
performing child tid clear and futex wake, could clobber a new mapping
made at the same location as the just-removed thread's tid field.
disable kernel clearing of child tid to prevent this.
mips clone: don't free stack space used to copy arg
the mips abi reserves stack space equal to the size of the in-register
args for the callee to save the args, if desired. this would cause the
beginning of the thread structure to be clobbered...
the old code worked in qemu app-level emulation, but not on real
kernels where the clone syscall does not copy the register values to
the new thread. save arguments on the new thread stack instead.
initial version of mips (o32) port, based on work by Richard Pennington (rdp)
basically, this version of the code was obtained by starting with
rdp's work from his ellcc source tree, adapting it to musl's build
system and coding style, auditing the bits headers for discrepencies
with kernel definitions or glibc/LSB ABI or large file issues, fixing
up incompatibility with the old binutils from aboriginal linux, and
adding some new special cases to deal with the oddities of sigaction
and pipe syscall interfaces on mips.
at present, minimal test programs work, but some interfaces are broken
or missing. threaded programs probably will not link.
if libc.a is compiled PIC for use in static PIE code, this should not
cause the dynamic linker (which still does not support static-linked
main program) to be built into libc.a.
fix lots of breakage on dlopen, mostly with explicit pathnames
most importantly, the name for such libs was being set from an
uninitialized buffer. also, shortname always had an initial '/'
character, making it useless for looking up already-loaded libraries
by name, and thus causing repeated searches through the library path.
major changes now:
- shortname is the base name for library lookups with no explicit
pathname. it's initially clear for libraries loaded with an explicit
pathname (and for the main program), but will be set if the same
library (detected via inodes match) is later found by a search.
- exact name match is never used to identify libraries loaded with an
explicit pathname. in this case, there's no explicit search, so we
can just stat the file and check for inode match.
previously this was being handled the same as a library-specific,
dependency-order lookup on the next library in the global chain, which
is likely to be utterly meaningless. instead the lookup needs to be in
the global namespace, but omitting the initial portion of the global
library chain up through the calling library.
this option is expensive and only used on old gcc's that lack
-fexcess-precision=standed, but it's not needed on non-i386 archs
where floating point does not have excess precision anyway.
if musl ever supports m68k, i think it will need to be special-cased
too. i'm not aware of any other archs with excess precision.
on arm, the location of the saved-signal-mask flag and mask were off
by one between sigsetjmp and siglongjmp, causing incorrect behavior
restoring the signal mask. this is because the siglongjmp code assumed
an extra slot was in the non-sig jmp_buf for the flag, but arm did not
have this. now, the extra slot is removed for all archs since it was
useless.
also, arm eabi requires jmp_buf to have 8-byte alignment. we achieve
that using long long as the type rather than with non-portable gcc
attribute tags.
no idea why gcc refuses to compile the C code to use a tail call, but
it's best to use asm anyway so we don't have to rely on the quality of
the compiler's optimizations for correct code.
Rich Felker [Fri, 29 Jun 2012 04:56:37 +0000 (00:56 -0400)]
replace old and ugly crypt implementation
the new version is largely the work of Solar Designer, with minor
changes for integration with musl. compared to the old code, text size
is reduced by about 7k, stack space usage by about 70k, and
performance is greatly improved by avoiding expensive calculation of
constant tables on each run.
this version also adds support for extended des-based password hashes,
which allow for unlimited key (password) length and configurable
iteration counts.
i've also published the interface for crypt_r in a new crypt.h header.
especially since this is not a standard interface, i did not feel
compelled to match the glibc abi for the crypt_data structure. the
glibc structure is way too big to allocate on the stack; in fact it's
so big that the first usage may cause the main thread to exceed its
pre-committed stack size of 128k and thus could cause the program to
crash even on systems with overcommit disabled. the only legitimate
use of crypt_data for crypt_r is to store the hash string to return,
so i've reserved 256 bytes, which should be more than sufficient
(longest known password hashes are ~60 characters, and beyond that is
possibly even exceeding some implementations' passwd file field size
limit).
Rich Felker [Mon, 25 Jun 2012 20:06:09 +0000 (16:06 -0400)]
fix arm crti/crtn code
lr must be saved because init/fini-section code from the compiler
clobbers it. this was not a problem when i tested without gcc's
crtbegin/crtend files present, but with them, musl on arm fails to
work (infinite loop in _init).
Rich Felker [Thu, 21 Jun 2012 02:16:47 +0000 (22:16 -0400)]
proper error handling for fcntl F_GETOWN on modern kernels
on old kernels, there's no way to detect errors; we must assume
negative syscall return values are pgrp ids. but if the F_GETOWN_EX
fcntl works, we can get a reliable answer.
nsz [Wed, 20 Jun 2012 21:25:58 +0000 (23:25 +0200)]
math: fix fma bug on x86 (found by Bruno Haible with gnulib)
The long double adjustment was wrong:
The usual check is
mant_bits & 0x7ff == 0x400
before doing a mant_bits++ or mant_bits-- adjustment since
this is the only case when rounding an inexact ld80 into
double can go wrong. (only in nearest rounding mode)
After such a check the ++ and -- is ok (the mantissa will end
in 0x401 or 0x3ff).
fma is a bit different (we need to add 3 numbers with correct
rounding: hi_xy + lo_xy + z so we should survive two roundings
at different places without precision loss)
The adjustment in fma only checks for zero low bits
mant_bits & 0x3ff == 0
this way the adjusted value is correct when rounded to
double or *less* precision.
(this is an important piece in the fma puzzle)
Unfortunately in this case the -- is not a correct adjustment
because mant_bits might underflow so further checks are needed
and this was the source of the bug.
Rich Felker [Wed, 20 Jun 2012 19:22:03 +0000 (15:22 -0400)]
fix broken wcwidth tables
unicode char data has both "W" and "F" wide types and the old table
only included the "W" ones. this omitted U+3000 (ideographic space)
and all the wide-ascii, etc.
Rich Felker [Wed, 20 Jun 2012 18:50:29 +0000 (14:50 -0400)]
avoid cancellation in pclose
at the point pclose might receive and act on cancellation, it has
already invalidated the FILE passed to it. thus, per musl's QOI
guarantees about cancellation and resource allocation/deallocation,
it's not a candidate for cancellation.
if it were required to be a cancellation point by posix, we would have
to switch the order of deallocation, but somehow still close the pipe
in order to trigger the child process to exit. i looked into doing
this, but the logic gets ugly, and i'm not sure the semantics are
conformant, so i'd rather just leave it alone unless there's a need to
change it.
Rich Felker [Wed, 20 Jun 2012 18:39:50 +0000 (14:39 -0400)]
make popen cancellation-safe
close was the only cancellation point called from popen, but it left
popen with major resource leaks if any call to close got cancelled.
the easiest, cheapest fix is just to use a non-cancellable close
function.
Rich Felker [Wed, 20 Jun 2012 16:07:18 +0000 (12:07 -0400)]
make strerror_r behave nicer on failure
if the buffer is too short, at least return a partial string. this is
helpful if the caller is lazy and does not check for failure. care is
taken to avoid writing anything if the buffer length is zero, and to
always null-terminate when the buffer length is non-zero.
Rich Felker [Wed, 20 Jun 2012 13:28:54 +0000 (09:28 -0400)]
fix another oob pointer arithmetic issue in printf floating point
this one could never cause any problems unless the compiler/machine
goes to extra trouble to break oob pointer arithmetic, but it's best
to fix it anyway.
Rich Felker [Wed, 20 Jun 2012 02:44:08 +0000 (22:44 -0400)]
fix localeconv values and implementation
dynamic-allocation of the structure is not valid; it can crash an
application if malloc fails. since localeconv is not specified to have
failure conditions, the object needs to have static storage duration.
need to review whether all the values are right or not still..
Rich Felker [Wed, 20 Jun 2012 02:24:15 +0000 (22:24 -0400)]
fix dummied-out fsync
if we eventually have build options, it might be nice to make an
option to dummy this out again, in case anybody needs a system-wide
disable for disk/ssd-thrashing, etc. that some daemons do when
logging...
Rich Felker [Wed, 20 Jun 2012 01:41:43 +0000 (21:41 -0400)]
fix pointer overflow bug in floating point printf
large precision values could cause out-of-bounds pointer arithmetic in
computing the precision cutoff (used to avoid expensive long-precision
arithmetic when the result will be discarded). per the C standard,
this is undefined behavior. one would expect that it works anyway, and
in fact it did in most real-world cases, but it was randomly
(depending on aslr) crashing in i386 binaries running on x86_64
kernels. this is because linux puts the userspace stack near 4GB
(instead of near 3GB) when the kernel is 64-bit, leading to the
out-of-bounds pointer arithmetic overflowing past the end of address
space and giving a very low pointer value, which then compared lower
than a pointer it should have been higher than.
the new code rearranges the arithmetic so that no overflow can occur.
while this bug could crash printf with memory corruption, it's
unlikely to have security impact in real-world applications since the
ability to provide an extremely large field precision value under
attacker-control is required to trigger the bug.
Rich Felker [Tue, 19 Jun 2012 05:27:26 +0000 (01:27 -0400)]
stdio: handle file position correctly at program exit
for seekable files, posix imposed requirements on the offset of the
underlying open file description after a stream is closed. this was
correctly handled (as a side effect of the unconditional fflush call)
when streams were explicitly closed by fclose, but was not handled
correctly at program exit time, where fflush(0) was being used.
the weak symbol hackery is to pull in __stdio_exit if either of
__toread or __towrite is used, but avoid calling it twice so we don't
have to keep extra state. the new __stdio_exit is a streamlined fflush
variant that avoids performing any unnecessary operations and which
never unlocks the files or open file list, so we can be sure no other
threads write new data to a stream's buffer after it's already
flushed.