Rich Felker [Sat, 11 Aug 2012 22:39:12 +0000 (18:39 -0400)]
remove buggy short-string wcsstr implementation; always use twoway
since this interface is rarely used, it's probably best to lean
towards keeping code size down anyway. one-character needles will
still be found immediately by the initial wcschr call anyway.
Rich Felker [Sat, 11 Aug 2012 03:39:32 +0000 (23:39 -0400)]
minor but worthwhile optimization in printf: avoid expensive strspn
the strspn call was made for every format specifier and end-of-string,
even though the expected return value was 1-2 for normal usage.
replace with simple loop.
Rich Felker [Fri, 10 Aug 2012 19:13:26 +0000 (15:13 -0400)]
use int instead of long for ptrdiff_t on all 32-bit archs
this is needed to match the underlying "ABI" standards. it's not
really an ABI issue since the binary representations are the same, but
having the wrong type can lead to errors when the type arising from a
difference-of-pointers expression does not match the defined type of
ptrdiff_t. most of the problems affect C++, not C.
Rich Felker [Fri, 10 Aug 2012 04:20:00 +0000 (00:20 -0400)]
add blowfish hash support to crypt
there are still some discussions going on about tweaking the code, but
at least thing brings us to the point of having something working in
the repository. hopefully the remaining major hashes (md5,sha) will
follow soon.
Rich Felker [Fri, 10 Aug 2012 02:52:13 +0000 (22:52 -0400)]
fix (hopefully) all hard-coded 8's for kernel sigset_t size
some minor changes to how hard-coded sets for thread-related purposes
are handled were also needed, since the old object sizes were not
necessarily sufficient. things have gotten a bit ugly in this area,
and i think a cleanup is in order at some point, but for now the goal
is just to get the code working on all supported archs including mips,
which was badly broken by linux rejecting syscalls with the wrong
sigset_t size.
Rich Felker [Fri, 10 Aug 2012 00:47:17 +0000 (20:47 -0400)]
make crypt return an unmatchable hash rather than NULL on failure
unfortunately, a large portion of programs which call crypt are not
prepared for its failure and do not check that the return value is
non-null before using it. thus, always "succeeding" but giving an
unmatchable hash is reportedly a better behavior than failing on
error.
it was suggested that we could do this the same way as other
implementations and put the null-to-unmatchable translation in the
wrapper rather than the individual crypt modules like crypt_des, but
when i tried to do it, i found it was making the logic in __crypt_r
for keeping track of which hash type we're working with and whether it
succeeded or failed much more complex, and potentially error-prone.
the way i'm doing it now seems to have essentially zero cost, anyway.
Rich Felker [Sun, 5 Aug 2012 18:12:10 +0000 (14:12 -0400)]
align mips _init/_fini functions
since .init and .fini are not .text, the toolchain does not seem to
align them for code by default. this yields random breakage depending
on the object sizes the linker is dealing with.
Rich Felker [Sun, 5 Aug 2012 16:50:26 +0000 (12:50 -0400)]
mips dynamic linker support
not heavily tested, but the basics are working. the basic concept is
that the dynamic linker entry point code invokes a pure-PIC (no global
accesses) C function in reloc.h to perform the early GOT relocations
needed to make the dynamic linker itself functional, then invokes
__dynlink like on other archs. since mips uses some ugly arch-specific
hacks to optimize relocating the GOT (rather than just using the
normal DT_REL[A] tables like on other archs), the dynamic linker has
been modified slightly to support calling arch-specific relocation
code in reloc.h.
most of the actual mips-specific behavior was developed by reading the
output of readelf on libc.so and simple executable files. i could not
find good reference information on which relocation types need to be
supported or their semantics, so it's possible that some legitimate
usage cases will not work yet.
Rich Felker [Sun, 5 Aug 2012 16:32:20 +0000 (12:32 -0400)]
make dynlink.lo depend on reloc.h in makefile
this is mainly a development convenience but will also ensure users
building from latest git always get up-to-date arch-specific dynamic
linker code without having to "make clean".
Rich Felker [Sun, 5 Aug 2012 06:44:32 +0000 (02:44 -0400)]
more dynamic linker internals cleanup
changing the string printed for the dso name is not a regression; the
old code was simply using the wrong dso name (head rather than the dso
currently being relocated). this will be fixed in a later commit.
Rich Felker [Sun, 5 Aug 2012 06:37:14 +0000 (02:37 -0400)]
floating point support for arm setjmp/longjmp
not heavily tested, but at least they don't seem to break anything on
soft float targets with or without coprocessors. they check the auxv
AT_HWCAP flags to determine which coprocessor, if any, is available.
Rich Felker [Fri, 3 Aug 2012 01:02:34 +0000 (21:02 -0400)]
fix argument type error on wcwidth function
since the correct declaration was not visible, and since the
representation of the types wchar_t and wint_t always match, a
compiler would have to go out of its way to make this bug manifest,
but better to fix it anyway.
save AT_HWCAP from auxv for subsequent use in machine-specific code
it's expected that this will be needed/useful only in asm, so I've
given it its own symbol that can be addressed in pc-relative ways from
asm rather than adding a field in the __libc structure which would
require hard-coding the offset wherever it's used.
this seems counter-intuitive since sem_trywait is supposed to just try
once, not wait for the semaphore. however, the retry loop is not a
wait. instead, it's to handle the case where the value changes due to
a simultaneous post or wait from another thread while the semaphore
value remains positive. in such a case, it's absolutely wrong for
sem_trywait to fail with EAGAIN because the semaphore is not busy.
based on patches by orc and Isaac Dunham, with some fixes. sys/io.h
exists and contains prototypes for these functions regardless of
whether the target arch has them; this is a bit unorthodox but I don't
think it will break anything. the function definitions do not exist
unless the appropriate SYS_* syscall number macro is defined, which
should make sure configure scripts looking for these functions don't
find them on other systems.
presently, sys/io.h does not have the inb/outb/etc. port io
macros/functions. I'd be surprised if ioperm/iopl are useful without
them, so they probably need to be added at some point in appropriate
bits/io.h files...
add floating point register saving/restoring to mips setjmp/longjmp
also fix the alignment of jmp_buf to meet the abi. linux always
emulates fpu on mips if it's not present, so enabling this code
unconditionally is "safe" but may be slow. in the long term it may be
preferable to find a way to disable it on soft float builds.
the fields in the mcontext_t are long long (for no good reason) even
on 32-bit mips, so the offset of the instruction pointer (as a word)
varies depending on endianness.
workaround another sendmsg kernel bug on 64-bit machines
the kernel wrongly expects the cmsg length field to be size_t instead
of socklen_t. in order to work around the issue, we have to impose a
length limit and copy to a local buffer. the length limit should be
more than sufficient for any real-world use; these headers are only
used for passing file descriptors and permissions between processes
over unix sockets.
fix several locks that weren't updated right for new futex-based __lock
these could have caused memory corruption due to invalid accesses to
the next field. all should be fixed now; I found the errors with fgrep
-r '__lock(&', which is bogus since the argument should be an array.
after the thread unmaps its own stack/thread structure, the kernel,
performing child tid clear and futex wake, could clobber a new mapping
made at the same location as the just-removed thread's tid field.
disable kernel clearing of child tid to prevent this.
mips clone: don't free stack space used to copy arg
the mips abi reserves stack space equal to the size of the in-register
args for the callee to save the args, if desired. this would cause the
beginning of the thread structure to be clobbered...
the old code worked in qemu app-level emulation, but not on real
kernels where the clone syscall does not copy the register values to
the new thread. save arguments on the new thread stack instead.
initial version of mips (o32) port, based on work by Richard Pennington (rdp)
basically, this version of the code was obtained by starting with
rdp's work from his ellcc source tree, adapting it to musl's build
system and coding style, auditing the bits headers for discrepencies
with kernel definitions or glibc/LSB ABI or large file issues, fixing
up incompatibility with the old binutils from aboriginal linux, and
adding some new special cases to deal with the oddities of sigaction
and pipe syscall interfaces on mips.
at present, minimal test programs work, but some interfaces are broken
or missing. threaded programs probably will not link.
if libc.a is compiled PIC for use in static PIE code, this should not
cause the dynamic linker (which still does not support static-linked
main program) to be built into libc.a.
fix lots of breakage on dlopen, mostly with explicit pathnames
most importantly, the name for such libs was being set from an
uninitialized buffer. also, shortname always had an initial '/'
character, making it useless for looking up already-loaded libraries
by name, and thus causing repeated searches through the library path.
major changes now:
- shortname is the base name for library lookups with no explicit
pathname. it's initially clear for libraries loaded with an explicit
pathname (and for the main program), but will be set if the same
library (detected via inodes match) is later found by a search.
- exact name match is never used to identify libraries loaded with an
explicit pathname. in this case, there's no explicit search, so we
can just stat the file and check for inode match.
previously this was being handled the same as a library-specific,
dependency-order lookup on the next library in the global chain, which
is likely to be utterly meaningless. instead the lookup needs to be in
the global namespace, but omitting the initial portion of the global
library chain up through the calling library.
this option is expensive and only used on old gcc's that lack
-fexcess-precision=standed, but it's not needed on non-i386 archs
where floating point does not have excess precision anyway.
if musl ever supports m68k, i think it will need to be special-cased
too. i'm not aware of any other archs with excess precision.
on arm, the location of the saved-signal-mask flag and mask were off
by one between sigsetjmp and siglongjmp, causing incorrect behavior
restoring the signal mask. this is because the siglongjmp code assumed
an extra slot was in the non-sig jmp_buf for the flag, but arm did not
have this. now, the extra slot is removed for all archs since it was
useless.
also, arm eabi requires jmp_buf to have 8-byte alignment. we achieve
that using long long as the type rather than with non-portable gcc
attribute tags.
no idea why gcc refuses to compile the C code to use a tail call, but
it's best to use asm anyway so we don't have to rely on the quality of
the compiler's optimizations for correct code.
Rich Felker [Fri, 29 Jun 2012 04:56:37 +0000 (00:56 -0400)]
replace old and ugly crypt implementation
the new version is largely the work of Solar Designer, with minor
changes for integration with musl. compared to the old code, text size
is reduced by about 7k, stack space usage by about 70k, and
performance is greatly improved by avoiding expensive calculation of
constant tables on each run.
this version also adds support for extended des-based password hashes,
which allow for unlimited key (password) length and configurable
iteration counts.
i've also published the interface for crypt_r in a new crypt.h header.
especially since this is not a standard interface, i did not feel
compelled to match the glibc abi for the crypt_data structure. the
glibc structure is way too big to allocate on the stack; in fact it's
so big that the first usage may cause the main thread to exceed its
pre-committed stack size of 128k and thus could cause the program to
crash even on systems with overcommit disabled. the only legitimate
use of crypt_data for crypt_r is to store the hash string to return,
so i've reserved 256 bytes, which should be more than sufficient
(longest known password hashes are ~60 characters, and beyond that is
possibly even exceeding some implementations' passwd file field size
limit).
Rich Felker [Mon, 25 Jun 2012 20:06:09 +0000 (16:06 -0400)]
fix arm crti/crtn code
lr must be saved because init/fini-section code from the compiler
clobbers it. this was not a problem when i tested without gcc's
crtbegin/crtend files present, but with them, musl on arm fails to
work (infinite loop in _init).
Rich Felker [Thu, 21 Jun 2012 02:16:47 +0000 (22:16 -0400)]
proper error handling for fcntl F_GETOWN on modern kernels
on old kernels, there's no way to detect errors; we must assume
negative syscall return values are pgrp ids. but if the F_GETOWN_EX
fcntl works, we can get a reliable answer.
nsz [Wed, 20 Jun 2012 21:25:58 +0000 (23:25 +0200)]
math: fix fma bug on x86 (found by Bruno Haible with gnulib)
The long double adjustment was wrong:
The usual check is
mant_bits & 0x7ff == 0x400
before doing a mant_bits++ or mant_bits-- adjustment since
this is the only case when rounding an inexact ld80 into
double can go wrong. (only in nearest rounding mode)
After such a check the ++ and -- is ok (the mantissa will end
in 0x401 or 0x3ff).
fma is a bit different (we need to add 3 numbers with correct
rounding: hi_xy + lo_xy + z so we should survive two roundings
at different places without precision loss)
The adjustment in fma only checks for zero low bits
mant_bits & 0x3ff == 0
this way the adjusted value is correct when rounded to
double or *less* precision.
(this is an important piece in the fma puzzle)
Unfortunately in this case the -- is not a correct adjustment
because mant_bits might underflow so further checks are needed
and this was the source of the bug.
Rich Felker [Wed, 20 Jun 2012 19:22:03 +0000 (15:22 -0400)]
fix broken wcwidth tables
unicode char data has both "W" and "F" wide types and the old table
only included the "W" ones. this omitted U+3000 (ideographic space)
and all the wide-ascii, etc.