Rich Felker [Sat, 5 Oct 2013 16:00:55 +0000 (12:00 -0400)]
slightly optimize __brk for size
there is no reason to check the return value for setting errno, since
brk never returns errors, only the new value of the brk (which may be
the same as the old, or otherwise differ from the requested brk, on
failure).
it may be beneficial to eventually just eliminate this file and make
the syscalls inline in malloc.c.
fix off-by-one error in getgrnam_r and getgrgid_r, clobbering gr_name
bug report and patch by Michael Forney. the terminating null pointer
at the end of the gr_mem array was overwriting the beginning of the
string data, causing the gr_name member to always be a zero-length
string.
"If wn becomes 0 after processing a chunk of 4, mbsrtowcs currently
continues on, wrapping wn around to -1, causing the rest of the string
to be processed.
This resulted in buffer overruns if there was only space in ws for wn
wide characters."
the original patch submitted added an additional check for !wn after
the loop; to avoid extra branching, I instead just changed the wn>=4
check to wn>=5 to ensure that at least one slot remains after the
word-at-a-time loop runs. this should not slow down the tail
processing on real-world usage, since an extra slot that can't be
processed in the word-at-a-time loop is needed for the null
termination anyway.
Szabolcs Nagy [Fri, 27 Sep 2013 13:55:29 +0000 (13:55 +0000)]
math: fix comparision macros (isless etc) when FLT_EVAL_METHOD!=0
This is a change in ISO C11 annex F (F.10.11p1), comparision macros
can't round their arguments to their semantic type when the evaluation
format has wider range and precision. (ie. they must be consistent with
the builtin relational operators)
fix arm atomic store and generate simpler/less-bloated/faster code
atomic store was lacking a barrier, which was fine for legacy arm with
no real smp and kernel-emulated cas, but unsuitable for more modern
systems. the kernel provides another "kuser" function, at 0xffff0fa0,
which could be used for the barrier, but using that would drop support
for kernels 2.6.12 through 2.6.14 unless an extra conditional were
added to check for barrier availability. just using the barrier in the
kernel cas is easier, and, based on my reading of the assembly code in
the kernel, does not appear to be significantly slower.
at the same time, other atomic operations are adapted to call the
kernel cas function directly rather than using a_cas; due to small
differences in their interface contracts, this makes the generated
code much simpler.
fix potential deadlock bug in libc-internal locking logic
if a multithreaded program became non-multithreaded (i.e. all other
threads exited) while one thread held an internal lock, the remaining
thread would fail to release the lock. the the program then became
multithreaded again at a later time, any further attempts to obtain
the lock would deadlock permanently.
the underlying cause is that the value of libc.threads_minus_1 at
unlock time might not match the value at lock time. one solution would
be returning a flag to the caller indicating whether the lock was
taken and needs to be unlocked, but there is a simpler solution: using
the lock itself as such a flag.
note that this flag is not needed anyway for correctness; if the lock
is not held, the unlock code is harmless. however, the memory
synchronization properties associated with a_store are costly on some
archs, so it's best to avoid executing the unlock code when it is
unnecessary.
fix clobbering of caller's stack in mips __clone function
this was resulting in crashes in posix_spawn on mips, and would have
affected applications calling clone too. since the prototype for
__clone has it as a variadic function, it may not assume that 16($sp)
is writable for use in making the syscall. instead, it needs to
allocate additional stack space, and then adjust the stack pointer
back in both of the code paths for the parent process/thread.
Szabolcs Nagy [Sun, 15 Sep 2013 15:32:07 +0000 (15:32 +0000)]
sys/resource.h: add PRIO_MIN and PRIO_MAX for getpriority and setpriority
These constants are not specified by POSIX, but they are in the reserved
namespace, glibc and bsd systems seem to provide them as well.
(Note that POSIX specifies -NZERO and NZERO-1 to be the limits, but
PRIO_MAX equals NZERO)
Szabolcs Nagy [Sun, 15 Sep 2013 14:53:34 +0000 (14:53 +0000)]
update include/elf.h following glibc changes
the changes were verified using various sources:
linux: include/uapi/linux/elf.h
binutils: include/elf/common.h
glibc: elf/elf.h
sysv gabi: http://www.sco.com/developers/gabi/latest/contents.html
sun linker docs: http://docs.oracle.com/cd/E18752_01/pdf/817-1984.pdf
and platform specific docs
- fixed:
EF_MIPS_* E_MIPS_* e_flags: fixed accoding to glibc and binutils
- added:
ELFOSABI_GNU for EI_OSABI entry: glibc, binutils and sysv gabi
EM_* e_machine values: updated according to linux and glibc
PN_XNUM e_phnum value: from glibc and linux, see oracle docs
NT_* note types: updated according to linux and glibc
DF_1_* flags for DT_FLAGS_1 entry: following glibc and oracle docs
AT_HWCAP2 auxv entry for more hwcap bits accoding to linux and glibc
R_386_SIZE32 relocation according to glibc and binutils
EF_ARM_ABI_FLOAT_* e_flags: added following glibc and binutils
R_AARCH64_* relocs: added following glibc and aarch64 elf specs
R_ARM_* relocs: according to glibc, binutils and arm elf specs
R_X86_64_* relocs: added missing relocs following glibc
- removed:
HWCAP_SPARC_* flags were moved to arch specific header in glibc
R_ARM_SWI24 reloc is marked as obsolete in glibc, not present in binutils
not specified in arm elf spec, R_ARM_TLS_DESC reused its number
see http://www.codesourcery.com/publications/RFC-TLSDESC-ARM.txt
- glibc changes not pulled in:
ELFOSABI_ARM_AEABI (bare-metal system, binutils and glibc disagrees about the name)
R_68K_* relocs for unsupported platform
R_SPARC_* ditto
EF_SH* ditto (e_flags)
EF_S390* ditto (e_flags)
R_390* ditto
R_MN10300* ditto
R_TILE* ditto
CLONE_PARENT is not necessary (CLONE_THREAD provides all the useful
parts of it) and Linux treats CLONE_PARENT as an error in certain
situations, without noticing that it would be a no-op due to
CLONE_THREAD. this error case prevents, for example, use of a
multi-threaded init process and certain usages with containers.
Szabolcs Nagy [Sun, 15 Sep 2013 02:42:29 +0000 (02:42 +0000)]
net/if_arp.h: add missing ARP hardware identifiers from linux uapi headers
the removed ARPHRD_IEEE802154_PHY was only present in the kernel api
in v2.6.31 (by accident), but it got into the glibc headers (in 2009)
and remained there since this header was not updated since then.
Szabolcs Nagy [Sun, 15 Sep 2013 02:00:32 +0000 (02:00 +0000)]
support configurable page size on mips, powerpc and microblaze
PAGE_SIZE was hardcoded to 4096, which is historically what most
systems use, but on several archs it is a kernel config parameter,
user space can only know it at execution time from the aux vector.
PAGE_SIZE and PAGESIZE are not defined on archs where page size is
a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE)
to query it. Internally libc code defines PAGE_SIZE to libc.page_size,
which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink
as well. (Note that libc.page_size can be accessed without GOT, ie.
before relocations are done)
Some fpathconf settings are hardcoded to 4096, these should be actually
queried from the filesystem using statfs.
unlike other archs, the mips version of clone was not doing anything
to align the stack pointer. this seems to have been the cause for some
SIGBUS crashes that were observed in posix_spawn.
the underlying problem was not incorrect sign extension (fixed in the
previous commit to this file by nsz) but that code that treats "long"
as 32-bit was copied blindly from i386 to x86_64.
now lrintl is identical to llrintl on x86_64, as it should be.
do not use default when dynamic linker fails to open existing path file
if fopen fails for a reason other than ENOENT, we must assume the
intent is that the path file be used. failure may be due to
misconfiguration or intentional resource-exhaustion attack (against
suid programs), in which case falling back to loading libraries from
an unintended path could be dangerous.
Szabolcs Nagy [Fri, 6 Sep 2013 18:35:55 +0000 (18:35 +0000)]
math: remove STRICT_ASSIGN macro
gcc did not always drop excess precision according to c99 at assignments
before version 4.5 even if -std=c99 was requested which caused badly
broken mathematical functions on i386 when FLT_EVAL_METHOD!=0
but STRICT_ASSIGN was not used consistently and it is worked around for
old compilers with -ffloat-store so it is no longer needed
the new convention is to get the compiler respect c99 semantics and when
excess precision is not harmful use float_t or double_t or to specialize
code using FLT_EVAL_METHOD
Szabolcs Nagy [Thu, 5 Sep 2013 18:05:07 +0000 (18:05 +0000)]
math: support invalid ld80 representations in fpclassify
apparently gnulib requires invalid long double representations
to be handled correctly in printf so we classify them according
to how the fpu treats them: bad inf is nan, bad nan is nan,
bad normal is nan and bad subnormal/zero is minimal normal
Szabolcs Nagy [Thu, 5 Sep 2013 12:26:26 +0000 (12:26 +0000)]
math: fix acoshf on negative values
acosh(x) is invalid for x<1, acoshf tried to be clever using
signed comparisions to handle all x<2 the same way, but the
formula was wrong on large negative values.
Szabolcs Nagy [Thu, 5 Sep 2013 10:58:48 +0000 (10:58 +0000)]
math: fix exp2l asm on x86 (raise underflow correctly)
there were two problems:
* omitted underflow on subnormal results: exp2l(-16383.5) was calculated
as sqrt(2)*2^-16384, the last bits of sqrt(2) are zero so the down scaling
does not underflow eventhough the result is in subnormal range
* spurious underflow for subnormal inputs: exp2l(0x1p-16400) was evaluated
as f2xm1(x)+1 and f2xm1 raised underflow (because inexact subnormal result)
the first issue is fixed by raising underflow manually if x is in
(-32768,-16382] and not integer (x-0x1p63+0x1p63 != x)
the second issue is fixed by treating x in (-0x1p64,0x1p64) specially
for these fixes the special case handling was completely rewritten
Szabolcs Nagy [Wed, 4 Sep 2013 15:51:05 +0000 (15:51 +0000)]
math: cbrt cleanup and long double fix
* use float_t and double_t
* cleanup subnormal handling
* bithacks according to the new convention (ldshape for long double
and explicit unions for float and double)
Szabolcs Nagy [Wed, 4 Sep 2013 07:51:11 +0000 (07:51 +0000)]
math: fix underflow in exp*.c and long double handling in exp2l
* don't care about inexact flag
* use double_t and float_t (faster, smaller, more precise on x86)
* exp: underflow when result is zero or subnormal and not -inf
* exp2: underflow when result is zero or subnormal and not exact
* expm1: underflow when result is zero or subnormal
* expl: don't underflow on -inf
* exp2: fix incorrect comment
* expm1: simplify special case handling and overflow properly
* expm1: cleanup final scaling and fix negative left shift ub (twopk)
Szabolcs Nagy [Tue, 3 Sep 2013 18:50:58 +0000 (18:50 +0000)]
math: long double trigonometric cleanup (cosl, sinl, sincosl, tanl)
ld128 support was added to internal kernel functions (__cosl, __sinl,
__tanl, __rem_pio2l) from freebsd (not tested, but should be a good
start for when ld128 arch arrives)
__rem_pio2l had some code cleanup, the freebsd ld128 code seems to
gather the results of a large reduction with precision loss (fixed
the bug but a todo comment was added for later investigation)
the old copyright was removed from the non-kernel wrapper functions
(cosl, sinl, sincosl, tanl) since these are trivial and the interesting
parts and comments had been already rewritten.
Szabolcs Nagy [Tue, 3 Sep 2013 14:37:48 +0000 (14:37 +0000)]
math: rewrite hypot
method: if there is a large difference between the scale of x and y
then the larger magnitude dominates, otherwise reduce x,y so the
argument of sqrt (x*x+y*y) does not overflow or underflow and calculate
the argument precisely using exact multiplication. If the argument
has less error than 1/sqrt(2) ~ 0.7 ulp, then the result has less error
than 1 ulp in nearest rounding mode.
the original fdlibm method was the same, except it used bit hacks
instead of dekker-veltkamp algorithm, which is problematic for long
double where different representations are supported. (the new hypot
and hypotl code should be smaller and faster on 32bit cpu archs with
fast fpu), the new code behaves differently in non-nearest rounding,
but the error should be still less than 2ulps.
* results are exact
* modfl follows truncl (raises inexact flag spuriously now)
* modf and modff only had cosmetic cleanup
* remainder is just a wrapper around remquo now
* using iterative shift+subtract for remquo and fmod
* ld80 and ld128 are supported as well
* faster, smaller, cleaner implementation than the bit hacks of fdlibm
* use arithmetics like y=(double)(x+0x1p52)-0x1p52, which is an integer
neighbor of x in all rounding modes (0<=x<0x1p52) and only use bithacks
when that's faster and smaller (for float it usually is)
* the code assumes standard excess precision handling for casts
* long double code supports both ld80 and ld128
* nearbyint is not changed (it is a wrapper around rint)
Szabolcs Nagy [Mon, 2 Sep 2013 23:33:20 +0000 (23:33 +0000)]
math: ilogb cleanup
* consistent code style
* explicit union instead of typedef for double and float bit access
* turn FENV_ACCESS ON to make 0/0.0f raise invalid flag
* (untested) ld128 version of ilogbl (used by logbl which has ld128 support)
in synccall, ignore the signal before any threads' signal handlers return
this protects against deadlock from spurious signals (e.g. sent by
another process) arriving after the controlling thread releases the
other threads from the sync operation.
fix invalid pointer in synccall (multithread setuid, etc.)
the head pointer was not being reset between calls to synccall, so any
use of this interface more than once would build the linked list
incorrectly, keeping the (now invalid) list nodes from the previous
call.
avoid crash in scanf when invalid %m format is encountered
invalid format strings invoke undefined behavior, so this is not a
conformance issue, but it's nicer for scanf to report the error safely
instead of calling free on a potentially-uninitialized pointer or a
pointer to memory belonging to the caller.
Rich Felker [Sat, 31 Aug 2013 19:50:23 +0000 (15:50 -0400)]
debloat realpath's allocation strategy
rather than allocating a PATH_MAX-sized buffer when the caller does
not provide an output buffer, work first with a PATH_MAX-sized temp
buffer with automatic storage, and either copy it to the caller's
buffer or strdup it on success. this not only avoids massive memory
waste, but also avoids pulling in free (and thus the full malloc
implementation) unnecessarily in static programs.
Rich Felker [Sat, 31 Aug 2013 19:44:58 +0000 (15:44 -0400)]
make realpath use O_PATH when opening the file
this avoids failure if the file is not readable and avoids odd
behavior for device nodes, etc. on old kernels that lack O_PATH, the
old behavior (O_RDONLY) will naturally happen as the fallback.
Rich Felker [Sat, 31 Aug 2013 05:12:00 +0000 (01:12 -0400)]
fix breakage in synccall due to incorrect signal restoration in sigqueue
commit 07827d1a82fb33262f686eda959857f0d28cd8fa seems to have
introduced this issue. sigqueue is called from the synccall core, at
which time, even implementation-internal signals are blocked. however,
pthread_sigmask removes the implementation-internal signals from the
old mask before returning, so that a process which began life with
them blocked will not be able to save a signal mask that has them
blocked, possibly causing them to become re-blocked later. however,
this was causing sigqueue to unblock the implementation-internal
signals during synccall, leading to deadlock.
Rich Felker [Fri, 30 Aug 2013 21:06:17 +0000 (17:06 -0400)]
only expose struct tcphdr under _GNU_SOURCE
the BSD and GNU versions of this structure differ, so exposing it in
the default _BSD_SOURCE profile is possibly problematic. both versions
could be simultaneously supported with anonymous unions if needed in
the future, but for now, just omitting it except under _GNU_SOURCE
should be safe.
Rich Felker [Wed, 28 Aug 2013 09:08:16 +0000 (05:08 -0400)]
remove -Wcast-align from --enable-warnings
I originally added this warning option based on a misunderstanding of
how it works. it does not warn whenever the destination of the cast
has stricter alignment; it only warns in cases where misaligned
dereference could lead to a fault. thus, it's essentially a no-op for
i386, which had me wrongly believing the code was clean for this
warning level. on other archs, numerous diagnostic messages are
produced, and all of them are false-positives, so it's better just not
to use it.
Rich Felker [Wed, 28 Aug 2013 07:34:57 +0000 (03:34 -0400)]
optimized C memcpy
unlike the old C memcpy, this version handles word-at-a-time reads and
writes even for misaligned copies. it does not require that the cpu
support misaligned accesses; instead, it performs bit shifts to
realign the bytes for the destination.
essentially, this is the C version of the ARM assembly language
memcpy. the ideas are all the same, and it should perform well on any
arch with a decent number of general-purpose registers that has a
barrel shift operation. since the barrel shifter is an optional cpu
feature on microblaze, it may be desirable to provide an alternate asm
implementation on microblaze, but otherwise the C code provides a
competitive implementation for "generic risc-y" cpu archs that should
alleviate the urgent need for arch-specific memcpy asm.
Rich Felker [Wed, 28 Aug 2013 04:41:00 +0000 (00:41 -0400)]
stdbool.h should define __bool_true_false_are_defined even for C++
while the incorporation of this requirement from C99 into C++11 was
likely an accident, some software expects it to be defined, and it
doesn't hurt. if the requirement is removed, then presumably
__bool_true_false_are_defined would just be in the implementation
namespace and thus defining it would still be legal.
Rich Felker [Tue, 27 Aug 2013 22:08:29 +0000 (18:08 -0400)]
optimized C memset
this version of memset is optimized both for small and large values of
n, and makes no misaligned writes, so it is usable (and near-optimal)
on all archs. it is capable of filling up to 52 or 56 bytes without
entering a loop and with at most 7 branches, all of which can be fully
predicted if memset is called multiple times with the same size.
it also uses the attribute extension to inform the compiler that it is
violating the aliasing rules, unlike the previous code which simply
assumed it was safe to violate the aliasing rules since translation
unit boundaries hide the violations from the compiler. for non-GNUC
compilers, 100% portable fallback code in the form of a naive loop is
provided. I intend to eventually apply this approach to all of the
string/memory functions which are doing word-at-a-time accesses.
Rich Felker [Tue, 27 Aug 2013 21:33:47 +0000 (17:33 -0400)]
add attribute((may_alias)) checking in configure
this will be needed for upcoming commits to the string/mem functions
to correct their unannounced use of aliasing violations for
word-at-a-time search, fill, and copy operations.
Rich Felker [Sun, 25 Aug 2013 06:02:15 +0000 (02:02 -0400)]
add the %s (seconds since the epoch) format to strftime
this is a nonstandard extension but will be required in the next
version of POSIX, and it's widely used/useful in shell scripts
utilizing the date utility.
Rich Felker [Sat, 24 Aug 2013 16:59:02 +0000 (12:59 -0400)]
fix strftime handling of time zone data
this may need further revision in the future, since POSIX is rather
unclear on the requirements, and is designed around the assumption of
POSIX TZ specifiers which are not sufficiently powerful to represent
real-world timezones (this is why zoneinfo support was added).
the basic issue is that strftime gets the string and numeric offset
for the timezone from the extra fields in struct tm, which are
initialized when calling localtime/gmtime/etc. however, a conforming
application might have created its own struct tm without initializing
these fields, in which case using __tm_zone (a pointer) could crash.
other zoneinfo-based implementations simply check for a null pointer,
but otherwise can still crash of the field contains junk.
simply ignoring __tm_zone and using tzname[] would "work" but would
give incorrect results in time zones with more complex rules. I feel
like this would lower the quality of implementation.
instead, simply validate __tm_zone: unless it points to one of the
zone name strings managed by the timezone system, assume it's invalid.
this commit also fixes several other minor bugs with formatting:
tm_isdst being negative is required to suppress printing of the zone
formats, and %z was using the wrong format specifiers since the type
of val was changed, resulting in bogus output.
Rich Felker [Sat, 24 Aug 2013 03:07:09 +0000 (23:07 -0400)]
fix mishandling of empty or blank TZ environment variable
the empty TZ string was matching equal to the initial value of the
cached TZ name, thus causing do_tzset never to run and never to
initialize the time zone data.
Rich Felker [Fri, 23 Aug 2013 19:51:59 +0000 (15:51 -0400)]
fix bugs in $ORIGIN handling
1. an occurrence of ${ORIGIN} before $ORIGIN would be ignored due to
the strstr logic. (note that rpath contains multiple :-delimited paths
to be searched.)
Rich Felker [Fri, 23 Aug 2013 18:14:47 +0000 (14:14 -0400)]
use AT_EXECFN, if available, for dynamic linker to identify main program
fallback to argv[0] as before. unlike argv[0], AT_EXECFN was a valid
(but possibly relative) pathname for the new program image at the time
the execve syscall was made.
as a special case, ignore AT_EXECFN if it begins with "/proc/", in
order not to give bogus (and possibly harmful) results when fexecve
was used.
Rich Felker [Fri, 23 Aug 2013 15:15:40 +0000 (11:15 -0400)]
add recursive rpath support to dynamic linker
previously, rpath was only honored for direct dependencies. in other
words, if A depends on B and B depends on C, only B's rpath (if any),
not A's rpath, was being searched for C. this limitation made
rpath-based deployment difficult in the presence of multiple levels of
library dependency.
at present, $ORIGIN processing in rpath is still unsupported.
Rich Felker [Fri, 23 Aug 2013 02:36:19 +0000 (22:36 -0400)]
add strftime and wcsftime field widths
at present, since POSIX requires %F to behave as %+4Y-%m-%d and ISO C
requires %F to behave as %Y-%m-%d, the default behavior for %Y has
been changed to match %+4Y. this seems to be the only way to conform
to the requirements of both standards, and it does not affect years
prior to the year 10000. depending on the outcome of interpretations
from the standards bodies, this may be adjusted at some point.
Rich Felker [Thu, 22 Aug 2013 23:44:02 +0000 (19:44 -0400)]
simplify strftime and fix integer overflows
use a long long value so that even with offsets, values cannot
overflow. instead of using different format strings for different
numeric formats, simply use a per-format width and %0*lld for all of
them.
this width specifier is not for use with strftime field widths; that
will be a separate step in the caller.