Dmitry V. Levin [Wed, 12 Oct 2011 19:03:29 +0000 (19:03 +0000)]
Add names for dummy parsers. No code changes
* linux/dummy.h: Add aliases to printargs() for those of dummy parsers
that had no own names before.
* linux/*/syscallent.h: Use these new names instead of printargs.
Dmitry V. Levin [Tue, 11 Oct 2011 17:07:05 +0000 (17:07 +0000)]
Implement decoding of splice, tee and vmsplice(2) syscalls
* io.c (print_loff_t): New function.
(sys_sendfile64): Use it.
(splice_flags): New xlat structure.
(sys_tee, sys_splice, sys_vmsplice): New functions.
* linux/syscall.h (sys_tee, sys_splice, sys_vmsplice): Declare them.
* linux/*/syscallent.h: Use them.
Do post-attach initialization earlier; fix "we ignore SIGSTOP on NOMMU" bug
We set ptrace options when we see post-attach SIGSTOP.
This is wrong: it's better to set them right away on the very first
stop (whichever it will be). It also will make adding SEIZE support easier,
since SEIZE has no post-attach SIGSTOP.
We do it by adding a new bit, TCB_IGNORE_ONE_SIGSTOP, and treating
TCB_STARTUP and TCB_IGNORE_ONE_SIGSTOP as two slightly different things.
* defs.h: Add a new flag bit, TCB_IGNORE_ONE_SIGSTOP.
* process.c (internal_fork): Set TCB_IGNORE_ONE_SIGSTOP on a newly added child.
* strace.c (startup_attach): Set TCB_IGNORE_ONE_SIGSTOP after attach.
Fix a case when "strace -p PID" found PID dead but sone other of its threads
still alive.
(startup_child): Set TCB_IGNORE_ONE_SIGSTOP after attach, _if needed_.
This fixes a bogus case where we can ignore a _real_ SIGSTOP on NOMMU.
(detach): Perform anti-SIGSTOP dance only if TCB_IGNORE_ONE_SIGSTOP is set,
not if TCB_STARTUP is set.
(trace): Set TCB_IGNORE_ONE_SIGSTOP after attach.
Clear TCB_STARTUP and initialize tracee on the very first tracee stop.
Clear TCB_IGNORE_ONE_SIGSTOP when SIGSTOP is seen.
* defs.h: Remove TCB_ATTACH_DONE constant.
* strace.c (startup_attach): Use TCB_STARTUP instead of TCB_ATTACH_DONE
to distinquish attached from not-yet-attached threads.
This fixes logic in detach() which thinks that TCB_STARTUP
means that we are already attached, but did not see SIGSTOP yet.
This also allows to get rid of TCB_ATTACH_DONE flag.
* process.c (internal_fork): Set TCB_STARTUP after attach.
* strace.c (startup_attach): Likewise.
(startup_child): Likewise.
(alloc_tcb): Do not set TCB_STARTUP on tcb allocation - we are
not attached yet.
(trace): Set TCB_STARTUP when we detech an auto-attached child.
After recent change, select(2^31-1, NULL, NULL, NULL)
would make strace exit. This change caps fdsize so that
it is always in [0, 1025*1024], IOW: we will try to allocate at most
1 megabyte, which in practice will almost always work,
unlike malloc(2Gig).
* desc.c (decode_select): Cap fdsize to 1024*1024.
* pathtrace.c (pathtrace_match): Cap fdsize to 1024*1024.
* file.c (sys_getdents): Cap len to 1024*1024.
(sys_getdents64): Cap len to 1024*1024.
* util.c (dumpiov): Refuse to process iov with more than 1024*1024
elements. Don't die on malloc failure.
(dumpstr): Don't die on malloc failure.
Minor tweaks in startup_child(). Logic isn't changed (but code is)
* strace.c (startup_attach): Tweak comment.
(startup_child): Move common code out of ifdef.
Indent nested ifdefs. Tweak comments. Remove two
unnecessary calls to getpid().
Denys Vlasenko [Wed, 31 Aug 2011 13:58:06 +0000 (15:58 +0200)]
Add README-linux-ptrace file
I tried to push this doc to Michael Kerrisk <mtk.manpages@gmail.com>,
but got no reply. To avoid losing the document, let it live
in strace tree for now.
Denys Vlasenko [Wed, 31 Aug 2011 10:26:03 +0000 (12:26 +0200)]
Optimization: eliminate all remaining usages of strcat()
After this change, we don't use strcat() anywhere.
* defs.h: Change sprinttv() return type to char *.
* time.c (sprinttv): Return pointer past last stored char.
* desc.c (decode_select): Change printing logic in order to eliminate
usage of strcat() - use stpcpy(), *outptr++ = ch, sprintf() instead.
Also reduce usage of strlen().
* stream.c (decode_poll): Likewise.
Denys Vlasenko [Wed, 31 Aug 2011 10:22:56 +0000 (12:22 +0200)]
Optimize string_quote() for speed
* util.c (string_quote): Speed up check for terminating NUL.
Replace strintf() with open-coded binary to hex/oct conversions -
we potentially do them for every single byte, need to be fast.
Denys Vlasenko [Tue, 30 Aug 2011 16:53:49 +0000 (18:53 +0200)]
On X86_64 and I386, use PTRACE_GETREGS to fetch all registers
Before this change, registers were read with PTRACE_PEEKUSER
ptrace operation, one per register. This is slower than
fetching them all in one ptrace operation.
* defs.h: include asm/ptrace.h on X86_64 and I386.
* syscall.c: New static variables i386_regs and x86_64_regs.
Remove static eax/rax variables.
(get_scno): Fetch all registers with single PTRACE_GETREGS operation.
(get_syscall_result): Likewise.
(syscall_fixup_on_sysenter): Use PTRACE_GETREGS results in i386/x86_64_regs.
(syscall_enter): Set tcp->u_arg[i] from PTRACE_GETREGS results.
(get_error): Set tcp->u_rval, tcp->u_error from PTRACE_GETREGS results.
Denys Vlasenko [Thu, 25 Aug 2011 08:25:35 +0000 (10:25 +0200)]
Simplify syscall_fixup[_on_sysexit]
* syscall.c (syscall_fixup): Remove checks for entering(tcp).
Remove code which executes if exiting(tcp).
(syscall_fixup_on_sysexit): Remove code which executes
if entering(tcp). Remove checks for exiting(tcp).
Denys Vlasenko [Thu, 25 Aug 2011 08:23:00 +0000 (10:23 +0200)]
Split syscall_fixup into enter/exit pair of functions
* syscall.c: Create syscall_fixup_on_sysexit() which is a copy of
syscall_fixup().
(trace_syscall_exiting): Call syscall_fixup_on_sysexit() instead of
syscall_fixup().
Denys Vlasenko [Wed, 24 Aug 2011 23:27:59 +0000 (01:27 +0200)]
Optimize tabto()
tabto is used in many lines of strace output.
On glibc, tprintf("%*s", col - curcol, "") is noticeably slow
compared to tprintf(" "). Use the latter.
Observed ~15% reduction of time spent in userspace.
* defs.h: Drop extern declaration of acolumn. Make tabto()
take no parameters.
* process.c (sys_exit): Call tabto() with no parameters.
* syscall.c (trace_syscall_exiting): Call tabto() with no parameters.
* strace.c: Make acolumn static, add static char *acolumn_spaces.
(main): Allocate acolumn_spaces as a string of spaces.
(printleader): Call tabto() with no parameters.
(tabto): Use simpler method to print lots of spaces.
Denys Vlasenko [Wed, 24 Aug 2011 23:13:43 +0000 (01:13 +0200)]
Opotimize "scno >= 0 && scno < nsyscalls" check
gcc can't figure out on its own that this check can be done with
single compare, and does two compares. We can help it by casting
scno to unsigned long: ((unsigned long)(scno) < nsyscalls)
* defs.h: New macro SCNO_IN_RANGE(long_var).
* count.c (count_syscall): Use SCNO_IN_RANGE() instead of open-coded check.
* syscall.c (getrval2): Use SCNO_IN_RANGE() instead of open-coded check.
This fixes a bug: missing check for scno < 0 and scno > nsyscalls
instead of scno >= nsyscalls.
(get_scno): Use SCNO_IN_RANGE() instead of open-coded check.
This fixes a bug: scno > nsyscalls instead of scno >= nsyscalls.
(known_scno): Use SCNO_IN_RANGE() instead of open-coded check.
(internal_syscall): Likewise.
(syscall_enter): Likewise.
(trace_syscall_entering): Likewise.
(get_error): Likewise.
(trace_syscall_exiting): Likewise.
Denys Vlasenko [Wed, 24 Aug 2011 15:25:32 +0000 (17:25 +0200)]
Unify per-architecture post-execve SIGTRAP check.
Move post-execve SIGTRAP check from get_scno_on_sysenter
(multitude of places on many architectures) to a single location
in trace_syscall_entering. This loosens the logic for some arches,
since many of them had additional checks such as scno == 0.
However, on non-ancient Linux kernels we should never have post-execve
SIGTRAP in the first place, by virtue of using PTRACE_O_TRACEEXEC.
Denys Vlasenko [Wed, 24 Aug 2011 14:59:23 +0000 (16:59 +0200)]
Speed up x86 by avoiding EAX read on syscall entry
on x86, EAX read on syscall entry is not necessary if we know
that post-execve SIGTRAP is disabled by PTRACE_O_TRACEEXEC ptrace option.
This patch (a) moves EAX retrieval from syscall_fixup
to get_scno_on_sysexit, and (b) perform EAX retrieval in syscall_fixup
only if we are in syscall entry and PTRACE_O_TRACEEXEC option is not on.
* syscall.c (get_scno_on_sysexit): On I386 and X86_64, read eax/rax
which contain syscall return value.
(syscall_fixup): On I386 and X86_64, read eax/rax only on syscall enter
and only if PTRACE_O_TRACEEXEC is not in effect.
Denys Vlasenko [Wed, 24 Aug 2011 14:56:03 +0000 (16:56 +0200)]
Do not read syscall no in get_scno_on_sysexit
* syscall.c (get_scno_on_sysexit): Remove scno retrieval code, since
we don't save it anyway. This is the first real logic change
which should make strace faster: for example, on x64 ORIG_EAX
is no longer read in each syscall exit.
Denys Vlasenko [Wed, 24 Aug 2011 14:47:32 +0000 (16:47 +0200)]
get_scno is an unholy mess, make it less horrible
Currently, get_scno does *much* more than "get syscall no".
It checks for post-execve SIGTRAP. It checks for changes
in personality. It retrieves params on entry and registers on exit.
Worse still, it is different in different architectures: for example,
for AVR32 regs are fetched in get_scno(), while for e.g. I386
it is done in syscall_enter().
Another problem is that get_scno() is called on both syscall entry and
syscall exit, which is stupid: we don't need to know scno on syscall
exit, it is already known from last syscall entry and stored in
tcp->scno! In essence, get_scno() does two completely different things
on syscall entry and on exit, they are just mixed into one bottle, like
shampoo and conditioner.
The following patches will try to improve this situation.
This change duplicates get_scno into identical get_scno_on_sysenter,
get_scno_on_sysexit functions. Call them in syscall enter and syscall
exit, correspondingly.
* defs.h: Rename get_scno to get_scno_on_sysenter; declare it only
if USE_PROCFS.
* strace.c (proc_open): Call get_scno_on_sysenter instead of get_scno.
* syscall.c (get_scno): Split into two (so far identical) functions
get_scno_on_sysenter and get_scno_on_sysexit.
(trace_syscall_entering): Call get_scno_on_sysenter instead of get_scno.
(trace_syscall_exiting): Call get_scno_on_sysexit instead of get_scno.
Dmitry V. Levin [Tue, 23 Aug 2011 16:24:20 +0000 (16:24 +0000)]
Reduce code redundancy in syscall_enter()
* syscall.c [LINUX] (syscall_enter): Move tcp->u_nargs initialization
from arch-specific ifdefs to common code. Always cache tcp->u_nargs in
a local variable and use it in for() loops.
[IA64, AVR32] Rewrite tcp->u_arg[] initialization using a loop.
Denys Vlasenko [Tue, 23 Aug 2011 16:04:25 +0000 (18:04 +0200)]
Define MAX_ARGS to 6 for all Linux arches
* defs.h: Define MAX_ARGS to 6 for all Linux arches.
* linux/ia64/syscallent.h: Change all 8-argument printargs
to MA (MAX_ARGS).
linux/mips/syscallent.h: Change all two 7-argument printargs
to MA (MAX_ARGS).
Denys Vlasenko [Tue, 23 Aug 2011 11:32:38 +0000 (13:32 +0200)]
Cache tcp->u_nargs in a local variable for for() loops
Loops of the form "for (i = 0; i < tcp->u_nargs; i++) ..."
need to fetch tcp->u_nargs from memory on every iteration
if "..." part has a function call (gcc doesn't know that
tcp->u_nargs won't change). This can be sped up
by putting tcp->u_nargs in a local variable, which might
go into a CPU register.
* syscall.c (decode_subcall): Cache tcp->u_nargs in a local variable
as for() loop limit value.
(syscall_enter): Likewise.
Denys Vlasenko [Tue, 23 Aug 2011 11:24:17 +0000 (13:24 +0200)]
Stop using nargs == -1 in syscallent tables
Usage -1 as argument count in syscallent tables
necessitates the check for it, a-la:
if (sysent[tcp->scno].nargs != -1)
tcp->u_nargs = sysent[tcp->scno].nargs;
else
tcp->u_nargs = MAX_ARGS;
which is stupid: we waste cycles checking something which
is constant and known at compile time.
* defs.h: Make struct sysent::nargs unsigned.
* freebsd/i386/syscallent.h: Replace nargs of -1 with MA.
* linux/s390/syscallent.h: Likewise.
* linux/s390x/syscallent.h: Likewise.
* svr4/syscallent.h: Likewise.
* freebsd/syscalls.pl: Likewise in generator script.
* syscallent.sh: Likewise in generator script.
* syscall.c: Add define MA MAX_ARGS / undef MA around includes
of syscallent[N].h.
Denys Vlasenko [Mon, 22 Aug 2011 00:06:35 +0000 (02:06 +0200)]
Fix -z display.
Before this patch, the following:
open("qwerty", O_RDONLY) = -1 ENOENT
write(2, "wc: qwerty: No such file or dire"..., 38) = 38
was shown totally wrongly with -z:
open("qwerty", O_RDONLY) = 38
(yes, that's right, write syscall is lost!)
Now it is shown "less wrongly" as:
open("qwerty", O_RDONLY <unfinished ...>
write(2, "wc: qwerty: No such file or dire"..., 38) = 38
* syscall.c (trace_syscall_exiting): Use common TCB_INSYSCALL clearing
via "goto ret". This fixes totally broken display of -z, but even now
it is not working as intended. Add a comment about that.
(trace_syscall_entering): Use common TCB_INSYSCALL setting
via "goto ret".
Denys Vlasenko [Sun, 21 Aug 2011 16:03:23 +0000 (18:03 +0200)]
Straighten up confused comments/messages about post-execve SIGTRAP handling
* defs.h: Explain TCB_INSYSCALL and TCB_WAITEXECVE bits in detail.
* strace.c (choose_pfd): Use entering/exiting macros instead of direct check
for TCB_INSYSCALL.
* syscall.c (get_scno): Use entering/exiting macros instead of direct check
for TCB_INSYSCALL. Fix comments about post-execve SIGTRAP.
(syscall_fixup): Use entering/exiting instead of direct check
for TCB_INSYSCALL. Add a comment what "not a syscall entry" message
usually means. Change wrong "stray syscall exit" messages into
"not a syscall entry" ones.
Denys Vlasenko [Sun, 21 Aug 2011 15:47:40 +0000 (17:47 +0200)]
count_syscall() always returns 0, optimize it
* defs.h (count_syscall): Change return type from int to void.
* count.c (count_syscall): Change return type from int to void.
* syscall.c (trace_syscall_exiting): Change code around call
to count_syscall accordingly.
Denys Vlasenko [Sun, 21 Aug 2011 15:26:55 +0000 (17:26 +0200)]
Conditionally optimize out unused code
* syscall.c (internal_syscall): Call internal_exec only if
SUNOS4 || (LINUX && TCB_WAITEXECVE).
* process.c (internal_exec): Define this function only if
SUNOS4 || (LINUX && TCB_WAITEXECVE).
(printwaitn): Don't check wordsize if SUPPORTED_PERSONALITIES == 1.
* signal.c (sys_kill): Likewise.
* syscall.c (is_negated_errno): Likewise.
(trace_syscall_exiting): Fold a tprintf into tprintfs which follow it.
Denys Vlasenko [Sat, 20 Aug 2011 00:12:33 +0000 (02:12 +0200)]
Small optimization in signal and ioctl tables
Trivial shuffling of data tables puts them all in one file,
allowing gcc to see their sizes and eliminate variables
which store these sizes.
Surprisingly, in C mode gcc does not optimize out static const int
variables. Help it by using enums instead.
* defs.h: Stop exporting ioctlent{0,1,2}, nioctlents{0,1,2},
signalent{0,1,2}, nsignals{0,1,2}.
* ioctl.c: Remove definitions of ioctlent{,0,1,2} and nioctlents{,0,1,2}.
* signal.c: Remove definitions of signalent{,0,1,2} and nsignals{,0,1,2}.
* syscall.c: Move above definitions to this file. Make them static const
or enums if suitable.
Denys Vlasenko [Fri, 19 Aug 2011 22:03:10 +0000 (00:03 +0200)]
Make needlessly static data local
* syscall.c (get_scno): For POWERPC64 and X86-64, variable currpers
is declared static. But its old data is never used. Convert it
to ordinary local variable.