The iowait value reported by the kernel on NO_HZ systems can decrement
as a result of inaccurate iowait tracking. Waiting on IO can be first
accounted as iowait but then instead as idle.
Function get_per_cpu_interval() considers iowait going backwards between
two readings as a CPU coming back online and resets the iowait value of
the first reading to 0. If iowait is decremented only because of
inaccurate tracking, this causes that almost all time between the two
readings is incorrectly recognized by sar as being spent in iowait.
The patch updates the code in get_per_cpu_interval() to recognize this
situation. If the iowait value between two readings decremented but the
idle value did not then the code now considers it as a problem with the
iowait reporting and corrects the first value according to the second
reading. Otherwise, the code remains treating decremented iowait as a
CPU coming back online.
Petr Pavlu [Wed, 2 Sep 2020 08:24:43 +0000 (10:24 +0200)]
Workaround for iowait being decremented
The iowait value reported by the kernel on NO_HZ systems can decrement
as a result of inaccurate iowait tracking. Waiting on IO can be first
accounted as iowait but then instead as idle.
Function get_per_cpu_interval() considers iowait going backwards between
two readings as a CPU coming back online and resets the iowait value of
the first reading to 0. If iowait is decremented only because of
inaccurate tracking, this causes that almost all time between the two
readings is incorrectly recognized by sar as being spent in iowait.
The patch updates the code in get_per_cpu_interval() to recognize this
situation. If the iowait value between two readings decremented but the
idle value did not then the code now considers it as a problem with the
iowait reporting and corrects the first value according to the second
reading. Otherwise, the code remains treating decremented iowait as a
CPU coming back online.
Sebastien GODARD [Fri, 21 Aug 2020 08:36:47 +0000 (10:36 +0200)]
sar manual page: Update definition for runq-sz metric
The runq-sz metric was defined as the number of tasks waiting for run
time, and calculated as (number of tasks running+waiting for run time)
minus 1 (to not count current running process).
This was OK on UP machines, but is no longer true on SMP/multi-cores
machines. So update the metric's definition: runq-sz is the number of
tasks running or waiting for run time (we still don't count current
running process).
sysstat version 12.4.0 final packaging.
Changelog added.
No breaking new features here but numerous small improvements over
the last previous version. Among them:
* All the sysstat commands now display their statistics in color
by default when the output is connected to a terminal,
* sar "pretty-prints" the device names by default, which means
you won't need to use option -p with option -d to display the
device names as they appear in /dev,
* You can tell the sa2 script to wait for a random delay before executing
in order to prevent a massive I/O burst on some systems,
* You can also tell the sa1 script to insert a comment in current daily
datafile saDD on system suspend and resume,
... and more ! See the CHANGES file for more details.
This option makes the report easier to read by a human.
Please note that for sar, option -p is now equivalent to this one.
Use this option to display device names or network interface names
on the right of the report instead of the left (which can be
particularly useful for long names).
Also option -h is now equivalent to "--pretty --human".
sar: Device names are now pretty-printed by default
sar now pretty-prints device names by default. It means that they will
be displayed as they appear in /dev (e.g. sda, sdb...) instead of
"devM-m", and you don't need to use option -p for that.
Using sysstat.ioconf configure file to determine the device name based
on its major and minor numbers gives a wrong name for xvd* devices with
big minor numbers.
Don't use it any more as there are other ways to find the name (e.g. we
can read the symlink in /sys/dev/block/).
Fix gcc V10 warnings in sysstat 12.0.1 code used for test
The tests/ directory contains an old version of sysstat code (v 12.0.1)
used for non regression tests.
Compiling this version produces warnings with recent gcc v9/v10. Fix
them, even if we alter a bit the original 12.0.1 code.
Sebastien GODARD [Sat, 20 Jun 2020 08:36:26 +0000 (10:36 +0200)]
Makefile: Remove all reports and data files if requested
When option --enable-clean-sa-dir, sysstat used to remove compressed
reports and data files in /var/log/sa directory only when they had
been compressed using gzip.
Update Makefile to look for other compression programs. Also remove
files using the format saYYYYMMDD or sarYYYYMMDD.
Sebastien GODARD [Sat, 20 Jun 2020 07:40:32 +0000 (09:40 +0200)]
Compress manual pages by default when installed
Compress manual pages (using xz, bzip2 or gzip) by default when they are
installed.
Replace option --enable-compress-manpg with --disable-compress-manpg in
configure script.
Sebastien GODARD [Sat, 20 Jun 2020 07:22:59 +0000 (09:22 +0200)]
configure: Remove obsolete autoconf macros
Don't use AC_HEADER_STDC and AC_HEADER_DIRENT macros in configure.in
script.
GNU autoconf manual says that these macros are obsolete and that new
code should no longer use them.
Sebastien GODARD [Sat, 13 Jun 2020 09:55:28 +0000 (11:55 +0200)]
Display statistics in color by default
Display statistics in colors by default when the output is connected to
a terminal. It is no longer necessary to set the S_COLORS environment
variable.
According to the time(2) manual page, the argument passed to time()
system call is obsolescent and should always be NULL in new code.
When this argument is NULL, the call cannot fail.
Tom Hebb [Wed, 3 Jun 2020 18:57:21 +0000 (11:57 -0700)]
Replace index() call with strchr() call
According to glibc documentation[1], "index is another name for strchr;
they are exactly the same. New code should always use strchr." The use
of index() breaks compilation for Android targets, which use Bionic
instead of glibc and don't have index().
Sebastien GODARD [Mon, 11 May 2020 13:43:15 +0000 (15:43 +0200)]
sa1: Insert a comment in daily datafile on system suspend/resume
Add a new option ("--sleep") to sa1 so that a comment can be inserted in
current daily datafile on system suspend/resume.
This comment can then be displayed using sar's option -C.
E.g.:
$ sar -C
Linux 5.6.10-300.fc32.x86_64 (localhost.localdomain) 05/11/2020 _x86_64_ (8 CPU)
02:53:05 PM LINUX RESTART (8 CPU)
02:55:01 PM CPU %user %nice %system %iowait %steal %idle
03:00:01 PM all 1.66 0.00 1.27 0.17 0.00 96.89
03:10:01 PM all 1.30 0.00 1.38 0.20 0.00 97.12
03:12:22 PM COM LINUX SLEEP MODE (pre suspend)
03:14:55 PM COM LINUX SLEEP MODE (post suspend)
03:15:31 PM all 2.79 0.00 3.46 4.35 0.00 89.40
03:20:01 PM all 0.66 0.00 0.82 0.13 0.00 98.39
Average: all 3.50 0.00 3.53 0.78 0.00 92.19
"sa1 --sleep" will be called by sysstat.sleep script installed in
$systemdsleepdir directory if systemd is available.
Sebastien GODARD [Mon, 11 May 2020 08:04:17 +0000 (10:04 +0200)]
sa2: Wait for a random delay before running
Add a new option ("delay_range=") to configure script to tell sa2 script
to wait for a random delay in the indicated range before running.
This delay (expressed in seconds) is aimed at preventing a massive I/O
burst at the same time on VM sharing the same storage area.
Sebastien GODARD [Sun, 10 May 2020 08:57:53 +0000 (10:57 +0200)]
configure: Add new option "--enable-use-crond"
Add a new option to configuration script to tell it to use the standard
cron daemon (and the SysV standard files in /etc/rc.d/) even if systemd
is installed.
I have added this option because systemd seems currently broken on my
F32 distro.
sysstat version 12.3.3 final packaging.
lsm and spec files updated.
Changelog added.
Year of (C) message updated.
Exciting new features in this version include:
* sar/sadc collect and display Pressure-Stall Information statistics.
These metrics have been added during the 4.20 development cycle of the
Linux kernel. They can be displayed with "sar -q {CPU | LOAD | MEM}".
* iostat has gained support devices managed by drivers in userspace like
spdk (see #257). New flags (-f / +f) have been added so that the user
can specify an alternate location for statistics files.
This version also includes various bug fixes.
Enjoy!
GCC versions 9 and later complain more agressively, e.g.:
common.c: In function ‘get_wwnid_from_pretty’:
common.c:396:4: warning: ‘strncpy’ offset [275, 4095] from the object at ‘drd’ is out of the bounds of referenced subobject ‘d_name’ with type ‘char[256]’ at offset 19 [-Warray-bounds]
396 | strncpy(wwn_name, drd->d_name, sizeof(wwn_name));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/dirent.h:61,
from common.c:32:
/usr/include/bits/dirent.h:33:10: note: subobject ‘d_name’ declared here
33 | char d_name[256]; /* We must not include limits.h! */
| ^~~~~~
or:
common.c: In function ‘get_persistent_name_path’:
common.c:876:37: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
876 | snprintf(path, sizeof(path), "%s/%s",
| ^
common.c:876:2: note: ‘snprintf’ output 2 or more bytes (assuming 4097) into a destination of size 4096
876 | snprintf(path, sizeof(path), "%s/%s",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
877 | get_persistent_type_dir(persistent_name_type), name);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add new flag to iostat that can be used to specify an alternate
directory for devices statistics.
Example:
iostat -f /altdir [...]
-> Use <altdir> directory to read devices statistics.
iostat +f /altdir [...]
-> Use standard kernel files *and* <altdir> to read devices statistics.
<altdir> is a directory containing files with statistics for devices
managed in userspace.
<altdir> may contain:
- a diskstats file whose format is compliant with that located in /proc.
- statistics for individual devices contained in files whose format is
compliant with that of files located in /sys.
In particular, the following files located in <altdir> may be used by
iostat:
<altdir>/block/<device>/stat
<altdir>/block/<device>/<partition>/stat
<partition> files must have an entry in <altdir>/dev/block/ directory,
e.g.:
<altdir>/dev/block/[major]:[minor] --> ../../block/<device>/<partition>
Notes:
1) iostat uses the /proc/diskstats file to read statistics only when
"-p all" has been entered on the command line (read statistics for all
the devices and their partitions).
2) iostat uses the /sys/block/<device>/stat files to read the statistics
for a device (e.g. "iostat sda") and possibly all its partitions
(e.g. "iostat -p sda").
3) iostat uses the link in /sys/dev/block/[major]:[minor] to know where
the stat file is located for a partition that has been entered on the
command line (e.g. "iostat sda3"). The partition must exist in /dev to
get its major and minor numbers.
Add a new option to be used with "sadf -c" (datafile conversion).
This option enables the user to specify the number of ticks per second
for the machine where the datafile to be converted was created.
E.g.:
sadf -c old_datafile -O hz=250 > new_datafile
sar: Don't display "Inconsistent input data" when no activities are
collected by sadc
When no activities are collected by sadc, sadc writes an error message
("Requested activities not available").
sar used to display a second error message in addition to the previous
one ("Inconsistent input data"). Remove this one.