Sebastien GODARD [Tue, 27 Aug 2013 19:28:33 +0000 (21:28 +0200)]
Fix wrong permissions for data file created by sa1 script
If HISTORY is greater than 28, the sa1 script does not execute the
command "umask 0022" until after the new output file is created,
allowing it to have the wrong permissions if the root umask is not set
to 0022. So move the umask command above that "if" statement.
Reported-by: Peter Schiffer <pschiffe@redhat.com.fake> Signed-off-by: Sebastien GODARD <sysstat@orange.fr.fake>
Sebastien GODARD [Thu, 15 Aug 2013 14:20:12 +0000 (16:20 +0200)]
Fix sar log file corruption in odd Feb 28th edge-case
The sar scripts /usr/lib/sa/sa1 and sa2 both normally use logs in
/var/log/sa itself, but if HISTORY is > 28 , the scripts use a tree of
log directories under /var/log/sa.
When HISTORY is < 28, then /usr/lib/sa/sa2's action to expunge old logs
which are more than HISTORY days old will mean that "tomorrow's" saXX
file never exists prior to /usr/lib/sa/sa1 creating it on the first pass
of the day.
However, when HISTORY == 28, then on March 1st in a non-leap year, the
log file sa01 will already exist from Feb 1st having not been
pre-expunged by the sa2 script. Similarly for March 2nd through 28th.
So now make sure that the sa1 script deletes the log file
if it is from a previous month. This way this will
prevent a log file from a month ago being re-used "today".
Update iostat manual page: Fix a small inaccuracy regarding %util field
iostat manual page used to say that device saturation occured when the
value of %util was close to 100%. This is true for devices serving
requests serially but not necessariyl for devices able to serve multiple
requests simultaneously. So update iostat manual page accordingly.
Use a lightweight static library to compile some sysstat commands
Create two versions of the "librdstats.a" static library: one having all
the functions to read stats and which will be used by sadc, the other
having only a minimal subset of functions used by other commands like
iostat or pidstat.
The result is a much smaller binary file for iostat (size is 29% smaller
than before), pidstat (-27%), mpstat (-40%), nfsiostat (-43%) and
cifsiostat (-43%).
Sebastien GODARD [Sun, 30 Jun 2013 13:07:44 +0000 (15:07 +0200)]
"sar -o" now collects all possible statistics
sar now collects all possible statistics (including partitions ones)
when data are saved into a file with option -o ("sar -o" calls sadc
with option "-S XALL" instead of "-S ALL").
Sebastien GODARD [Wed, 26 Jun 2013 19:53:04 +0000 (21:53 +0200)]
Small fix for "sar -A" in sar manual page
Indicate that filesystems statistics are also included in stats
displayed by sar -A (that is to say: option -F is also set when sar -A
is entered on the command line).
Sebastien GODARD [Sun, 23 Jun 2013 15:07:31 +0000 (17:07 +0200)]
%ifutil: Update sadf output to take into account the new metric
This patch updates the various output formats of sadf (CSV, ppc, JSON,
XML) to take into account the new %ifutil metric added to sar network
devices statistics.
DTD and XSD documents have also been updated accordingly.
Sebastien GODARD [Sun, 23 Jun 2013 14:57:06 +0000 (16:57 +0200)]
%ifutil: Add NIC utilization percentage to sar network stats
This patch adds %ifutil to statistics displayed by sar -n.
%ifutil is a new metric giving the utilization percentage of the
corresponding network interface.
Sebastien GODARD [Wed, 12 Jun 2013 19:39:19 +0000 (21:39 +0200)]
Collect filesystems stats only when sadc option "-S XDISK" is used
Make sadc collect filesystems statistics (those displayed by sar option
-F) only when option "-S XDISK" is used.
Filesystems are actually closer to partitions than to disks. So it makes
more sense for sadc to collect them when option "-S XDISK" is used
rather than option "-S DISK".
Update sadc and sar manual pages to reflect the change.
Sebastien GODARD [Tue, 11 Jun 2013 19:46:56 +0000 (21:46 +0200)]
Fix a wrong assertion in sadf manual page
In its EXAMPLE section, sadf manual page says that
sadf -d /var/log/sa/sa21 -- -r -n DEV
extracts memory, swap space and network statistics from the data file.
This is no longer true as swap statistics are now displayed with option
-S and not option -r. So fix the wrong comment.
Tell where interrupts data come from in mpstat manual page
mpstat manual page now tells that interrupts data displayed by
mpstat -I CPU come from /proc/interrupts file, and that interrupts data
displayed by mpstat -I SCPU come from /proc/softirqs file, so that the
meaning of each interrupts can be more easily understood by users.
Type for "intr" attribute was integer in XSD document.
Yet its value can sometimes be "sum" when displaying statistics for the
total number of interrupts received per second (sar -I SUM).
So change its type to string.
Sebastien GODARD [Thu, 30 May 2013 20:26:18 +0000 (22:26 +0200)]
Handle octal codes in filesystems mount point names
sar -F was unable to get statistics about filesystems whose mount points
(as given by /etc/mtab file) had octal codes in their names.
So now sar replaces those octal codes with their corresponding
characters before trying to collect stats about them.
Sebastien GODARD [Mon, 20 May 2013 15:14:05 +0000 (17:14 +0200)]
Filesystems stats: Display unmounted filesystems in summary list
This patch enables sar -F to display filesystems in its summary list (the
last stats displayed by sar) even if those filesystems have been
unmounted before the end of the report.
Sebastien GODARD [Sun, 19 May 2013 09:18:22 +0000 (11:18 +0200)]
Split rd_stats.c and rd_stats.h files
rd_stats.c file was becoming really big. So remove from it functions
used to count the number of items and put them in a separate file
(count.c).
Functions prototypes go to count.h.
Sebastien GODARD [Sun, 19 May 2013 08:39:16 +0000 (10:39 +0200)]
Fix "memfree" element type in XSD document.
Type of "memfree" element (from memory statistics) was "negativeInteger"
in XSD document. Use "nonNegativeInteger instead" since it can never be
negative.
Also update CHANGES file.
Sebastien GODARD [Sat, 18 May 2013 20:04:27 +0000 (22:04 +0200)]
Filesystems statistics (part 6): XML output format
This patch adds XML output format for filesystems statistics. This
format can be displayed with sadf option -x.
DTD and XML Schema (xsd) documents have also been updated.
seb [Mon, 6 May 2013 19:55:04 +0000 (21:55 +0200)]
Filesystems statistics for sar (part 3): Display statistics
This patch makes sar display filesystems statistics collected by sadc.
No average statistics are calculated here (filesystems can be unmounted,
then mounted again, making average values meaningless). Instead, sar displays again
the list of filesystems.
seb [Sun, 5 May 2013 15:42:03 +0000 (17:42 +0200)]
Filesystems statistics for sar (part 2): Read statistics
This patch reads statistics for mounted filesystems except for
pseudo-filesystems which are ignored.
It also determines the number of filesystems for which stats will be
read.
Oh, and it adds another field to the stats_filesystem structure so that
filesystem name can be saved ;-)
seb [Wed, 1 May 2013 12:22:08 +0000 (14:22 +0200)]
Remove unused constants from header files
Several constants defined in header files were no longer used.
So remove them.
Note that S_F_PER_PROC constant (used in sar code to indicate that
option -P has been entered on the commnd line) has also been deleted.
We can know that option -P has been used if the CPU bitmap has at
least one bit set.
Added filesystems statistics to sar (part 1): Basic definitions and structures
A new option (-F) has been added to sar. This option tells sar to display
filesystems statistics.
This first patch adds the corresponding structures, constants and the new
functions prototypes.
sar's help and usage messages have also been updated.
Sysstat command options can now be 'collapsed' (grouped) when
not followed by an argument. So it's now possible for example
to enter 'iostat -px 2 5' since no device name is given to
option -p.
This also concerns pidstat option -U: You can now enter for example
'pidstat -wU' to display switching activity for tasks together with
their user name.
seb [Sat, 16 Mar 2013 20:00:37 +0000 (21:00 +0100)]
mpstat now takes into account every interrupt per processor
so that their number adds up to the number displayed for CPU "all".
mpstat used to sum only numerical interrupts (those with names like
"0", "1", etc. and not "LOC", ...). But the number of interrupts
per processor (displayed by mpstat -I SUM -P ALL) doesn't add up
to what is displayed for "all". To fix this, take into account all
interrupts per processor in /proc/interrupts file.
Message from Shergill, Gurinder <gurinder.shergill@hp.com> 13/03/2013:
I am seeing something odd with mpstat. It shows 0 for all the CPUs except 0 even though there are interrupts going to other CPUs and the number doesn't add up to what is displayed for "all". I can confirm that there is very high level of I/O activity on the system (resulting in about 200k intr/s). I have tried multiple system and seen the same behavior. I have also tried multiple different kernels (3.7.10, 3.8.2, also the distro kernel with RHEL 6.3 & 6.4).
Finally, I downloaded the latest sources for sysstat and built them on one of my systems, but even with that I get the same behavior.
seb [Wed, 13 Mar 2013 14:36:56 +0000 (15:36 +0100)]
Fixed a bug where systemd unit file couldn't be installed
because PKG_PROG_PKG_CONFIG macro wasn't expanded in configure
script.
Mail from Peter Schiffer <pschiffe@redhat.com> 08/03/2013:
I'm writing you regarding a little problem I've noticed when running
./configure script:
./configure: line 3923: PKG_PROG_PKG_CONFIG: command not found
checking for systemctl... /bin/systemctl
./configure: line 3975: --variable=systemdsystemunitdir: command not found
According to Google, you might be missing pkg-config program in your
path while generating configure script from configure.in file.
Because of this, systemd unit file won't be installed while doing make
install.
seb [Sun, 3 Mar 2013 14:46:28 +0000 (15:46 +0100)]
pidstat's option -U updated.
pidstat's option -U can now be followed by a user name.
In this case, only tasks belonging to the specified user are
displayed by iostat.
pidstat manual page updated.
seb [Sat, 2 Mar 2013 14:30:01 +0000 (15:30 +0100)]
Now use sigaction() instead of signal() for signals handling.
signal() manual page explicitly says to avoid using it for
signal handling, because of portability problems among other.
So use now sigaction() for that.
seb [Sat, 2 Mar 2013 13:34:54 +0000 (14:34 +0100)]
pidstat can now display the username of the tasks being monitored.
A new option (-U) has been added to pidstat: This option is used
to display the real user name of the tasks being monitored instead
of the UID. pidstat manual page has been updated.
seb [Mon, 10 Dec 2012 21:02:12 +0000 (22:02 +0100)]
Changed IPv6 counters (used by sar -n {IP6 | EIP6 }) to
unsigned long long to keep in sync with current kernels.
Keep in sync with recent kernels (3.7rc8 used here): Now use
unsigned long long for SNMP IPv6 statistics.
WARNING: This breaks compatibility with older sar data
files format for IPv6 statistics.
seb [Mon, 10 Dec 2012 20:49:12 +0000 (21:49 +0100)]
Changed IPv4 counters (used by sar -n {IP | EIP }) to
unsigned long long to keep in sync with current kernels.
Keep in sync with recent kernels (3.7rc8 used here): Now use
unsigned long long for SNMP IPv4 statistics.
WARNING: This breaks compatibility with older sar data
files format for IPv4 statistics.
seb [Mon, 10 Dec 2012 20:27:46 +0000 (21:27 +0100)]
Changed network counters (used by sar -n {DEV | EDEV }) to
unsigned long long to keep in sync with current kernels.
Keep in sync with recent kernels (3.7rc8 used here): Now use
unsigned long long for network statistics.
WARNING: This breaks compatibility with older sar data
files format for network statistics.
Mail from Matthew Hall (matthew.hall@ecsc.co.uk) 15/12/2011:
I've spotted an issue with sadc when values in /proc/net/dev are greater
than 4294967295, in that in rd_stats.h all values in
stats_net_dev/stats_net_edev are unsigned long, but it *seems* that for
a while at least (earliest reference I can see is to 2002 [1]), that
values for (rx|tx)_(bytes|packets) are unsigned long long (this is the
format ifconfig from net-tools uses).
I've attached a sample of my /proc/net/dev (for reference) and a patch
for the 9.0.6.1 version of sysstat (also applies against 10.0.2 with
minor fuzz from patch) which converts lu to llu for these counters.
I've not looked much further into it, since this solves my particular
problem, but I expect it's not the 'correct' solution as I can see in
/usr/include/linux/if_link.h that rx_bytes is a '__u32' for x86, and
'__u64' for x86_64 - so it's probably not portable across different
architectures. There's likely some more work to be done to have
different format structs of stats_net_dev depending on arch.
seb [Fri, 30 Nov 2012 21:32:30 +0000 (22:32 +0100)]
Cosmetic fixes in configure script.
Trying to grep for some expressions in configure.in script was
not done properly, resulting in "...: Command not found" message
(hidden since stdout and stderr were redirected to /dev/null).
So change:
if (`grep some_expr some_file /dev/null 2>&1); then...
with:
grep some_expr some_file /dev/null 2>&1
if test $? = 0; then...
seb [Fri, 30 Nov 2012 21:09:48 +0000 (22:09 +0100)]
Now install sadc in $prefix/lib64 directory on 64 bit machines
even if $prefix/lib also exists.
$prefix/lib no longer takes precedence on $prefix/lib64 directory
if this latter exists on 64 bit machines.
CPU is 64 bit if it has the lm (long mode) flag in /proc/cpuinfo.
Mail from Wayne Lin <wlin@mvista.com> 14/11/2012:
Subject: why 64 bit sa1 sa2 sadc are in /usr/lib and not in /usr/lib64?
We like the sysstat package, we are just curious why by default when built
on 64 bit system and targeting 64 bit system, the /sa and its content
sa1, sa2, sadc are being put in /usr/lib and not /usr/lib64?
Should we just use sa_lib_dir to configure the redirect?
seb [Fri, 30 Nov 2012 20:06:57 +0000 (21:06 +0100)]
FIxed a bug where sadc didn't collect all its activities when it
had to overwrite an old sysstat data file with some
unknown activity formats.
How to reproduce: Create a data file with sadc from version 9.1.6:
sadc data 1 2
Check that you have all activities (and network ones in particular),
for example using sar from 10.1.2 version:
sar -n DEV -f data
Try to append data with sadc from 10.1.2:
sadc -F data 1 3
Only a few (or no) activities have been collected and saved in data file.
This is because one activity is unknown so sar overwrites the file. But in
the meantime, activities have been reset (open_ofile() function).
Mail from John Lau <johnlcf@gmail.com> 16/11/2012:
Subject: Sadc create corrupt sa file even with -F option
(See corresponding mail)
seb [Tue, 27 Nov 2012 20:23:03 +0000 (21:23 +0100)]
Option -y added to iostat.
This option tells iostat to not display its first report with statistics
since system boot.
Courtesy Peter Schiffer from RedHat.
Mail from Peter Schiffer 20/11/2012 (pschiffe@redhat.com):
I want to talk to you about one feature of iostat command. When you ran
iostat without any arguments, it prints statistics since boot. This is
OK, and, for example, mpstat is doing the same. However, when you run
let's say iostat 1 5, the first report displays statistics since boot.
This can be a problem in numerous situations. Well, I am not sure when
this behavior is beneficial and usually this first report is removed by
some kind of post-processing. When compared to mpstat, the first report
is skipped and it waits to the next report. And I think this is the
default behavior for all other sysstat tools.
Now, I was wondering about how to make it better. I don't think that it
would be wise to change default behavior - that could break things.
But I was thinking more about some new option which would make iostat
skip the first since boot report.
seb [Mon, 19 Nov 2012 20:48:40 +0000 (21:48 +0100)]
Fixed DTD document.
When computer has run all day without restart, XML output file
from sadf -x contains no boot elements.
So change DTD document accordingly (from <!ELEMENT restarts (boot+)>
to <!ELEMENT restarts (boot*)>).
XSD document is not updated, but its version number changes to remain
consistent with that of DTD document.
Bug reported by Peter Schiffer <pschiffe@redhat.com> 13/11/2012.
Fixed a fatal error when compiled with -Werror=format-security.
This change is a workaround a fatal error that we get when compiling
sysstat with -Werror=format-securit.
Mail from Guillaume Rousse (guillomovitch@gmail.com) 30/07/2012:
Voici un patch que j'ai retrouvé dans le package mageia de sysstat, qui corrige une erreur d'utilisation de printf (fatale avec les options -Wformat -Werror=format-security).
[...]
XML DTD document name is now tagged with a version number.
DTD document name now includes a version number.
XML output displayed by sadf -x points at the DTD document
which applies to this specific version.
Mail from Frank Ch Eigler (fche@redhat.com) 26/09/2012:
From some brief testing, it appears as though sadf's xml output format
has changed a few times over time, but the same xml dtd URL is being
emitted: http://pagesperso-orange.fr/sebastien.godard/sysstat.dtd If
indeed the dtd has changed over time, wouldn't it be wise to keep
newer versions tagged with a version number in the URLs, so that
URL-based xml validation would succeed well into the future?
sar -r now tracks the amount of dirty memory (memory waiting to
get written back to disk).
DTD and XSD documents updated.
Sar manual page updated.
Mail from Michael Blakeley (mike@blakeley.com) 28/09/2012:
I've been thinking about patching the sa collectors to track the "Dirty" metric from /proc/meminfo, and sar to report on it. This would be useful for applications where latency is important: having historical data on dirty writeback pages can help trace the kind of problems that can be addressed by tuning vm.dirty_bytes and friends.
In the 10.1 code I see that a few functions and structs already make use of meminfo. What's your philosophy on this? Should dirty-kB be a new struct, or perhaps merge with the existing meminfo_huge struct?
Similarly for reporting, should I focus on a new option ("-D" perhaps?) or try to piggyback on an existing one?
New field added to sar -u: %gnice (time spent
running a niced guest).
sar manual page updated.
DTD and XSD documents updated.
sadf various output (XML, CSV, etc.) updated.
Sysstat service unit file has been added to replace init script
for systems with systemd support.
Mail from Peter Schiffer <pschiffe@redhat.com> 24/08/2012:
I am sending you a patch which adds support for systemd to sysstat. With this patch, configure script detects whether the system uses systemd, and if yes, it installs the unit file and enables the sysstat service. Then, init script is not used.
Systemd is default since Fedora 15, and many more distributions are slowly converting to it. For more information about it, see it's home page [1], man pages about unit files and integration into autotools can be found at [2 - 6]. General information about systemd can be found at [7].
Sysstat init script updated to make it more conforming to LSB.
Sysstat init script updated:
* instead of temp file, use /var/run/sysstat.pid file
* check user privileges on start and stop (even though stop does nothing)
* status command works "correctly" (of course it will always return 3 - service is not running, but for the love of the standards..)
Mail from Peter Schiffer (pschiffe@redhat.com) 08/08/2012:
I was looking into sysstats init script, and here are few things I found out:
* init scripts are usually run with root privileges (at least on system start). But then: mktemp command is run under root (line 24 in init script), and if @SU_C_OWNER@ is set and sa1 script fails, then @SU_C_OWNER@ won't have privileges to remove the temp file. example:
Also, in this case, condition on line 31 is useless.
* it looks like, that comment starting on line 28 is no true (or, at least, anymore). I did a little test (which is attached). In one script, there is command "exit 17", and another script is calling the first via exec. Now, just run:
So, you can see two things: the exit code is not lost, and even I call "su user -c" on myself, I need to enter my password. The latter can be confusing if @SU_C_OWNER@ is not me, etc..
So.. what I am trying to show:
* command "service sysstat start" should only be called with root privileges
* we can use exit code of command run under "su foo -c ..."
(...)
Now, the init script should be more conforming to LSB (according to the [1]). I'm also attaching output of our internal LSB init script test.
First, this patch renames sadf option '-T' into '-U', and
sadf option '-t' into '-T'.
It then adds a new option: -t. This option tells sadc to display
the timestamps in the local time of the data file creator
instead of UTC. The same option already exists for sar.
The FAQ is also updated: Tell that options -s and -e are always
expressed in local time.
sadf option -T has been renamed into -U, and option -t has been
renamed into -T. This was made compulsory to add a new option -t
consistent with that of sar.
Make sysstat disk counters consistent with those from latest kernel (3.5).
Changed the type of some disk counters to keep in sync with latest 3.5
kernel.
This breaks the compatibility with older sar data files format for disk
activity.
Mail from Peter Schiffer (pschiffe@redhat.com) 15/02/2012:
I am sending you next patch. I've updated reading /proc/diskstats and
/sys/block/<disk>/stat files in iostat.c and rd_stats.c source files
according to latest kernel (3.2.6).
Problem was, that in case of very high I/O operations, sar -d and iostat
outputted overflowed values:
Various cosmetic changes in manual pages and usage messages displayed by sysstat commands.
Mail from Peter Schiffer (pschiffe@redhat.com) 04/07/2012:
I am sending you 2 patches. They are minor modifications to the
documentation:
In man-usage.patch I tried to unify usage output of programs and man
pages synopsis, so all the usages of utilities are similar. Also, it
fixes one bug in iostat man page where Network Filesystem report was
mentioned and problem in sar man page where some options in synopsis
weren't bold.
In man-asciibetical-order.patch I tried to unify order of option
descriptions in man pages, I used asciibetical order (uppercase before
lowercase) - again, so all man pages can be similar.
These are rather cosmetic changes than fixes, however I think unity is
important for better user experience.
Persistent device names support added to sar and iostat (option -j)
Option -j added to sar and iostat to add support for persistent device
names.
Mail from Peter Schiffer (pschiffe@redhat.com) 22/06/2012:
I need to implement another feature for sysstat and I would like to hear
your opinion.
Pretty device names, such as sda, vda, ... are not persistent and in
some specific situations kernel can assign different names for the same
device between reboots. To prevent this confusion, persistent device
names exists. Those names are in /dev/disk/by-xxx folders... You are
probably aware of this..
So, currently, sar -d -p and iostat are displaying the pretty device
names. I would like to add new option, which would take one argument
specifying the type of persistent name, and then, sar -d -p and iostat
would display the device names in that persistent name, e.g.:
It would work like this: I wouldn't bound the type of name, rather I
would check whether the specified type inserted by user exists like
folder: /dev/disk/by-label (in that particular example). If yes, I would
resolve links in that folder until I found the device I was looking for
and display that name.. It should be straightforward.
My questions are: how would you name the option? And where would you put
the common code for sar and iostat? Also, do you have any comments or
ideas?
sar: Use /sys/dev/block/major:minor links to determine devices real name.
Now use /sys/dev/block/major:minor links to determine devices real name.
This is used as the first option now, before using sysstat.ioconf
configuration file.
Mail from Peter Schiffer (pschiffe@redhat.com) 20/06/2012:
I am sending you a patch which is looking into
/sys/dev/block/major:minor link to determine the device name. This
should work for any device, but I let it as the last option when
determining devname. What do you think?
Sebastien [Sat, 30 Jun 2012 20:32:38 +0000 (22:32 +0200)]
Added option -[0-9]* to sar to show data of that days ago.
Mail from Don <do1@yandex.ru> 22/06/2012:
Hello Sysstat author(s),
Please add option `-[0-9]+' like `-1', `-2', etc. to sar tool, to show
data of that days ago. That would be handy and useful for everybody.
Example implementation in bash:
function sar() {
case "$1" in -[0-9]*) local OPT="-f /var/log/sa/sa`date +%d
--date=${1#-}' day ago'`"; shift;; esac
command sar $OPT "$@"
}
Limitation of this implementation is that -1 option should be first, but
if you implement this type of options in main code you can avoid it and
also many people will benefit. It will easier than to rememebr proper
date and type `-f /var/log/sa/saXX'.