Sebastien GODARD [Wed, 21 May 2014 19:17:20 +0000 (21:17 +0200)]
Lower HISTORY limit to 25 for scripts to create multiple directories
This patch lowers HISTORY value limit from 28 to 25. When HISTORY value
is greater than 25 then sa1 script creates a month-by-month directory
structure in /var/log/sa to save datafiles.
This value guarantees that we will always have a full history and that
no files will be overwritten unintentionally.
Sebastien GODARD [Fri, 16 May 2014 20:16:37 +0000 (22:16 +0200)]
Don't install crontabs when using systemd timer units
systemd now has timer units that can replace cron jobs.
Such units are installed by sysstat if requested to do so on machines
supporting systemd. In this case a crontab should no longer be
installed else a redundant line of statistics appear at every interval
of time.
Sebastien GODARD [Wed, 14 May 2014 19:50:54 +0000 (21:50 +0200)]
Prevent sar from appending data to data from the preceding month
Previously, the sar utility set the maximum number of days to be logged
in one month too high. Consequently, data from a month were appended to
data from the preceding month. With this update, the maximum number of
days has been set to 25, and data from a month now correctly replaces
data from the preceding month.
From Red Hat Bugzilla #578929 (Jeff Bastian):
*** BEGIN ***
Although RHEL 4 was modified to set the max history for sar to 26 days,
this value is still too long.
Given that
1. February only has 28 days, and
2. that 'find ... -mtime +26' finds files that were modified 27
or more days ago, and
3. find ignores fractional days and thus has rounding errors, and
4. the DST changes take place within sar's $HISTORY window
(as of 2007, the second Sunday in March[1], or March 14, 2010
this year)
then /usr/lib/sa/sa2 is only deleting files that are 28 days
or older (not 27).
As a result, sar data from March 15 to March 28 was getting appended
to the data from February instead of replacing it. You can see the
files
from March 15 to 28 are 2x the size of normal files.
# ls -al
...
-rw-r--r-- 1 root root 813552 Mar 12 23:50 sa12
-rw-r--r-- 1 root root 813552 Mar 13 23:50 sa13
-rw-r--r-- 1 root root 779664 Mar 14 23:50 sa14
-rw-r--r-- 1 root root 1626864 Mar 15 23:50 sa15
-rw-r--r-- 1 root root 1626864 Mar 16 23:50 sa16
-rw-r--r-- 1 root root 1626864 Mar 17 23:50 sa17
...
-rw-r--r-- 1 root root 1626864 Mar 26 23:50 sa26
-rw-r--r-- 1 root root 1626864 Mar 27 23:50 sa27
-rw-r--r-- 1 root root 1626864 Mar 28 23:50 sa28
-rw-r--r-- 1 root root 813552 Mar 29 23:50 sa29
-rw-r--r-- 1 root root 339120 Mar 30 09:50 sa30
...
The max needs to be 25 days to account for DST now.
Attached is a demo script to show how 'find ... -mtime +26 ...' fails to
find files that are 27 days old when dealing with the DST time change on
March 14, 2010.
It touches a series of files, sa10 to sa20, with timestamps from
February 10 to February 20 at 23:53. It then sets the date to March 15
and finds files that are older than 26 days. (And finally restores the
date.)
Stopping ntpd: [ OK ]
Stopping crond: [ OK ]
Mon Mar 15 23:53:00 EDT 2010
-rw-r--r-- 1 root root 0 Feb 15 23:53 /root/timestamp/sa15
-rw-r--r-- 1 root root 0 Feb 14 23:53 /root/timestamp/sa14
-rw-r--r-- 1 root root 0 Feb 13 23:53 /root/timestamp/sa13
-rw-r--r-- 1 root root 0 Feb 12 23:53 /root/timestamp/sa12
-rw-r--r-- 1 root root 0 Feb 11 23:53 /root/timestamp/sa11
-rw-r--r-- 1 root root 0 Feb 10 23:53 /root/timestamp/sa10
Thu Apr 1 12:35:12 EDT 2010
Starting crond: [ OK ]
Starting ntpd: [ OK ]
You can see that youngest file it found was from February 15 which is 28
days old on March 15. It does not find sa16 which is 27 days (in our
thinking) even though we specified +26. This is due to the rounding
errors from find ignoring fractional parts of days.
*** END ***
Sebastien GODARD [Sat, 15 Mar 2014 07:39:54 +0000 (08:39 +0100)]
Display number of activities saved in file's header
Update sadf so that it can display the total number of activities and
the number of volatile activities in file.
These numbers can be displayed with sadf -H.
Sebastien GODARD [Fri, 14 Mar 2014 10:08:00 +0000 (11:08 +0100)]
sar: Take into account a change of CPU number insar datafile (5)
The goal of the next few patches to come is to make it possible for
other activities than A_CPU (displayed by sar -u) to take into account a
change of CPU count in sar data files. These activities must be directly
related to CPU: At the present time these are A_PWR_CPUFREQ (displayed
by sar -m CPU) and A_PWR_WGHFREQ (displayed by sar -m FREQ).
With previous work done, a change from, say 6 to 8 CPU is taken into
account by sar -u:
$ sar -P ALL -f data
Linux 3.9.10-100 (home) 02/08/2014 _x86_64_ (6 CPU)
1) Changes the format of sar data file (this doesn't hurt since 10.3.1
has already changed it. Done again here just in case someone had cloned
from 10.3.1 though this version has not been officially released).
2) Adds a new field in data file's header (sa_vol_act_nr), giving the
number of activities in file taking into account a change of CPU count.
(Such activities, like A_CPU or A_PWR_CPUFREQ are called "volatile
activities" here).
3) When a restart mark is inserted, sadc also writes a few additional
data giving the new number of CPU for each volatile activity in file.
Of course sar has also been updated to read those data and take into
account a change of CPU count for all volatile activities to display.
Tomasz Torcz [Sat, 1 Mar 2014 12:58:03 +0000 (13:58 +0100)]
add systemd timer units replacing cronjobs
Systemd timer units, while preserving functionality of cronjobs,
provide some additional features:
- dependency on cron can be dropped
- timer units next expiration is clearly visible in "systemctl list-timers"
- unit customization rules provide cleaner way for administrator to modify
default parameters, enable, disable and override individual timers
Mike Kazantsev [Mon, 17 Feb 2014 20:01:26 +0000 (02:01 +0600)]
Fix output of sadf -j with file-utc-time present
Commit aea4561 added "file-utc-time" parameter output (if present in the
datafile), but didn't add comma before it in "sadf -j" output, making it
unparseable. Fix that.
Be more user-friendly when trying to access a non existent data file
When sar and sadf try to access a non existent standard daily data file,
tell the user to check if data collecting is enabled.
By doing so, the user goes to the docs looking for how to enable the
data collection, rather than looking for help on why the software seems
to be broken.
sadf -H now displays the last CPU count recorded in the data file. This
is the value of the last_cpu_nr field saved in file's header, giving the
number of processors the machine had when the last sample of statistics
was appended to the file.
Move header_size field to the end of the structure, so that other
programs can still tell it is a sysstat sar data file, even though
the format has changed.
Also, to avoid causing format changes, add padding to the file_magic
structure so that when they grow in size other tools can still
handle the existing data.
sar: Take into account a change of CPU number in sar datafile (3)
Update sar so that it now reads the number of CPU associated with a
RESTART record. Reallocate CPU structures accordingly, and display the
number of CPU with the restart mark.
sar: Take into account a change of CPU number in sar datafile (2)
sadc now accepts a change of CPU count when inserting a restart mark. In
this case, sadc sets the new value in file's header (last_cpu_nr field)
then rewrites it.
The new number of CPU is also written after the restart mark record in
file.
Sebastien GODARD [Sun, 26 Jan 2014 16:28:15 +0000 (17:28 +0100)]
sar: Take into account a change of CPU number in sar datafile (1)
On virtual machine, after CPU change (e.g. power off machine ->
add 1 cpu -> power on machine), sar command doesn't display
previous data (from today) from the current saXX file.
This is the first patch aimed at making sar able to take into account
a change of CPU count in its datafiles.
This patch adds a new field into file's header data (last_cpu_nr)
which gives the number of CPU the machine had when the last sample of
data was appended to the file.
Sar data file format is no longer compatible with that from previous
sysstat versions. To be able to add other fields to file's header
in the future without making it incompatible again, also add a
field (header_size) to file's magic data giving the size of file's
header.
Sebastien GODARD [Sun, 26 Jan 2014 08:17:03 +0000 (09:17 +0100)]
Rename nfsiostat command to nfsiostat-sysstat
nfsiostat was added to the sysstat package in 2010, but such a command
has already existed in the nfs-utils package since 2008.
So to avoid confusion, rename nfsiostat to nfsiostat-sysstat and
indicate it is now obsolete.
The nfsiostat command from the sysstat package will be removed in a
future version since we don't need two similar commands.
Sebastien GODARD [Wed, 15 Jan 2014 21:31:33 +0000 (22:31 +0100)]
Add option --enable-copy-only to configure script
Add a new option (--enable-configure-only) to configure script. This
option makes sure that files are only copied when installing sysstat and
that nothing else (like activating a service for distro using systemd)
is done. This may be useful when creating sysstat package.
Sebastien GODARD [Mon, 13 Jan 2014 21:07:02 +0000 (22:07 +0100)]
sadf now displays file creation time
sadf -H, sadf -x and sadf -j now display the creation time of the
datafile. This time was available in the header of the datafile.
Time is displayed in UTC.
Sebastien GODARD [Sun, 12 Jan 2014 07:10:51 +0000 (08:10 +0100)]
Take $DESTDIR variable into account when installing sysstat service used by systemd.
Previous code for installing sysstat using systemd needed rw permissions
to the root directory. Yet some users and distributions want to install
files into $(DESTDIR) before actually installing into the root directory.
So take $DESTDIR variable into account.
pidstat: Display stats since boot time for a list of given processes
pidstat displays statistics since system startup when the interval and
count parameters are not set on the command line (eg. entering "pidstat
-d" will display I/O statistics for all processes that have had I/O
activity since boot time). But pidstat couldn't display those stats when
some PID numbers were entered on the command line (eg. "pidstat -d -p
1234" to display I/O stats since system startup for process 1234).
This patch makes it possible now.
Rearrange options displayed by sar in its short help message
(sar -h): The upper case option should be displayed before its lower
case counterpart to be consistent with options order displayed by sar
usage message or displayed in sar manual page.
pidstat -d now displays -1 for I/O statistics values when the
file containing the statistics data for the corresponding process cannot
be read (permission denied or file non existent).
pidstat stack utilization statistics were not always properly refreshed
Stack utilization statistics displayed by "pidstat -s" were sometimes
not displayed for some processes although values had changed.
This was because stack stats were displayed together with memory stats,
but pidstat was looking only for a variation in memory counters values
to decide whether to display the corresponding process or not.
So now process memory and stack statistics independently.
mpstat/pidstat exit immediately when SIGINT caught during 1st interval
The mpstat and pidstat commands display their average statistics when
they are interrupted by the user with control-C (SIGINT signal).
But when the signal was caught during the first interval of time (ie.
before any statistics had been displayed), trying to display some
average values is irrelevant. So make them exit immediately in this
case.
sar (or sadc) allocates empty records in the data files it creates so
that it can save statistics for devices (disks, network interfaces,
etc.) that may be added to the system after the file was created.
The drawback is that data files take more space on disk than actually
strictly necessary.
Using the "prealloc" variable with configure (before compiling sysstat),
the user can tell how much space he wants to allocate.
This variable will determine the size of data files created by sar/sadc.
The default value is 1, meaning that some empty records will be
allocated.
A value of 0 means that data files will be the smallest possible.
Sebastien GODARD [Tue, 27 Aug 2013 19:28:33 +0000 (21:28 +0200)]
Fix wrong permissions for data file created by sa1 script
If HISTORY is greater than 28, the sa1 script does not execute the
command "umask 0022" until after the new output file is created,
allowing it to have the wrong permissions if the root umask is not set
to 0022. So move the umask command above that "if" statement.
Reported-by: Peter Schiffer <pschiffe@redhat.com.fake> Signed-off-by: Sebastien GODARD <sysstat@orange.fr.fake>
Sebastien GODARD [Thu, 15 Aug 2013 14:20:12 +0000 (16:20 +0200)]
Fix sar log file corruption in odd Feb 28th edge-case
The sar scripts /usr/lib/sa/sa1 and sa2 both normally use logs in
/var/log/sa itself, but if HISTORY is > 28 , the scripts use a tree of
log directories under /var/log/sa.
When HISTORY is < 28, then /usr/lib/sa/sa2's action to expunge old logs
which are more than HISTORY days old will mean that "tomorrow's" saXX
file never exists prior to /usr/lib/sa/sa1 creating it on the first pass
of the day.
However, when HISTORY == 28, then on March 1st in a non-leap year, the
log file sa01 will already exist from Feb 1st having not been
pre-expunged by the sa2 script. Similarly for March 2nd through 28th.
So now make sure that the sa1 script deletes the log file
if it is from a previous month. This way this will
prevent a log file from a month ago being re-used "today".
Update iostat manual page: Fix a small inaccuracy regarding %util field
iostat manual page used to say that device saturation occured when the
value of %util was close to 100%. This is true for devices serving
requests serially but not necessariyl for devices able to serve multiple
requests simultaneously. So update iostat manual page accordingly.
Use a lightweight static library to compile some sysstat commands
Create two versions of the "librdstats.a" static library: one having all
the functions to read stats and which will be used by sadc, the other
having only a minimal subset of functions used by other commands like
iostat or pidstat.
The result is a much smaller binary file for iostat (size is 29% smaller
than before), pidstat (-27%), mpstat (-40%), nfsiostat (-43%) and
cifsiostat (-43%).
Sebastien GODARD [Sun, 30 Jun 2013 13:07:44 +0000 (15:07 +0200)]
"sar -o" now collects all possible statistics
sar now collects all possible statistics (including partitions ones)
when data are saved into a file with option -o ("sar -o" calls sadc
with option "-S XALL" instead of "-S ALL").
Sebastien GODARD [Wed, 26 Jun 2013 19:53:04 +0000 (21:53 +0200)]
Small fix for "sar -A" in sar manual page
Indicate that filesystems statistics are also included in stats
displayed by sar -A (that is to say: option -F is also set when sar -A
is entered on the command line).
Sebastien GODARD [Sun, 23 Jun 2013 15:07:31 +0000 (17:07 +0200)]
%ifutil: Update sadf output to take into account the new metric
This patch updates the various output formats of sadf (CSV, ppc, JSON,
XML) to take into account the new %ifutil metric added to sar network
devices statistics.
DTD and XSD documents have also been updated accordingly.
Sebastien GODARD [Sun, 23 Jun 2013 14:57:06 +0000 (16:57 +0200)]
%ifutil: Add NIC utilization percentage to sar network stats
This patch adds %ifutil to statistics displayed by sar -n.
%ifutil is a new metric giving the utilization percentage of the
corresponding network interface.
Sebastien GODARD [Wed, 12 Jun 2013 19:39:19 +0000 (21:39 +0200)]
Collect filesystems stats only when sadc option "-S XDISK" is used
Make sadc collect filesystems statistics (those displayed by sar option
-F) only when option "-S XDISK" is used.
Filesystems are actually closer to partitions than to disks. So it makes
more sense for sadc to collect them when option "-S XDISK" is used
rather than option "-S DISK".
Update sadc and sar manual pages to reflect the change.
Sebastien GODARD [Tue, 11 Jun 2013 19:46:56 +0000 (21:46 +0200)]
Fix a wrong assertion in sadf manual page
In its EXAMPLE section, sadf manual page says that
sadf -d /var/log/sa/sa21 -- -r -n DEV
extracts memory, swap space and network statistics from the data file.
This is no longer true as swap statistics are now displayed with option
-S and not option -r. So fix the wrong comment.
Tell where interrupts data come from in mpstat manual page
mpstat manual page now tells that interrupts data displayed by
mpstat -I CPU come from /proc/interrupts file, and that interrupts data
displayed by mpstat -I SCPU come from /proc/softirqs file, so that the
meaning of each interrupts can be more easily understood by users.
Type for "intr" attribute was integer in XSD document.
Yet its value can sometimes be "sum" when displaying statistics for the
total number of interrupts received per second (sar -I SUM).
So change its type to string.
Sebastien GODARD [Thu, 30 May 2013 20:26:18 +0000 (22:26 +0200)]
Handle octal codes in filesystems mount point names
sar -F was unable to get statistics about filesystems whose mount points
(as given by /etc/mtab file) had octal codes in their names.
So now sar replaces those octal codes with their corresponding
characters before trying to collect stats about them.
Sebastien GODARD [Mon, 20 May 2013 15:14:05 +0000 (17:14 +0200)]
Filesystems stats: Display unmounted filesystems in summary list
This patch enables sar -F to display filesystems in its summary list (the
last stats displayed by sar) even if those filesystems have been
unmounted before the end of the report.
Sebastien GODARD [Sun, 19 May 2013 09:18:22 +0000 (11:18 +0200)]
Split rd_stats.c and rd_stats.h files
rd_stats.c file was becoming really big. So remove from it functions
used to count the number of items and put them in a separate file
(count.c).
Functions prototypes go to count.h.