Sebastien Godard [Sun, 27 Feb 2011 14:45:23 +0000 (15:45 +0100)]
Option -V from sysstat commands now displays the version number on stdout
and returns 0 for the exit code.
Option -V used to display the version number on stderr and returns 1 for
the exit code.
This is not the expected behavior as it has done everything we asked properly.
So change this: Display on stdout and returns 0.
The same change has been applied to sar's option -h, which displays a
help message.
Mail from Lodewijk Bonebakker <jlbonebakker@gmail.com> (15/02/2011):
Subject: Systat version reporting
Dear Sebastian,
I have question related to the way you report the version number in sysstat. At the moment, it seems that you write:
sysstat version <versionno>
(C) Sebastien Godard (sysstat <at> orange.fr)
to stderr, and set the error-code to 1.
Given the significant changes between 7/8 and 9, we have a tremendous headache in automatically dealing with the different sar data files, collected during the day on different machines (some which we prefer not to upgrade). Currently in our environment we can work with this way of reporting your version number, but I would like to make a suggestion:
It would greatly help us if the version command returns only "sysstat version <versionno>" to stdout and sets the return code to 0. Our reasoning behind this is that 'sar -V' should print the version number and exit 0, since it has done everything we have asked it to do correctly. A non-zero exit code is then reserved for error-conditions. This way we can efficiently get the version-number, check for proper installation/kernel versioning etc.
Thank you for your time in considering this suggestion,
Added the possibility to extend the number of slots for NFS and
CIFS mount points on the fly.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011:
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat.patch - adds the possibility to extend the number of
slots for nfs mount points during the nfsiostat run
cifsiostat:
cifsiostat.patch - adds the possibility to extend the number of
slots for cifs mount points during the cifsiostat run
See also mail from Masanari Iida (masanari.iida@hp.com) 28/01/2011:
Hello
I have a feedback about nfsiostat behavior.
Description
nfsiostat need to be run _AFTER_ the NFS share is mounted.
Version.
nfsiostat in sysstat 9.1.7
How to reproduce
(1) Run nfsiostat -k 1
(2) Mount one NFS share
(3) Check nfsiostat output.
Expected result.
nfsiostat start to report the NFS share's activities, after I mount it.
Actual result.
nfsiostat not reporting the mounted NFS share, even after it is mounted.
Additional information:
If I mount the NFS share _BEFORE_ i run nfsiostat, nfsiostat reports the
NFS share's information. And also, if I umount the NFS share,
the line disappeared. (This is expected.)
It is because, the environment is using autofs, mount and umount often
happens on the system.
But I continously running the iostat -kn to collect the statics, then I have
encounteded this symptom. (And tested with nfsiostat )
In case of multiple NFS mount points, the symptom is bit different.
If an additional NFS share is mounted BEFORE this test is done,
the target NFS share appeared after mount it. And dissapeared it after umount.
But if I do the same thing one more time, it is not display any more.
========
To say the truth, the original symptom is happen on sysstat 7.0
on RHEL5 system, using iostat -kn. The environment uses autofs.
I wanted to confirm if the upstream version already fixed this symptom,
so that's why I downloaded the latest tar ball and tried.
But so far, the similar symptom still exist.
If you think this is current known limitation or known bug,
please document it in man page or FAQ page of your web.
Fix a problem with long NFS and CIFS share names in cifsiostat and
nfsiostat.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat3.patch - fix the problem with long nfs shares names
(...)
cifsiostat:
cifsiostat3.patch - fix the problem with long cifs shares names
(...)
Check calloc() return value in nfsiostat and cifsiostat.
A call to calloc() function to allocate structures in nfsiostat and
cifsiostat wasn't checked for its return code. This call could possibly
fail without ever being noticed.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat2.patch - adds the forgotten test to malloc
(...)
cifsiostat:
cifsiostat2.patch - adds the forgotten test to malloc
(...)
Added --debuginfo option to cifsiostat and nfsiostat commands.
Jan Kaluza from Redhat (jkaluza@redhat.com) added option --debuginfo
to cifsiostat and nfsiostat commands.
His mail (06/30/2010):
Hi,
thanks for applying my previous patch in Sysstat 9.1.3 (I'm really proud to be
in the Changelog). I've created another patch which adds --debuginfo option
also for new tools (nfsiostat, cifsiostat) introduced in this release. I think
it could help debugging in some situations. Please feel free to ask any
question about that patch.
Jan Kaluza
cifsiostat and nfsiostat manual pages have also been updated.
By default, sysstat_panic function is no longer included in binary commands.
This function is defined only if --enable-debuginfo has been used with
configure.
Sebastien Godard [Fri, 17 Dec 2010 20:19:53 +0000 (21:19 +0100)]
Added a new field (blocked) to sar -q.
This patch adds a new metric (blocked - number of tasks currently
blocked, waiting for I/O to complete) to sar -q.
Also update sar manual page, and DTD/XSD documents.
Note that this breaks stats_queue structure format.
Sebastien Godard [Sat, 11 Dec 2010 20:49:22 +0000 (21:49 +0100)]
No longer assume that device-mapper major number is 253.
Get the real number from /proc/devices file.
The sar, sadf and iostat commands used to assume that device-mapper
major number was 253. This happened to be false sometimes. So get
the real number from the /proc/devices file.
From Mike Coleman <tutufan@gmail.com> 04/10/2010:
The iostat program seems to assume that the major device number for
devmap will always be 253. This doesn't seem to be an official
number, though, and I have a box where it actually ends up being 252,
which breaks 'iostat -N'.
It looks like this can be determined dynamically by looking at /proc/devices.
Sebastien Godard [Sat, 11 Dec 2010 14:18:19 +0000 (15:18 +0100)]
[Kenichi Okuyama]: Small change to sar manual page.
Kenichi Okuyama <kenichi.okuyama@gmail.com> noticed that a sentence
in sar manual page was a bit confusing:
ip-frag
Number of IP fragments currently in use.
because unless you are familiar with
Linux Kernel internal, "IP fragments" are not something to be IN USE.
This is really how many elements are there in fragment queues, and
each element of queue is "group of fragmented ip packets", which when
completed, will become single IP packet.
So replace it with:
ip-frag
Number of IP fragments currently in queue.
Sebastien Godard [Tue, 30 Nov 2010 20:28:35 +0000 (21:28 +0100)]
pidstat: Code cleaned.
A comment in pidstat.c (get_pid_to_display()function) still refered
to option -X, although this option no longer exists as it was
merged with option -C.
Sebastien Godard [Mon, 22 Nov 2010 14:22:01 +0000 (15:22 +0100)]
Fixed bogus CPU statistics output, which happened when
CPU user value from /proc/stat wasn't incremented whereas
CPU guest value was.
From the Fedora Bugzilla database.
Ivana Varekova 2010-10-15 09:05:41 EDT
Description of problem:
The output of sar command is bogus, the value of %usr overflows
Version-Release number of selected component (if applicable):
last upstream (http://sebastien.godard.pagesperso-orange.fr/download.html) -
sysstat-9.1.5
How reproducible:
Steps to Reproduce:
1.# sar -u ALL -P ALL 1 1000
Linux 2.6.32.21-168.fc12.i686 (localhost) _i686_ (2 CPU)
...
02:59:54 PM CPU %user %nice %system %iowait %steal %idle
02:59:56 PM all 5.24 0.00 4.52 0.00 0.00 90.24
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
02:59:56 PM 1 3.20 0.00 1.83 0.00 0.00 94.98
....
2.
3.
Actual results:
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
^should be zero
Additional info:
the problem happens if user value prom /proc/stat is not incremented, and guest
value form /proc/stat for the same cpu is incremented. In this case sar should
output 0 as an %usr value.
(This situation can happened - see the code in kernel: account_guest_time
function) e.g.:
Sebastien Godard [Sun, 14 Nov 2010 12:24:57 +0000 (13:24 +0100)]
Fix segfaults on bogus localtime input.
The return code from localtime() function (and also gmtime() one)
wasn't checked. In some (rare) cases, it can return a NULL pointer
resulting in a segmentation fault.
The return code is now checked, but no specific action is performed
anyway.
Original patch from Ivana Varekova from RedHat (04/10/2010):
diff -up sysstat-9.0.6.1/sar.c.pom sysstat-9.0.6.1/sar.c
--- sysstat-9.0.6.1/sar.c.pom 2009-10-17 15:08:21.000000000 +0200
+++ sysstat-9.0.6.1/sar.c 2010-10-04 12:21:13.383442188 +0200
@@ -247,7 +247,7 @@ void reverse_check_act(unsigned int act_
* @curr Index in array for current sample statistics.
***************************************************************************
*/
-void sar_get_record_timestamp_struct(int curr)
+int sar_get_record_timestamp_struct(int curr)
{
struct tm *ltm;
@@ -312,13 +319,17 @@ int check_line_hdr(void)
* @cur_time Timestamp string.
***************************************************************************
*/
-void set_record_timestamp_string(int curr, char *cur_time, int len)
+int set_record_timestamp_string(int curr, char *cur_time, int len)
{
+ int ret;
/* Fill timestamp structure */
- sar_get_record_timestamp_struct(curr);
+ ret = sar_get_record_timestamp_struct(curr);
+ if (ret != 0)
+ return ret;
/* Set cur_time date value */
strftime(cur_time, len, "%X", &rectime);
+ return 0;
}
/*
@@ -407,6 +418,7 @@ int write_stats(int curr, int read_from_
int use_tm_end, int reset, unsigned int act_id)
{
int i;
+ int ret;
unsigned long long itv, g_itv;
static int cross_day = 0;
static __nr_t cpu_nr = -1;
@@ -423,9 +435,14 @@ int write_stats(int curr, int read_from_
}
/* Set previous timestamp */
- set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ ret = set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ if (ret != 0)
+ return ret;
+
/* Set current timestamp */
- set_record_timestamp_string(curr, timestamp[curr], 16);
+ ret = set_record_timestamp_string(curr, timestamp[curr], 16);
+ if (ret != 0)
+ return ret;
/* Check if we are beginning a new day */
if (use_tm_start && record_hdr[!curr].ust_time &&
@@ -569,8 +586,11 @@ int sar_print_special(int curr, int use_
{
char cur_time[26];
int dp = 1;
+ int ret;
- set_record_timestamp_string(curr, cur_time, 26);
+ ret = set_record_timestamp_string(curr, cur_time, 26);
+ if (ret != 0)
+ return ret;
/* The record must be in the interval specified by -s/-e options */
if ((use_tm_start && (datecmp(&rectime, &tm_start) < 0)) ||
@@ -865,7 +885,8 @@ void read_stats_from_file(char from_file
*/
read_file_stat_bunch(act, 0, ifd, file_hdr.sa_nr_act,
file_actlst);
- sar_get_record_timestamp_struct(0);
+ if (sar_get_record_timestamp_struct(0))
+ continue;
}
}
while ((rtype == R_RESTART) || (rtype == R_COMMENT) ||
Sebastien Godard [Fri, 12 Nov 2010 15:44:21 +0000 (16:44 +0100)]
sar now tells sadc to read only the necessary groups of activities.
We noticed that a simple command like "sar 0" had a small delay before
displaying the CPU statsitics since system startup. This was because
in every case sar called sadc with option -S ALL resulting in all
possible activities being read.
Now, except if sar's option -o is used (in which case all possible
activities will still be read), sar tells sadc to read only the groups
of activities that include those that will be displayed on screen.
Sebastien Godard [Thu, 11 Nov 2010 17:37:31 +0000 (18:37 +0100)]
Small update to sar's manual page.
Sar's manual page says twice that -A option is equivalent to specifying
--bBdq... This led to inconsistencies as one was sometimes not updated
when a new option was added. So replace one of them by "selects all
possible activities".
Updated lsm and spec files.
Updated release date in CHANGES file.
Note that this patch also includes a small fix in spec file
where a cron file hasn't still been moved in its own subdirectory.
.gitignore file has been updated.
This patch also includes a small fix for sar CPU frequency activity
(sar -m CPU). It now takes into account a specific case of machines
where /sys isn't mounted and which don't have an SMP kernel running.
In this case, the number of CPU is counted using /proc/stat file, and
this file only has a line with global CPU statistics. The number of
items for CPU frequency activity is then equal to 1, which was
badly handled by read_cpuinfo() function in rd_stats.c. This patch
fixes that.
Define groups of activities: Each activity has now a new
attribute specifying the group it belongs to (POWER, IPV6, etc.)
Add a new attribute to the activity structure, defining the group
the activity belongs to. This makes to code more generic, as it is
no longer necessary to update sadc.c whenever a new activity is added
to a group.
Sebastien Godard [Sun, 24 Oct 2010 15:39:20 +0000 (17:39 +0200)]
Added CPU average clock frequency statistics to sar and sadc.
This patch adds a new option to sar (-m FREQ) that displays
the following field: wghMHz.
For this option to work, the cpufreq-stats driver must be compiled
in the kernel, as we need to read the "time-in-state" file in /sys.
sadc and sadf have also been updated to take into account this new field.
DTD and XSD documents have been updated.
The sar manual page has been updated.
Mail from Zhen Zhang (08/09/2010) <furykerry@gmail.com>
Hi ,
The current stable and development systat collect cpu frequency data
from /proc/cpuinfo, but currently cpuinfo "cpu Mhz" field report the
instant cpu frequency . From a system administrator point of view
however ,the preferred metric is the average cpu frequency at
reporting interval . The average frequency can be obtain from
/sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state.
Will the sysstat switch to average cpu frequency at next development version?
Thanks
Mail from Zhen Zhang (11/09/2010) <furykerry@gmail.com>
I want to measure the average cpu frequency of a machine which
probably is capable of dynamic adjusting frequency (DVFS). The average
of cpu frequency is an important metric to evaluate the power
consumption of a machine, and how hard the machine is working to serve
the requests. DVFS capability is starting to get wide adoption in the
server domain e.g. in recent Xeon.
The /proc/cpuinfo interface only record the instant cpu frequency
,however linux kernel or user frequncy governor can adjust cpu
frequency frequently , e.g. for default ondemand governor the
frequency is 10ms . Such interval is way to small comparing to usual
sysstat interval e.g 5min. So an accumulated value is needed .
cpufreq-stats is a driver ( It seems had entered into kernel 2.6.11,
and its document is available at kernel 2.6.12,
http://lxr.linux.no/#linux+v2.6.12/Documentation/cpu-freq/cpufreq-stats.txt).
For recent ubuntu , cpufreq-stats and cpufreq driver is built into
kernel. The average frequency can be fetch as follow
each line define a pair of frequency and its accumulated ticks since
reboot. sysstat can sample it and take the difference as the
accumulated ticks at the sampling interval , and calculated weighted
average cpu frequency.
The cpufreq-stats do have some pitfall which is addressed in patch
https://patchwork.kernel.org/patch/72488/).
Nevertheless its current form is already quite useful, I suggest
sysstat to utilize if available , and fall back to /proc/cpuinfo if
not.
Sebastien Godard [Sun, 10 Oct 2010 14:39:28 +0000 (16:39 +0200)]
Added a new magical value for each activity in file.
A format change can now hit only one activity instead of the whole file.
Sadf has also been updated to be able to display activities with
unknown format (sadf -H).
Create a new activity (A_HUGE) for hugepages statistics.
Hugepages statistics have been added as an additional output
for memory activity by commit d7ed8d382140e2d709a6753fa44a0acfcba91a7e.
Create a dedicated activity for them (A_HUGE). This is quite cleaner
although the drawback is that /proc/meminfo file will be now read twice.
Added SADC_OPTIONS to sysstat configuration file, and sysstat(5) manual page.
Mail from Ivana Varekova (20/09/2010):
SADC_OPTIONS is now the prefered way to pass args to sadc. It is read from
sa1 and sa2 shell scripts from /etc/sysconfig/sysstat configuration file.
Also add sysstat(5) manual page that describes the various environment
variables and their meanings.
Mail from Ivana Varekova (21/09/2010):
Using --disable-man-group option with configure resulted in man_group variable
being used. With --enable-man-group, the variable was ignored, which is the
opposite of what is expected.
This patch fixes that.
Moved manual pages from $prefix/man to $prefix/share/man.
Mail from Ivana Varekova (21/09/2010).
The Linux Filesystem Hierarchy now defines the default location for
manual pages as /usr/share/man instead of /usr/man.
So update sysstat to reflect this change.
See: http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#usr
Updated .gitignore file to ignore some more files.
Updated various source headers: (C) 2010 instead of (C) 2009.
Updated sar manual page: sar -A also includes -m ALL.