Sebastien Godard [Fri, 17 Dec 2010 20:19:53 +0000 (21:19 +0100)]
Added a new field (blocked) to sar -q.
This patch adds a new metric (blocked - number of tasks currently
blocked, waiting for I/O to complete) to sar -q.
Also update sar manual page, and DTD/XSD documents.
Note that this breaks stats_queue structure format.
Sebastien Godard [Sat, 11 Dec 2010 20:49:22 +0000 (21:49 +0100)]
No longer assume that device-mapper major number is 253.
Get the real number from /proc/devices file.
The sar, sadf and iostat commands used to assume that device-mapper
major number was 253. This happened to be false sometimes. So get
the real number from the /proc/devices file.
From Mike Coleman <tutufan@gmail.com> 04/10/2010:
The iostat program seems to assume that the major device number for
devmap will always be 253. This doesn't seem to be an official
number, though, and I have a box where it actually ends up being 252,
which breaks 'iostat -N'.
It looks like this can be determined dynamically by looking at /proc/devices.
Sebastien Godard [Sat, 11 Dec 2010 14:18:19 +0000 (15:18 +0100)]
[Kenichi Okuyama]: Small change to sar manual page.
Kenichi Okuyama <kenichi.okuyama@gmail.com> noticed that a sentence
in sar manual page was a bit confusing:
ip-frag
Number of IP fragments currently in use.
because unless you are familiar with
Linux Kernel internal, "IP fragments" are not something to be IN USE.
This is really how many elements are there in fragment queues, and
each element of queue is "group of fragmented ip packets", which when
completed, will become single IP packet.
So replace it with:
ip-frag
Number of IP fragments currently in queue.
Sebastien Godard [Tue, 30 Nov 2010 20:28:35 +0000 (21:28 +0100)]
pidstat: Code cleaned.
A comment in pidstat.c (get_pid_to_display()function) still refered
to option -X, although this option no longer exists as it was
merged with option -C.
Sebastien Godard [Mon, 22 Nov 2010 14:22:01 +0000 (15:22 +0100)]
Fixed bogus CPU statistics output, which happened when
CPU user value from /proc/stat wasn't incremented whereas
CPU guest value was.
From the Fedora Bugzilla database.
Ivana Varekova 2010-10-15 09:05:41 EDT
Description of problem:
The output of sar command is bogus, the value of %usr overflows
Version-Release number of selected component (if applicable):
last upstream (http://sebastien.godard.pagesperso-orange.fr/download.html) -
sysstat-9.1.5
How reproducible:
Steps to Reproduce:
1.# sar -u ALL -P ALL 1 1000
Linux 2.6.32.21-168.fc12.i686 (localhost) _i686_ (2 CPU)
...
02:59:54 PM CPU %user %nice %system %iowait %steal %idle
02:59:56 PM all 5.24 0.00 4.52 0.00 0.00 90.24
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
02:59:56 PM 1 3.20 0.00 1.83 0.00 0.00 94.98
....
2.
3.
Actual results:
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
^should be zero
Additional info:
the problem happens if user value prom /proc/stat is not incremented, and guest
value form /proc/stat for the same cpu is incremented. In this case sar should
output 0 as an %usr value.
(This situation can happened - see the code in kernel: account_guest_time
function) e.g.:
Sebastien Godard [Sun, 14 Nov 2010 12:24:57 +0000 (13:24 +0100)]
Fix segfaults on bogus localtime input.
The return code from localtime() function (and also gmtime() one)
wasn't checked. In some (rare) cases, it can return a NULL pointer
resulting in a segmentation fault.
The return code is now checked, but no specific action is performed
anyway.
Original patch from Ivana Varekova from RedHat (04/10/2010):
diff -up sysstat-9.0.6.1/sar.c.pom sysstat-9.0.6.1/sar.c
--- sysstat-9.0.6.1/sar.c.pom 2009-10-17 15:08:21.000000000 +0200
+++ sysstat-9.0.6.1/sar.c 2010-10-04 12:21:13.383442188 +0200
@@ -247,7 +247,7 @@ void reverse_check_act(unsigned int act_
* @curr Index in array for current sample statistics.
***************************************************************************
*/
-void sar_get_record_timestamp_struct(int curr)
+int sar_get_record_timestamp_struct(int curr)
{
struct tm *ltm;
@@ -312,13 +319,17 @@ int check_line_hdr(void)
* @cur_time Timestamp string.
***************************************************************************
*/
-void set_record_timestamp_string(int curr, char *cur_time, int len)
+int set_record_timestamp_string(int curr, char *cur_time, int len)
{
+ int ret;
/* Fill timestamp structure */
- sar_get_record_timestamp_struct(curr);
+ ret = sar_get_record_timestamp_struct(curr);
+ if (ret != 0)
+ return ret;
/* Set cur_time date value */
strftime(cur_time, len, "%X", &rectime);
+ return 0;
}
/*
@@ -407,6 +418,7 @@ int write_stats(int curr, int read_from_
int use_tm_end, int reset, unsigned int act_id)
{
int i;
+ int ret;
unsigned long long itv, g_itv;
static int cross_day = 0;
static __nr_t cpu_nr = -1;
@@ -423,9 +435,14 @@ int write_stats(int curr, int read_from_
}
/* Set previous timestamp */
- set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ ret = set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ if (ret != 0)
+ return ret;
+
/* Set current timestamp */
- set_record_timestamp_string(curr, timestamp[curr], 16);
+ ret = set_record_timestamp_string(curr, timestamp[curr], 16);
+ if (ret != 0)
+ return ret;
/* Check if we are beginning a new day */
if (use_tm_start && record_hdr[!curr].ust_time &&
@@ -569,8 +586,11 @@ int sar_print_special(int curr, int use_
{
char cur_time[26];
int dp = 1;
+ int ret;
- set_record_timestamp_string(curr, cur_time, 26);
+ ret = set_record_timestamp_string(curr, cur_time, 26);
+ if (ret != 0)
+ return ret;
/* The record must be in the interval specified by -s/-e options */
if ((use_tm_start && (datecmp(&rectime, &tm_start) < 0)) ||
@@ -865,7 +885,8 @@ void read_stats_from_file(char from_file
*/
read_file_stat_bunch(act, 0, ifd, file_hdr.sa_nr_act,
file_actlst);
- sar_get_record_timestamp_struct(0);
+ if (sar_get_record_timestamp_struct(0))
+ continue;
}
}
while ((rtype == R_RESTART) || (rtype == R_COMMENT) ||
Sebastien Godard [Fri, 12 Nov 2010 15:44:21 +0000 (16:44 +0100)]
sar now tells sadc to read only the necessary groups of activities.
We noticed that a simple command like "sar 0" had a small delay before
displaying the CPU statsitics since system startup. This was because
in every case sar called sadc with option -S ALL resulting in all
possible activities being read.
Now, except if sar's option -o is used (in which case all possible
activities will still be read), sar tells sadc to read only the groups
of activities that include those that will be displayed on screen.
Sebastien Godard [Thu, 11 Nov 2010 17:37:31 +0000 (18:37 +0100)]
Small update to sar's manual page.
Sar's manual page says twice that -A option is equivalent to specifying
--bBdq... This led to inconsistencies as one was sometimes not updated
when a new option was added. So replace one of them by "selects all
possible activities".
Updated lsm and spec files.
Updated release date in CHANGES file.
Note that this patch also includes a small fix in spec file
where a cron file hasn't still been moved in its own subdirectory.
.gitignore file has been updated.
This patch also includes a small fix for sar CPU frequency activity
(sar -m CPU). It now takes into account a specific case of machines
where /sys isn't mounted and which don't have an SMP kernel running.
In this case, the number of CPU is counted using /proc/stat file, and
this file only has a line with global CPU statistics. The number of
items for CPU frequency activity is then equal to 1, which was
badly handled by read_cpuinfo() function in rd_stats.c. This patch
fixes that.
Define groups of activities: Each activity has now a new
attribute specifying the group it belongs to (POWER, IPV6, etc.)
Add a new attribute to the activity structure, defining the group
the activity belongs to. This makes to code more generic, as it is
no longer necessary to update sadc.c whenever a new activity is added
to a group.
Sebastien Godard [Sun, 24 Oct 2010 15:39:20 +0000 (17:39 +0200)]
Added CPU average clock frequency statistics to sar and sadc.
This patch adds a new option to sar (-m FREQ) that displays
the following field: wghMHz.
For this option to work, the cpufreq-stats driver must be compiled
in the kernel, as we need to read the "time-in-state" file in /sys.
sadc and sadf have also been updated to take into account this new field.
DTD and XSD documents have been updated.
The sar manual page has been updated.
Mail from Zhen Zhang (08/09/2010) <furykerry@gmail.com>
Hi ,
The current stable and development systat collect cpu frequency data
from /proc/cpuinfo, but currently cpuinfo "cpu Mhz" field report the
instant cpu frequency . From a system administrator point of view
however ,the preferred metric is the average cpu frequency at
reporting interval . The average frequency can be obtain from
/sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state.
Will the sysstat switch to average cpu frequency at next development version?
Thanks
Mail from Zhen Zhang (11/09/2010) <furykerry@gmail.com>
I want to measure the average cpu frequency of a machine which
probably is capable of dynamic adjusting frequency (DVFS). The average
of cpu frequency is an important metric to evaluate the power
consumption of a machine, and how hard the machine is working to serve
the requests. DVFS capability is starting to get wide adoption in the
server domain e.g. in recent Xeon.
The /proc/cpuinfo interface only record the instant cpu frequency
,however linux kernel or user frequncy governor can adjust cpu
frequency frequently , e.g. for default ondemand governor the
frequency is 10ms . Such interval is way to small comparing to usual
sysstat interval e.g 5min. So an accumulated value is needed .
cpufreq-stats is a driver ( It seems had entered into kernel 2.6.11,
and its document is available at kernel 2.6.12,
http://lxr.linux.no/#linux+v2.6.12/Documentation/cpu-freq/cpufreq-stats.txt).
For recent ubuntu , cpufreq-stats and cpufreq driver is built into
kernel. The average frequency can be fetch as follow
each line define a pair of frequency and its accumulated ticks since
reboot. sysstat can sample it and take the difference as the
accumulated ticks at the sampling interval , and calculated weighted
average cpu frequency.
The cpufreq-stats do have some pitfall which is addressed in patch
https://patchwork.kernel.org/patch/72488/).
Nevertheless its current form is already quite useful, I suggest
sysstat to utilize if available , and fall back to /proc/cpuinfo if
not.
Sebastien Godard [Sun, 10 Oct 2010 14:39:28 +0000 (16:39 +0200)]
Added a new magical value for each activity in file.
A format change can now hit only one activity instead of the whole file.
Sadf has also been updated to be able to display activities with
unknown format (sadf -H).
Create a new activity (A_HUGE) for hugepages statistics.
Hugepages statistics have been added as an additional output
for memory activity by commit d7ed8d382140e2d709a6753fa44a0acfcba91a7e.
Create a dedicated activity for them (A_HUGE). This is quite cleaner
although the drawback is that /proc/meminfo file will be now read twice.
Added SADC_OPTIONS to sysstat configuration file, and sysstat(5) manual page.
Mail from Ivana Varekova (20/09/2010):
SADC_OPTIONS is now the prefered way to pass args to sadc. It is read from
sa1 and sa2 shell scripts from /etc/sysconfig/sysstat configuration file.
Also add sysstat(5) manual page that describes the various environment
variables and their meanings.
Mail from Ivana Varekova (21/09/2010):
Using --disable-man-group option with configure resulted in man_group variable
being used. With --enable-man-group, the variable was ignored, which is the
opposite of what is expected.
This patch fixes that.
Moved manual pages from $prefix/man to $prefix/share/man.
Mail from Ivana Varekova (21/09/2010).
The Linux Filesystem Hierarchy now defines the default location for
manual pages as /usr/share/man instead of /usr/man.
So update sysstat to reflect this change.
See: http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#usr
Updated .gitignore file to ignore some more files.
Updated various source headers: (C) 2010 instead of (C) 2009.
Updated sar manual page: sar -A also includes -m ALL.