Sebastien Godard [Sun, 20 Nov 2011 15:29:07 +0000 (16:29 +0100)]
Fixed random crash with iostat when called with
option -N [NOVELL Bug#729130].
Mail from Petr Uzel <petr.uzel@suse.cz> 11/13/2011 11:45 AM:
> > On 11/09/2011 01:34 PM, Petr Uzel wrote:
>> > >attached patch fixes
>> > >https://bugzilla.novell.com/show_bug.cgi?id=729130
>> > >
Hi Sebastien,
As far as I understand (please correct me if I'm wrong), sysstat is
hitting unspecified behavior, which might explain why you can not
reproduce the bug.
Check out transfrom_devmapname() function:
while ((dp = readdir(dm_dir)) != NULL) {
/* For each file in DEVMAP_DIR */
dm_name points to the memory returned by readdir(), but from 'man
readdir', this memory is not guaranteed to be valid after next call
to readdir or after closedir().
man readdir:
....
The data returned by readdir() may be overwritten by subsequent
calls to readdir() for the same directory stream.
....
man closedir:
....
The closedir() function closes the directory stream
associated with dirp. A successful call to closedir() also
closes the underlying file descriptor associated with dirp.
The directory stream descriptor dirp is not available after
this call.
....
[It is not very clear to me if this also invalidates the dirent
structure]
So the solution is to strncpy the memory before it gets invalidated by
next readdir() or closedir().
Unrelated to this bug:
Could you please add something like make install_nostrip to the
Makefile? By default, the buildsystem uses LDFLAGS = -s, which always
strips the resulting binary. In openSUSE, we have to patch this,
because we need to strip the binaries on our own (to create
sysstat-debug{source,info} packages).
Sebastien Godard [Fri, 11 Nov 2011 15:01:37 +0000 (16:01 +0100)]
New output format added to sadf: JSON.
JSON (JavaScript Object Notation) is a lightweight data interchange
format (less verbose than XML). sadf can now display sar's data
in JSON using a new switch (-j).
Sebastien Godard [Wed, 31 Aug 2011 13:11:25 +0000 (15:11 +0200)]
Fixed bugs in sadf XML output and in DTD/XSD documents.
On 08/29/2011 02:27 PM, "Jürgen Heinemann (Undefined)" wrote:
> Hallo Sebastian,
> I have found some bugs with sadf -x command.
> You can see my changes in sysstat-10.0.2.rc1.diff attachment.
> The Doctype Declaration in sadf_misc.c isn't set to "sysstat" rootNode
> and timetamp Element closed with child Elements
> See my Example xslt
>
> sadf -P 0,1 -x > input.xml
> xsltproc --encoding utf-8 --novalid sysstat.xslt input.xml
>
> greets Jürgen
Sebastien Godard [Tue, 16 Aug 2011 12:13:27 +0000 (14:13 +0200)]
Fixed a bug with pidstat, where stats for terminated processes
were still displayed.
On 07/06/2011 08:40 AM, Kei Ishida wrote:
> Hello
>
> I found the folloring bug of pidstat. I hope this would help.
>
> [Bug description]
> Pidstat displayed pid(s) of dead process(es) every other times
> when multiple pids are specified.
>
> [How reproducible]
> Always. Run pidstat with more then pids to watch, and kill one of them.
>
> [Proposed patch]
> diff -urNp sysstat-10.0.1/pidstat.c sysstat-10.0.1_/pidstat.c
> --- sysstat-10.0.1/pidstat.c 2011-06-01 22:05:12.000000000 +0900
> +++ sysstat-10.0.1_/pidstat.c 2011-07-06 11:05:46.149880175 +0900
> @@ -1011,6 +1011,20 @@ int get_pid_to_display(int prev, int cur
>
> else if (DISPLAY_PID(pidflag)) {
>
> + unsigned int i;
> + int pid_exists = FALSE;
> +
> + /* See if pid exists in pid_array[] */
> + for (i = 0; i < pid_array_nr; i++) {
> + if ((*pstc)->pid == pid_array[i]) {
> + pid_exists = TRUE;
> + break;
> + }
> + }
> +
> + if (!pid_exists)
> + return 0;
> +
> *pstp = st_pid_list[prev] + p;
> }
>
>
> Regards,
> Kei Ishida
Sebastien Godard [Mon, 15 Aug 2011 14:00:05 +0000 (16:00 +0200)]
Option "-P ON" added to mpstat.
This option tells mpstat to display statistics for online
processors only.
mpstat manual page updated.
On 06/30/2011 06:41 AM, Ananth N Mavinakayanahalli wrote:
a. Consistency with output of top and /proc/cpuinfo, both of which won't
display information of offlined CPUs.
b. On POWER7 for instance, with SMT, when SMT is turned off, 3 of the 4
CPUs are off-lined. Someone using just mpstat will have no idea that
he has just 1 CPU underneath while mpstat says he has 4, without an
indication of which ones are usable and which ones are not.
I think, at the least, the -P switch should be educated with a new
option that displays only online CPU information, if there is a hard
requirement to preserve existing functionality.
Sebastien Godard [Sun, 14 Aug 2011 14:33:55 +0000 (16:33 +0200)]
pidstat manual page updated.
Added the description of field %MEM displayed by pidstat -r.
On 06/29/2011 05:43 AM, Carlos Allegri wrote:
> I'm using pidstat version 9.0.6 for monitoring of a mail server.
> I run: "pidstat -r"
> Then, in the output: "PID minflt/s majflt/s VSZ RSS %MEM Command", I see the column "%MEM" which isn't explained in the manual.
> Could you tell me, which is the meaning of this column?
Sebastien Godard [Sat, 13 Aug 2011 13:58:28 +0000 (15:58 +0200)]
sadf manual page updated.
Use of options -t, -H and -T has changed. So updated sadf manual
page accordingly.
Also fix a wrong statement, saying that options -s and -e are ignored
with option -x.
Sebastien Godard [Sat, 13 Aug 2011 12:44:43 +0000 (14:44 +0200)]
DTD and XSD documents updated.
A new mark ("utc") has been added in the XML output
generated by sadf -x. This mark has a value of 0 or 1 depending
on whether the timestamp is expressed in local time or UTC.
A correction has also been made in the XSD document, where the
description of "comment" messages was erroneous.
Sebastien Godard [Sat, 13 Aug 2011 12:28:19 +0000 (14:28 +0200)]
sadf modified to make it easier to add new output formats.
sadf has been heavily modified to make it easiser to add new
output formats. The idea was to take the same architecture pattern
than that of sar. Anyway, I haven't been able to achieve this goal:
The design is still not generic although things have improved.
Automate translation files handling in Makefile.in.
Mail from Jeroen Roovers <jer@gentoo.org> 03/06/2011:
Subject: [PATCH] automate translation file handling in Makefile.in
Hello,
as maintainer of the sysstat package in the Gentoo Linux repository, I
have been maintaining a patch that makes it easier to instruct our
build/install system to install support for certain languages or indeed
save space on the target system by not installing them.
This patch makes the build system not list all the available
translations as the current Makefile.in does, but finds the language
files on its own and repeats the same two commands in a loop over the
files it finds. Also with this patch, the Makefile.in does not need to
be updated for each new translation any longer.
Sebastien Godard [Sat, 28 May 2011 14:03:55 +0000 (16:03 +0200)]
Fixed XML output displayed by sadf (hugepages statistics were
included in power management ones).
When displaying stats with sar, hugepages utilization statistics
were displayed between voltage inputs statistics and CPU clock
ones. This was not really smart but still OK.
Yet, when displaying XML output with sadf -x, hugepages statistics
were included in the <power-management> section, which is quite bad
in this case. So move hugepage structure just after memory utilization
one in activity.c:act[] array.
Sebastien Godard [Tue, 24 May 2011 11:46:17 +0000 (13:46 +0200)]
sar and pidstat: Check that _("Average") string doesn't exceed
the size of the timestamp buffer.
One could find something like:
strcpy(string, _("Average"));
in pidstat.c and sar.c. Yet, we don't know whether the translation
message for "Average" will fit in target string buffer. Hence we
replaced the previous expression with something like:
strncpy(string, _("Average"), length_of_string_buffer);
string[lenght_of_string_buffer - 1] = '\0';
Sebastien Godard [Tue, 24 May 2011 09:43:25 +0000 (11:43 +0200)]
sar: Decrease column width for sensor device name (temperature,
voltage inputs and fans statistics).
There were several unncessary spaces between the last column of
statistics and the device name in the report displayed by sar for
temperature, voltage inputs and fans statistics.
These spaces have been removed.
Sebastien Godard [Tue, 24 May 2011 09:26:35 +0000 (11:26 +0200)]
sadf -p now displays the sensor device name for temperature,
voltage inputs and fans statistics.
The render() function was not properly used in rndr_stats.c, in particular
when the DEVICE name was to be displayed by sadf for fans, voltage inputs
and temperature statistics.
A new flag has been added (PT_USESTR) enabling the render() function to
display strings. As a consequence, sadf -d and sadf -p are now able
to display the sensor device name.
The output of sadf -d has also changed (this is no longer "device;FAN;..."
but "FAN;DEVICE;...". Same thing applies for TEMP and IN statistics).
Option -h added to iostat. This option makes the device utilization
report easier to read with long device names.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
* sysstat_1.patch
adds -h (human readable) option to iostat tool (just to indent the row after the device name)
[...]
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
From Fedora bugzilla: G. Michael Carter 2011-04-11 17:19:58 EDT
As far as reports go this is rather hard to read. Can we get the Device column
to size based on the longest name?
cifsiostat didn't count open files from the "Posix Open" column in
/proc/fs/cifs/Stats file. This is now fixed.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
[...]
* sysstat_3.patch
fixes cifsiostat tool which in open files does not count files which are in /proc/fs/cifs/Stats output in column "Posix Open"
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
Close file descriptor in read_uptime() function (file rd_stats.c).
File descriptor was not closed when /proc/uptime file happened to be empty.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011:
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
[...]
* sysstat_2.patch
fix rear_uptime bug - this function does not close the file descriptor which is open in it in the special situation
[...]
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
Sebastien Godard [Fri, 11 Mar 2011 13:45:05 +0000 (14:45 +0100)]
Fixed several bufs with nfsiostat (and cifsiostat).
Mail from Masanari Iida (masanari.iida@hp.com) 08/02/2011:
Sebastien,
Thanks for the fix.
Before you release the core on end of Feb, I would ask you to test the
code with following 2 scenario.
(1) Mount / Umount while nfsiostat running.
Check points
(a) nfsiostat detect new nfs mount points after nfsiostat started.
The mounted NFS share have to be reported by nfsiostat.
(b) nfsiostat detect nfs mount points which is umounted after nfsiostat started.
No lines reported by nfsiostat after umount NFS share.
This is an original bug scenario that I reported to you in the first e-mail.
(2) nfsiostat not showing incorrect value when NFS mount point re-mount happen.
Following symptom was seen on sysstat 7.0.2 (on RHEL5).
Step to reproduce.
(1) Mount an NFS share
(2) Run iostat (or nfsiostat)
(3) Umount the NFS share
(4) Mount the same NFS share that umounted in step 3.
(5) Check the iostat result for the NFS share.
Actual Result on sysstat 7.0.2.
Very large value for rops/s and wops/s display one time.
These are incorrect values.
Expected result.
The nfsiostat reported correct value for rops/s and wops/s
even when the NFS mount point re-mounted while nfsiostat running.
Regards,
Masanari Iida
Mail from Masanari Iida (masanari.iida@hp.com) 24/02/2011:
Hello,
Thank you for your support.
I have tested your new code.
The bugs that I have reported in previous e-mail are fixed on this version.
So I don't get no more coredump, and nfsiostat detect all mounted and
umounted filesystems while running.
One minor issue still remain here.
On 2nd result of the nfsiostat always include some unknown value.
(These 0 data are correct, since I just mount the NFS share and not doing
any I/O during the test. )
How to reproduce 1
(1) Mount one NFS filesystem.
(2) Run nfsiostat with interval and count options. The count must be 3 or more.
(3) Check the 2nd result from nfsiostat.
How to reproduce 2
(1) Run nfsiostat with interval.
(2) Mount the NFS filesystem while running the nfsiostat.
(3) Check the 2nd result of the just mounted NFS filesystem.
I know that vmstat or iostat case, the first result is a history of data since OS boot.
So I usually ignore the first data and use after the 2nd result.
In NFS automount enviroment, NFS mount/umount happen at random timing.
So it is hard to remove these sudden large value from data.
Impact.
When a script draw a graph, these sudden large values expand Y scale.
So the normal value may looks smaller than expected in the graph.
Don't link sysstat's commands with sensors library if not needed.
Sebastien,
I'm forwarding a bug report I've got today. The submitter is right. The
only binary that actually needs sensors is sadc. but all the remaining
binaries are linked with the library, and it's not easy to change it.
I was trying to find some way to fix the issue e.g. by using the
--as-needed linker option, but this doesn't worked well (even after
changing ordering of linking). Finally I've came up with an idea of
splitting rd_stats.c into two parts - one that requires sensors and the
other one that doesn't.
I'm attaching some proof of concept patch just to show the idea. It uses
preprocessor macro to compile a tiny version of rd_stats.c that is than
linked with all binaries except for sadc. I don't think the patch is
ready to be applied b you as is. It would be much better to manually
split the rd_stats.c file into two parts, and than make configure not to
add -lsensors to CFLAGS, but put -lsensors in some autoconf variable
that would be added to sadc's LFLAGS. Could you please look at it and
possbly make an appropriate changes for the next version of sysstat?
Thanks,
robert
------ Wiadomosc oryginalna ------
Temat: Bug#612571: sysstat: iostat links libsensors4 with no need
Odeslano-Data: Wed, 09 Feb 2011 09:18:01 +0000, Wed, 09 Feb 2011
09:18:04 +0000
Odeslano-Od: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>
Odeslano-Do: debian-bugs-dist@lists.debian.org
Odeslano-Kopia: Robert Luberda <robert@debian.org>
Data: Wed, 9 Feb 2011 10:15:31 +0100
Nadawca: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>
Odpowiedz-Do: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>,
612571@bugs.debian.org
Adresat: submit@bugs.debian.org
Package: sysstat
Version: 9.1.7-2
Hello,
iostat from sysstat 9.1.7-2 links /usr/lib/libsensors.so.4.
The manual page doesn't reveal any new functions that could explain this
and according to ldd libsensors isn't used:
$ ldd -u /usr/bin/iostat
Unused direct dependencies:
Sebastien Godard [Sun, 27 Feb 2011 15:07:59 +0000 (16:07 +0100)]
iostat incorrectly mapped device-mapper IDs
greater than 256. This is now fixed [DEBIAN Bug#614397].
Mail from Robert Luberda <robert@debian.org> (22/02/2011)
Subject: Fwd: Bug#614397: iostat(from sysstat) doesn't support more
than 256 device-mapper names
Sebastien,
I'm forwarding a bug report I've got yesterday.
I can't reproduce it by myself, as I don't use LVM at all, but I've just
found out that some Debian machines actually do use it, and the device
numbers on them starts with 252, so the contents of /dev/mapper looks like:
crw-rw---- 1 root root 10, 61 Feb 19 18:43 control
brw-rw---- 1 root disk 252, 0 Feb 19 18:43 vg_$hostname-srv
Regards,
robert
------ Wiadomosc oryginalna ------
Temat: Bug#614397: iostat(from sysstat) doesn't support more than 256
device-mapper names
Odeslano-Data: Mon, 21 Feb 2011 18:27:01 +0000, Mon, 21 Feb 2011
18:27:04 +0000
Odeslano-Od: Adam Heath <doogie@brainfood.com>
Odeslano-Do: debian-bugs-dist@lists.debian.org
Odeslano-Kopia: Robert Luberda <robert@debian.org>
Data: Mon, 21 Feb 2011 12:24:11 -0600
Nadawca: Adam Heath <doogie@brainfood.com>
Odpowiedz-Do: Adam Heath <doogie@brainfood.com>, 614397@bugs.debian.org
Adresat: submit@bugs.debian.org
package: sysstat
severity: minor
version: 9.0.6.1-2
iostat tries to do bit-shifting of device ids; this is a big no-no.
Attached patch at least fixes it for mapping of device-mapper
names(iostat -N).
I would love to have this go into stable-updates(squeeze), but can
understand if it's not the type of change that normally would be allowed.
The circumstances of this bug, cause device 256 to map to 0, 257 to
map to 1, etc.
debian-changes-9.0.6.1-2.1
--- sysstat-9.0.6.1.orig/ioconf.c
+++ sysstat-9.0.6.1/ioconf.c
@@ -500,8 +500,8 @@ char *transform_devmapname(unsigned int
if (stat(filen, &aux) == 0) {
/* Get its minor and major numbers */
Sebastien Godard [Sun, 27 Feb 2011 14:45:23 +0000 (15:45 +0100)]
Option -V from sysstat commands now displays the version number on stdout
and returns 0 for the exit code.
Option -V used to display the version number on stderr and returns 1 for
the exit code.
This is not the expected behavior as it has done everything we asked properly.
So change this: Display on stdout and returns 0.
The same change has been applied to sar's option -h, which displays a
help message.
Mail from Lodewijk Bonebakker <jlbonebakker@gmail.com> (15/02/2011):
Subject: Systat version reporting
Dear Sebastian,
I have question related to the way you report the version number in sysstat. At the moment, it seems that you write:
sysstat version <versionno>
(C) Sebastien Godard (sysstat <at> orange.fr)
to stderr, and set the error-code to 1.
Given the significant changes between 7/8 and 9, we have a tremendous headache in automatically dealing with the different sar data files, collected during the day on different machines (some which we prefer not to upgrade). Currently in our environment we can work with this way of reporting your version number, but I would like to make a suggestion:
It would greatly help us if the version command returns only "sysstat version <versionno>" to stdout and sets the return code to 0. Our reasoning behind this is that 'sar -V' should print the version number and exit 0, since it has done everything we have asked it to do correctly. A non-zero exit code is then reserved for error-conditions. This way we can efficiently get the version-number, check for proper installation/kernel versioning etc.
Thank you for your time in considering this suggestion,
Added the possibility to extend the number of slots for NFS and
CIFS mount points on the fly.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011:
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat.patch - adds the possibility to extend the number of
slots for nfs mount points during the nfsiostat run
cifsiostat:
cifsiostat.patch - adds the possibility to extend the number of
slots for cifs mount points during the cifsiostat run
See also mail from Masanari Iida (masanari.iida@hp.com) 28/01/2011:
Hello
I have a feedback about nfsiostat behavior.
Description
nfsiostat need to be run _AFTER_ the NFS share is mounted.
Version.
nfsiostat in sysstat 9.1.7
How to reproduce
(1) Run nfsiostat -k 1
(2) Mount one NFS share
(3) Check nfsiostat output.
Expected result.
nfsiostat start to report the NFS share's activities, after I mount it.
Actual result.
nfsiostat not reporting the mounted NFS share, even after it is mounted.
Additional information:
If I mount the NFS share _BEFORE_ i run nfsiostat, nfsiostat reports the
NFS share's information. And also, if I umount the NFS share,
the line disappeared. (This is expected.)
It is because, the environment is using autofs, mount and umount often
happens on the system.
But I continously running the iostat -kn to collect the statics, then I have
encounteded this symptom. (And tested with nfsiostat )
In case of multiple NFS mount points, the symptom is bit different.
If an additional NFS share is mounted BEFORE this test is done,
the target NFS share appeared after mount it. And dissapeared it after umount.
But if I do the same thing one more time, it is not display any more.
========
To say the truth, the original symptom is happen on sysstat 7.0
on RHEL5 system, using iostat -kn. The environment uses autofs.
I wanted to confirm if the upstream version already fixed this symptom,
so that's why I downloaded the latest tar ball and tried.
But so far, the similar symptom still exist.
If you think this is current known limitation or known bug,
please document it in man page or FAQ page of your web.
Fix a problem with long NFS and CIFS share names in cifsiostat and
nfsiostat.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat3.patch - fix the problem with long nfs shares names
(...)
cifsiostat:
cifsiostat3.patch - fix the problem with long cifs shares names
(...)
Check calloc() return value in nfsiostat and cifsiostat.
A call to calloc() function to allocate structures in nfsiostat and
cifsiostat wasn't checked for its return code. This call could possibly
fail without ever being noticed.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat2.patch - adds the forgotten test to malloc
(...)
cifsiostat:
cifsiostat2.patch - adds the forgotten test to malloc
(...)
Added --debuginfo option to cifsiostat and nfsiostat commands.
Jan Kaluza from Redhat (jkaluza@redhat.com) added option --debuginfo
to cifsiostat and nfsiostat commands.
His mail (06/30/2010):
Hi,
thanks for applying my previous patch in Sysstat 9.1.3 (I'm really proud to be
in the Changelog). I've created another patch which adds --debuginfo option
also for new tools (nfsiostat, cifsiostat) introduced in this release. I think
it could help debugging in some situations. Please feel free to ask any
question about that patch.
Jan Kaluza
cifsiostat and nfsiostat manual pages have also been updated.
By default, sysstat_panic function is no longer included in binary commands.
This function is defined only if --enable-debuginfo has been used with
configure.
Sebastien Godard [Fri, 17 Dec 2010 20:19:53 +0000 (21:19 +0100)]
Added a new field (blocked) to sar -q.
This patch adds a new metric (blocked - number of tasks currently
blocked, waiting for I/O to complete) to sar -q.
Also update sar manual page, and DTD/XSD documents.
Note that this breaks stats_queue structure format.
Sebastien Godard [Sat, 11 Dec 2010 20:49:22 +0000 (21:49 +0100)]
No longer assume that device-mapper major number is 253.
Get the real number from /proc/devices file.
The sar, sadf and iostat commands used to assume that device-mapper
major number was 253. This happened to be false sometimes. So get
the real number from the /proc/devices file.
From Mike Coleman <tutufan@gmail.com> 04/10/2010:
The iostat program seems to assume that the major device number for
devmap will always be 253. This doesn't seem to be an official
number, though, and I have a box where it actually ends up being 252,
which breaks 'iostat -N'.
It looks like this can be determined dynamically by looking at /proc/devices.
Sebastien Godard [Sat, 11 Dec 2010 14:18:19 +0000 (15:18 +0100)]
[Kenichi Okuyama]: Small change to sar manual page.
Kenichi Okuyama <kenichi.okuyama@gmail.com> noticed that a sentence
in sar manual page was a bit confusing:
ip-frag
Number of IP fragments currently in use.
because unless you are familiar with
Linux Kernel internal, "IP fragments" are not something to be IN USE.
This is really how many elements are there in fragment queues, and
each element of queue is "group of fragmented ip packets", which when
completed, will become single IP packet.
So replace it with:
ip-frag
Number of IP fragments currently in queue.
Sebastien Godard [Tue, 30 Nov 2010 20:28:35 +0000 (21:28 +0100)]
pidstat: Code cleaned.
A comment in pidstat.c (get_pid_to_display()function) still refered
to option -X, although this option no longer exists as it was
merged with option -C.
Sebastien Godard [Mon, 22 Nov 2010 14:22:01 +0000 (15:22 +0100)]
Fixed bogus CPU statistics output, which happened when
CPU user value from /proc/stat wasn't incremented whereas
CPU guest value was.
From the Fedora Bugzilla database.
Ivana Varekova 2010-10-15 09:05:41 EDT
Description of problem:
The output of sar command is bogus, the value of %usr overflows
Version-Release number of selected component (if applicable):
last upstream (http://sebastien.godard.pagesperso-orange.fr/download.html) -
sysstat-9.1.5
How reproducible:
Steps to Reproduce:
1.# sar -u ALL -P ALL 1 1000
Linux 2.6.32.21-168.fc12.i686 (localhost) _i686_ (2 CPU)
...
02:59:54 PM CPU %user %nice %system %iowait %steal %idle
02:59:56 PM all 5.24 0.00 4.52 0.00 0.00 90.24
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
02:59:56 PM 1 3.20 0.00 1.83 0.00 0.00 94.98
....
2.
3.
Actual results:
02:59:56 PM 0 4487176417.15 0.00 7.46 0.00 0.00
85.07
^should be zero
Additional info:
the problem happens if user value prom /proc/stat is not incremented, and guest
value form /proc/stat for the same cpu is incremented. In this case sar should
output 0 as an %usr value.
(This situation can happened - see the code in kernel: account_guest_time
function) e.g.:
Sebastien Godard [Sun, 14 Nov 2010 12:24:57 +0000 (13:24 +0100)]
Fix segfaults on bogus localtime input.
The return code from localtime() function (and also gmtime() one)
wasn't checked. In some (rare) cases, it can return a NULL pointer
resulting in a segmentation fault.
The return code is now checked, but no specific action is performed
anyway.
Original patch from Ivana Varekova from RedHat (04/10/2010):
diff -up sysstat-9.0.6.1/sar.c.pom sysstat-9.0.6.1/sar.c
--- sysstat-9.0.6.1/sar.c.pom 2009-10-17 15:08:21.000000000 +0200
+++ sysstat-9.0.6.1/sar.c 2010-10-04 12:21:13.383442188 +0200
@@ -247,7 +247,7 @@ void reverse_check_act(unsigned int act_
* @curr Index in array for current sample statistics.
***************************************************************************
*/
-void sar_get_record_timestamp_struct(int curr)
+int sar_get_record_timestamp_struct(int curr)
{
struct tm *ltm;
@@ -312,13 +319,17 @@ int check_line_hdr(void)
* @cur_time Timestamp string.
***************************************************************************
*/
-void set_record_timestamp_string(int curr, char *cur_time, int len)
+int set_record_timestamp_string(int curr, char *cur_time, int len)
{
+ int ret;
/* Fill timestamp structure */
- sar_get_record_timestamp_struct(curr);
+ ret = sar_get_record_timestamp_struct(curr);
+ if (ret != 0)
+ return ret;
/* Set cur_time date value */
strftime(cur_time, len, "%X", &rectime);
+ return 0;
}
/*
@@ -407,6 +418,7 @@ int write_stats(int curr, int read_from_
int use_tm_end, int reset, unsigned int act_id)
{
int i;
+ int ret;
unsigned long long itv, g_itv;
static int cross_day = 0;
static __nr_t cpu_nr = -1;
@@ -423,9 +435,14 @@ int write_stats(int curr, int read_from_
}
/* Set previous timestamp */
- set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ ret = set_record_timestamp_string(!curr, timestamp[!curr], 16);
+ if (ret != 0)
+ return ret;
+
/* Set current timestamp */
- set_record_timestamp_string(curr, timestamp[curr], 16);
+ ret = set_record_timestamp_string(curr, timestamp[curr], 16);
+ if (ret != 0)
+ return ret;
/* Check if we are beginning a new day */
if (use_tm_start && record_hdr[!curr].ust_time &&
@@ -569,8 +586,11 @@ int sar_print_special(int curr, int use_
{
char cur_time[26];
int dp = 1;
+ int ret;
- set_record_timestamp_string(curr, cur_time, 26);
+ ret = set_record_timestamp_string(curr, cur_time, 26);
+ if (ret != 0)
+ return ret;
/* The record must be in the interval specified by -s/-e options */
if ((use_tm_start && (datecmp(&rectime, &tm_start) < 0)) ||
@@ -865,7 +885,8 @@ void read_stats_from_file(char from_file
*/
read_file_stat_bunch(act, 0, ifd, file_hdr.sa_nr_act,
file_actlst);
- sar_get_record_timestamp_struct(0);
+ if (sar_get_record_timestamp_struct(0))
+ continue;
}
}
while ((rtype == R_RESTART) || (rtype == R_COMMENT) ||
Sebastien Godard [Fri, 12 Nov 2010 15:44:21 +0000 (16:44 +0100)]
sar now tells sadc to read only the necessary groups of activities.
We noticed that a simple command like "sar 0" had a small delay before
displaying the CPU statsitics since system startup. This was because
in every case sar called sadc with option -S ALL resulting in all
possible activities being read.
Now, except if sar's option -o is used (in which case all possible
activities will still be read), sar tells sadc to read only the groups
of activities that include those that will be displayed on screen.
Sebastien Godard [Thu, 11 Nov 2010 17:37:31 +0000 (18:37 +0100)]
Small update to sar's manual page.
Sar's manual page says twice that -A option is equivalent to specifying
--bBdq... This led to inconsistencies as one was sometimes not updated
when a new option was added. So replace one of them by "selects all
possible activities".
Updated lsm and spec files.
Updated release date in CHANGES file.
Note that this patch also includes a small fix in spec file
where a cron file hasn't still been moved in its own subdirectory.
.gitignore file has been updated.
This patch also includes a small fix for sar CPU frequency activity
(sar -m CPU). It now takes into account a specific case of machines
where /sys isn't mounted and which don't have an SMP kernel running.
In this case, the number of CPU is counted using /proc/stat file, and
this file only has a line with global CPU statistics. The number of
items for CPU frequency activity is then equal to 1, which was
badly handled by read_cpuinfo() function in rd_stats.c. This patch
fixes that.
Define groups of activities: Each activity has now a new
attribute specifying the group it belongs to (POWER, IPV6, etc.)
Add a new attribute to the activity structure, defining the group
the activity belongs to. This makes to code more generic, as it is
no longer necessary to update sadc.c whenever a new activity is added
to a group.
Sebastien Godard [Sun, 24 Oct 2010 15:39:20 +0000 (17:39 +0200)]
Added CPU average clock frequency statistics to sar and sadc.
This patch adds a new option to sar (-m FREQ) that displays
the following field: wghMHz.
For this option to work, the cpufreq-stats driver must be compiled
in the kernel, as we need to read the "time-in-state" file in /sys.
sadc and sadf have also been updated to take into account this new field.
DTD and XSD documents have been updated.
The sar manual page has been updated.
Mail from Zhen Zhang (08/09/2010) <furykerry@gmail.com>
Hi ,
The current stable and development systat collect cpu frequency data
from /proc/cpuinfo, but currently cpuinfo "cpu Mhz" field report the
instant cpu frequency . From a system administrator point of view
however ,the preferred metric is the average cpu frequency at
reporting interval . The average frequency can be obtain from
/sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state.
Will the sysstat switch to average cpu frequency at next development version?
Thanks
Mail from Zhen Zhang (11/09/2010) <furykerry@gmail.com>
I want to measure the average cpu frequency of a machine which
probably is capable of dynamic adjusting frequency (DVFS). The average
of cpu frequency is an important metric to evaluate the power
consumption of a machine, and how hard the machine is working to serve
the requests. DVFS capability is starting to get wide adoption in the
server domain e.g. in recent Xeon.
The /proc/cpuinfo interface only record the instant cpu frequency
,however linux kernel or user frequncy governor can adjust cpu
frequency frequently , e.g. for default ondemand governor the
frequency is 10ms . Such interval is way to small comparing to usual
sysstat interval e.g 5min. So an accumulated value is needed .
cpufreq-stats is a driver ( It seems had entered into kernel 2.6.11,
and its document is available at kernel 2.6.12,
http://lxr.linux.no/#linux+v2.6.12/Documentation/cpu-freq/cpufreq-stats.txt).
For recent ubuntu , cpufreq-stats and cpufreq driver is built into
kernel. The average frequency can be fetch as follow
each line define a pair of frequency and its accumulated ticks since
reboot. sysstat can sample it and take the difference as the
accumulated ticks at the sampling interval , and calculated weighted
average cpu frequency.
The cpufreq-stats do have some pitfall which is addressed in patch
https://patchwork.kernel.org/patch/72488/).
Nevertheless its current form is already quite useful, I suggest
sysstat to utilize if available , and fall back to /proc/cpuinfo if
not.
Sebastien Godard [Sun, 10 Oct 2010 14:39:28 +0000 (16:39 +0200)]
Added a new magical value for each activity in file.
A format change can now hit only one activity instead of the whole file.
Sadf has also been updated to be able to display activities with
unknown format (sadf -H).
Create a new activity (A_HUGE) for hugepages statistics.
Hugepages statistics have been added as an additional output
for memory activity by commit d7ed8d382140e2d709a6753fa44a0acfcba91a7e.
Create a dedicated activity for them (A_HUGE). This is quite cleaner
although the drawback is that /proc/meminfo file will be now read twice.
Added SADC_OPTIONS to sysstat configuration file, and sysstat(5) manual page.
Mail from Ivana Varekova (20/09/2010):
SADC_OPTIONS is now the prefered way to pass args to sadc. It is read from
sa1 and sa2 shell scripts from /etc/sysconfig/sysstat configuration file.
Also add sysstat(5) manual page that describes the various environment
variables and their meanings.
Mail from Ivana Varekova (21/09/2010):
Using --disable-man-group option with configure resulted in man_group variable
being used. With --enable-man-group, the variable was ignored, which is the
opposite of what is expected.
This patch fixes that.
Moved manual pages from $prefix/man to $prefix/share/man.
Mail from Ivana Varekova (21/09/2010).
The Linux Filesystem Hierarchy now defines the default location for
manual pages as /usr/share/man instead of /usr/man.
So update sysstat to reflect this change.
See: http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html#usr
Updated .gitignore file to ignore some more files.
Updated various source headers: (C) 2010 instead of (C) 2009.
Updated sar manual page: sar -A also includes -m ALL.