seb [Mon, 19 Nov 2012 20:48:40 +0000 (21:48 +0100)]
Fixed DTD document.
When computer has run all day without restart, XML output file
from sadf -x contains no boot elements.
So change DTD document accordingly (from <!ELEMENT restarts (boot+)>
to <!ELEMENT restarts (boot*)>).
XSD document is not updated, but its version number changes to remain
consistent with that of DTD document.
Bug reported by Peter Schiffer <pschiffe@redhat.com> 13/11/2012.
Fixed a fatal error when compiled with -Werror=format-security.
This change is a workaround a fatal error that we get when compiling
sysstat with -Werror=format-securit.
Mail from Guillaume Rousse (guillomovitch@gmail.com) 30/07/2012:
Voici un patch que j'ai retrouvé dans le package mageia de sysstat, qui corrige une erreur d'utilisation de printf (fatale avec les options -Wformat -Werror=format-security).
[...]
XML DTD document name is now tagged with a version number.
DTD document name now includes a version number.
XML output displayed by sadf -x points at the DTD document
which applies to this specific version.
Mail from Frank Ch Eigler (fche@redhat.com) 26/09/2012:
From some brief testing, it appears as though sadf's xml output format
has changed a few times over time, but the same xml dtd URL is being
emitted: http://pagesperso-orange.fr/sebastien.godard/sysstat.dtd If
indeed the dtd has changed over time, wouldn't it be wise to keep
newer versions tagged with a version number in the URLs, so that
URL-based xml validation would succeed well into the future?
sar -r now tracks the amount of dirty memory (memory waiting to
get written back to disk).
DTD and XSD documents updated.
Sar manual page updated.
Mail from Michael Blakeley (mike@blakeley.com) 28/09/2012:
I've been thinking about patching the sa collectors to track the "Dirty" metric from /proc/meminfo, and sar to report on it. This would be useful for applications where latency is important: having historical data on dirty writeback pages can help trace the kind of problems that can be addressed by tuning vm.dirty_bytes and friends.
In the 10.1 code I see that a few functions and structs already make use of meminfo. What's your philosophy on this? Should dirty-kB be a new struct, or perhaps merge with the existing meminfo_huge struct?
Similarly for reporting, should I focus on a new option ("-D" perhaps?) or try to piggyback on an existing one?
New field added to sar -u: %gnice (time spent
running a niced guest).
sar manual page updated.
DTD and XSD documents updated.
sadf various output (XML, CSV, etc.) updated.
Sysstat service unit file has been added to replace init script
for systems with systemd support.
Mail from Peter Schiffer <pschiffe@redhat.com> 24/08/2012:
I am sending you a patch which adds support for systemd to sysstat. With this patch, configure script detects whether the system uses systemd, and if yes, it installs the unit file and enables the sysstat service. Then, init script is not used.
Systemd is default since Fedora 15, and many more distributions are slowly converting to it. For more information about it, see it's home page [1], man pages about unit files and integration into autotools can be found at [2 - 6]. General information about systemd can be found at [7].
Sysstat init script updated to make it more conforming to LSB.
Sysstat init script updated:
* instead of temp file, use /var/run/sysstat.pid file
* check user privileges on start and stop (even though stop does nothing)
* status command works "correctly" (of course it will always return 3 - service is not running, but for the love of the standards..)
Mail from Peter Schiffer (pschiffe@redhat.com) 08/08/2012:
I was looking into sysstats init script, and here are few things I found out:
* init scripts are usually run with root privileges (at least on system start). But then: mktemp command is run under root (line 24 in init script), and if @SU_C_OWNER@ is set and sa1 script fails, then @SU_C_OWNER@ won't have privileges to remove the temp file. example:
Also, in this case, condition on line 31 is useless.
* it looks like, that comment starting on line 28 is no true (or, at least, anymore). I did a little test (which is attached). In one script, there is command "exit 17", and another script is calling the first via exec. Now, just run:
So, you can see two things: the exit code is not lost, and even I call "su user -c" on myself, I need to enter my password. The latter can be confusing if @SU_C_OWNER@ is not me, etc..
So.. what I am trying to show:
* command "service sysstat start" should only be called with root privileges
* we can use exit code of command run under "su foo -c ..."
(...)
Now, the init script should be more conforming to LSB (according to the [1]). I'm also attaching output of our internal LSB init script test.
First, this patch renames sadf option '-T' into '-U', and
sadf option '-t' into '-T'.
It then adds a new option: -t. This option tells sadc to display
the timestamps in the local time of the data file creator
instead of UTC. The same option already exists for sar.
The FAQ is also updated: Tell that options -s and -e are always
expressed in local time.
sadf option -T has been renamed into -U, and option -t has been
renamed into -T. This was made compulsory to add a new option -t
consistent with that of sar.
Make sysstat disk counters consistent with those from latest kernel (3.5).
Changed the type of some disk counters to keep in sync with latest 3.5
kernel.
This breaks the compatibility with older sar data files format for disk
activity.
Mail from Peter Schiffer (pschiffe@redhat.com) 15/02/2012:
I am sending you next patch. I've updated reading /proc/diskstats and
/sys/block/<disk>/stat files in iostat.c and rd_stats.c source files
according to latest kernel (3.2.6).
Problem was, that in case of very high I/O operations, sar -d and iostat
outputted overflowed values:
Various cosmetic changes in manual pages and usage messages displayed by sysstat commands.
Mail from Peter Schiffer (pschiffe@redhat.com) 04/07/2012:
I am sending you 2 patches. They are minor modifications to the
documentation:
In man-usage.patch I tried to unify usage output of programs and man
pages synopsis, so all the usages of utilities are similar. Also, it
fixes one bug in iostat man page where Network Filesystem report was
mentioned and problem in sar man page where some options in synopsis
weren't bold.
In man-asciibetical-order.patch I tried to unify order of option
descriptions in man pages, I used asciibetical order (uppercase before
lowercase) - again, so all man pages can be similar.
These are rather cosmetic changes than fixes, however I think unity is
important for better user experience.
Persistent device names support added to sar and iostat (option -j)
Option -j added to sar and iostat to add support for persistent device
names.
Mail from Peter Schiffer (pschiffe@redhat.com) 22/06/2012:
I need to implement another feature for sysstat and I would like to hear
your opinion.
Pretty device names, such as sda, vda, ... are not persistent and in
some specific situations kernel can assign different names for the same
device between reboots. To prevent this confusion, persistent device
names exists. Those names are in /dev/disk/by-xxx folders... You are
probably aware of this..
So, currently, sar -d -p and iostat are displaying the pretty device
names. I would like to add new option, which would take one argument
specifying the type of persistent name, and then, sar -d -p and iostat
would display the device names in that persistent name, e.g.:
It would work like this: I wouldn't bound the type of name, rather I
would check whether the specified type inserted by user exists like
folder: /dev/disk/by-label (in that particular example). If yes, I would
resolve links in that folder until I found the device I was looking for
and display that name.. It should be straightforward.
My questions are: how would you name the option? And where would you put
the common code for sar and iostat? Also, do you have any comments or
ideas?
sar: Use /sys/dev/block/major:minor links to determine devices real name.
Now use /sys/dev/block/major:minor links to determine devices real name.
This is used as the first option now, before using sysstat.ioconf
configuration file.
Mail from Peter Schiffer (pschiffe@redhat.com) 20/06/2012:
I am sending you a patch which is looking into
/sys/dev/block/major:minor link to determine the device name. This
should work for any device, but I let it as the last option when
determining devname. What do you think?
Sebastien [Sat, 30 Jun 2012 20:32:38 +0000 (22:32 +0200)]
Added option -[0-9]* to sar to show data of that days ago.
Mail from Don <do1@yandex.ru> 22/06/2012:
Hello Sysstat author(s),
Please add option `-[0-9]+' like `-1', `-2', etc. to sar tool, to show
data of that days ago. That would be handy and useful for everybody.
Example implementation in bash:
function sar() {
case "$1" in -[0-9]*) local OPT="-f /var/log/sa/sa`date +%d
--date=${1#-}' day ago'`"; shift;; esac
command sar $OPT "$@"
}
Limitation of this implementation is that -1 option should be first, but
if you implement this type of options in main code you can avoid it and
also many people will benefit. It will easier than to rememebr proper
date and type `-f /var/log/sa/saXX'.
sadc now overwrites its daily data file when it is from a previous month.
When the output file is specified as "-", sadc now overwrites the daily
data file if it is from a previous month. This is useful to prevent data
from several months from being saved in the same file.
Mail from Vitezslav Cizek <vcizek@suse.cz>:
Hi,
/var/log/sa/saXX files don't get overwritten when new month comes.
The new data is appended to the end.
Reproduced with several versions of sysstat.
I browsed the code, but couldn't find any part relevant
to the date checking when opening files.
It works as advertised in manpage with the attached patch.
(Against 10.0.4)
The patch doesn't check whether it operates on the standard file or not.
Change time format from HH-MM-SS to HH:MM:SS in the reports
displyaed by sadf.
Change the time format to HH:MM:SS to be consistent with XSD
document. This was already the format used a few versions ago
and was changed for an unknown reason.
Mail from Frank Glinka <glinkaf@uni-muenster.de> 24/04/2012:
Your sysstat.xsd specifies the type xs:time for the time elements within the XML. According to the schema specification the format then must be HH:MM:SS for the data but sadf's format is HH-MM-SS. Correspondingly, the element is not valid and not parsed by JAXB.
A maxOccurs indicator has been added for the timestamp element.
This is indicator is set to "unbounded", which is compulsory here
as the default value is 1 if not specified.
Mail from Frank Glinka <glinkaf@uni-muenster.de> 24/04/2012:
There seem to be gaps/inconsistencies between the produced XML and the sysstat.xsd that you provide on your website. The xsd does not specify any maximum occurrence indicators, which in that case default to '1'. For example, although the 'timestamp' element occurs multiple times within the 'statistics' element inside the XML, the xsd claims it appears exactly once. In that case, a 'maxOccurs="unbounded"' is required. (This is an issue if I use Java's JAXB and your sysstat.xsd to read & parse the XML file automatically.)
Options -g and -T added to iostat. These options enable the user
to display statistics for groups of devices.
Option -g adds support for device groups statistics to iostat.
Option -T tells iostat to display global stats for groups only, and not
stats for individual devices in those groups.
On 03/09/2012 11:05 AM, Alain Chéreau wrote:
J'ai un serveur avec beaucoup de disques et je voulais ne voir que la
somme des io des disques utilisés dans le meme type d'usage.
J'ai modifé iostat pour faire le cumul, somme, ou moyenne, par groupes
de devices et réalisé la somme sur le nom donné.
J'attache le code modifié qui réalise cela.
On 03/13/2012 01:53 PM, Alain Chéreau wrote:
Bonjour,
un collegue a également l'intention de s'en servir pour transformer les noms de device en nom logique. Pour simplifier la lecture la mise en plce des titres des courbes.
Sebastien Godard [Thu, 15 Mar 2012 15:49:36 +0000 (16:49 +0100)]
Set exit code to 0 for sa2 script shell.
Mail from Peter Schiffer (pschiffe@redhat.com) 13/03/2012
Hello,
we have found minor issue in sysstat. Exit code of sa2 doesn't have to be 0 even if everything is OK. This is because the last operation there is precautionary rmdir which may not have anything to do:
$ sudo sh /usr/lib64/sa/sa2 -A
$ echo $?
1
I'm attaching simple patch adding "exit 0" at the end of sa1 and sa2. What do you think?
Sebastien Godard [Fri, 24 Feb 2012 15:04:06 +0000 (16:04 +0100)]
The number of jiffies spent by a CPU in guest mode given by the
corresponding counter in /proc/stat may be slightly different
from that included in the user counter. Take this into account
when calculating current time interval value.
This should be a very rare case, and the difference barely noticeable.
Mail from Peter Schiffer 23/02/2012:
Hello Sebastien,
I've done all my work on sysstat for now, but there were no new patches for you.
However, there is one patch I don't know origin of. I am sending it to you as attachment. It's in Fedora since sysstat 10.0.0, but there's no bug related to it. Did Ivana send it to you already? Do you have it included in the next development version?
Anyways, you can release next sysstat version now, thank you for waiting.
/*
***************************************************************************
- * Remove /dev from path name.
+ * Canonicalize and remove /dev from path name.
*
* IN:
- * @name Device name (may begins with "/dev/")
+ * @name Device name (may begins with "/dev/" or can be a symlink)
*
* RETURNS:
* Device basename.
@@ -390,10 +390,33 @@ int get_win_height(void)
*/
char *device_name(char *name)
{
- if (!strncmp(name, "/dev/", 5))
- return name + 5;
+ char *out;
+ char *resolved_name;
- return name;
+ /* realpath() creates new string, so we need to free it later. */
+ resolved_name = realpath(name, 0);
+
+ /* If path doesn't exists, just copy the input and return it. We have to
+ copy because result of this function is always freed. */
+ if (!resolved_name) {
+ out = (char *)calloc(1, sizeof(name));
+ strncpy(out, name, sizeof(name));
+
+ return out;
+ }
+
+ /* We need to copy the path without "/dev/" prefix because we cannot free
+ 'resolved_name + 5' string. */
+ if (!strncmp(resolved_name, "/dev/", 5)) {
+ out = (char *)calloc(1, sizeof(resolved_name) - 4);
+ strncpy(out, resolved_name + 5, sizeof(resolved_name) - 4);
+
+ free(resolved_name);
+
+ return out;
+ }
+
+ return resolved_name;
}
The configure script has been updated, with the addition of a new
option (--disable-stripping) which tells configure to NOT strip
object files (option "-s" is no longer passed to gcc when linking
the binaries).
From Petr Uzel <petr.uzel@suse.cz> 13/11/2011 11:45 AM
Could you please add something like make install_nostrip to the
Makefile? By default, the buildsystem uses LDFLAGS = -s, which always
strips the resulting binary. In openSUSE, we have to patch this,
because we need to strip the binaries on our own (to create
sysstat-debug{source,info} packages).
Sebastien Godard [Sun, 20 Nov 2011 15:29:07 +0000 (16:29 +0100)]
Fixed random crash with iostat when called with
option -N [NOVELL Bug#729130].
Mail from Petr Uzel <petr.uzel@suse.cz> 11/13/2011 11:45 AM:
> > On 11/09/2011 01:34 PM, Petr Uzel wrote:
>> > >attached patch fixes
>> > >https://bugzilla.novell.com/show_bug.cgi?id=729130
>> > >
Hi Sebastien,
As far as I understand (please correct me if I'm wrong), sysstat is
hitting unspecified behavior, which might explain why you can not
reproduce the bug.
Check out transfrom_devmapname() function:
while ((dp = readdir(dm_dir)) != NULL) {
/* For each file in DEVMAP_DIR */
dm_name points to the memory returned by readdir(), but from 'man
readdir', this memory is not guaranteed to be valid after next call
to readdir or after closedir().
man readdir:
....
The data returned by readdir() may be overwritten by subsequent
calls to readdir() for the same directory stream.
....
man closedir:
....
The closedir() function closes the directory stream
associated with dirp. A successful call to closedir() also
closes the underlying file descriptor associated with dirp.
The directory stream descriptor dirp is not available after
this call.
....
[It is not very clear to me if this also invalidates the dirent
structure]
So the solution is to strncpy the memory before it gets invalidated by
next readdir() or closedir().
Unrelated to this bug:
Could you please add something like make install_nostrip to the
Makefile? By default, the buildsystem uses LDFLAGS = -s, which always
strips the resulting binary. In openSUSE, we have to patch this,
because we need to strip the binaries on our own (to create
sysstat-debug{source,info} packages).
Sebastien Godard [Fri, 11 Nov 2011 15:01:37 +0000 (16:01 +0100)]
New output format added to sadf: JSON.
JSON (JavaScript Object Notation) is a lightweight data interchange
format (less verbose than XML). sadf can now display sar's data
in JSON using a new switch (-j).
Sebastien Godard [Wed, 31 Aug 2011 13:11:25 +0000 (15:11 +0200)]
Fixed bugs in sadf XML output and in DTD/XSD documents.
On 08/29/2011 02:27 PM, "Jürgen Heinemann (Undefined)" wrote:
> Hallo Sebastian,
> I have found some bugs with sadf -x command.
> You can see my changes in sysstat-10.0.2.rc1.diff attachment.
> The Doctype Declaration in sadf_misc.c isn't set to "sysstat" rootNode
> and timetamp Element closed with child Elements
> See my Example xslt
>
> sadf -P 0,1 -x > input.xml
> xsltproc --encoding utf-8 --novalid sysstat.xslt input.xml
>
> greets Jürgen
Sebastien Godard [Tue, 16 Aug 2011 12:13:27 +0000 (14:13 +0200)]
Fixed a bug with pidstat, where stats for terminated processes
were still displayed.
On 07/06/2011 08:40 AM, Kei Ishida wrote:
> Hello
>
> I found the folloring bug of pidstat. I hope this would help.
>
> [Bug description]
> Pidstat displayed pid(s) of dead process(es) every other times
> when multiple pids are specified.
>
> [How reproducible]
> Always. Run pidstat with more then pids to watch, and kill one of them.
>
> [Proposed patch]
> diff -urNp sysstat-10.0.1/pidstat.c sysstat-10.0.1_/pidstat.c
> --- sysstat-10.0.1/pidstat.c 2011-06-01 22:05:12.000000000 +0900
> +++ sysstat-10.0.1_/pidstat.c 2011-07-06 11:05:46.149880175 +0900
> @@ -1011,6 +1011,20 @@ int get_pid_to_display(int prev, int cur
>
> else if (DISPLAY_PID(pidflag)) {
>
> + unsigned int i;
> + int pid_exists = FALSE;
> +
> + /* See if pid exists in pid_array[] */
> + for (i = 0; i < pid_array_nr; i++) {
> + if ((*pstc)->pid == pid_array[i]) {
> + pid_exists = TRUE;
> + break;
> + }
> + }
> +
> + if (!pid_exists)
> + return 0;
> +
> *pstp = st_pid_list[prev] + p;
> }
>
>
> Regards,
> Kei Ishida
Sebastien Godard [Mon, 15 Aug 2011 14:00:05 +0000 (16:00 +0200)]
Option "-P ON" added to mpstat.
This option tells mpstat to display statistics for online
processors only.
mpstat manual page updated.
On 06/30/2011 06:41 AM, Ananth N Mavinakayanahalli wrote:
a. Consistency with output of top and /proc/cpuinfo, both of which won't
display information of offlined CPUs.
b. On POWER7 for instance, with SMT, when SMT is turned off, 3 of the 4
CPUs are off-lined. Someone using just mpstat will have no idea that
he has just 1 CPU underneath while mpstat says he has 4, without an
indication of which ones are usable and which ones are not.
I think, at the least, the -P switch should be educated with a new
option that displays only online CPU information, if there is a hard
requirement to preserve existing functionality.
Sebastien Godard [Sun, 14 Aug 2011 14:33:55 +0000 (16:33 +0200)]
pidstat manual page updated.
Added the description of field %MEM displayed by pidstat -r.
On 06/29/2011 05:43 AM, Carlos Allegri wrote:
> I'm using pidstat version 9.0.6 for monitoring of a mail server.
> I run: "pidstat -r"
> Then, in the output: "PID minflt/s majflt/s VSZ RSS %MEM Command", I see the column "%MEM" which isn't explained in the manual.
> Could you tell me, which is the meaning of this column?
Sebastien Godard [Sat, 13 Aug 2011 13:58:28 +0000 (15:58 +0200)]
sadf manual page updated.
Use of options -t, -H and -T has changed. So updated sadf manual
page accordingly.
Also fix a wrong statement, saying that options -s and -e are ignored
with option -x.
Sebastien Godard [Sat, 13 Aug 2011 12:44:43 +0000 (14:44 +0200)]
DTD and XSD documents updated.
A new mark ("utc") has been added in the XML output
generated by sadf -x. This mark has a value of 0 or 1 depending
on whether the timestamp is expressed in local time or UTC.
A correction has also been made in the XSD document, where the
description of "comment" messages was erroneous.
Sebastien Godard [Sat, 13 Aug 2011 12:28:19 +0000 (14:28 +0200)]
sadf modified to make it easier to add new output formats.
sadf has been heavily modified to make it easiser to add new
output formats. The idea was to take the same architecture pattern
than that of sar. Anyway, I haven't been able to achieve this goal:
The design is still not generic although things have improved.
Automate translation files handling in Makefile.in.
Mail from Jeroen Roovers <jer@gentoo.org> 03/06/2011:
Subject: [PATCH] automate translation file handling in Makefile.in
Hello,
as maintainer of the sysstat package in the Gentoo Linux repository, I
have been maintaining a patch that makes it easier to instruct our
build/install system to install support for certain languages or indeed
save space on the target system by not installing them.
This patch makes the build system not list all the available
translations as the current Makefile.in does, but finds the language
files on its own and repeats the same two commands in a loop over the
files it finds. Also with this patch, the Makefile.in does not need to
be updated for each new translation any longer.
Sebastien Godard [Sat, 28 May 2011 14:03:55 +0000 (16:03 +0200)]
Fixed XML output displayed by sadf (hugepages statistics were
included in power management ones).
When displaying stats with sar, hugepages utilization statistics
were displayed between voltage inputs statistics and CPU clock
ones. This was not really smart but still OK.
Yet, when displaying XML output with sadf -x, hugepages statistics
were included in the <power-management> section, which is quite bad
in this case. So move hugepage structure just after memory utilization
one in activity.c:act[] array.
Sebastien Godard [Tue, 24 May 2011 11:46:17 +0000 (13:46 +0200)]
sar and pidstat: Check that _("Average") string doesn't exceed
the size of the timestamp buffer.
One could find something like:
strcpy(string, _("Average"));
in pidstat.c and sar.c. Yet, we don't know whether the translation
message for "Average" will fit in target string buffer. Hence we
replaced the previous expression with something like:
strncpy(string, _("Average"), length_of_string_buffer);
string[lenght_of_string_buffer - 1] = '\0';
Sebastien Godard [Tue, 24 May 2011 09:43:25 +0000 (11:43 +0200)]
sar: Decrease column width for sensor device name (temperature,
voltage inputs and fans statistics).
There were several unncessary spaces between the last column of
statistics and the device name in the report displayed by sar for
temperature, voltage inputs and fans statistics.
These spaces have been removed.
Sebastien Godard [Tue, 24 May 2011 09:26:35 +0000 (11:26 +0200)]
sadf -p now displays the sensor device name for temperature,
voltage inputs and fans statistics.
The render() function was not properly used in rndr_stats.c, in particular
when the DEVICE name was to be displayed by sadf for fans, voltage inputs
and temperature statistics.
A new flag has been added (PT_USESTR) enabling the render() function to
display strings. As a consequence, sadf -d and sadf -p are now able
to display the sensor device name.
The output of sadf -d has also changed (this is no longer "device;FAN;..."
but "FAN;DEVICE;...". Same thing applies for TEMP and IN statistics).
Option -h added to iostat. This option makes the device utilization
report easier to read with long device names.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
* sysstat_1.patch
adds -h (human readable) option to iostat tool (just to indent the row after the device name)
[...]
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
From Fedora bugzilla: G. Michael Carter 2011-04-11 17:19:58 EDT
As far as reports go this is rather hard to read. Can we get the Device column
to size based on the longest name?
cifsiostat didn't count open files from the "Posix Open" column in
/proc/fs/cifs/Stats file. This is now fixed.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
[...]
* sysstat_3.patch
fixes cifsiostat tool which in open files does not count files which are in /proc/fs/cifs/Stats output in column "Posix Open"
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
Close file descriptor in read_uptime() function (file rd_stats.c).
File descriptor was not closed when /proc/uptime file happened to be empty.
Mail from Ivana Varekova (varekova@redhat.com) 02/05/2011:
Subject: feature request: Resizing device column in iostat to size of larges lvm device name
Hello,
I'm sending three patches against sysstat-10.0.0:
[...]
* sysstat_2.patch
fix rear_uptime bug - this function does not close the file descriptor which is open in it in the special situation
[...]
If you need a clarification to arbitrary of them please sent me an e-mail.
Ivana
Sebastien Godard [Fri, 11 Mar 2011 13:45:05 +0000 (14:45 +0100)]
Fixed several bufs with nfsiostat (and cifsiostat).
Mail from Masanari Iida (masanari.iida@hp.com) 08/02/2011:
Sebastien,
Thanks for the fix.
Before you release the core on end of Feb, I would ask you to test the
code with following 2 scenario.
(1) Mount / Umount while nfsiostat running.
Check points
(a) nfsiostat detect new nfs mount points after nfsiostat started.
The mounted NFS share have to be reported by nfsiostat.
(b) nfsiostat detect nfs mount points which is umounted after nfsiostat started.
No lines reported by nfsiostat after umount NFS share.
This is an original bug scenario that I reported to you in the first e-mail.
(2) nfsiostat not showing incorrect value when NFS mount point re-mount happen.
Following symptom was seen on sysstat 7.0.2 (on RHEL5).
Step to reproduce.
(1) Mount an NFS share
(2) Run iostat (or nfsiostat)
(3) Umount the NFS share
(4) Mount the same NFS share that umounted in step 3.
(5) Check the iostat result for the NFS share.
Actual Result on sysstat 7.0.2.
Very large value for rops/s and wops/s display one time.
These are incorrect values.
Expected result.
The nfsiostat reported correct value for rops/s and wops/s
even when the NFS mount point re-mounted while nfsiostat running.
Regards,
Masanari Iida
Mail from Masanari Iida (masanari.iida@hp.com) 24/02/2011:
Hello,
Thank you for your support.
I have tested your new code.
The bugs that I have reported in previous e-mail are fixed on this version.
So I don't get no more coredump, and nfsiostat detect all mounted and
umounted filesystems while running.
One minor issue still remain here.
On 2nd result of the nfsiostat always include some unknown value.
(These 0 data are correct, since I just mount the NFS share and not doing
any I/O during the test. )
How to reproduce 1
(1) Mount one NFS filesystem.
(2) Run nfsiostat with interval and count options. The count must be 3 or more.
(3) Check the 2nd result from nfsiostat.
How to reproduce 2
(1) Run nfsiostat with interval.
(2) Mount the NFS filesystem while running the nfsiostat.
(3) Check the 2nd result of the just mounted NFS filesystem.
I know that vmstat or iostat case, the first result is a history of data since OS boot.
So I usually ignore the first data and use after the 2nd result.
In NFS automount enviroment, NFS mount/umount happen at random timing.
So it is hard to remove these sudden large value from data.
Impact.
When a script draw a graph, these sudden large values expand Y scale.
So the normal value may looks smaller than expected in the graph.
Don't link sysstat's commands with sensors library if not needed.
Sebastien,
I'm forwarding a bug report I've got today. The submitter is right. The
only binary that actually needs sensors is sadc. but all the remaining
binaries are linked with the library, and it's not easy to change it.
I was trying to find some way to fix the issue e.g. by using the
--as-needed linker option, but this doesn't worked well (even after
changing ordering of linking). Finally I've came up with an idea of
splitting rd_stats.c into two parts - one that requires sensors and the
other one that doesn't.
I'm attaching some proof of concept patch just to show the idea. It uses
preprocessor macro to compile a tiny version of rd_stats.c that is than
linked with all binaries except for sadc. I don't think the patch is
ready to be applied b you as is. It would be much better to manually
split the rd_stats.c file into two parts, and than make configure not to
add -lsensors to CFLAGS, but put -lsensors in some autoconf variable
that would be added to sadc's LFLAGS. Could you please look at it and
possbly make an appropriate changes for the next version of sysstat?
Thanks,
robert
------ Wiadomosc oryginalna ------
Temat: Bug#612571: sysstat: iostat links libsensors4 with no need
Odeslano-Data: Wed, 09 Feb 2011 09:18:01 +0000, Wed, 09 Feb 2011
09:18:04 +0000
Odeslano-Od: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>
Odeslano-Do: debian-bugs-dist@lists.debian.org
Odeslano-Kopia: Robert Luberda <robert@debian.org>
Data: Wed, 9 Feb 2011 10:15:31 +0100
Nadawca: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>
Odpowiedz-Do: Mario 'BitKoenig' Holbe <Mario.Holbe@TU-Ilmenau.DE>,
612571@bugs.debian.org
Adresat: submit@bugs.debian.org
Package: sysstat
Version: 9.1.7-2
Hello,
iostat from sysstat 9.1.7-2 links /usr/lib/libsensors.so.4.
The manual page doesn't reveal any new functions that could explain this
and according to ldd libsensors isn't used:
$ ldd -u /usr/bin/iostat
Unused direct dependencies:
Sebastien Godard [Sun, 27 Feb 2011 15:07:59 +0000 (16:07 +0100)]
iostat incorrectly mapped device-mapper IDs
greater than 256. This is now fixed [DEBIAN Bug#614397].
Mail from Robert Luberda <robert@debian.org> (22/02/2011)
Subject: Fwd: Bug#614397: iostat(from sysstat) doesn't support more
than 256 device-mapper names
Sebastien,
I'm forwarding a bug report I've got yesterday.
I can't reproduce it by myself, as I don't use LVM at all, but I've just
found out that some Debian machines actually do use it, and the device
numbers on them starts with 252, so the contents of /dev/mapper looks like:
crw-rw---- 1 root root 10, 61 Feb 19 18:43 control
brw-rw---- 1 root disk 252, 0 Feb 19 18:43 vg_$hostname-srv
Regards,
robert
------ Wiadomosc oryginalna ------
Temat: Bug#614397: iostat(from sysstat) doesn't support more than 256
device-mapper names
Odeslano-Data: Mon, 21 Feb 2011 18:27:01 +0000, Mon, 21 Feb 2011
18:27:04 +0000
Odeslano-Od: Adam Heath <doogie@brainfood.com>
Odeslano-Do: debian-bugs-dist@lists.debian.org
Odeslano-Kopia: Robert Luberda <robert@debian.org>
Data: Mon, 21 Feb 2011 12:24:11 -0600
Nadawca: Adam Heath <doogie@brainfood.com>
Odpowiedz-Do: Adam Heath <doogie@brainfood.com>, 614397@bugs.debian.org
Adresat: submit@bugs.debian.org
package: sysstat
severity: minor
version: 9.0.6.1-2
iostat tries to do bit-shifting of device ids; this is a big no-no.
Attached patch at least fixes it for mapping of device-mapper
names(iostat -N).
I would love to have this go into stable-updates(squeeze), but can
understand if it's not the type of change that normally would be allowed.
The circumstances of this bug, cause device 256 to map to 0, 257 to
map to 1, etc.
debian-changes-9.0.6.1-2.1
--- sysstat-9.0.6.1.orig/ioconf.c
+++ sysstat-9.0.6.1/ioconf.c
@@ -500,8 +500,8 @@ char *transform_devmapname(unsigned int
if (stat(filen, &aux) == 0) {
/* Get its minor and major numbers */
Sebastien Godard [Sun, 27 Feb 2011 14:45:23 +0000 (15:45 +0100)]
Option -V from sysstat commands now displays the version number on stdout
and returns 0 for the exit code.
Option -V used to display the version number on stderr and returns 1 for
the exit code.
This is not the expected behavior as it has done everything we asked properly.
So change this: Display on stdout and returns 0.
The same change has been applied to sar's option -h, which displays a
help message.
Mail from Lodewijk Bonebakker <jlbonebakker@gmail.com> (15/02/2011):
Subject: Systat version reporting
Dear Sebastian,
I have question related to the way you report the version number in sysstat. At the moment, it seems that you write:
sysstat version <versionno>
(C) Sebastien Godard (sysstat <at> orange.fr)
to stderr, and set the error-code to 1.
Given the significant changes between 7/8 and 9, we have a tremendous headache in automatically dealing with the different sar data files, collected during the day on different machines (some which we prefer not to upgrade). Currently in our environment we can work with this way of reporting your version number, but I would like to make a suggestion:
It would greatly help us if the version command returns only "sysstat version <versionno>" to stdout and sets the return code to 0. Our reasoning behind this is that 'sar -V' should print the version number and exit 0, since it has done everything we have asked it to do correctly. A non-zero exit code is then reserved for error-conditions. This way we can efficiently get the version-number, check for proper installation/kernel versioning etc.
Thank you for your time in considering this suggestion,
Added the possibility to extend the number of slots for NFS and
CIFS mount points on the fly.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011:
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat.patch - adds the possibility to extend the number of
slots for nfs mount points during the nfsiostat run
cifsiostat:
cifsiostat.patch - adds the possibility to extend the number of
slots for cifs mount points during the cifsiostat run
See also mail from Masanari Iida (masanari.iida@hp.com) 28/01/2011:
Hello
I have a feedback about nfsiostat behavior.
Description
nfsiostat need to be run _AFTER_ the NFS share is mounted.
Version.
nfsiostat in sysstat 9.1.7
How to reproduce
(1) Run nfsiostat -k 1
(2) Mount one NFS share
(3) Check nfsiostat output.
Expected result.
nfsiostat start to report the NFS share's activities, after I mount it.
Actual result.
nfsiostat not reporting the mounted NFS share, even after it is mounted.
Additional information:
If I mount the NFS share _BEFORE_ i run nfsiostat, nfsiostat reports the
NFS share's information. And also, if I umount the NFS share,
the line disappeared. (This is expected.)
It is because, the environment is using autofs, mount and umount often
happens on the system.
But I continously running the iostat -kn to collect the statics, then I have
encounteded this symptom. (And tested with nfsiostat )
In case of multiple NFS mount points, the symptom is bit different.
If an additional NFS share is mounted BEFORE this test is done,
the target NFS share appeared after mount it. And dissapeared it after umount.
But if I do the same thing one more time, it is not display any more.
========
To say the truth, the original symptom is happen on sysstat 7.0
on RHEL5 system, using iostat -kn. The environment uses autofs.
I wanted to confirm if the upstream version already fixed this symptom,
so that's why I downloaded the latest tar ball and tried.
But so far, the similar symptom still exist.
If you think this is current known limitation or known bug,
please document it in man page or FAQ page of your web.
Fix a problem with long NFS and CIFS share names in cifsiostat and
nfsiostat.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat3.patch - fix the problem with long nfs shares names
(...)
cifsiostat:
cifsiostat3.patch - fix the problem with long cifs shares names
(...)
Check calloc() return value in nfsiostat and cifsiostat.
A call to calloc() function to allocate structures in nfsiostat and
cifsiostat wasn't checked for its return code. This call could possibly
fail without ever being noticed.
Mail from Ivana Varekova (varekova@redhat.com) 02/02/2011
Hello, I'm sending 6 patches - 3 for nfsiostat and 3 for cifsiostat
nfsiostat:
nfsiostat2.patch - adds the forgotten test to malloc
(...)
cifsiostat:
cifsiostat2.patch - adds the forgotten test to malloc
(...)
Added --debuginfo option to cifsiostat and nfsiostat commands.
Jan Kaluza from Redhat (jkaluza@redhat.com) added option --debuginfo
to cifsiostat and nfsiostat commands.
His mail (06/30/2010):
Hi,
thanks for applying my previous patch in Sysstat 9.1.3 (I'm really proud to be
in the Changelog). I've created another patch which adds --debuginfo option
also for new tools (nfsiostat, cifsiostat) introduced in this release. I think
it could help debugging in some situations. Please feel free to ask any
question about that patch.
Jan Kaluza
cifsiostat and nfsiostat manual pages have also been updated.
By default, sysstat_panic function is no longer included in binary commands.
This function is defined only if --enable-debuginfo has been used with
configure.
Sebastien Godard [Fri, 17 Dec 2010 20:19:53 +0000 (21:19 +0100)]
Added a new field (blocked) to sar -q.
This patch adds a new metric (blocked - number of tasks currently
blocked, waiting for I/O to complete) to sar -q.
Also update sar manual page, and DTD/XSD documents.
Note that this breaks stats_queue structure format.
Sebastien Godard [Sat, 11 Dec 2010 20:49:22 +0000 (21:49 +0100)]
No longer assume that device-mapper major number is 253.
Get the real number from /proc/devices file.
The sar, sadf and iostat commands used to assume that device-mapper
major number was 253. This happened to be false sometimes. So get
the real number from the /proc/devices file.
From Mike Coleman <tutufan@gmail.com> 04/10/2010:
The iostat program seems to assume that the major device number for
devmap will always be 253. This doesn't seem to be an official
number, though, and I have a box where it actually ends up being 252,
which breaks 'iostat -N'.
It looks like this can be determined dynamically by looking at /proc/devices.
Sebastien Godard [Sat, 11 Dec 2010 14:18:19 +0000 (15:18 +0100)]
[Kenichi Okuyama]: Small change to sar manual page.
Kenichi Okuyama <kenichi.okuyama@gmail.com> noticed that a sentence
in sar manual page was a bit confusing:
ip-frag
Number of IP fragments currently in use.
because unless you are familiar with
Linux Kernel internal, "IP fragments" are not something to be IN USE.
This is really how many elements are there in fragment queues, and
each element of queue is "group of fragmented ip packets", which when
completed, will become single IP packet.
So replace it with:
ip-frag
Number of IP fragments currently in queue.
Sebastien Godard [Tue, 30 Nov 2010 20:28:35 +0000 (21:28 +0100)]
pidstat: Code cleaned.
A comment in pidstat.c (get_pid_to_display()function) still refered
to option -X, although this option no longer exists as it was
merged with option -C.