Apply README changes from Josh Kupershmidt to source file doc/pgBadger.pod, README...

author Darold Gilles <gilles@darold.net>

Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)

committer Darold Gilles <gilles@darold.net>

Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)
author Darold Gilles <gilles@darold.net>
Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)
committer Darold Gilles <gilles@darold.net>
Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)
diff --git a/doc/pgBadger.pod b/doc/pgBadger.pod

index 47056db0203e5e04f7464ec0bba3eca4aaef2fd3..36532f112aa07120ea4e2192036c12c420f0115e 100644 (file)
--- a/doc/pgBadger.pod
+++ b/doc/pgBadger.pod
@@ -13,8 +13,8 @@ Arguments:
      logfile can be a single log file, a list of files, or a shell command
      returning a list of files. If you want to pass log content from stdin
      use - as filename. Note that input from stdin will not work with csvlog.
-    You can also use a file containing a list of log file to parse, see -L
-    command line option.
+    You can also use a file containing a list of log files to parse, see
+    the -L command line option.
  
  Options:
  
@@ -86,7 +86,7 @@ Options:
      -w | --watch-mode      : only report errors just like logwatch could do.
      -x | --extension       : output format. Values: text, html, bin, json or
                               tsung. Default: html
-    -X | --extra-files     : in incremetal mode allow pgbadger to write CSS and
+    -X | --extra-files     : in incremetal mode allow pgBadger to write CSS and
                              JS files in the output directory as separate files.
      -z | --zcat exec_path  : set the full path to the zcat program. Use it if
                               zcat or bzcat or unzip is not in your path.
@@ -123,14 +123,14 @@ Options:
                               You can use this option multiple times.
      --exclude-appname name : exclude entries for the specified application name
                              from report. Example: "pg_dump".
-    --exclude-line regex   : pgbadger will start to exclude any log entry that
+    --exclude-line regex   : pgBadger will start to exclude any log entry that
                              will match the given regex. Can be used multiple
                              time.
      --anonymize            : obscure all literals in queries, useful to hide
                               confidential data.
      --noreport             : prevent pgbadger to create reports in incremental
                              mode.
-    --log-duration         : force pgbadger to associate log entries generated
+    --log-duration         : force pgBadger to associate log entries generated
                               by both log_duration = on and log_statement = 'all'
      --enable-checksum      : used to add a md5 sum under each query report.
  
@@ -190,39 +190,39 @@ Or better, use the auto-generated incremental reports:
  
  will generate a report per day and per week.
  
-In incremental mode, you can also specify the number of week to keep in the
+In incremental mode, you can also specify the number of weeks to keep in
  reports:
  
      /usr/bin/pgbadger --retention 2 -I -q /var/log/postgresql/postgresql.log.1 
         -O /var/www/pg_reports/
  
-If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can
-use pgbadger as follow to exclude these period from the report:
+If you have a pg_dump at 23:00 and 13:00 each day lasting half an hour,
+you can use pgBadger as follows to exclude those periods from the report:
  
      pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log 
  
-This will help avoid having COPY statements, as generated by pg_dump, on top of
-the list of slowest queries. You can also use --exclude-appname "pg_dump" to
+This will help avoid having COPY statements, as generated by pg_dump, at
+the top of the list of slowest queries. You can also use --exclude-appname
  solve this problem in a simpler way.
  
  
  =head1 DESCRIPTION
  
-pgBadger is a PostgreSQL log analyzer build for speed with fully detailed
+pgBadger is a PostgreSQL log analyzer built for speed with fully
  reports from your PostgreSQL log file. It's a single and small Perl script
-that outperform any other PostgreSQL log analyzer.
+Perl script that outperforms any other PostgreSQL log analyzer.
  
-It is written in pure Perl language and uses a javascript library (flotr2)
-to draw graphs so that you don't need to install any additional Perl modules
-or other packages. Furthermore, this library gives us more features such as
-zooming. pgBadger also uses the Bootstrap javascript library and the FontAwesome
-webfont for better design. Everything is embedded.
+It is written in pure Perl and uses a javascript library (flotr2) to draw
+graphs so that you don't need to install any additional Perl modules or
+other packages. Furthermore, this library gives us more features such
+as zooming. pgBadger also uses the Bootstrap javascript library and
+the FontAwesome webfont for better design. Everything is embedded.
  
  pgBadger is able to autodetect your log file format (syslog, stderr or csvlog).
-It is designed to parse huge log files as well as gzip compressed file. See a
+It is designed to parse huge log files as well as gzip compressed files. See a
  complete list of features below. Supported compressed format are gzip, bzip2
-and xz. For the last one you must have a xz version upper than 5.05 that support
-the --robot option.
+and xz. For the xz format you must have an xz version upper than 5.05 that
+supports the --robot option.
  
  All charts are zoomable and can be saved as PNG images.
  
@@ -232,8 +232,8 @@ report using command line options.
  pgBadger supports any custom format set into the log_line_prefix directive of
  your postgresql.conf file as long as it at least specify the %t and %p patterns.
  
-pgBadger allow parallel processing on a single log file and multiple files
-through the use of the -j option and the number of CPUs as value.
+pgBadger allows parallel processing of a single log file or multiple
+files through the use of the -j option specifying the number of CPUs.
  
  If you want to save system performance you can also use log_duration instead of
  log_min_duration_statement to have reports on duration and number of queries only.
@@ -258,8 +258,8 @@ pgBadger reports everything about your SQL queries:
         Queries generating the most cancellation.
         Queries most cancelled.
  
-The following reports are also available with hourly charts divide by periods of
-five minutes:
+The following reports are also available with hourly charts divided into
+periods of five minutes:
  
         SQL queries statistics.
         Temporary file statistics.
@@ -268,7 +268,7 @@ five minutes:
         Cancelled queries.
         Error events (panic, fatal, error and warning).
  
-There's also some pie reports of distribution about:
+There are also some pie charts about distribution of:
  
         Locks statistics.
         Queries by type (select/insert/update/delete).
@@ -283,23 +283,23 @@ highlighted and beautified automatically.
  
  You can also have incremental reports with one report per day and a cumulative
  report per week. Two multiprocess modes are available to speed up log parsing,
-one using one core per log file, and the second to use multiple core to parse
-a single file. Both modes can be combined.
+one using one core per log file, and the second using multiple cores to parse
+a single file. These modes can be combined.
  
  Histogram granularity can be adjusted using the -A command line option. By default
-they will report the mean of each top queries/error occuring per hour, but you can
+they will report the mean of each top queries/errors occuring per hour, but you can
  specify the granularity down to the minute.
  
  pgBadger can also be used in a central place to parse remote log files using a
-password less SSH connection. This mode can be used with compressed files and
-in mode multiprocess per file (-J) but can not be used with CSV log format.
+passwordless SSH connection. This mode can be used with compressed files and in
+the multiprocess per file mode (-J) but can not be used with the CSV log format.
  
  
  =head1 REQUIREMENT
  
  pgBadger comes as a single Perl script - you do not need anything other than a modern
  Perl distribution. Charts are rendered using a Javascript library so you don't need
-anything. Your browser will do all the work.
+anything other than a web browser. Your browser will do all the work.
  
  If you planned to parse PostgreSQL CSV log files you might need some Perl Modules:
  
@@ -365,10 +365,10 @@ You must first enable SQL query logging to have something to parse:
  
          log_min_duration_statement = 0
  
-Here every statement will be logged, on busy server you may want to increase
-this value to only log queries with a higher duration time. Note that if you
-have log_statement set to 'all' nothing will be logged through directive
-log_min_duration_statement. See next chapter for more information.
+Here every statement will be logged, on a busy server you may want to increase
+this value to only log queries with a longer duration. Note that if you have
+log_statement set to 'all' nothing will be logged through the log_min_duration_statement
+directive. See the next chapter for more information.
  
  With 'stderr' log format, log_line_prefix must be at least:
  
@@ -400,7 +400,7 @@ You need to enable other parameters in postgresql.conf to get more information f
          log_temp_files = 0
         log_autovacuum_min_duration = 0
  
-Do not enable log_statement as their log format will not be parsed by pgBadger.
+Do not enable log_statement as its log format will not be parsed by pgBadger.
  
  Of course your log messages should be in English without locale support:
  
@@ -409,14 +409,14 @@ Of course your log messages should be in English without locale support:
  but this is not only recommended by pgBadger.
  
  Note: the session line [%l-1] is just used to match the default prefix for "stderr".
-The -1 has no real purpose and basically is not used in Pgbadger statistics / graphs.
-You can safely removed them from the log_line_prefix but you will need to set the
---prefix command line option.
+The -1 has no real purpose and basically is not used in pgBadger statistics / graphs.
+You can safely remove them from the log_line_prefix but you will need to set the
+--prefix command line option accordingly.
  
  =head1 log_min_duration_statement, log_duration and log_statement
  
-If you want full statistics reports you must set log_min_duration_statement
-to 0 or more milliseconds.
+If you want the query statistics to include the actual query strings,
+you must set log_min_duration_statement to 0 or more milliseconds.
  
  If you just want to report duration and number of queries and don't want all
  details about queries, set log_min_duration_statement to -1 to disable it and
@@ -434,7 +434,7 @@ set to 'all' nothing will be logged with log_line_prefix.
  To enable parallel processing you just have to use the -j N option where N is
  the number of cores you want to use.
  
-pgbadger will then proceed as follow:
+pgBadger will then proceed as follow:
  
         for each log file
             chunk size = int(file size / N)
@@ -447,17 +447,17 @@ pgbadger will then proceed as follow:
         All binary temporary files generated will then be read and loaded into
         memory to build the html output.
  
-With that method, at start/end of chunks pgbadger may truncate or omit a
-maximum of N queries perl log file which is an insignificant gap if you have
+With that method, at start/end of chunks pgBadger may truncate or omit a
+maximum of N queries per log file which is an insignificant gap if you have
  millions of queries in your log file. The chance that the query that you were
-looking for is loose is near 0, this is why I think this gap is livable. Most
+looking for is lost is near 0, this is why I think this gap is livable. Most
  of the time the query is counted twice but truncated.
  
-When you have lot of small log files and lot of CPUs it is speedier to dedicate
+When you have many small log files and many CPUs it is speedier to dedicate
  one core to one log file at a time. To enable this behavior you have to use
  option -J N instead. With 200 log files of 10MB each the use of the -J option
-start being really interesting with 8 Cores. Using this method you will be sure
-to not loose any queries in the reports.
+starts being really interesting with 8 Cores. Using this method you will be
+sure not to lose any queries in the reports.
  
  He are a benchmarck done on a server with 8 CPUs and a single file of 9.5GB.
  
@@ -466,7 +466,7 @@ He are a benchmarck done on a server with 8 CPUs and a single file of 9.5GB.
            -j   | 1h41m18 | 50m25 | 25m39 | 15m58
            -J   | 1h41m18 | 54m28 | 41m16 | 34m45
  
-With 200 log files of 10MB each and a total og 2GB the results are slightly
+With 200 log files of 10MB each and a total of 2GB the results are slightly
  different:
  
           Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU
@@ -474,49 +474,50 @@ different:
             -j   | 20m15 |  9m56 |  5m20 | 4m20
             -J   | 20m15 |  9m49 |  5m00 | 2m40
  
-So it is recommanded to use -j unless you have hundred of small log file
+So it is recommended to use -j unless you have hundreds of small log files
  and can use at least 8 CPUs.
  
-IMPORTANT: when you are using parallel parsing pgbadger will generate a lot
-of temporary files in the /tmp directory and will remove them at end, so do
-not remove those files unless pgbadger is not running. They are all named
-with the following template tmp_pgbadgerXXXX.bin so they can be easily identified.
+IMPORTANT: when you are using parallel parsing pgBadger will generate a
+lot of temporary files in the /tmp directory and will remove them at the
+end, so do not remove those files unless pgBadger is not running. They are
+all named with the following template tmp_pgbadgerXXXX.bin so they can be
+easily identified.
  
  =head1 INCREMENTAL REPORTS
  
-pgBadger include an automatic incremental report mode using option -I or
+pgBadger includes an automatic incremental report mode using option -I or
  --incremental. When running in this mode, pgBadger will generate one report
  per day and a cumulative report per week. Output is first done in binary
  format into the mandatory output directory (see option -O or --outdir),
  then in HTML format for daily and weekly reports with a main index file.
  
-The main index file will show a dropdown menu per week with a link to the week
-report and links to daily reports of this week.
+The main index file will show a dropdown menu per week with a link to each
+week's report and links to daily reports of each week.
  
-For example, if you run pgBadger as follow based on a daily rotated file:
+For example, if you run pgBadger as follows based on a daily rotated file:
  
      0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \
          -O /var/www/pg_reports/
  
  you will have all daily and weekly reports for the full running period.
  
-In this mode pgBagder will create an automatic incremental file into the
+In this mode pgBadger will create an automatic incremental file in the
  output directory, so you don't have to use the -l option unless you want
-to change the path of that file. This mean that you can run pgBadger in
-this mode each days on a log file rotated each week, it will not count
+to change the path of that file. This means that you can run pgBadger in
+this mode each day on a log file rotated each week, and it will not count
  the log entries twice.
  
  To save disk space you may want to use the -X or --extra-files command line
  option to force pgBadger to write javascript and css to separate files in
  the output directory. The resources will then be loaded using script and
-link tag.
+link tags.
  
  =head1 BINARY FORMAT
  
  Using the binary format it is possible to create custom incremental and
-cumulative reports. For example, if you want to refresh a pgbadger report
-each hour from a daily PostgreSQl log file, you can proceed by running each
-hour the following commands:
+cumulative reports. For example, if you want to refresh a pgBadger
+report each hour from a daily PostgreSQL log file, you can proceed by
+running each hour the following commands:
  
      pgbadger --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log
  
@@ -525,8 +526,9 @@ report from that binary file:
  
      pgbadger sunday/*.bin
  
-Or an other example, if you have one log file per hour and you want a reports to be
-rebuild each time the log file is switched. Proceed as follow:
+Or as another example, if you generate one log file per hour and you want
+reports to be rebuilt each time the log file is rotated, proceed as
+follows:
  
         pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log
         pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log
@@ -538,7 +540,7 @@ is generated, just do the following:
  
         pgbadger -o day1_report.html day1/*.bin
  
-Adjust the commands following your needs.
+Adjust the commands to suit your particular needs.
  
  =head1 JSON FORMAT
author	Darold Gilles <gilles@darold.net>
	Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)
committer	Darold Gilles <gilles@darold.net>
	Sat, 25 Jul 2015 17:31:46 +0000 (19:31 +0200)