From: Darold Gilles Date: Sat, 25 Jul 2015 17:31:46 +0000 (+0200) Subject: Apply README changes from Josh Kupershmidt to source file doc/pgBadger.pod, README... X-Git-Tag: v7.2~23 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=f94bf64ac3aa6d8bacd8a5730705a3c0be31dcfa;p=pgbadger Apply README changes from Josh Kupershmidt to source file doc/pgBadger.pod, README is generated using: pod2text doc/pgBadger.pod > README --- diff --git a/doc/pgBadger.pod b/doc/pgBadger.pod index 47056db..36532f1 100644 --- a/doc/pgBadger.pod +++ b/doc/pgBadger.pod @@ -13,8 +13,8 @@ Arguments: logfile can be a single log file, a list of files, or a shell command returning a list of files. If you want to pass log content from stdin use - as filename. Note that input from stdin will not work with csvlog. - You can also use a file containing a list of log file to parse, see -L - command line option. + You can also use a file containing a list of log files to parse, see + the -L command line option. Options: @@ -86,7 +86,7 @@ Options: -w | --watch-mode : only report errors just like logwatch could do. -x | --extension : output format. Values: text, html, bin, json or tsung. Default: html - -X | --extra-files : in incremetal mode allow pgbadger to write CSS and + -X | --extra-files : in incremetal mode allow pgBadger to write CSS and JS files in the output directory as separate files. -z | --zcat exec_path : set the full path to the zcat program. Use it if zcat or bzcat or unzip is not in your path. @@ -123,14 +123,14 @@ Options: You can use this option multiple times. --exclude-appname name : exclude entries for the specified application name from report. Example: "pg_dump". - --exclude-line regex : pgbadger will start to exclude any log entry that + --exclude-line regex : pgBadger will start to exclude any log entry that will match the given regex. Can be used multiple time. --anonymize : obscure all literals in queries, useful to hide confidential data. --noreport : prevent pgbadger to create reports in incremental mode. - --log-duration : force pgbadger to associate log entries generated + --log-duration : force pgBadger to associate log entries generated by both log_duration = on and log_statement = 'all' --enable-checksum : used to add a md5 sum under each query report. @@ -190,39 +190,39 @@ Or better, use the auto-generated incremental reports: will generate a report per day and per week. -In incremental mode, you can also specify the number of week to keep in the +In incremental mode, you can also specify the number of weeks to keep in reports: /usr/bin/pgbadger --retention 2 -I -q /var/log/postgresql/postgresql.log.1 -O /var/www/pg_reports/ -If you have a pg_dump at 23:00 and 13:00 each day during half an hour, you can -use pgbadger as follow to exclude these period from the report: +If you have a pg_dump at 23:00 and 13:00 each day lasting half an hour, +you can use pgBadger as follows to exclude those periods from the report: pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log -This will help avoid having COPY statements, as generated by pg_dump, on top of -the list of slowest queries. You can also use --exclude-appname "pg_dump" to +This will help avoid having COPY statements, as generated by pg_dump, at +the top of the list of slowest queries. You can also use --exclude-appname solve this problem in a simpler way. =head1 DESCRIPTION -pgBadger is a PostgreSQL log analyzer build for speed with fully detailed +pgBadger is a PostgreSQL log analyzer built for speed with fully reports from your PostgreSQL log file. It's a single and small Perl script -that outperform any other PostgreSQL log analyzer. +Perl script that outperforms any other PostgreSQL log analyzer. -It is written in pure Perl language and uses a javascript library (flotr2) -to draw graphs so that you don't need to install any additional Perl modules -or other packages. Furthermore, this library gives us more features such as -zooming. pgBadger also uses the Bootstrap javascript library and the FontAwesome -webfont for better design. Everything is embedded. +It is written in pure Perl and uses a javascript library (flotr2) to draw +graphs so that you don't need to install any additional Perl modules or +other packages. Furthermore, this library gives us more features such +as zooming. pgBadger also uses the Bootstrap javascript library and +the FontAwesome webfont for better design. Everything is embedded. pgBadger is able to autodetect your log file format (syslog, stderr or csvlog). -It is designed to parse huge log files as well as gzip compressed file. See a +It is designed to parse huge log files as well as gzip compressed files. See a complete list of features below. Supported compressed format are gzip, bzip2 -and xz. For the last one you must have a xz version upper than 5.05 that support -the --robot option. +and xz. For the xz format you must have an xz version upper than 5.05 that +supports the --robot option. All charts are zoomable and can be saved as PNG images. @@ -232,8 +232,8 @@ report using command line options. pgBadger supports any custom format set into the log_line_prefix directive of your postgresql.conf file as long as it at least specify the %t and %p patterns. -pgBadger allow parallel processing on a single log file and multiple files -through the use of the -j option and the number of CPUs as value. +pgBadger allows parallel processing of a single log file or multiple +files through the use of the -j option specifying the number of CPUs. If you want to save system performance you can also use log_duration instead of log_min_duration_statement to have reports on duration and number of queries only. @@ -258,8 +258,8 @@ pgBadger reports everything about your SQL queries: Queries generating the most cancellation. Queries most cancelled. -The following reports are also available with hourly charts divide by periods of -five minutes: +The following reports are also available with hourly charts divided into +periods of five minutes: SQL queries statistics. Temporary file statistics. @@ -268,7 +268,7 @@ five minutes: Cancelled queries. Error events (panic, fatal, error and warning). -There's also some pie reports of distribution about: +There are also some pie charts about distribution of: Locks statistics. Queries by type (select/insert/update/delete). @@ -283,23 +283,23 @@ highlighted and beautified automatically. You can also have incremental reports with one report per day and a cumulative report per week. Two multiprocess modes are available to speed up log parsing, -one using one core per log file, and the second to use multiple core to parse -a single file. Both modes can be combined. +one using one core per log file, and the second using multiple cores to parse +a single file. These modes can be combined. Histogram granularity can be adjusted using the -A command line option. By default -they will report the mean of each top queries/error occuring per hour, but you can +they will report the mean of each top queries/errors occuring per hour, but you can specify the granularity down to the minute. pgBadger can also be used in a central place to parse remote log files using a -password less SSH connection. This mode can be used with compressed files and -in mode multiprocess per file (-J) but can not be used with CSV log format. +passwordless SSH connection. This mode can be used with compressed files and in +the multiprocess per file mode (-J) but can not be used with the CSV log format. =head1 REQUIREMENT pgBadger comes as a single Perl script - you do not need anything other than a modern Perl distribution. Charts are rendered using a Javascript library so you don't need -anything. Your browser will do all the work. +anything other than a web browser. Your browser will do all the work. If you planned to parse PostgreSQL CSV log files you might need some Perl Modules: @@ -365,10 +365,10 @@ You must first enable SQL query logging to have something to parse: log_min_duration_statement = 0 -Here every statement will be logged, on busy server you may want to increase -this value to only log queries with a higher duration time. Note that if you -have log_statement set to 'all' nothing will be logged through directive -log_min_duration_statement. See next chapter for more information. +Here every statement will be logged, on a busy server you may want to increase +this value to only log queries with a longer duration. Note that if you have +log_statement set to 'all' nothing will be logged through the log_min_duration_statement +directive. See the next chapter for more information. With 'stderr' log format, log_line_prefix must be at least: @@ -400,7 +400,7 @@ You need to enable other parameters in postgresql.conf to get more information f log_temp_files = 0 log_autovacuum_min_duration = 0 -Do not enable log_statement as their log format will not be parsed by pgBadger. +Do not enable log_statement as its log format will not be parsed by pgBadger. Of course your log messages should be in English without locale support: @@ -409,14 +409,14 @@ Of course your log messages should be in English without locale support: but this is not only recommended by pgBadger. Note: the session line [%l-1] is just used to match the default prefix for "stderr". -The -1 has no real purpose and basically is not used in Pgbadger statistics / graphs. -You can safely removed them from the log_line_prefix but you will need to set the ---prefix command line option. +The -1 has no real purpose and basically is not used in pgBadger statistics / graphs. +You can safely remove them from the log_line_prefix but you will need to set the +--prefix command line option accordingly. =head1 log_min_duration_statement, log_duration and log_statement -If you want full statistics reports you must set log_min_duration_statement -to 0 or more milliseconds. +If you want the query statistics to include the actual query strings, +you must set log_min_duration_statement to 0 or more milliseconds. If you just want to report duration and number of queries and don't want all details about queries, set log_min_duration_statement to -1 to disable it and @@ -434,7 +434,7 @@ set to 'all' nothing will be logged with log_line_prefix. To enable parallel processing you just have to use the -j N option where N is the number of cores you want to use. -pgbadger will then proceed as follow: +pgBadger will then proceed as follow: for each log file chunk size = int(file size / N) @@ -447,17 +447,17 @@ pgbadger will then proceed as follow: All binary temporary files generated will then be read and loaded into memory to build the html output. -With that method, at start/end of chunks pgbadger may truncate or omit a -maximum of N queries perl log file which is an insignificant gap if you have +With that method, at start/end of chunks pgBadger may truncate or omit a +maximum of N queries per log file which is an insignificant gap if you have millions of queries in your log file. The chance that the query that you were -looking for is loose is near 0, this is why I think this gap is livable. Most +looking for is lost is near 0, this is why I think this gap is livable. Most of the time the query is counted twice but truncated. -When you have lot of small log files and lot of CPUs it is speedier to dedicate +When you have many small log files and many CPUs it is speedier to dedicate one core to one log file at a time. To enable this behavior you have to use option -J N instead. With 200 log files of 10MB each the use of the -J option -start being really interesting with 8 Cores. Using this method you will be sure -to not loose any queries in the reports. +starts being really interesting with 8 Cores. Using this method you will be +sure not to lose any queries in the reports. He are a benchmarck done on a server with 8 CPUs and a single file of 9.5GB. @@ -466,7 +466,7 @@ He are a benchmarck done on a server with 8 CPUs and a single file of 9.5GB. -j | 1h41m18 | 50m25 | 25m39 | 15m58 -J | 1h41m18 | 54m28 | 41m16 | 34m45 -With 200 log files of 10MB each and a total og 2GB the results are slightly +With 200 log files of 10MB each and a total of 2GB the results are slightly different: Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU @@ -474,49 +474,50 @@ different: -j | 20m15 | 9m56 | 5m20 | 4m20 -J | 20m15 | 9m49 | 5m00 | 2m40 -So it is recommanded to use -j unless you have hundred of small log file +So it is recommended to use -j unless you have hundreds of small log files and can use at least 8 CPUs. -IMPORTANT: when you are using parallel parsing pgbadger will generate a lot -of temporary files in the /tmp directory and will remove them at end, so do -not remove those files unless pgbadger is not running. They are all named -with the following template tmp_pgbadgerXXXX.bin so they can be easily identified. +IMPORTANT: when you are using parallel parsing pgBadger will generate a +lot of temporary files in the /tmp directory and will remove them at the +end, so do not remove those files unless pgBadger is not running. They are +all named with the following template tmp_pgbadgerXXXX.bin so they can be +easily identified. =head1 INCREMENTAL REPORTS -pgBadger include an automatic incremental report mode using option -I or +pgBadger includes an automatic incremental report mode using option -I or --incremental. When running in this mode, pgBadger will generate one report per day and a cumulative report per week. Output is first done in binary format into the mandatory output directory (see option -O or --outdir), then in HTML format for daily and weekly reports with a main index file. -The main index file will show a dropdown menu per week with a link to the week -report and links to daily reports of this week. +The main index file will show a dropdown menu per week with a link to each +week's report and links to daily reports of each week. -For example, if you run pgBadger as follow based on a daily rotated file: +For example, if you run pgBadger as follows based on a daily rotated file: 0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \ -O /var/www/pg_reports/ you will have all daily and weekly reports for the full running period. -In this mode pgBagder will create an automatic incremental file into the +In this mode pgBadger will create an automatic incremental file in the output directory, so you don't have to use the -l option unless you want -to change the path of that file. This mean that you can run pgBadger in -this mode each days on a log file rotated each week, it will not count +to change the path of that file. This means that you can run pgBadger in +this mode each day on a log file rotated each week, and it will not count the log entries twice. To save disk space you may want to use the -X or --extra-files command line option to force pgBadger to write javascript and css to separate files in the output directory. The resources will then be loaded using script and -link tag. +link tags. =head1 BINARY FORMAT Using the binary format it is possible to create custom incremental and -cumulative reports. For example, if you want to refresh a pgbadger report -each hour from a daily PostgreSQl log file, you can proceed by running each -hour the following commands: +cumulative reports. For example, if you want to refresh a pgBadger +report each hour from a daily PostgreSQL log file, you can proceed by +running each hour the following commands: pgbadger --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log @@ -525,8 +526,9 @@ report from that binary file: pgbadger sunday/*.bin -Or an other example, if you have one log file per hour and you want a reports to be -rebuild each time the log file is switched. Proceed as follow: +Or as another example, if you generate one log file per hour and you want +reports to be rebuilt each time the log file is rotated, proceed as +follows: pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log @@ -538,7 +540,7 @@ is generated, just do the following: pgbadger -o day1_report.html day1/*.bin -Adjust the commands following your needs. +Adjust the commands to suit your particular needs. =head1 JSON FORMAT