-d | --dbname database : only report on entries for the given database.
-e | --end datetime : end date/time for the data to be parsed in log.
-f | --format logtype : possible values: syslog,stderr,csv. Default: stderr
- -G | --nograph : disable graphs on HTML output. Enable by default.
+ -G | --nograph : disable graphs on HTML output. Enabled by default.
-h | --help : show this message and exit.
- -i | --ident name : programname used as syslog ident. Default: postgres
+ -i | --ident name : program name used as syslog ident. Default: postgres
-I | --incremental : use incremental mode, reports will be generated by
days in a separate directory, --outdir must be set.
- -j | --jobs number : number of jobs to run on parallel on each log file.
+ -j | --jobs number : number of jobs to run in parallel on each log file.
Default is 1, run as single process.
- -J | --Jobs number : number of log file to parse in parallel. Default
+ -J | --Jobs number : number of log files to parse in parallel. Default
is 1, run as single process.
-l | --last-parsed file: allow incremental log parsing by registering the
last datetime and line parsed. Useful if you want
perl pgbadger --prefix 'user=%u,db=%d,client=%h,appname=%a' \
/pglog/postgresql-2012-08-21*
- Use my 8 CPUs to parse my 10GB file faster, really faster
+ Use my 8 CPUs to parse my 10GB file faster, much faster
perl pgbadger -j 8 /pglog/postgresql-9.1-main.log
will generate a report per day and per week in the given output
directory.
- If you have a pg_dump at 23:00 and 13:00 each day during half an hour,
- you can use pgbadger as follow to exclude these periods from the report:
+ If you have pg_dumps scheduled at 23:00 and 13:00 every day, taking
+ less than an hour each, you can use pgBadger as follows to exclude these
+ periods from the report:
pgbadger --exclude-time "2013-09-.* (23|13):.*" postgresql.log
- This will help to not have all COPY order on top of slowest queries. You
- can also use --exclude-appname "pg_dump" to solve this problem in a more
- simple way.
+ This will avoid having these COPY queries crowd out your other slowest
+ queries. You can also use --exclude-appname "pg_dump" to solve this
+ problem more simply.
DESCRIPTION
- pgBadger is a PostgreSQL log analyzer build for speed with fully
+ pgBadger is a PostgreSQL log analyzer built for speed with fully
detailed reports from your PostgreSQL log file. It's a single and small
- Perl script that outperform any other PostgreSQL log analyzer.
+ Perl script that outperforms any other PostgreSQL log analyzer.
- It is written in pure Perl language and uses a javascript library
+ It is written in pure Perl and uses a javascript library
(flotr2) to draw graphs so that you don't need to install any additional
Perl modules or other packages. Furthermore, this library gives us more
features such as zooming. pgBadger also uses the Bootstrap javascript
You can also limit pgBadger to only report errors or remove any part of
the report using command line options.
- pgBadger supports any custom format set into log_line_prefix of your
- postgresql.conf file provide that you use the %t, %p and %l patterns.
+ pgBadger supports any custom format set into the log_line_prefix of your
+ postgresql.conf file provided that you use the %t, %p and %l patterns.
- pgBadger allow parallel processing on a single log file and multiple
- files through the use of the -j option and the number of CPUs as value.
+ pgBadger allows parallel processing of a single or multiple
+ log files via the -j option with the number of CPUs.
If you want to save system performance you can also use log_duration
instead of log_min_duration_statement to have reports on duration and
The most frequent errors.
Histogram of query times.
- The following reports are also available with hourly charts divide by
+ The following reports are also available with hourly charts divided into
periods of five minutes:
SQL queries statistics.
Checkpoints statistics.
Autovacuum and autoanalyze statistics.
- There's also some pie reports of distribution about:
+ There are also some pie charts showing the distribution of:
Locks statistics.
- ueries by type (select/insert/update/delete).
- Distribution of queries type per database/application
+ Queries by type (select/insert/update/delete).
+ Distribution of query types per database/application
Sessions per database/user/client.
Connections per database/user/client.
Autovacuum and autoanalyze per table.
cumulative report per week.
Histogram granularity can be adjusted using the -A command line option.
- By default they will report the mean of each top queries/error occuring
+ By default they will report the mean of each top queries/errors occurring
per hour, but you can specify the granularity down to the minute.
REQUIREMENT
format you don't need to install it.
Compressed log file format is autodetected from the file exension. If
- pgBadger find a gz extension it will use the zcat utility, with a bz2
+ pgBadger finds a gz extension it will use the zcat utility, with a bz2
extension it will use bzcat and if the file extension is zip then the
unzip utility will be used.
--zcat="C:\tools\unzip -p"
By default pgBadger will use the zcat, bzcat and unzip utilities
- following the file extension. If you use the default autodetection
- compress format you can mixed gz, bz2 or zip files. Specifying a custom
- value to --zcat option will remove this feature of mixed compressed
- format.
+ based on the file extension. If you use the default autodetection
+ of compression format you can mix gz, bz2 and zip files. Specifying a custom
+ value for the --zcat option will remove this feature of detecting mixed
+ compression formats.
- Note that multiprocessing can not be used with compressed files or CSV
- files as well as under Windows platform.
+ Note that multiprocessing can not be used with compressed files, CSV
+ files, or on the Windows platform.
INSTALLATION
- Download the tarball from github and unpack the archive as follow:
+ Download the tarball from github and unpack the archive as follows:
tar xzf pgbadger-4.x.tar.gz
cd pgbadger-4.x/
To enable parallel processing you just have to use the -j N option where
N is the number of cores you want to use.
- pgbadger will then proceed as follow:
+ pgBadger will then proceed as follows:
for each log file
chunk size = int(file size / N)
look at start/end offsets of these chunks
fork N processes and seek to the start offset of each chunk
- each process will terminate when the parser reach the end offset
+ each process will terminate when the parser reaches the end offset
of its chunk
- each process write stats into a binary temporary file
- wait for all children has terminated
+ each process writes stats into a binary temporary file
+ wait for all child processes to terminate
All binary temporary files generated will then be read and loaded into
memory to build the html output.
- With that method, at start/end of chunks pgbadger may truncate or omit a
- maximum of N queries perl log file which is an insignificant gap if you
+ With this method, at the start/end of chunks pgBadger may truncate or omit a
+ maximum of N queries per log file which is an insignificant gap if you
have millions of queries in your log file. The chance that the query
- that you were looking for is loose is near 0, this is why I think this
+ that you were looking for is missing is near 0, this is why I think this
gap is livable. Most of the time the query is counted twice but
truncated.
- When you have lot of small log files and lot of CPUs it is speedier to
+ When you have many small log files and many CPUs it is faster to
dedicate one core to one log file at a time. To enable this behavior you
have to use option -J N instead. With 200 log files of 10MB each the use
- of the -J option start being really interesting with 8 Cores. Using this
- method you will be sure to not loose any queries in the reports.
+ of the -J option starts being really interesting with 8 Cores. Using this
+ method you will be sure to not lose any queries in the reports.
- He are a benchmarck done on a server with 8 CPUs and a single file of
+ He are benchmarks performed on a server with 8 CPUs and a single file of
9.5GB.
Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU
-j | 1h41m18 | 50m25 | 25m39 | 15m58
-J | 1h41m18 | 54m28 | 41m16 | 34m45
- With 200 log files of 10MB each and a total og 2GB the results are
+ With 200 log files of 10MB each and a total of 2GB the results are
slightly different:
Option | 1 CPU | 2 CPU | 4 CPU | 8 CPU
-j | 20m15 | 9m56 | 5m20 | 4m20
-J | 20m15 | 9m49 | 5m00 | 2m40
- So it is recommanded to use -j unless you have hundred of small log file
+ So it is recommended to use -j unless you have hundreds of small log files
and can use at least 8 CPUs.
- IMPORTANT: when you are using parallel parsing pgbadger will generate a
+ IMPORTANT: when you are using parallel parsing pgBadger will generate a
lot of temporary files in the /tmp directory and will remove them at
- end, so do not remove those files unless pgbadger is not running. They
+ end, so do not remove those files unless pgBadger is not running. They
are all named with the following template tmp_pgbadgerXXXX.bin so they
can be easily identified.
INCREMENTAL REPORTS
- pgBadger include an automatic incremental report mode using option -I or
+ pgBadger includes an automatic incremental report mode using option -I or
--incremental. When running in this mode, pgBadger will generate one
report per day and a cumulative report per week. Output is first done in
binary format into the mandatory output directory (see option -O or
index file.
The main index file will show a dropdown menu per week with a link to
- the week report and links to daily reports of this week.
+ the week's report and links to daily reports of the week.
- For example, if you run pgBadger as follow based on a daily rotated
+ For example, if you run pgBadger as follows based on a daily rotated
file:
0 4 * * * /usr/bin/pgbadger -I -q /var/log/postgresql/postgresql.log.1 \
In this mode pgBagder will create an automatic incremental file into the
output directory, so you don't have to use the -l option unless you want
- to change the path of that file. This mean that you can run pgBadger in
- this mode each days on a log file rotated each week, it will not count
+ to change the path of that file. This means that you can run pgBadger in
+ this mode each day on a log file rotated each week, and it will not count
the log entries twice.
BINARY FORMAT
Using the binary format it is possible to create custom incremental and
- cumulative reports. For example, if you want to refresh a pgbadger
- report each hour from a daily PostgreSQl log file, you can proceed by
- running each hour the following commands:
+ cumulative reports. For example, if you want to refresh a pgBadger
+ report each hour from a daily PostgreSQL log file, you can
+ run the following command every hour:
- pgbadder --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log
+ pgbadger --last-parsed .pgbadger_last_state_file -o sunday/hourX.bin /var/log/pgsql/postgresql-Sun.log
to generate the incremental data files in binary format. And to generate
- the fresh HTML report from that binary file:
+ a fresh HTML report from that binary file:
- pgbadder sunday/*.bin
+ pgbadger sunday/*.bin
- Or an other example, if you have one log file per hour and you want a
- reports to be rebuild each time the log file is switched. Proceed as
- follow:
+ Or as another example, if you have one log file per hour and you want
+ reports to be rebuilt each time the log file is switched. Proceed as
+ follows:
pgbadger -o day1/hour01.bin /var/log/pgsql/pglog/postgresql-2012-03-23_10.log
pgbadger -o day1/hour02.bin /var/log/pgsql/pglog/postgresql-2012-03-23_11.log
pgbadger -o day1_report.html day1/*.bin
- Adjust the commands following your needs.
+ Adjust the commands to suit your needs.
AUTHORS
pgBadger is an original work from Gilles Darold.