granicus.if.org Git - postgresql/blob - doc/src/sgml/maintenance.sgml

   1 <!--
   2 $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.33 2004/04/05 03:02:03 momjian Exp $
   3 -->
   4
   5 <chapter id="maintenance">
   6  <title>Routine Database Maintenance Tasks</title>
   7
   8  <indexterm zone="maintenance">
   9   <primary>maintenance</primary>
  10  </indexterm>
  11
  12   <para>
  13    There are a few routine maintenance chores that must be performed on
  14    a regular basis to keep a <productname>PostgreSQL</productname>
  15    server running smoothly.  The tasks discussed here are repetitive
  16    in nature and can easily be automated using standard Unix tools such
  17    as <application>cron</application> scripts.  But it is the database
  18    administrator's responsibility to set up appropriate scripts, and to
  19    check that they execute successfully.
  20   </para>
  21
  22   <para>
  23    One obvious maintenance task is creation of backup copies of the data on a
  24    regular schedule.  Without a recent backup, you have no chance of recovery
  25    after a catastrophe (disk failure, fire, mistakenly dropping a critical
  26    table, etc.).  The backup and recovery mechanisms available in
  27    <productname>PostgreSQL</productname> are discussed at length in
  28    <xref linkend="backup">.
  29   </para>
  30
  31   <para>
  32    The other main category of maintenance task is periodic <quote>vacuuming</>
  33    of the database.  This activity is discussed in
  34    <xref linkend="routine-vacuuming">.
  35   </para>
  36
  37   <para>
  38    Something else that might need periodic attention is log file management.
  39    This is discussed in <xref linkend="logfile-maintenance">.
  40   </para>
  41
  42   <para>
  43    <productname>PostgreSQL</productname> is low-maintenance compared
  44    to some other database management systems.  Nonetheless,
  45    appropriate attention to these tasks will go far towards ensuring a
  46    pleasant and productive experience with the system.
  47   </para>
  48
  49  <sect1 id="routine-vacuuming">
  50   <title>Routine Vacuuming</title>
  51
  52   <indexterm zone="routine-vacuuming">
  53    <primary>vacuum</primary>
  54   </indexterm>
  55
  56   <para>
  57    <productname>PostgreSQL</productname>'s <command>VACUUM</> command
  58    must be run on a regular basis for several reasons:
  59
  60     <orderedlist>
  61      <listitem>
  62       <simpara>To recover disk space occupied by updated or deleted
  63       rows.</simpara>
  64      </listitem>
  65
  66      <listitem>
  67       <simpara>To update data statistics used by the
  68       <productname>PostgreSQL</productname> query planner.</simpara>
  69      </listitem>
  70
  71      <listitem>
  72       <simpara>To protect against loss of very old data due to
  73       <firstterm>transaction ID wraparound</>.</simpara>
  74      </listitem>
  75     </orderedlist>
  76
  77    The frequency and scope of the <command>VACUUM</> operations performed for each of
  78    these reasons will vary depending on the needs of each site.
  79    Therefore, database administrators must understand these issues and
  80    develop an appropriate maintenance strategy.  This section concentrates
  81    on explaining the high-level issues; for details about command syntax
  82    and so on, see the <command>VACUUM</> command reference page.
  83   </para>
  84
  85   <para>
  86    Beginning in <productname>PostgreSQL</productname> 7.2, the standard form
  87    of <command>VACUUM</> can run in parallel with normal database operations
  88    (selects, inserts, updates, deletes, but not changes to table definitions).
  89    Routine vacuuming is therefore not nearly as intrusive as it was in prior
  90    releases, and it is not as critical to try to schedule it at low-usage
  91    times of day.
  92   </para>
  93
  94   <sect2 id="vacuum-for-space-recovery">
  95    <title>Recovering disk space</title>
  96
  97    <indexterm zone="vacuum-for-space-recovery">
  98     <primary>disk space</primary>
  99    </indexterm>
 100
 101    <para>
 102     In normal <productname>PostgreSQL</productname> operation, an
 103     <command>UPDATE</> or <command>DELETE</> of a row does not
 104     immediately remove the old version of the row.
 105     This approach is necessary to gain the benefits of multiversion
 106     concurrency control (see <xref linkend="mvcc">): the row version
 107     must not be deleted while it is still potentially visible to other
 108     transactions. But eventually, an outdated or deleted row version is no
 109     longer of interest to any transaction. The space it occupies must be
 110     reclaimed for reuse by new rows, to avoid infinite growth of disk
 111     space requirements. This is done by running <command>VACUUM</>.
 112    </para>
 113
 114    <para>
 115     Clearly, a table that receives frequent updates or deletes will need
 116     to be vacuumed more often than tables that are seldom updated. It
 117     may be useful to set up periodic <application>cron</> tasks that
 118     <command>VACUUM</command> only selected tables, skipping tables that are known not to
 119     change often. This is only likely to be helpful if you have both
 120     large heavily-updated tables and large seldom-updated tables --- the
 121     extra cost of vacuuming a small table isn't enough to be worth
 122     worrying about.
 123    </para>
 124
 125    <para>
 126     There are two variants of the <command>VACUUM</command>
 127     command. The first form, known as <quote>lazy vacuum</quote> or
 128     just <command>VACUUM</command>, marks expired data in tables and
 129     indexes for future reuse; it does <emphasis>not</emphasis> attempt
 130     to reclaim the space used by this expired data
 131     immediately. Therefore, the table file is not shortened, and any
 132     unused space in the file is not returned to the operating
 133     system. This variant of <command>VACUUM</command> can be run
 134     concurrently with normal database operations.
 135    </para>
 136
 137    <para>
 138     The second form is the <command>VACUUM FULL</command>
 139     command. This uses a more aggressive algorithm for reclaiming the
 140     space consumed by expired row versions. Any space that is freed by
 141     <command>VACUUM FULL</command> is immediately returned to the
 142     operating system. Unfortunately, this variant of the
 143     <command>VACUUM</command> command acquires an exclusive lock on
 144     each table while <command>VACUUM FULL</command> is processing
 145     it. Therefore, frequently using <command>VACUUM FULL</command> can
 146     have an extremely negative effect on the performance of concurrent
 147     database queries.
 148    </para>
 149
 150    <para>
 151     The standard form of <command>VACUUM</> is best used with the goal
 152     of maintaining a fairly level steady-state usage of disk space. If
 153     you need to return disk space to the operating system you can use
 154     <command>VACUUM FULL</> --- but what's the point of releasing disk
 155     space that will only have to be allocated again soon?  Moderately
 156     frequent standard <command>VACUUM</> runs are a better approach
 157     than infrequent <command>VACUUM FULL</> runs for maintaining
 158     heavily-updated tables.
 159    </para>
 160
 161    <para>
 162     Recommended practice for most sites is to schedule a database-wide
 163     <command>VACUUM</> once a day at a low-usage time of day,
 164     supplemented by more frequent vacuuming of heavily-updated tables
 165     if necessary. In fact, some installations with an extremely high
 166     rate of data modification <command>VACUUM</command> some tables as
 167     often as once very five minutes.  (If you have multiple databases
 168     in a cluster, don't forget to <command>VACUUM</command> each one;
 169     the program <filename>vacuumdb</> may be helpful.)
 170    </para>
 171
 172    <para>
 173     <command>VACUUM FULL</> is recommended for cases where you know
 174     you have deleted the majority of rows in a table, so that the
 175     steady-state size of the table can be shrunk substantially with
 176     <command>VACUUM FULL</>'s more aggressive approach.  Use plain
 177     <command>VACUUM</>, not <command>VACUUM FULL</>, for routine
 178     vacuuming for space recovery.
 179    </para>
 180
 181    <para>
 182     If you have a table whose contents are deleted on a periodic
 183     basis, consider doing it with <command>TRUNCATE</command> rather
 184     than using <command>DELETE</command> followed by
 185     <command>VACUUM</command>. <command>TRUNCATE</command> removes the
 186     entire content of the table immediately, without recquiring a
 187     subsequent <command>VACUUM</command> or <command>VACUUM
 188     FULL</command> to reclaim the now-unused disk space.
 189    </para>
 190   </sect2>
 191
 192   <sect2 id="vacuum-for-statistics">
 193    <title>Updating planner statistics</title>
 194
 195    <indexterm zone="vacuum-for-statistics">
 196     <primary>statistics</primary>
 197     <secondary>of the planner</secondary>
 198    </indexterm>
 199
 200    <indexterm zone="vacuum-for-statistics">
 201     <primary>ANALYZE</primary>
 202    </indexterm>
 203
 204    <para>
 205     The <productname>PostgreSQL</productname> query planner relies on
 206     statistical information about the contents of tables in order to
 207     generate good plans for queries.  These statistics are gathered by
 208     the <command>ANALYZE</> command, which can be invoked by itself or
 209     as an optional step in <command>VACUUM</>.  It is important to have
 210     reasonably accurate statistics, otherwise poor choices of plans may
 211     degrade database performance.
 212    </para>
 213
 214    <para>
 215     As with vacuuming for space recovery, frequent updates of statistics
 216     are more useful for heavily-updated tables than for seldom-updated
 217     ones. But even for a heavily-updated table, there may be no need for
 218     statistics updates if the statistical distribution of the data is
 219     not changing much. A simple rule of thumb is to think about how much
 220     the minimum and maximum values of the columns in the table change.
 221     For example, a <type>timestamp</type> column that contains the time
 222     of row update will have a constantly-increasing maximum value as
 223     rows are added and updated; such a column will probably need more
 224     frequent statistics updates than, say, a column containing URLs for
 225     pages accessed on a website. The URL column may receive changes just
 226     as often, but the statistical distribution of its values probably
 227     changes relatively slowly.
 228    </para>
 229
 230    <para>
 231     It is possible to run <command>ANALYZE</> on specific tables and even
 232     just specific columns of a table, so the flexibility exists to update some
 233     statistics more frequently than others if your application requires it.
 234     In practice, however, the usefulness of this feature is doubtful.
 235     Beginning in <productname>PostgreSQL</productname> 7.2,
 236     <command>ANALYZE</> is a fairly fast operation even on large tables,
 237     because it uses a statistical random sampling of the rows of a table
 238     rather than reading every single row.  So it's probably much simpler
 239     to just run it over the whole database every so often.
 240    </para>
 241
 242    <tip>
 243     <para>
 244      Although per-column tweaking of <command>ANALYZE</> frequency may not be
 245      very productive, you may well find it worthwhile to do per-column
 246      adjustment of the level of detail of the statistics collected by
 247      <command>ANALYZE</>.  Columns that are heavily used in <literal>WHERE</> clauses
 248      and have highly irregular data distributions may require a finer-grain
 249      data histogram than other columns.  See <command>ALTER TABLE SET
 250      STATISTICS</>.
 251     </para>
 252    </tip>
 253
 254    <para>
 255     Recommended practice for most sites is to schedule a database-wide
 256     <command>ANALYZE</> once a day at a low-usage time of day; this can
 257     usefully be combined with a nightly <command>VACUUM</>.  However,
 258     sites with relatively slowly changing table statistics may find that
 259     this is overkill, and that less-frequent <command>ANALYZE</> runs
 260     are sufficient.
 261    </para>
 262   </sect2>
 263
 264   <sect2 id="vacuum-for-wraparound">
 265    <title>Preventing transaction ID wraparound failures</title>
 266
 267    <indexterm zone="vacuum-for-wraparound">
 268     <primary>transaction ID</primary>
 269     <secondary>wraparound</secondary>
 270    </indexterm>
 271
 272    <para>
 273     <productname>PostgreSQL</productname>'s MVCC transaction semantics
 274     depend on being able to compare transaction ID (<acronym>XID</>)
 275     numbers: a row version with an insertion XID greater than the current
 276     transaction's XID is <quote>in the future</> and should not be visible
 277     to the current transaction.  But since transaction IDs have limited size
 278     (32 bits at this writing) a cluster that runs for a long time (more
 279     than 4 billion transactions) will suffer <firstterm>transaction ID
 280     wraparound</>: the XID counter wraps around to zero, and all of a sudden
 281     transactions that were in the past appear to be in the future --- which
 282     means their outputs become invisible.  In short, catastrophic data loss.
 283     (Actually the data is still there, but that's cold comfort if you can't
 284     get at it.)
 285    </para>
 286
 287    <para>
 288     Prior to <productname>PostgreSQL</productname> 7.2, the only defense
 289     against XID wraparound was to re-<command>initdb</> at least every 4
 290     billion transactions. This of course was not very satisfactory for
 291     high-traffic sites, so a better solution has been devised. The new
 292     approach allows a server to remain up indefinitely, without
 293     <command>initdb</> or any sort of restart. The price is this
 294     maintenance requirement: <emphasis>every table in the database must
 295     be vacuumed at least once every billion transactions</emphasis>.
 296    </para>
 297
 298    <para>
 299     In practice this isn't an onerous requirement, but since the
 300     consequences of failing to meet it can be complete data loss (not
 301     just wasted disk space or slow performance), some special provisions
 302     have been made to help database administrators keep track of the
 303     time since the last <command>VACUUM</>. The remainder of this
 304     section gives the details.
 305    </para>
 306
 307    <para>
 308     The new approach to XID comparison distinguishes two special XIDs,
 309     numbers 1 and 2 (<literal>BootstrapXID</> and
 310     <literal>FrozenXID</>). These two XIDs are always considered older
 311     than every normal XID. Normal XIDs (those greater than 2) are
 312     compared using modulo-2<superscript>31</> arithmetic. This means
 313     that for every normal XID, there are two billion XIDs that are
 314     <quote>older</> and two billion that are <quote>newer</>; another
 315     way to say it is that the normal XID space is circular with no
 316     endpoint. Therefore, once a row version has been created with a particular
 317     normal XID, the row version will appear to be <quote>in the past</> for
 318     the next two billion transactions, no matter which normal XID we are
 319     talking about. If the row version still exists after more than two billion
 320     transactions, it will suddenly appear to be in the future. To
 321     prevent data loss, old row versions must be reassigned the XID
 322     <literal>FrozenXID</> sometime before they reach the
 323     two-billion-transactions-old mark. Once they are assigned this
 324     special XID, they will appear to be <quote>in the past</> to all
 325     normal transactions regardless of wraparound issues, and so such
 326     row versions will be good until deleted, no matter how long that is. This
 327     reassignment of XID is handled by <command>VACUUM</>.
 328    </para>
 329
 330    <para>
 331     <command>VACUUM</>'s normal policy is to reassign <literal>FrozenXID</>
 332     to any row version with a normal XID more than one billion transactions in the
 333     past.  This policy preserves the original insertion XID until it is not
 334     likely to be of interest anymore.  (In fact, most row versions will probably
 335     live and die without ever being <quote>frozen</>.)  With this policy,
 336     the maximum safe interval between <command>VACUUM</> runs on any table
 337     is exactly one billion transactions: if you wait longer, it's possible
 338     that a row version that was not quite old enough to be reassigned last time
 339     is now more than two billion transactions old and has wrapped around
 340     into the future --- i.e., is lost to you.  (Of course, it'll reappear
 341     after another two billion transactions, but that's no help.)
 342    </para>
 343
 344    <para>
 345     Since periodic <command>VACUUM</> runs are needed anyway for the reasons
 346     described earlier, it's unlikely that any table would not be vacuumed
 347     for as long as a billion transactions.  But to help administrators ensure
 348     this constraint is met, <command>VACUUM</> stores transaction ID
 349     statistics in the system table <literal>pg_database</>.  In particular,
 350     the <literal>datfrozenxid</> column of a database's
 351     <literal>pg_database</> row is updated at the completion of any
 352     database-wide <command>VACUUM</command> operation (i.e., <command>VACUUM</> that does not
 353     name a specific table).  The value stored in this field is the freeze
 354     cutoff XID that was used by that <command>VACUUM</> command.  All normal
 355     XIDs older than this cutoff XID are guaranteed to have been replaced by
 356     <literal>FrozenXID</> within that database.  A convenient way to
 357     examine this information is to execute the query
 358
 359 <programlisting>
 360 SELECT datname, age(datfrozenxid) FROM pg_database;
 361 </programlisting>
 362
 363     The <literal>age</> column measures the number of transactions from the
 364     cutoff XID to the current transaction's XID.
 365    </para>
 366
 367    <para>
 368     With the standard freezing policy, the <literal>age</> column will start
 369     at one billion for a freshly-vacuumed database.  When the <literal>age</>
 370     approaches two billion, the database must be vacuumed again to avoid
 371     risk of wraparound failures.  Recommended practice is to <command>VACUUM</command> each
 372     database at least once every half-a-billion (500 million) transactions,
 373     so as to provide plenty of safety margin.  To help meet this rule,
 374     each database-wide <command>VACUUM</> automatically delivers a warning
 375     if there are any <literal>pg_database</> entries showing an
 376     <literal>age</> of more than 1.5 billion transactions, for example:
 377
 378 <programlisting>
 379 play=# VACUUM;
 380 WARNING:  some databases have not been vacuumed in 1613770184 transactions
 381 HINT:  Better vacuum them within 533713463 transactions, or you may have a wraparound failure.
 382 VACUUM
 383 </programlisting>
 384    </para>
 385
 386    <para>
 387     <command>VACUUM</> with the <command>FREEZE</> option uses a more
 388     aggressive freezing policy: row versions are frozen if they are old enough
 389     to be considered good by all open transactions. In particular, if a
 390     <command>VACUUM FREEZE</> is performed in an otherwise-idle
 391     database, it is guaranteed that <emphasis>all</> row versions in that
 392     database will be frozen. Hence, as long as the database is not
 393     modified in any way, it will not need subsequent vacuuming to avoid
 394     transaction ID wraparound problems. This technique is used by
 395     <command>initdb</> to prepare the <literal>template0</> database.
 396     It should also be used to prepare any user-created databases that
 397     are to be marked <literal>datallowconn</> = <literal>false</> in
 398     <literal>pg_database</>, since there isn't any convenient way to
 399     <command>VACUUM</command> a database that you can't connect to. Note that
 400     <command>VACUUM</command>'s automatic warning message about
 401     unvacuumed databases will ignore <literal>pg_database</> entries
 402     with <literal>datallowconn</> = <literal>false</>, so as to avoid
 403     giving false warnings about these databases; therefore it's up to
 404     you to ensure that such databases are frozen correctly.
 405    </para>
 406
 407   </sect2>
 408  </sect1>
 409
 410
 411  <sect1 id="routine-reindex">
 412   <title>Routine Reindexing</title>
 413
 414   <indexterm zone="routine-reindex">
 415    <primary>reindex</primary>
 416   </indexterm>
 417
 418   <para>
 419    In some situations it is worthwhile to rebuild indexes periodically
 420    with the <command>REINDEX</> command. (There is also
 421    <filename>contrib/reindexdb</> which can reindex an entire database.)
 422    However, <productname>PostgreSQL</> 7.4 has substantially reduced the need
 423    for this activity compared to earlier releases.
 424   </para>
 425  </sect1>
 426
 427
 428  <sect1 id="logfile-maintenance">
 429   <title>Log File Maintenance</title>
 430
 431   <indexterm zone="logfile-maintenance">
 432    <primary>server log</primary>
 433    <secondary>log file maintenance</secondary>
 434   </indexterm>
 435
 436   <para>
 437    It is a good idea to save the database server's log output
 438    somewhere, rather than just routing it to <filename>/dev/null</>.
 439    The log output is invaluable when it comes time to diagnose
 440    problems.  However, the log output tends to be voluminous
 441    (especially at higher debug levels) and you won't want to save it
 442    indefinitely.  You need to <quote>rotate</> the log files so that
 443    new log files are started and old ones removed after a reasonable
 444    period of time.
 445   </para>
 446
 447   <para>
 448    If you simply direct the <systemitem>stderr</> of the <command>postmaster</command> into a
 449    file, the only way to truncate the log file is to stop and restart
 450    the <command>postmaster</command>. This may be OK if you are using
 451    <productname>PostgreSQL</productname> in a development environment,
 452    but few production servers would find this behavior acceptable.
 453   </para>
 454
 455   <para>
 456    The simplest production-grade approach to managing log output is to
 457    send it all to <application>syslog</> and let
 458    <application>syslog</> deal with file rotation. To do this, set the
 459    configurations parameter <literal>log_destination</> to 'syslog' (to log to
 460    <application>syslog</> only) in <filename>postgresql.conf</>. Then
 461    you can send a <literal>SIGHUP</literal> signal to the
 462    <application>syslog</> daemon whenever you want to force it to
 463    start writing a new log file.  If you want to automate log
 464    rotation, the <application>logrotate</application> program can be
 465    configured to work with log files from
 466    <application>syslog</application>.
 467   </para>
 468
 469   <para>
 470    On many systems, however, <application>syslog</> is not very reliable,
 471    particularly with large log messages; it may truncate or drop messages
 472    just when you need them the most.  Also, on <productname>linux</>,
 473    <application>syslog</> will sync each message to disk, yielding poor
 474    performance.  Use a <literal>-</> at the start of the file name
 475    in the <application>syslog</> config file to disable this behavior.
 476   </para>
 477
 478   <para>
 479    You may find it more useful to pipe the
 480    <systemitem>stderr</> of the <command>postmaster</> to some type of
 481    log rotation program. If you start the server with
 482    <command>pg_ctl</>, then the <systemitem>stderr</> of the <command>postmaster</command>
 483    is already redirected to <systemitem>stdout</>, so you just need a
 484    pipe command:
 485
 486 <programlisting>
 487 pg_ctl start | logrotate
 488 </programlisting>
 489
 490    The <productname>PostgreSQL</> distribution doesn't include a
 491    suitable log rotation program, but there are many available on the
 492    Internet. For example, the <application>logrotate</application>
 493    tool included in the <productname>Apache</productname> distribution
 494    can be used with <productname>PostgreSQL</productname>.
 495   </para>
 496  </sect1>
 497 </chapter>
 498
 499 <!-- Keep this comment at the end of the file
 500 Local variables:
 501 mode:sgml
 502 sgml-omittag:nil
 503 sgml-shorttag:t
 504 sgml-minimize-attributes:nil
 505 sgml-always-quote-attributes:t
 506 sgml-indent-step:1
 507 sgml-indent-data:t
 508 sgml-parent-document:nil
 509 sgml-default-dtd-file:"./reference.ced"
 510 sgml-exposed-tags:nil
 511 sgml-local-catalogs:("/usr/lib/sgml/catalog")
 512 sgml-local-ecat-files:nil
 513 End:
 514 -->