granicus.if.org Git - postgresql/blob - doc/src/sgml/ddl.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.96 2010/08/23 02:43:25 tgl Exp $ -->
   2
   3 <chapter id="ddl">
   4  <title>Data Definition</title>
   5
   6  <para>
   7   This chapter covers how one creates the database structures that
   8   will hold one's data.  In a relational database, the raw data is
   9   stored in tables, so the majority of this chapter is devoted to
  10   explaining how tables are created and modified and what features are
  11   available to control what data is stored in the tables.
  12   Subsequently, we discuss how tables can be organized into
  13   schemas, and how privileges can be assigned to tables.  Finally,
  14   we will briefly look at other features that affect the data storage,
  15   such as inheritance, views, functions, and triggers.
  16  </para>
  17
  18  <sect1 id="ddl-basics">
  19   <title>Table Basics</title>
  20
  21   <indexterm zone="ddl-basics">
  22    <primary>table</primary>
  23   </indexterm>
  24
  25   <indexterm>
  26    <primary>row</primary>
  27   </indexterm>
  28
  29   <indexterm>
  30    <primary>column</primary>
  31   </indexterm>
  32
  33   <para>
  34    A table in a relational database is much like a table on paper: It
  35    consists of rows and columns.  The number and order of the columns
  36    is fixed, and each column has a name.  The number of rows is
  37    variable &mdash; it reflects how much data is stored at a given moment.
  38    SQL does not make any guarantees about the order of the rows in a
  39    table.  When a table is read, the rows will appear in an unspecified order,
  40    unless sorting is explicitly requested.  This is covered in <xref
  41    linkend="queries">.  Furthermore, SQL does not assign unique
  42    identifiers to rows, so it is possible to have several completely
  43    identical rows in a table.  This is a consequence of the
  44    mathematical model that underlies SQL but is usually not desirable.
  45    Later in this chapter we will see how to deal with this issue.
  46   </para>
  47
  48   <para>
  49    Each column has a data type.  The data type constrains the set of
  50    possible values that can be assigned to a column and assigns
  51    semantics to the data stored in the column so that it can be used
  52    for computations.  For instance, a column declared to be of a
  53    numerical type will not accept arbitrary text strings, and the data
  54    stored in such a column can be used for mathematical computations.
  55    By contrast, a column declared to be of a character string type
  56    will accept almost any kind of data but it does not lend itself to
  57    mathematical calculations, although other operations such as string
  58    concatenation are available.
  59   </para>
  60
  61   <para>
  62    <productname>PostgreSQL</productname> includes a sizable set of
  63    built-in data types that fit many applications.  Users can also
  64    define their own data types.  Most built-in data types have obvious
  65    names and semantics, so we defer a detailed explanation to <xref
  66    linkend="datatype">.  Some of the frequently used data types are
  67    <type>integer</type> for whole numbers, <type>numeric</type> for
  68    possibly fractional numbers, <type>text</type> for character
  69    strings, <type>date</type> for dates, <type>time</type> for
  70    time-of-day values, and <type>timestamp</type> for values
  71    containing both date and time.
  72   </para>
  73
  74   <indexterm>
  75    <primary>table</primary>
  76    <secondary>creating</secondary>
  77   </indexterm>
  78
  79   <para>
  80    To create a table, you use the aptly named <xref
  81    linkend="sql-createtable"> command.
  82    In this command you specify at least a name for the new table, the
  83    names of the columns and the data type of each column.  For
  84    example:
  85 <programlisting>
  86 CREATE TABLE my_first_table (
  87     first_column text,
  88     second_column integer
  89 );
  90 </programlisting>
  91    This creates a table named <literal>my_first_table</literal> with
  92    two columns.  The first column is named
  93    <literal>first_column</literal> and has a data type of
  94    <type>text</type>; the second column has the name
  95    <literal>second_column</literal> and the type <type>integer</type>.
  96    The table and column names follow the identifier syntax explained
  97    in <xref linkend="sql-syntax-identifiers">.  The type names are
  98    usually also identifiers, but there are some exceptions.  Note that the
  99    column list is comma-separated and surrounded by parentheses.
 100   </para>
 101
 102   <para>
 103    Of course, the previous example was heavily contrived.  Normally,
 104    you would give names to your tables and columns that convey what
 105    kind of data they store.  So let's look at a more realistic
 106    example:
 107 <programlisting>
 108 CREATE TABLE products (
 109     product_no integer,
 110     name text,
 111     price numeric
 112 );
 113 </programlisting>
 114    (The <type>numeric</type> type can store fractional components, as
 115    would be typical of monetary amounts.)
 116   </para>
 117
 118   <tip>
 119    <para>
 120     When you create many interrelated tables it is wise to choose a
 121     consistent naming pattern for the tables and columns.  For
 122     instance, there is a choice of using singular or plural nouns for
 123     table names, both of which are favored by some theorist or other.
 124    </para>
 125   </tip>
 126
 127   <para>
 128    There is a limit on how many columns a table can contain.
 129    Depending on the column types, it is between 250 and 1600.
 130    However, defining a table with anywhere near this many columns is
 131    highly unusual and often a questionable design.
 132   </para>
 133
 134   <indexterm>
 135    <primary>table</primary>
 136    <secondary>removing</secondary>
 137   </indexterm>
 138
 139   <para>
 140    If you no longer need a table, you can remove it using the <xref
 141    linkend="sql-droptable"> command.
 142    For example:
 143 <programlisting>
 144 DROP TABLE my_first_table;
 145 DROP TABLE products;
 146 </programlisting>
 147    Attempting to drop a table that does not exist is an error.
 148    Nevertheless, it is common in SQL script files to unconditionally
 149    try to drop each table before creating it, ignoring any error
 150    messages, so that the script works whether or not the table exists.
 151    (If you like, you can use the <literal>DROP TABLE IF EXISTS</> variant
 152    to avoid the error messages, but this is not standard SQL.)
 153   </para>
 154
 155   <para>
 156    If you need to modify a table that already exists, see <xref
 157    linkend="ddl-alter"> later in this chapter.
 158   </para>
 159
 160   <para>
 161    With the tools discussed so far you can create fully functional
 162    tables.  The remainder of this chapter is concerned with adding
 163    features to the table definition to ensure data integrity,
 164    security, or convenience.  If you are eager to fill your tables with
 165    data now you can skip ahead to <xref linkend="dml"> and read the
 166    rest of this chapter later.
 167   </para>
 168  </sect1>
 169
 170  <sect1 id="ddl-default">
 171   <title>Default Values</title>
 172
 173   <indexterm zone="ddl-default">
 174    <primary>default value</primary>
 175   </indexterm>
 176
 177   <para>
 178    A column can be assigned a default value.  When a new row is
 179    created and no values are specified for some of the columns, those
 180    columns will be filled with their respective default values.  A
 181    data manipulation command can also request explicitly that a column
 182    be set to its default value, without having to know what that value is.
 183    (Details about data manipulation commands are in <xref linkend="dml">.)
 184   </para>
 185
 186   <para>
 187    <indexterm><primary>null value</primary><secondary>default value</secondary></indexterm>
 188    If no default value is declared explicitly, the default value is the
 189    null value.  This usually makes sense because a null value can
 190    be considered to represent unknown data.
 191   </para>
 192
 193   <para>
 194    In a table definition, default values are listed after the column
 195    data type.  For example:
 196 <programlisting>
 197 CREATE TABLE products (
 198     product_no integer,
 199     name text,
 200     price numeric <emphasis>DEFAULT 9.99</emphasis>
 201 );
 202 </programlisting>
 203   </para>
 204
 205   <para>
 206    The default value can be an expression, which will be
 207    evaluated whenever the default value is inserted
 208    (<emphasis>not</emphasis> when the table is created).  A common example
 209    is for a <type>timestamp</type> column to have a default of <literal>CURRENT_TIMESTAMP</>,
 210    so that it gets set to the time of row insertion.  Another common
 211    example is generating a <quote>serial number</> for each row.
 212    In <productname>PostgreSQL</productname> this is typically done by
 213    something like:
 214 <programlisting>
 215 CREATE TABLE products (
 216     product_no integer <emphasis>DEFAULT nextval('products_product_no_seq')</emphasis>,
 217     ...
 218 );
 219 </programlisting>
 220    where the <literal>nextval()</> function supplies successive values
 221    from a <firstterm>sequence object</> (see <xref
 222    linkend="functions-sequence">). This arrangement is sufficiently common
 223    that there's a special shorthand for it:
 224 <programlisting>
 225 CREATE TABLE products (
 226     product_no <emphasis>SERIAL</emphasis>,
 227     ...
 228 );
 229 </programlisting>
 230    The <literal>SERIAL</> shorthand is discussed further in <xref
 231    linkend="datatype-serial">.
 232   </para>
 233  </sect1>
 234
 235  <sect1 id="ddl-constraints">
 236   <title>Constraints</title>
 237
 238   <indexterm zone="ddl-constraints">
 239    <primary>constraint</primary>
 240   </indexterm>
 241
 242   <para>
 243    Data types are a way to limit the kind of data that can be stored
 244    in a table.  For many applications, however, the constraint they
 245    provide is too coarse.  For example, a column containing a product
 246    price should probably only accept positive values.  But there is no
 247    standard data type that accepts only positive numbers.  Another issue is
 248    that you might want to constrain column data with respect to other
 249    columns or rows.  For example, in a table containing product
 250    information, there should be only one row for each product number.
 251   </para>
 252
 253   <para>
 254    To that end, SQL allows you to define constraints on columns and
 255    tables.  Constraints give you as much control over the data in your
 256    tables as you wish.  If a user attempts to store data in a column
 257    that would violate a constraint, an error is raised.  This applies
 258    even if the value came from the default value definition.
 259   </para>
 260
 261   <sect2>
 262    <title>Check Constraints</title>
 263
 264    <indexterm>
 265     <primary>check constraint</primary>
 266    </indexterm>
 267
 268    <indexterm>
 269     <primary>constraint</primary>
 270     <secondary>check</secondary>
 271    </indexterm>
 272
 273    <para>
 274     A check constraint is the most generic constraint type.  It allows
 275     you to specify that the value in a certain column must satisfy a
 276     Boolean (truth-value) expression.  For instance, to require positive
 277     product prices, you could use:
 278 <programlisting>
 279 CREATE TABLE products (
 280     product_no integer,
 281     name text,
 282     price numeric <emphasis>CHECK (price &gt; 0)</emphasis>
 283 );
 284 </programlisting>
 285    </para>
 286
 287    <para>
 288     As you see, the constraint definition comes after the data type,
 289     just like default value definitions.  Default values and
 290     constraints can be listed in any order.  A check constraint
 291     consists of the key word <literal>CHECK</literal> followed by an
 292     expression in parentheses.  The check constraint expression should
 293     involve the column thus constrained, otherwise the constraint
 294     would not make too much sense.
 295    </para>
 296
 297    <indexterm>
 298     <primary>constraint</primary>
 299     <secondary>name</secondary>
 300    </indexterm>
 301
 302    <para>
 303     You can also give the constraint a separate name.  This clarifies
 304     error messages and allows you to refer to the constraint when you
 305     need to change it.  The syntax is:
 306 <programlisting>
 307 CREATE TABLE products (
 308     product_no integer,
 309     name text,
 310     price numeric <emphasis>CONSTRAINT positive_price</emphasis> CHECK (price &gt; 0)
 311 );
 312 </programlisting>
 313     So, to specify a named constraint, use the key word
 314     <literal>CONSTRAINT</literal> followed by an identifier followed
 315     by the constraint definition.  (If you don't specify a constraint
 316     name in this way, the system chooses a name for you.)
 317    </para>
 318
 319    <para>
 320     A check constraint can also refer to several columns.  Say you
 321     store a regular price and a discounted price, and you want to
 322     ensure that the discounted price is lower than the regular price:
 323 <programlisting>
 324 CREATE TABLE products (
 325     product_no integer,
 326     name text,
 327     price numeric CHECK (price &gt; 0),
 328     discounted_price numeric CHECK (discounted_price &gt; 0),
 329     <emphasis>CHECK (price &gt; discounted_price)</emphasis>
 330 );
 331 </programlisting>
 332    </para>
 333
 334    <para>
 335     The first two constraints should look familiar.  The third one
 336     uses a new syntax.  It is not attached to a particular column,
 337     instead it appears as a separate item in the comma-separated
 338     column list.  Column definitions and these constraint
 339     definitions can be listed in mixed order.
 340    </para>
 341
 342    <para>
 343     We say that the first two constraints are column constraints, whereas the
 344     third one is a table constraint because it is written separately
 345     from any one column definition.  Column constraints can also be
 346     written as table constraints, while the reverse is not necessarily
 347     possible, since a column constraint is supposed to refer to only the
 348     column it is attached to.  (<productname>PostgreSQL</productname> doesn't
 349     enforce that rule, but you should follow it if you want your table
 350     definitions to work with other database systems.)  The above example could
 351     also be written as:
 352 <programlisting>
 353 CREATE TABLE products (
 354     product_no integer,
 355     name text,
 356     price numeric,
 357     CHECK (price &gt; 0),
 358     discounted_price numeric,
 359     CHECK (discounted_price &gt; 0),
 360     CHECK (price &gt; discounted_price)
 361 );
 362 </programlisting>
 363     or even:
 364 <programlisting>
 365 CREATE TABLE products (
 366     product_no integer,
 367     name text,
 368     price numeric CHECK (price &gt; 0),
 369     discounted_price numeric,
 370     CHECK (discounted_price &gt; 0 AND price &gt; discounted_price)
 371 );
 372 </programlisting>
 373     It's a matter of taste.
 374    </para>
 375
 376    <para>
 377     Names can be assigned to table constraints in the same way as
 378     column constraints:
 379 <programlisting>
 380 CREATE TABLE products (
 381     product_no integer,
 382     name text,
 383     price numeric,
 384     CHECK (price &gt; 0),
 385     discounted_price numeric,
 386     CHECK (discounted_price &gt; 0),
 387     <emphasis>CONSTRAINT valid_discount</> CHECK (price &gt; discounted_price)
 388 );
 389 </programlisting>
 390    </para>
 391
 392    <indexterm>
 393     <primary>null value</primary>
 394     <secondary sortas="check constraints">with check constraints</secondary>
 395    </indexterm>
 396
 397    <para>
 398     It should be noted that a check constraint is satisfied if the
 399     check expression evaluates to true or the null value.  Since most
 400     expressions will evaluate to the null value if any operand is null,
 401     they will not prevent null values in the constrained columns.  To
 402     ensure that a column does not contain null values, the not-null
 403     constraint described in the next section can be used.
 404    </para>
 405   </sect2>
 406
 407   <sect2>
 408    <title>Not-Null Constraints</title>
 409
 410    <indexterm>
 411     <primary>not-null constraint</primary>
 412    </indexterm>
 413
 414    <indexterm>
 415     <primary>constraint</primary>
 416     <secondary>NOT NULL</secondary>
 417    </indexterm>
 418
 419    <para>
 420     A not-null constraint simply specifies that a column must not
 421     assume the null value.  A syntax example:
 422 <programlisting>
 423 CREATE TABLE products (
 424     product_no integer <emphasis>NOT NULL</emphasis>,
 425     name text <emphasis>NOT NULL</emphasis>,
 426     price numeric
 427 );
 428 </programlisting>
 429    </para>
 430
 431    <para>
 432     A not-null constraint is always written as a column constraint.  A
 433     not-null constraint is functionally equivalent to creating a check
 434     constraint <literal>CHECK (<replaceable>column_name</replaceable>
 435     IS NOT NULL)</literal>, but in
 436     <productname>PostgreSQL</productname> creating an explicit
 437     not-null constraint is more efficient.  The drawback is that you
 438     cannot give explicit names to not-null constraints created this
 439     way.
 440    </para>
 441
 442    <para>
 443     Of course, a column can have more than one constraint.  Just write
 444     the constraints one after another:
 445 <programlisting>
 446 CREATE TABLE products (
 447     product_no integer NOT NULL,
 448     name text NOT NULL,
 449     price numeric NOT NULL CHECK (price &gt; 0)
 450 );
 451 </programlisting>
 452     The order doesn't matter.  It does not necessarily determine in which
 453     order the constraints are checked.
 454    </para>
 455
 456    <para>
 457     The <literal>NOT NULL</literal> constraint has an inverse: the
 458     <literal>NULL</literal> constraint.  This does not mean that the
 459     column must be null, which would surely be useless.  Instead, this
 460     simply selects the default behavior that the column might be null.
 461     The <literal>NULL</literal> constraint is not present in the SQL
 462     standard and should not be used in portable applications.  (It was
 463     only added to <productname>PostgreSQL</productname> to be
 464     compatible with some other database systems.)  Some users, however,
 465     like it because it makes it easy to toggle the constraint in a
 466     script file.  For example, you could start with:
 467 <programlisting>
 468 CREATE TABLE products (
 469     product_no integer NULL,
 470     name text NULL,
 471     price numeric NULL
 472 );
 473 </programlisting>
 474     and then insert the <literal>NOT</literal> key word where desired.
 475    </para>
 476
 477    <tip>
 478     <para>
 479      In most database designs the majority of columns should be marked
 480      not null.
 481     </para>
 482    </tip>
 483   </sect2>
 484
 485   <sect2>
 486    <title>Unique Constraints</title>
 487
 488    <indexterm>
 489     <primary>unique constraint</primary>
 490    </indexterm>
 491
 492    <indexterm>
 493     <primary>constraint</primary>
 494     <secondary>unique</secondary>
 495    </indexterm>
 496
 497    <para>
 498     Unique constraints ensure that the data contained in a column or a
 499     group of columns is unique with respect to all the rows in the
 500     table.  The syntax is:
 501 <programlisting>
 502 CREATE TABLE products (
 503     product_no integer <emphasis>UNIQUE</emphasis>,
 504     name text,
 505     price numeric
 506 );
 507 </programlisting>
 508     when written as a column constraint, and:
 509 <programlisting>
 510 CREATE TABLE products (
 511     product_no integer,
 512     name text,
 513     price numeric,
 514     <emphasis>UNIQUE (product_no)</emphasis>
 515 );
 516 </programlisting>
 517     when written as a table constraint.
 518    </para>
 519
 520    <para>
 521     If a unique constraint refers to a group of columns, the columns
 522     are listed separated by commas:
 523 <programlisting>
 524 CREATE TABLE example (
 525     a integer,
 526     b integer,
 527     c integer,
 528     <emphasis>UNIQUE (a, c)</emphasis>
 529 );
 530 </programlisting>
 531     This specifies that the combination of values in the indicated columns
 532     is unique across the whole table, though any one of the columns
 533     need not be (and ordinarily isn't) unique.
 534    </para>
 535
 536    <para>
 537     You can assign your own name for a unique constraint, in the usual way:
 538 <programlisting>
 539 CREATE TABLE products (
 540     product_no integer <emphasis>CONSTRAINT must_be_different</emphasis> UNIQUE,
 541     name text,
 542     price numeric
 543 );
 544 </programlisting>
 545    </para>
 546
 547    <indexterm>
 548     <primary>null value</primary>
 549     <secondary sortas="unique constraints">with unique constraints</secondary>
 550    </indexterm>
 551
 552    <para>
 553     In general, a unique constraint is violated when there is more than
 554     one row in the table where the values of all of the
 555     columns included in the constraint are equal.
 556     However, two null values are not considered equal in this
 557     comparison.  That means even in the presence of a
 558     unique constraint it is possible to store duplicate
 559     rows that contain a null value in at least one of the constrained
 560     columns.  This behavior conforms to the SQL standard, but we have
 561     heard that other SQL databases might not follow this rule.  So be
 562     careful when developing applications that are intended to be
 563     portable.
 564    </para>
 565   </sect2>
 566
 567   <sect2>
 568    <title>Primary Keys</title>
 569
 570    <indexterm>
 571     <primary>primary key</primary>
 572    </indexterm>
 573
 574    <indexterm>
 575     <primary>constraint</primary>
 576     <secondary>primary key</secondary>
 577    </indexterm>
 578
 579    <para>
 580     Technically, a primary key constraint is simply a combination of a
 581     unique constraint and a not-null constraint.  So, the following
 582     two table definitions accept the same data:
 583 <programlisting>
 584 CREATE TABLE products (
 585     product_no integer UNIQUE NOT NULL,
 586     name text,
 587     price numeric
 588 );
 589 </programlisting>
 590
 591 <programlisting>
 592 CREATE TABLE products (
 593     product_no integer <emphasis>PRIMARY KEY</emphasis>,
 594     name text,
 595     price numeric
 596 );
 597 </programlisting>
 598    </para>
 599
 600    <para>
 601     Primary keys can also constrain more than one column; the syntax
 602     is similar to unique constraints:
 603 <programlisting>
 604 CREATE TABLE example (
 605     a integer,
 606     b integer,
 607     c integer,
 608     <emphasis>PRIMARY KEY (a, c)</emphasis>
 609 );
 610 </programlisting>
 611    </para>
 612
 613    <para>
 614     A primary key indicates that a column or group of columns can be
 615     used as a unique identifier for rows in the table.  (This is a
 616     direct consequence of the definition of a primary key.  Note that
 617     a unique constraint does not, by itself, provide a unique identifier
 618     because it does not exclude null values.)  This is useful both for
 619     documentation purposes and for client applications.  For example,
 620     a GUI application that allows modifying row values probably needs
 621     to know the primary key of a table to be able to identify rows
 622     uniquely.
 623    </para>
 624
 625    <para>
 626     A table can have at most one primary key.  (There can be any number
 627     of unique and not-null constraints, which are functionally the same
 628     thing, but only one can be identified as the primary key.)
 629     Relational database theory
 630     dictates that every table must have a primary key.  This rule is
 631     not enforced by <productname>PostgreSQL</productname>, but it is
 632     usually best to follow it.
 633    </para>
 634   </sect2>
 635
 636   <sect2 id="ddl-constraints-fk">
 637    <title>Foreign Keys</title>
 638
 639    <indexterm>
 640     <primary>foreign key</primary>
 641    </indexterm>
 642
 643    <indexterm>
 644     <primary>constraint</primary>
 645     <secondary>foreign key</secondary>
 646    </indexterm>
 647
 648    <indexterm>
 649     <primary>referential integrity</primary>
 650    </indexterm>
 651
 652    <para>
 653     A foreign key constraint specifies that the values in a column (or
 654     a group of columns) must match the values appearing in some row
 655     of another table.
 656     We say this maintains the <firstterm>referential
 657     integrity</firstterm> between two related tables.
 658    </para>
 659
 660    <para>
 661     Say you have the product table that we have used several times already:
 662 <programlisting>
 663 CREATE TABLE products (
 664     product_no integer PRIMARY KEY,
 665     name text,
 666     price numeric
 667 );
 668 </programlisting>
 669     Let's also assume you have a table storing orders of those
 670     products.  We want to ensure that the orders table only contains
 671     orders of products that actually exist.  So we define a foreign
 672     key constraint in the orders table that references the products
 673     table:
 674 <programlisting>
 675 CREATE TABLE orders (
 676     order_id integer PRIMARY KEY,
 677     product_no integer <emphasis>REFERENCES products (product_no)</emphasis>,
 678     quantity integer
 679 );
 680 </programlisting>
 681     Now it is impossible to create orders with
 682     <structfield>product_no</structfield> entries that do not appear in the
 683     products table.
 684    </para>
 685
 686    <para>
 687     We say that in this situation the orders table is the
 688     <firstterm>referencing</firstterm> table and the products table is
 689     the <firstterm>referenced</firstterm> table.  Similarly, there are
 690     referencing and referenced columns.
 691    </para>
 692
 693    <para>
 694     You can also shorten the above command to:
 695 <programlisting>
 696 CREATE TABLE orders (
 697     order_id integer PRIMARY KEY,
 698     product_no integer <emphasis>REFERENCES products</emphasis>,
 699     quantity integer
 700 );
 701 </programlisting>
 702     because in absence of a column list the primary key of the
 703     referenced table is used as the referenced column(s).
 704    </para>
 705
 706    <para>
 707     A foreign key can also constrain and reference a group of columns.
 708     As usual, it then needs to be written in table constraint form.
 709     Here is a contrived syntax example:
 710 <programlisting>
 711 CREATE TABLE t1 (
 712   a integer PRIMARY KEY,
 713   b integer,
 714   c integer,
 715   <emphasis>FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)</emphasis>
 716 );
 717 </programlisting>
 718     Of course, the number and type of the constrained columns need to
 719     match the number and type of the referenced columns.
 720    </para>
 721
 722    <para>
 723     You can assign your own name for a foreign key constraint,
 724     in the usual way.
 725    </para>
 726
 727    <para>
 728     A table can contain more than one foreign key constraint.  This is
 729     used to implement many-to-many relationships between tables.  Say
 730     you have tables about products and orders, but now you want to
 731     allow one order to contain possibly many products (which the
 732     structure above did not allow).  You could use this table structure:
 733 <programlisting>
 734 CREATE TABLE products (
 735     product_no integer PRIMARY KEY,
 736     name text,
 737     price numeric
 738 );
 739
 740 CREATE TABLE orders (
 741     order_id integer PRIMARY KEY,
 742     shipping_address text,
 743     ...
 744 );
 745
 746 CREATE TABLE order_items (
 747     product_no integer REFERENCES products,
 748     order_id integer REFERENCES orders,
 749     quantity integer,
 750     PRIMARY KEY (product_no, order_id)
 751 );
 752 </programlisting>
 753     Notice that the primary key overlaps with the foreign keys in
 754     the last table.
 755    </para>
 756
 757    <indexterm>
 758     <primary>CASCADE</primary>
 759     <secondary>foreign key action</secondary>
 760    </indexterm>
 761
 762    <indexterm>
 763     <primary>RESTRICT</primary>
 764     <secondary>foreign key action</secondary>
 765    </indexterm>
 766
 767    <para>
 768     We know that the foreign keys disallow creation of orders that
 769     do not relate to any products.  But what if a product is removed
 770     after an order is created that references it?  SQL allows you to
 771     handle that as well.  Intuitively, we have a few options:
 772     <itemizedlist spacing="compact">
 773      <listitem><para>Disallow deleting a referenced product</para></listitem>
 774      <listitem><para>Delete the orders as well</para></listitem>
 775      <listitem><para>Something else?</para></listitem>
 776     </itemizedlist>
 777    </para>
 778
 779    <para>
 780     To illustrate this, let's implement the following policy on the
 781     many-to-many relationship example above: when someone wants to
 782     remove a product that is still referenced by an order (via
 783     <literal>order_items</literal>), we disallow it.  If someone
 784     removes an order, the order items are removed as well:
 785 <programlisting>
 786 CREATE TABLE products (
 787     product_no integer PRIMARY KEY,
 788     name text,
 789     price numeric
 790 );
 791
 792 CREATE TABLE orders (
 793     order_id integer PRIMARY KEY,
 794     shipping_address text,
 795     ...
 796 );
 797
 798 CREATE TABLE order_items (
 799     product_no integer REFERENCES products <emphasis>ON DELETE RESTRICT</emphasis>,
 800     order_id integer REFERENCES orders <emphasis>ON DELETE CASCADE</emphasis>,
 801     quantity integer,
 802     PRIMARY KEY (product_no, order_id)
 803 );
 804 </programlisting>
 805    </para>
 806
 807    <para>
 808     Restricting and cascading deletes are the two most common options.
 809     <literal>RESTRICT</literal> prevents deletion of a
 810     referenced row. <literal>NO ACTION</literal> means that if any
 811     referencing rows still exist when the constraint is checked, an error
 812     is raised; this is the default behavior if you do not specify anything.
 813     (The essential difference between these two choices is that
 814     <literal>NO ACTION</literal> allows the check to be deferred until
 815     later in the transaction, whereas <literal>RESTRICT</literal> does not.)
 816     <literal>CASCADE</> specifies that when a referenced row is deleted,
 817     row(s) referencing it should be automatically deleted as well.
 818     There are two other options:
 819     <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>.
 820     These cause the referencing columns to be set to nulls or default
 821     values, respectively, when the referenced row is deleted.
 822     Note that these do not excuse you from observing any constraints.
 823     For example, if an action specifies <literal>SET DEFAULT</literal>
 824     but the default value would not satisfy the foreign key, the
 825     operation will fail.
 826    </para>
 827
 828    <para>
 829     Analogous to <literal>ON DELETE</literal> there is also
 830     <literal>ON UPDATE</literal> which is invoked when a referenced
 831     column is changed (updated).  The possible actions are the same.
 832    </para>
 833
 834    <para>
 835     More information about updating and deleting data is in <xref
 836     linkend="dml">.
 837    </para>
 838
 839    <para>
 840     Finally, we should mention that a foreign key must reference
 841     columns that either are a primary key or form a unique constraint.
 842     If the foreign key references a unique constraint, there are some
 843     additional possibilities regarding how null values are matched.
 844     These are explained in the reference documentation for
 845     <xref linkend="sql-createtable">.
 846    </para>
 847   </sect2>
 848
 849   <sect2 id="ddl-constraints-exclusion">
 850    <title>Exclusion constraints</title>
 851
 852    <indexterm>
 853     <primary>exclusion constraint</primary>
 854    </indexterm>
 855
 856    <indexterm>
 857     <primary>constraint</primary>
 858     <secondary>exclusion</secondary>
 859    </indexterm>
 860
 861    <para>
 862     Exclusion constraints ensure that if any two rows are compared on
 863     the specified columns or expressions using the specified operators,
 864     at least one of these operator comparisons will return false or null.
 865     The syntax is:
 866 <programlisting>
 867 CREATE TABLE circles (
 868     c circle,
 869     EXCLUDE USING gist (c WITH &amp;&amp;)
 870 );
 871 </programlisting>
 872    </para>
 873
 874    <para>
 875     See also <link linkend="SQL-CREATETABLE-EXCLUDE"><command>CREATE
 876     TABLE ... CONSTRAINT ... EXCLUDE</></link> for details.
 877    </para>
 878   </sect2>
 879  </sect1>
 880
 881  <sect1 id="ddl-system-columns">
 882   <title>System Columns</title>
 883
 884   <para>
 885    Every table has several <firstterm>system columns</> that are
 886    implicitly defined by the system.  Therefore, these names cannot be
 887    used as names of user-defined columns.  (Note that these
 888    restrictions are separate from whether the name is a key word or
 889    not; quoting a name will not allow you to escape these
 890    restrictions.)  You do not really need to be concerned about these
 891    columns; just know they exist.
 892   </para>
 893
 894   <indexterm>
 895    <primary>column</primary>
 896    <secondary>system column</secondary>
 897   </indexterm>
 898
 899   <variablelist>
 900    <varlistentry>
 901     <term><structfield>oid</></term>
 902     <listitem>
 903      <para>
 904       <indexterm>
 905        <primary>OID</primary>
 906        <secondary>column</secondary>
 907       </indexterm>
 908       The object identifier (object ID) of a row. This column is only
 909       present if the table was created using <literal>WITH
 910       OIDS</literal>, or if the <xref linkend="guc-default-with-oids">
 911       configuration variable was set at the time. This column is of type
 912       <type>oid</type> (same name as the column); see <xref
 913       linkend="datatype-oid"> for more information about the type.
 914      </para>
 915     </listitem>
 916    </varlistentry>
 917
 918    <varlistentry>
 919     <term><structfield>tableoid</></term>
 920     <listitem>
 921      <indexterm>
 922       <primary>tableoid</primary>
 923      </indexterm>
 924
 925      <para>
 926       The OID of the table containing this row.  This column is
 927       particularly handy for queries that select from inheritance
 928       hierarchies (see <xref linkend="ddl-inherit">), since without it,
 929       it's difficult to tell which individual table a row came from.  The
 930       <structfield>tableoid</structfield> can be joined against the
 931       <structfield>oid</structfield> column of
 932       <structname>pg_class</structname> to obtain the table name.
 933      </para>
 934     </listitem>
 935    </varlistentry>
 936
 937    <varlistentry>
 938     <term><structfield>xmin</></term>
 939     <listitem>
 940      <indexterm>
 941       <primary>xmin</primary>
 942      </indexterm>
 943
 944      <para>
 945       The identity (transaction ID) of the inserting transaction for
 946       this row version.  (A row version is an individual state of a
 947       row; each update of a row creates a new row version for the same
 948       logical row.)
 949      </para>
 950     </listitem>
 951    </varlistentry>
 952
 953    <varlistentry>
 954     <term><structfield>cmin</></term>
 955     <listitem>
 956      <indexterm>
 957       <primary>cmin</primary>
 958      </indexterm>
 959
 960      <para>
 961       The command identifier (starting at zero) within the inserting
 962       transaction.
 963      </para>
 964     </listitem>
 965    </varlistentry>
 966
 967    <varlistentry>
 968     <term><structfield>xmax</></term>
 969     <listitem>
 970      <indexterm>
 971       <primary>xmax</primary>
 972      </indexterm>
 973
 974      <para>
 975       The identity (transaction ID) of the deleting transaction, or
 976       zero for an undeleted row version.  It is possible for this column to
 977       be nonzero in a visible row version. That usually indicates that the
 978       deleting transaction hasn't committed yet, or that an attempted
 979       deletion was rolled back.
 980      </para>
 981     </listitem>
 982    </varlistentry>
 983
 984    <varlistentry>
 985     <term><structfield>cmax</></term>
 986     <listitem>
 987      <indexterm>
 988       <primary>cmax</primary>
 989      </indexterm>
 990
 991      <para>
 992       The command identifier within the deleting transaction, or zero.
 993      </para>
 994     </listitem>
 995    </varlistentry>
 996
 997    <varlistentry>
 998     <term><structfield>ctid</></term>
 999     <listitem>
1000      <indexterm>
1001       <primary>ctid</primary>
1002      </indexterm>
1003
1004      <para>
1005       The physical location of the row version within its table.  Note that
1006       although the <structfield>ctid</structfield> can be used to
1007       locate the row version very quickly, a row's
1008       <structfield>ctid</structfield> will change if it is
1009       updated or moved by <command>VACUUM FULL</>.  Therefore
1010       <structfield>ctid</structfield> is useless as a long-term row
1011       identifier.  The OID, or even better a user-defined serial
1012       number, should be used to identify logical rows.
1013      </para>
1014     </listitem>
1015    </varlistentry>
1016   </variablelist>
1017
1018    <para>
1019     OIDs are 32-bit quantities and are assigned from a single
1020     cluster-wide counter.  In a large or long-lived database, it is
1021     possible for the counter to wrap around.  Hence, it is bad
1022     practice to assume that OIDs are unique, unless you take steps to
1023     ensure that this is the case.  If you need to identify the rows in
1024     a table, using a sequence generator is strongly recommended.
1025     However, OIDs can be used as well, provided that a few additional
1026     precautions are taken:
1027
1028     <itemizedlist>
1029      <listitem>
1030       <para>
1031        A unique constraint should be created on the OID column of each
1032        table for which the OID will be used to identify rows.  When such
1033        a unique constraint (or unique index) exists, the system takes
1034        care not to generate an OID matching an already-existing row.
1035        (Of course, this is only possible if the table contains fewer
1036        than 2<superscript>32</> (4 billion) rows, and in practice the
1037        table size had better be much less than that, or performance
1038        might suffer.)
1039       </para>
1040      </listitem>
1041      <listitem>
1042       <para>
1043        OIDs should never be assumed to be unique across tables; use
1044        the combination of <structfield>tableoid</> and row OID if you
1045        need a database-wide identifier.
1046       </para>
1047      </listitem>
1048      <listitem>
1049       <para>
1050        Of course, the tables in question must be created <literal>WITH
1051        OIDS</literal>.  As of <productname>PostgreSQL</productname> 8.1,
1052        <literal>WITHOUT OIDS</> is the default.
1053       </para>
1054      </listitem>
1055     </itemizedlist>
1056    </para>
1057
1058    <para>
1059     Transaction identifiers are also 32-bit quantities.  In a
1060     long-lived database it is possible for transaction IDs to wrap
1061     around.  This is not a fatal problem given appropriate maintenance
1062     procedures; see <xref linkend="maintenance"> for details.  It is
1063     unwise, however, to depend on the uniqueness of transaction IDs
1064     over the long term (more than one billion transactions).
1065    </para>
1066
1067    <para>
1068     Command identifiers are also 32-bit quantities.  This creates a hard limit
1069     of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands
1070     within a single transaction.  In practice this limit is not a
1071     problem &mdash; note that the limit is on the number of
1072     <acronym>SQL</acronym> commands, not the number of rows processed.
1073     Also, as of <productname>PostgreSQL</productname> 8.3, only commands
1074     that actually modify the database contents will consume a command
1075     identifier.
1076    </para>
1077  </sect1>
1078
1079  <sect1 id="ddl-alter">
1080   <title>Modifying Tables</title>
1081
1082   <indexterm zone="ddl-alter">
1083    <primary>table</primary>
1084    <secondary>modifying</secondary>
1085   </indexterm>
1086
1087   <para>
1088    When you create a table and you realize that you made a mistake, or
1089    the requirements of the application change, you can drop the
1090    table and create it again.  But this is not a convenient option if
1091    the table is already filled with data, or if the table is
1092    referenced by other database objects (for instance a foreign key
1093    constraint).  Therefore <productname>PostgreSQL</productname>
1094    provides a family of commands to make modifications to existing
1095    tables.  Note that this is conceptually distinct from altering
1096    the data contained in the table: here we are interested in altering
1097    the definition, or structure, of the table.
1098   </para>
1099
1100   <para>
1101    You can:
1102    <itemizedlist spacing="compact">
1103     <listitem>
1104      <para>Add columns</para>
1105     </listitem>
1106     <listitem>
1107      <para>Remove columns</para>
1108     </listitem>
1109     <listitem>
1110      <para>Add constraints</para>
1111     </listitem>
1112     <listitem>
1113      <para>Remove constraints</para>
1114     </listitem>
1115     <listitem>
1116      <para>Change default values</para>
1117     </listitem>
1118     <listitem>
1119      <para>Change column data types</para>
1120     </listitem>
1121     <listitem>
1122      <para>Rename columns</para>
1123     </listitem>
1124     <listitem>
1125      <para>Rename tables</para>
1126     </listitem>
1127    </itemizedlist>
1128
1129    All these actions are performed using the
1130    <xref linkend="sql-altertable">
1131    command, whose reference page contains details beyond those given
1132    here.
1133   </para>
1134
1135   <sect2>
1136    <title>Adding a Column</title>
1137
1138    <indexterm>
1139     <primary>column</primary>
1140     <secondary>adding</secondary>
1141    </indexterm>
1142
1143    <para>
1144     To add a column, use a command like:
1145 <programlisting>
1146 ALTER TABLE products ADD COLUMN description text;
1147 </programlisting>
1148     The new column is initially filled with whatever default
1149     value is given (null if you don't specify a <literal>DEFAULT</> clause).
1150    </para>
1151
1152    <para>
1153     You can also define constraints on the column at the same time,
1154     using the usual syntax:
1155 <programlisting>
1156 ALTER TABLE products ADD COLUMN description text CHECK (description &lt;&gt; '');
1157 </programlisting>
1158     In fact all the options that can be applied to a column description
1159     in <command>CREATE TABLE</> can be used here.  Keep in mind however
1160     that the default value must satisfy the given constraints, or the
1161     <literal>ADD</> will fail.  Alternatively, you can add
1162     constraints later (see below) after you've filled in the new column
1163     correctly.
1164    </para>
1165
1166   <tip>
1167    <para>
1168     Adding a column with a default requires updating each row of the
1169     table (to store the new column value).  However, if no default is
1170     specified, <productname>PostgreSQL</productname> is able to avoid
1171     the physical update.  So if you intend to fill the column with
1172     mostly nondefault values, it's best to add the column with no default,
1173     insert the correct values using <command>UPDATE</>, and then add any
1174     desired default as described below.
1175    </para>
1176   </tip>
1177   </sect2>
1178
1179   <sect2>
1180    <title>Removing a Column</title>
1181
1182    <indexterm>
1183     <primary>column</primary>
1184     <secondary>removing</secondary>
1185    </indexterm>
1186
1187    <para>
1188     To remove a column, use a command like:
1189 <programlisting>
1190 ALTER TABLE products DROP COLUMN description;
1191 </programlisting>
1192     Whatever data was in the column disappears.  Table constraints involving
1193     the column are dropped, too.  However, if the column is referenced by a
1194     foreign key constraint of another table,
1195     <productname>PostgreSQL</productname> will not silently drop that
1196     constraint.  You can authorize dropping everything that depends on
1197     the column by adding <literal>CASCADE</>:
1198 <programlisting>
1199 ALTER TABLE products DROP COLUMN description CASCADE;
1200 </programlisting>
1201     See <xref linkend="ddl-depend"> for a description of the general
1202     mechanism behind this.
1203    </para>
1204   </sect2>
1205
1206   <sect2>
1207    <title>Adding a Constraint</title>
1208
1209    <indexterm>
1210     <primary>constraint</primary>
1211     <secondary>adding</secondary>
1212    </indexterm>
1213
1214    <para>
1215     To add a constraint, the table constraint syntax is used.  For example:
1216 <programlisting>
1217 ALTER TABLE products ADD CHECK (name &lt;&gt; '');
1218 ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no);
1219 ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;
1220 </programlisting>
1221     To add a not-null constraint, which cannot be written as a table
1222     constraint, use this syntax:
1223 <programlisting>
1224 ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;
1225 </programlisting>
1226    </para>
1227
1228    <para>
1229     The constraint will be checked immediately, so the table data must
1230     satisfy the constraint before it can be added.
1231    </para>
1232   </sect2>
1233
1234   <sect2>
1235    <title>Removing a Constraint</title>
1236
1237    <indexterm>
1238     <primary>constraint</primary>
1239     <secondary>removing</secondary>
1240    </indexterm>
1241
1242    <para>
1243     To remove a constraint you need to know its name.  If you gave it
1244     a name then that's easy.  Otherwise the system assigned a
1245     generated name, which you need to find out.  The
1246     <application>psql</application> command <literal>\d
1247     <replaceable>tablename</replaceable></literal> can be helpful
1248     here; other interfaces might also provide a way to inspect table
1249     details.  Then the command is:
1250 <programlisting>
1251 ALTER TABLE products DROP CONSTRAINT some_name;
1252 </programlisting>
1253     (If you are dealing with a generated constraint name like <literal>$2</>,
1254     don't forget that you'll need to double-quote it to make it a valid
1255     identifier.)
1256    </para>
1257
1258    <para>
1259     As with dropping a column, you need to add <literal>CASCADE</> if you
1260     want to drop a constraint that something else depends on.  An example
1261     is that a foreign key constraint depends on a unique or primary key
1262     constraint on the referenced column(s).
1263    </para>
1264
1265    <para>
1266     This works the same for all constraint types except not-null
1267     constraints. To drop a not null constraint use:
1268 <programlisting>
1269 ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
1270 </programlisting>
1271     (Recall that not-null constraints do not have names.)
1272    </para>
1273   </sect2>
1274
1275   <sect2>
1276    <title>Changing a Column's Default Value</title>
1277
1278    <indexterm>
1279     <primary>default value</primary>
1280     <secondary>changing</secondary>
1281    </indexterm>
1282
1283    <para>
1284     To set a new default for a column, use a command like:
1285 <programlisting>
1286 ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
1287 </programlisting>
1288     Note that this doesn't affect any existing rows in the table, it
1289     just changes the default for future <command>INSERT</> commands.
1290    </para>
1291
1292    <para>
1293     To remove any default value, use:
1294 <programlisting>
1295 ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
1296 </programlisting>
1297     This is effectively the same as setting the default to null.
1298     As a consequence, it is not an error
1299     to drop a default where one hadn't been defined, because the
1300     default is implicitly the null value.
1301    </para>
1302   </sect2>
1303
1304   <sect2>
1305    <title>Changing a Column's Data Type</title>
1306
1307    <indexterm>
1308     <primary>column data type</primary>
1309     <secondary>changing</secondary>
1310    </indexterm>
1311
1312    <para>
1313     To convert a column to a different data type, use a command like:
1314 <programlisting>
1315 ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
1316 </programlisting>
1317     This will succeed only if each existing entry in the column can be
1318     converted to the new type by an implicit cast.  If a more complex
1319     conversion is needed, you can add a <literal>USING</> clause that
1320     specifies how to compute the new values from the old.
1321    </para>
1322
1323    <para>
1324     <productname>PostgreSQL</> will attempt to convert the column's
1325     default value (if any) to the new type, as well as any constraints
1326     that involve the column.  But these conversions might fail, or might
1327     produce surprising results.  It's often best to drop any constraints
1328     on the column before altering its type, and then add back suitably
1329     modified constraints afterwards.
1330    </para>
1331   </sect2>
1332
1333   <sect2>
1334    <title>Renaming a Column</title>
1335
1336    <indexterm>
1337     <primary>column</primary>
1338     <secondary>renaming</secondary>
1339    </indexterm>
1340
1341    <para>
1342     To rename a column:
1343 <programlisting>
1344 ALTER TABLE products RENAME COLUMN product_no TO product_number;
1345 </programlisting>
1346    </para>
1347   </sect2>
1348
1349   <sect2>
1350    <title>Renaming a Table</title>
1351
1352    <indexterm>
1353     <primary>table</primary>
1354     <secondary>renaming</secondary>
1355    </indexterm>
1356
1357    <para>
1358     To rename a table:
1359 <programlisting>
1360 ALTER TABLE products RENAME TO items;
1361 </programlisting>
1362    </para>
1363   </sect2>
1364  </sect1>
1365
1366  <sect1 id="ddl-priv">
1367   <title>Privileges</title>
1368
1369   <indexterm zone="ddl-priv">
1370    <primary>privilege</primary>
1371   </indexterm>
1372
1373   <indexterm>
1374    <primary>permission</primary>
1375    <see>privilege</see>
1376   </indexterm>
1377
1378   <para>
1379    When you create a database object, you become its owner.  By
1380    default, only the owner of an object can do anything with the
1381    object. In order to allow other users to use it,
1382    <firstterm>privileges</firstterm> must be granted.  (However,
1383    users that have the superuser attribute can always
1384    access any object.)
1385   </para>
1386
1387   <para>
1388    There are several different privileges: <literal>SELECT</>,
1389    <literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>,
1390    <literal>TRUNCATE</>, <literal>REFERENCES</>, <literal>TRIGGER</>,
1391    <literal>CREATE</>, <literal>CONNECT</>, <literal>TEMPORARY</>,
1392    <literal>EXECUTE</>, and <literal>USAGE</>.
1393    The privileges applicable to a particular
1394    object vary depending on the object's type (table, function, etc).
1395    For complete information on the different types of privileges
1396    supported by <productname>PostgreSQL</productname>, refer to the
1397    <xref linkend="sql-grant"> reference
1398    page.  The following sections and chapters will also show you how
1399    those privileges are used.
1400   </para>
1401
1402   <para>
1403    The right to modify or destroy an object is always the privilege of
1404    the owner only.
1405   </para>
1406
1407   <note>
1408    <para>
1409     To change the owner of a table, index, sequence, or view, use the
1410     <xref linkend="sql-altertable">
1411     command.  There are corresponding <literal>ALTER</> commands for
1412     other object types.
1413    </para>
1414   </note>
1415
1416   <para>
1417    To assign privileges, the <command>GRANT</command> command is
1418    used. For example, if <literal>joe</literal> is an existing user, and
1419    <literal>accounts</literal> is an existing table, the privilege to
1420    update the table can be granted with:
1421 <programlisting>
1422 GRANT UPDATE ON accounts TO joe;
1423 </programlisting>
1424    Writing <literal>ALL</literal> in place of a specific privilege grants all
1425    privileges that are relevant for the object type.
1426   </para>
1427
1428   <para>
1429    The special <quote>user</quote> name <literal>PUBLIC</literal> can
1430    be used to grant a privilege to every user on the system.  Also,
1431    <quote>group</> roles can be set up to help manage privileges when
1432    there are many users of a database &mdash; for details see
1433    <xref linkend="user-manag">.
1434   </para>
1435
1436   <para>
1437    To revoke a privilege, use the fittingly named
1438    <command>REVOKE</command> command:
1439 <programlisting>
1440 REVOKE ALL ON accounts FROM PUBLIC;
1441 </programlisting>
1442    The special privileges of the object owner (i.e., the right to do
1443    <command>DROP</>, <command>GRANT</>, <command>REVOKE</>, etc.)
1444    are always implicit in being the owner,
1445    and cannot be granted or revoked.  But the object owner can choose
1446    to revoke his own ordinary privileges, for example to make a
1447    table read-only for himself as well as others.
1448   </para>
1449
1450   <para>
1451    Ordinarily, only the object's owner (or a superuser) can grant or
1452    revoke privileges on an object.  However, it is possible to grant a
1453    privilege <quote>with grant option</>, which gives the recipient
1454    the right to grant it in turn to others.  If the grant option is
1455    subsequently revoked then all who received the privilege from that
1456    recipient (directly or through a chain of grants) will lose the
1457    privilege.  For details see the <xref linkend="sql-grant"> and
1458    <xref linkend="sql-revoke"> reference pages.
1459   </para>
1460  </sect1>
1461
1462  <sect1 id="ddl-schemas">
1463   <title>Schemas</title>
1464
1465   <indexterm zone="ddl-schemas">
1466    <primary>schema</primary>
1467   </indexterm>
1468
1469   <para>
1470    A <productname>PostgreSQL</productname> database cluster
1471    contains one or more named databases.  Users and groups of users are
1472    shared across the entire cluster, but no other data is shared across
1473    databases.  Any given client connection to the server can access
1474    only the data in a single database, the one specified in the connection
1475    request.
1476   </para>
1477
1478   <note>
1479    <para>
1480     Users of a cluster do not necessarily have the privilege to access every
1481     database in the cluster.  Sharing of user names means that there
1482     cannot be different users named, say, <literal>joe</> in two databases
1483     in the same cluster; but the system can be configured to allow
1484     <literal>joe</> access to only some of the databases.
1485    </para>
1486   </note>
1487
1488   <para>
1489    A database contains one or more named <firstterm>schemas</>, which
1490    in turn contain tables.  Schemas also contain other kinds of named
1491    objects, including data types, functions, and operators.  The same
1492    object name can be used in different schemas without conflict; for
1493    example, both <literal>schema1</> and <literal>myschema</> can
1494    contain tables named <literal>mytable</>.  Unlike databases,
1495    schemas are not rigidly separated: a user can access objects in any
1496    of the schemas in the database he is connected to, if he has
1497    privileges to do so.
1498   </para>
1499
1500   <para>
1501    There are several reasons why one might want to use schemas:
1502
1503    <itemizedlist>
1504     <listitem>
1505      <para>
1506       To allow many users to use one database without interfering with
1507       each other.
1508      </para>
1509     </listitem>
1510
1511     <listitem>
1512      <para>
1513       To organize database objects into logical groups to make them
1514       more manageable.
1515      </para>
1516     </listitem>
1517
1518     <listitem>
1519      <para>
1520       Third-party applications can be put into separate schemas so
1521       they do not collide with the names of other objects.
1522      </para>
1523     </listitem>
1524    </itemizedlist>
1525
1526    Schemas are analogous to directories at the operating system level,
1527    except that schemas cannot be nested.
1528   </para>
1529
1530   <sect2 id="ddl-schemas-create">
1531    <title>Creating a Schema</title>
1532
1533    <indexterm zone="ddl-schemas-create">
1534     <primary>schema</primary>
1535     <secondary>creating</secondary>
1536    </indexterm>
1537
1538    <para>
1539     To create a schema, use the <xref linkend="sql-createschema">
1540     command.  Give the schema a name
1541     of your choice.  For example:
1542 <programlisting>
1543 CREATE SCHEMA myschema;
1544 </programlisting>
1545    </para>
1546
1547    <indexterm>
1548     <primary>qualified name</primary>
1549    </indexterm>
1550
1551    <indexterm>
1552     <primary>name</primary>
1553     <secondary>qualified</secondary>
1554    </indexterm>
1555
1556    <para>
1557     To create or access objects in a schema, write a
1558     <firstterm>qualified name</> consisting of the schema name and
1559     table name separated by a dot:
1560 <synopsis>
1561 <replaceable>schema</><literal>.</><replaceable>table</>
1562 </synopsis>
1563     This works anywhere a table name is expected, including the table
1564     modification commands and the data access commands discussed in
1565     the following chapters.
1566     (For brevity we will speak of tables only, but the same ideas apply
1567     to other kinds of named objects, such as types and functions.)
1568    </para>
1569
1570    <para>
1571     Actually, the even more general syntax
1572 <synopsis>
1573 <replaceable>database</><literal>.</><replaceable>schema</><literal>.</><replaceable>table</>
1574 </synopsis>
1575     can be used too, but at present this is just for <foreignphrase>pro
1576     forma</> compliance with the SQL standard.  If you write a database name,
1577     it must be the same as the database you are connected to.
1578    </para>
1579
1580    <para>
1581     So to create a table in the new schema, use:
1582 <programlisting>
1583 CREATE TABLE myschema.mytable (
1584  ...
1585 );
1586 </programlisting>
1587    </para>
1588
1589    <indexterm>
1590     <primary>schema</primary>
1591     <secondary>removing</secondary>
1592    </indexterm>
1593
1594    <para>
1595     To drop a schema if it's empty (all objects in it have been
1596     dropped), use:
1597 <programlisting>
1598 DROP SCHEMA myschema;
1599 </programlisting>
1600     To drop a schema including all contained objects, use:
1601 <programlisting>
1602 DROP SCHEMA myschema CASCADE;
1603 </programlisting>
1604     See <xref linkend="ddl-depend"> for a description of the general
1605     mechanism behind this.
1606    </para>
1607
1608    <para>
1609     Often you will want to create a schema owned by someone else
1610     (since this is one of the ways to restrict the activities of your
1611     users to well-defined namespaces).  The syntax for that is:
1612 <programlisting>
1613 CREATE SCHEMA <replaceable>schemaname</replaceable> AUTHORIZATION <replaceable>username</replaceable>;
1614 </programlisting>
1615     You can even omit the schema name, in which case the schema name
1616     will be the same as the user name.  See <xref
1617     linkend="ddl-schemas-patterns"> for how this can be useful.
1618    </para>
1619
1620    <para>
1621     Schema names beginning with <literal>pg_</> are reserved for
1622     system purposes and cannot be created by users.
1623    </para>
1624   </sect2>
1625
1626   <sect2 id="ddl-schemas-public">
1627    <title>The Public Schema</title>
1628
1629    <indexterm zone="ddl-schemas-public">
1630     <primary>schema</primary>
1631     <secondary>public</secondary>
1632    </indexterm>
1633
1634    <para>
1635     In the previous sections we created tables without specifying any
1636     schema names.  By default such tables (and other objects) are
1637     automatically put into a schema named <quote>public</quote>.  Every new
1638     database contains such a schema.  Thus, the following are equivalent:
1639 <programlisting>
1640 CREATE TABLE products ( ... );
1641 </programlisting>
1642     and:
1643 <programlisting>
1644 CREATE TABLE public.products ( ... );
1645 </programlisting>
1646    </para>
1647   </sect2>
1648
1649   <sect2 id="ddl-schemas-path">
1650    <title>The Schema Search Path</title>
1651
1652    <indexterm>
1653     <primary>search path</primary>
1654    </indexterm>
1655
1656    <indexterm>
1657     <primary>unqualified name</primary>
1658    </indexterm>
1659
1660    <indexterm>
1661     <primary>name</primary>
1662     <secondary>unqualified</secondary>
1663    </indexterm>
1664
1665    <para>
1666     Qualified names are tedious to write, and it's often best not to
1667     wire a particular schema name into applications anyway.  Therefore
1668     tables are often referred to by <firstterm>unqualified names</>,
1669     which consist of just the table name.  The system determines which table
1670     is meant by following a <firstterm>search path</>, which is a list
1671     of schemas to look in.  The first matching table in the search path
1672     is taken to be the one wanted.  If there is no match in the search
1673     path, an error is reported, even if matching table names exist
1674     in other schemas in the database.
1675    </para>
1676
1677    <indexterm>
1678     <primary>schema</primary>
1679     <secondary>current</secondary>
1680    </indexterm>
1681
1682    <para>
1683     The first schema named in the search path is called the current schema.
1684     Aside from being the first schema searched, it is also the schema in
1685     which new tables will be created if the <command>CREATE TABLE</>
1686     command does not specify a schema name.
1687    </para>
1688
1689    <indexterm>
1690     <primary>search_path</primary>
1691    </indexterm>
1692
1693    <para>
1694     To show the current search path, use the following command:
1695 <programlisting>
1696 SHOW search_path;
1697 </programlisting>
1698     In the default setup this returns:
1699 <screen>
1700  search_path
1701 --------------
1702  "$user",public
1703 </screen>
1704     The first element specifies that a schema with the same name as
1705     the current user is to be searched.  If no such schema exists,
1706     the entry is ignored.  The second element refers to the
1707     public schema that we have seen already.
1708    </para>
1709
1710    <para>
1711     The first schema in the search path that exists is the default
1712     location for creating new objects.  That is the reason that by
1713     default objects are created in the public schema.  When objects
1714     are referenced in any other context without schema qualification
1715     (table modification, data modification, or query commands) the
1716     search path is traversed until a matching object is found.
1717     Therefore, in the default configuration, any unqualified access
1718     again can only refer to the public schema.
1719    </para>
1720
1721    <para>
1722     To put our new schema in the path, we use:
1723 <programlisting>
1724 SET search_path TO myschema,public;
1725 </programlisting>
1726     (We omit the <literal>$user</literal> here because we have no
1727     immediate need for it.)  And then we can access the table without
1728     schema qualification:
1729 <programlisting>
1730 DROP TABLE mytable;
1731 </programlisting>
1732     Also, since <literal>myschema</literal> is the first element in
1733     the path, new objects would by default be created in it.
1734    </para>
1735
1736    <para>
1737     We could also have written:
1738 <programlisting>
1739 SET search_path TO myschema;
1740 </programlisting>
1741     Then we no longer have access to the public schema without
1742     explicit qualification.  There is nothing special about the public
1743     schema except that it exists by default.  It can be dropped, too.
1744    </para>
1745
1746    <para>
1747     See also <xref linkend="functions-info"> for other ways to manipulate
1748     the schema search path.
1749    </para>
1750
1751    <para>
1752     The search path works in the same way for data type names, function names,
1753     and operator names as it does for table names.  Data type and function
1754     names can be qualified in exactly the same way as table names.  If you
1755     need to write a qualified operator name in an expression, there is a
1756     special provision: you must write
1757 <synopsis>
1758 <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operator</><literal>)</>
1759 </synopsis>
1760     This is needed to avoid syntactic ambiguity.  An example is:
1761 <programlisting>
1762 SELECT 3 OPERATOR(pg_catalog.+) 4;
1763 </programlisting>
1764     In practice one usually relies on the search path for operators,
1765     so as not to have to write anything so ugly as that.
1766    </para>
1767   </sect2>
1768
1769   <sect2 id="ddl-schemas-priv">
1770    <title>Schemas and Privileges</title>
1771
1772    <indexterm zone="ddl-schemas-priv">
1773     <primary>privilege</primary>
1774     <secondary sortas="schemas">for schemas</secondary>
1775    </indexterm>
1776
1777    <para>
1778     By default, users cannot access any objects in schemas they do not
1779     own.  To allow that, the owner of the schema must grant the
1780     <literal>USAGE</literal> privilege on the schema.  To allow users
1781     to make use of the objects in the schema, additional privileges
1782     might need to be granted, as appropriate for the object.
1783    </para>
1784
1785    <para>
1786     A user can also be allowed to create objects in someone else's
1787     schema.  To allow that, the <literal>CREATE</literal> privilege on
1788     the schema needs to be granted.  Note that by default, everyone
1789     has <literal>CREATE</literal> and <literal>USAGE</literal> privileges on
1790     the schema
1791     <literal>public</literal>.  This allows all users that are able to
1792     connect to a given database to create objects in its
1793     <literal>public</literal> schema.  If you do
1794     not want to allow that, you can revoke that privilege:
1795 <programlisting>
1796 REVOKE CREATE ON SCHEMA public FROM PUBLIC;
1797 </programlisting>
1798     (The first <quote>public</quote> is the schema, the second
1799     <quote>public</quote> means <quote>every user</quote>.  In the
1800     first sense it is an identifier, in the second sense it is a
1801     key word, hence the different capitalization; recall the
1802     guidelines from <xref linkend="sql-syntax-identifiers">.)
1803    </para>
1804   </sect2>
1805
1806   <sect2 id="ddl-schemas-catalog">
1807    <title>The System Catalog Schema</title>
1808
1809    <indexterm zone="ddl-schemas-catalog">
1810     <primary>system catalog</primary>
1811     <secondary>schema</secondary>
1812    </indexterm>
1813
1814    <para>
1815     In addition to <literal>public</> and user-created schemas, each
1816     database contains a <literal>pg_catalog</> schema, which contains
1817     the system tables and all the built-in data types, functions, and
1818     operators.  <literal>pg_catalog</> is always effectively part of
1819     the search path.  If it is not named explicitly in the path then
1820     it is implicitly searched <emphasis>before</> searching the path's
1821     schemas.  This ensures that built-in names will always be
1822     findable.  However, you can explicitly place
1823     <literal>pg_catalog</> at the end of your search path if you
1824     prefer to have user-defined names override built-in names.
1825    </para>
1826
1827    <para>
1828     In <productname>PostgreSQL</productname> versions before 7.3,
1829     table names beginning with <literal>pg_</> were reserved.  This is
1830     no longer true: you can create such a table name if you wish, in
1831     any non-system schema.  However, it's best to continue to avoid
1832     such names, to ensure that you won't suffer a conflict if some
1833     future version defines a system table named the same as your
1834     table.  (With the default search path, an unqualified reference to
1835     your table name would then be resolved as the system table instead.)
1836     System tables will continue to follow the convention of having
1837     names beginning with <literal>pg_</>, so that they will not
1838     conflict with unqualified user-table names so long as users avoid
1839     the <literal>pg_</> prefix.
1840    </para>
1841   </sect2>
1842
1843   <sect2 id="ddl-schemas-patterns">
1844    <title>Usage Patterns</title>
1845
1846    <para>
1847     Schemas can be used to organize your data in many ways.  There are
1848     a few usage patterns that are recommended and are easily supported by
1849     the default configuration:
1850     <itemizedlist>
1851      <listitem>
1852       <para>
1853        If you do not create any schemas then all users access the
1854        public schema implicitly.  This simulates the situation where
1855        schemas are not available at all.  This setup is mainly
1856        recommended when there is only a single user or a few cooperating
1857        users in a database.  This setup also allows smooth transition
1858        from the non-schema-aware world.
1859       </para>
1860      </listitem>
1861
1862      <listitem>
1863       <para>
1864        You can create a schema for each user with the same name as
1865        that user.  Recall that the default search path starts with
1866        <literal>$user</literal>, which resolves to the user name.
1867        Therefore, if each user has a separate schema, they access their
1868        own schemas by default.
1869       </para>
1870
1871       <para>
1872        If you use this setup then you might also want to revoke access
1873        to the public schema (or drop it altogether), so users are
1874        truly constrained to their own schemas.
1875       </para>
1876      </listitem>
1877
1878      <listitem>
1879       <para>
1880        To install shared applications (tables to be used by everyone,
1881        additional functions provided by third parties, etc.), put them
1882        into separate schemas.  Remember to grant appropriate
1883        privileges to allow the other users to access them.  Users can
1884        then refer to these additional objects by qualifying the names
1885        with a schema name, or they can put the additional schemas into
1886        their search path, as they choose.
1887       </para>
1888      </listitem>
1889     </itemizedlist>
1890    </para>
1891   </sect2>
1892
1893   <sect2 id="ddl-schemas-portability">
1894    <title>Portability</title>
1895
1896    <para>
1897     In the SQL standard, the notion of objects in the same schema
1898     being owned by different users does not exist.  Moreover, some
1899     implementations do not allow you to create schemas that have a
1900     different name than their owner.  In fact, the concepts of schema
1901     and user are nearly equivalent in a database system that
1902     implements only the basic schema support specified in the
1903     standard.  Therefore, many users consider qualified names to
1904     really consist of
1905     <literal><replaceable>username</>.<replaceable>tablename</></literal>.
1906     This is how <productname>PostgreSQL</productname> will effectively
1907     behave if you create a per-user schema for every user.
1908    </para>
1909
1910    <para>
1911     Also, there is no concept of a <literal>public</> schema in the
1912     SQL standard.  For maximum conformance to the standard, you should
1913     not use (perhaps even remove) the <literal>public</> schema.
1914    </para>
1915
1916    <para>
1917     Of course, some SQL database systems might not implement schemas
1918     at all, or provide namespace support by allowing (possibly
1919     limited) cross-database access.  If you need to work with those
1920     systems, then maximum portability would be achieved by not using
1921     schemas at all.
1922    </para>
1923   </sect2>
1924  </sect1>
1925
1926  <sect1 id="ddl-inherit">
1927   <title>Inheritance</title>
1928
1929   <indexterm>
1930    <primary>inheritance</primary>
1931   </indexterm>
1932
1933   <indexterm>
1934    <primary>table</primary>
1935    <secondary>inheritance</secondary>
1936   </indexterm>
1937
1938   <para>
1939    <productname>PostgreSQL</productname> implements table inheritance,
1940    which can be a useful tool for database designers.  (SQL:1999 and
1941    later define a type inheritance feature, which differs in many
1942    respects from the features described here.)
1943   </para>
1944
1945   <para>
1946    Let's start with an example: suppose we are trying to build a data
1947    model for cities.  Each state has many cities, but only one
1948    capital. We want to be able to quickly retrieve the capital city
1949    for any particular state. This can be done by creating two tables,
1950    one for state capitals and one for cities that are not
1951    capitals. However, what happens when we want to ask for data about
1952    a city, regardless of whether it is a capital or not? The
1953    inheritance feature can help to resolve this problem. We define the
1954    <structname>capitals</structname> table so that it inherits from
1955    <structname>cities</structname>:
1956
1957 <programlisting>
1958 CREATE TABLE cities (
1959     name            text,
1960     population      float,
1961     altitude        int     -- in feet
1962 );
1963
1964 CREATE TABLE capitals (
1965     state           char(2)
1966 ) INHERITS (cities);
1967 </programlisting>
1968
1969    In this case, the <structname>capitals</> table <firstterm>inherits</>
1970    all the columns of its parent table, <structname>cities</>. State
1971    capitals also have an extra column, <structfield>state</>, that shows
1972    their state.
1973   </para>
1974
1975   <para>
1976    In <productname>PostgreSQL</productname>, a table can inherit from
1977    zero or more other tables, and a query can reference either all
1978    rows of a table or all rows of a table plus all of its descendant tables.
1979    The latter behavior is the default.
1980    For example, the following query finds the names of all cities,
1981    including state capitals, that are located at an altitude over
1982    500 feet:
1983
1984 <programlisting>
1985 SELECT name, altitude
1986     FROM cities
1987     WHERE altitude &gt; 500;
1988 </programlisting>
1989
1990    Given the sample data from the <productname>PostgreSQL</productname>
1991    tutorial (see <xref linkend="tutorial-sql-intro">), this returns:
1992
1993 <programlisting>
1994    name    | altitude
1995 -----------+----------
1996  Las Vegas |     2174
1997  Mariposa  |     1953
1998  Madison   |      845
1999 </programlisting>
2000   </para>
2001
2002   <para>
2003    On the other hand, the following query finds all the cities that
2004    are not state capitals and are situated at an altitude over 500 feet:
2005
2006 <programlisting>
2007 SELECT name, altitude
2008     FROM ONLY cities
2009     WHERE altitude &gt; 500;
2010
2011    name    | altitude
2012 -----------+----------
2013  Las Vegas |     2174
2014  Mariposa  |     1953
2015 </programlisting>
2016   </para>
2017
2018   <para>
2019    Here the <literal>ONLY</literal> keyword indicates that the query
2020    should apply only to <structname>cities</structname>, and not any tables
2021    below <structname>cities</structname> in the inheritance hierarchy.  Many
2022    of the commands that we have already discussed &mdash;
2023    <command>SELECT</command>, <command>UPDATE</command> and
2024    <command>DELETE</command> &mdash; support the
2025    <literal>ONLY</literal> keyword.
2026   </para>
2027
2028   <para>
2029    In some cases you might wish to know which table a particular row
2030    originated from. There is a system column called
2031    <structfield>tableoid</structfield> in each table which can tell you the
2032    originating table:
2033
2034 <programlisting>
2035 SELECT c.tableoid, c.name, c.altitude
2036 FROM cities c
2037 WHERE c.altitude &gt; 500;
2038 </programlisting>
2039
2040    which returns:
2041
2042 <programlisting>
2043  tableoid |   name    | altitude
2044 ----------+-----------+----------
2045    139793 | Las Vegas |     2174
2046    139793 | Mariposa  |     1953
2047    139798 | Madison   |      845
2048 </programlisting>
2049
2050    (If you try to reproduce this example, you will probably get
2051    different numeric OIDs.)  By doing a join with
2052    <structname>pg_class</> you can see the actual table names:
2053
2054 <programlisting>
2055 SELECT p.relname, c.name, c.altitude
2056 FROM cities c, pg_class p
2057 WHERE c.altitude &gt; 500 AND c.tableoid = p.oid;
2058 </programlisting>
2059
2060    which returns:
2061
2062 <programlisting>
2063  relname  |   name    | altitude
2064 ----------+-----------+----------
2065  cities   | Las Vegas |     2174
2066  cities   | Mariposa  |     1953
2067  capitals | Madison   |      845
2068 </programlisting>
2069   </para>
2070
2071   <para>
2072    Inheritance does not automatically propagate data from
2073    <command>INSERT</command> or <command>COPY</command> commands to
2074    other tables in the inheritance hierarchy. In our example, the
2075    following <command>INSERT</command> statement will fail:
2076 <programlisting>
2077 INSERT INTO cities (name, population, altitude, state)
2078 VALUES ('New York', NULL, NULL, 'NY');
2079 </programlisting>
2080    We might hope that the data would somehow be routed to the
2081    <structname>capitals</structname> table, but this does not happen:
2082    <command>INSERT</command> always inserts into exactly the table
2083    specified.  In some cases it is possible to redirect the insertion
2084    using a rule (see <xref linkend="rules">).  However that does not
2085    help for the above case because the <structname>cities</> table
2086    does not contain the column <structfield>state</>, and so the
2087    command will be rejected before the rule can be applied.
2088   </para>
2089
2090   <para>
2091    All check constraints and not-null constraints on a parent table are
2092    automatically inherited by its children.  Other types of constraints
2093    (unique, primary key, and foreign key constraints) are not inherited.
2094   </para>
2095
2096   <para>
2097    A table can inherit from more than one parent table, in which case it has
2098    the union of the columns defined by the parent tables.  Any columns
2099    declared in the child table's definition are added to these.  If the
2100    same column name appears in multiple parent tables, or in both a parent
2101    table and the child's definition, then these columns are <quote>merged</>
2102    so that there is only one such column in the child table.  To be merged,
2103    columns must have the same data types, else an error is raised.  The
2104    merged column will have copies of all the check constraints coming from
2105    any one of the column definitions it came from, and will be marked not-null
2106    if any of them are.
2107   </para>
2108
2109   <para>
2110    Table inheritance is typically established when the child table is
2111    created, using the <literal>INHERITS</> clause of the
2112    <xref linkend="sql-createtable">
2113    statement.
2114    Alternatively, a table which is already defined in a compatible way can
2115    have a new parent relationship added, using the <literal>INHERIT</literal>
2116    variant of <xref linkend="sql-altertable">.
2117    To do this the new child table must already include columns with
2118    the same names and types as the columns of the parent. It must also include
2119    check constraints with the same names and check expressions as those of the
2120    parent. Similarly an inheritance link can be removed from a child using the
2121    <literal>NO INHERIT</literal> variant of <command>ALTER TABLE</>.
2122    Dynamically adding and removing inheritance links like this can be useful
2123    when the inheritance relationship is being used for table
2124    partitioning (see <xref linkend="ddl-partitioning">).
2125   </para>
2126
2127   <para>
2128    One convenient way to create a compatible table that will later be made
2129    a new child is to use the <literal>LIKE</literal> clause in <command>CREATE
2130    TABLE</command>. This creates a new table with the same columns as
2131    the source table. If there are any <literal>CHECK</literal>
2132    constraints defined on the source table, the <literal>INCLUDING
2133    CONSTRAINTS</literal> option to <literal>LIKE</literal> should be
2134    specified, as the new child must have constraints matching the parent
2135    to be considered compatible.
2136   </para>
2137
2138   <para>
2139    A parent table cannot be dropped while any of its children remain. Neither
2140    can columns or check constraints of child tables be dropped or altered
2141    if they are inherited
2142    from any parent tables. If you wish to remove a table and all of its
2143    descendants, one easy way is to drop the parent table with the
2144    <literal>CASCADE</literal> option.
2145   </para>
2146
2147   <para>
2148    <xref linkend="sql-altertable"> will
2149    propagate any changes in column data definitions and check
2150    constraints down the inheritance hierarchy.  Again, dropping
2151    columns that are depended on by other tables is only possible when using
2152    the <literal>CASCADE</literal> option. <command>ALTER
2153    TABLE</command> follows the same rules for duplicate column merging
2154    and rejection that apply during <command>CREATE TABLE</command>.
2155   </para>
2156
2157   <para>
2158    Note how table access permissions are handled.  Querying a parent
2159    table can automatically access data in child tables without further
2160    access privilege checking.  This preserves the appearance that the
2161    data is (also) in the parent table.  Accessing the child tables
2162    directly is, however, not automatically allowed and would require
2163    further privileges to be granted.
2164   </para>
2165
2166  <sect2 id="ddl-inherit-caveats">
2167   <title>Caveats</title>
2168
2169   <para>
2170    Note that not all SQL commands are able to work on
2171    inheritance hierarchies.  Commands that are used for data querying,
2172    data modification, or schema modification
2173    (e.g., <literal>SELECT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>,
2174    most variants of <literal>ALTER TABLE</literal>, but
2175    not <literal>INSERT</literal> and <literal>ALTER TABLE ...
2176    RENAME</literal>) typically default to including child tables and
2177    support the <literal>ONLY</literal> notation to exclude them.
2178    Commands that do database maintenance and tuning
2179    (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
2180    typically only work on individual, physical tables and do no
2181    support recursing over inheritance hierarchies.  The respective
2182    behavior of each individual command is documented in the reference
2183    part (<xref linkend="sql-commands">).
2184   </para>
2185
2186   <para>
2187    A serious limitation of the inheritance feature is that indexes (including
2188    unique constraints) and foreign key constraints only apply to single
2189    tables, not to their inheritance children. This is true on both the
2190    referencing and referenced sides of a foreign key constraint. Thus,
2191    in the terms of the above example:
2192
2193    <itemizedlist>
2194     <listitem>
2195      <para>
2196       If we declared <structname>cities</>.<structfield>name</> to be
2197       <literal>UNIQUE</> or a <literal>PRIMARY KEY</>, this would not stop the
2198       <structname>capitals</> table from having rows with names duplicating
2199       rows in <structname>cities</>.  And those duplicate rows would by
2200       default show up in queries from <structname>cities</>.  In fact, by
2201       default <structname>capitals</> would have no unique constraint at all,
2202       and so could contain multiple rows with the same name.
2203       You could add a unique constraint to <structname>capitals</>, but this
2204       would not prevent duplication compared to <structname>cities</>.
2205      </para>
2206     </listitem>
2207
2208     <listitem>
2209      <para>
2210       Similarly, if we were to specify that
2211       <structname>cities</>.<structfield>name</> <literal>REFERENCES</> some
2212       other table, this constraint would not automatically propagate to
2213       <structname>capitals</>.  In this case you could work around it by
2214       manually adding the same <literal>REFERENCES</> constraint to
2215       <structname>capitals</>.
2216      </para>
2217     </listitem>
2218
2219     <listitem>
2220      <para>
2221       Specifying that another table's column <literal>REFERENCES
2222       cities(name)</> would allow the other table to contain city names, but
2223       not capital names.  There is no good workaround for this case.
2224      </para>
2225     </listitem>
2226    </itemizedlist>
2227
2228    These deficiencies will probably be fixed in some future release,
2229    but in the meantime considerable care is needed in deciding whether
2230    inheritance is useful for your application.
2231   </para>
2232
2233   <note>
2234    <title>Deprecated</title>
2235    <para>
2236      In releases of <productname>PostgreSQL</productname> prior to 7.1, the
2237      default behavior was not to include child tables in queries. This was
2238      found to be error prone and also in violation of the SQL
2239      standard.  You can get the pre-7.1 behavior by turning off the
2240      <xref linkend="guc-sql-inheritance"> configuration
2241      option.
2242    </para>
2243   </note>
2244
2245    </sect2>
2246   </sect1>
2247
2248   <sect1 id="ddl-partitioning">
2249    <title>Partitioning</title>
2250
2251    <indexterm>
2252     <primary>partitioning</primary>
2253    </indexterm>
2254
2255    <indexterm>
2256     <primary>table</primary>
2257     <secondary>partitioning</secondary>
2258    </indexterm>
2259
2260    <para>
2261     <productname>PostgreSQL</productname> supports basic table
2262     partitioning. This section describes why and how to implement
2263     partitioning as part of your database design.
2264    </para>
2265
2266    <sect2 id="ddl-partitioning-overview">
2267      <title>Overview</title>
2268
2269    <para>
2270     Partitioning refers to splitting what is logically one large table
2271     into smaller physical pieces.
2272     Partitioning can provide several benefits:
2273    <itemizedlist>
2274     <listitem>
2275      <para>
2276       Query performance can be improved dramatically in certain situations,
2277       particularly when most of the heavily accessed rows of the table are in a
2278       single partition or a small number of partitions.  The partitioning
2279       substitutes for leading columns of indexes, reducing index size and
2280       making it more likely that the heavily-used parts of the indexes
2281       fit in memory.
2282      </para>
2283     </listitem>
2284
2285     <listitem>
2286      <para>
2287       When queries or updates access a large percentage of a single
2288       partition, performance can be improved by taking advantage
2289       of sequential scan of that partition instead of using an
2290       index and random access reads scattered across the whole table.
2291      </para>
2292     </listitem>
2293
2294     <listitem>
2295      <para>
2296       Bulk loads and deletes can be accomplished by adding or removing
2297       partitions, if that requirement is planned into the partitioning design.
2298       <command>ALTER TABLE</> is far faster than a bulk operation.
2299       It also entirely avoids the <command>VACUUM</command>
2300       overhead caused by a bulk <command>DELETE</>.
2301      </para>
2302     </listitem>
2303
2304     <listitem>
2305      <para>
2306       Seldom-used data can be migrated to cheaper and slower storage media.
2307      </para>
2308     </listitem>
2309    </itemizedlist>
2310
2311     The benefits will normally be worthwhile only when a table would
2312     otherwise be very large. The exact point at which a table will
2313     benefit from partitioning depends on the application, although a
2314     rule of thumb is that the size of the table should exceed the physical
2315     memory of the database server.
2316    </para>
2317
2318    <para>
2319     Currently, <productname>PostgreSQL</productname> supports partitioning
2320     via table inheritance.  Each partition must be created as a child
2321     table of a single parent table.  The parent table itself is normally
2322     empty; it exists just to represent the entire data set.  You should be
2323     familiar with inheritance (see <xref linkend="ddl-inherit">) before
2324     attempting to set up partitioning.
2325    </para>
2326
2327    <para>
2328     The following forms of partitioning can be implemented in
2329     <productname>PostgreSQL</productname>:
2330
2331     <variablelist>
2332      <varlistentry>
2333       <term>Range Partitioning</term>
2334
2335       <listitem>
2336        <para>
2337         The table is partitioned into <quote>ranges</quote> defined
2338         by a key column or set of columns, with no overlap between
2339         the ranges of values assigned to different partitions.  For
2340         example one might partition by date ranges, or by ranges of
2341         identifiers for particular business objects.
2342        </para>
2343       </listitem>
2344      </varlistentry>
2345
2346      <varlistentry>
2347       <term>List Partitioning</term>
2348
2349       <listitem>
2350        <para>
2351         The table is partitioned by explicitly listing which key values
2352         appear in each partition.
2353        </para>
2354       </listitem>
2355      </varlistentry>
2356     </variablelist>
2357    </para>
2358    </sect2>
2359
2360    <sect2 id="ddl-partitioning-implementation">
2361      <title>Implementing Partitioning</title>
2362
2363     <para>
2364      To set up a partitioned table, do the following:
2365      <orderedlist spacing="compact">
2366       <listitem>
2367        <para>
2368         Create the <quote>master</quote> table, from which all of the
2369         partitions will inherit.
2370        </para>
2371        <para>
2372         This table will contain no data.  Do not define any check
2373         constraints on this table, unless you intend them to
2374         be applied equally to all partitions.  There is no point
2375         in defining any indexes or unique constraints on it, either.
2376        </para>
2377       </listitem>
2378
2379       <listitem>
2380        <para>
2381         Create several <quote>child</quote> tables that each inherit from
2382         the master table.  Normally, these tables will not add any columns
2383         to the set inherited from the master.
2384        </para>
2385
2386        <para>
2387         We will refer to the child tables as partitions, though they
2388         are in every way normal <productname>PostgreSQL</> tables.
2389        </para>
2390       </listitem>
2391
2392       <listitem>
2393        <para>
2394         Add table constraints to the partition tables to define the
2395         allowed key values in each partition.
2396        </para>
2397
2398        <para>
2399         Typical examples would be:
2400 <programlisting>
2401 CHECK ( x = 1 )
2402 CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
2403 CHECK ( outletID &gt;= 100 AND outletID &lt; 200 )
2404 </programlisting>
2405         Ensure that the constraints guarantee that there is no overlap
2406         between the key values permitted in different partitions.  A common
2407         mistake is to set up range constraints like:
2408 <programlisting>
2409 CHECK ( outletID BETWEEN 100 AND 200 )
2410 CHECK ( outletID BETWEEN 200 AND 300 )
2411 </programlisting>
2412         This is wrong since it is not clear which partition the key value
2413         200 belongs in.
2414        </para>
2415
2416        <para>
2417         Note that there is no difference in
2418         syntax between range and list partitioning; those terms are
2419         descriptive only.
2420        </para>
2421       </listitem>
2422
2423       <listitem>
2424        <para>
2425         For each partition, create an index on the key column(s),
2426         as well as any other indexes you might want.  (The key index is
2427         not strictly necessary, but in most scenarios it is helpful.
2428         If you intend the key values to be unique then you should
2429         always create a unique or primary-key constraint for each
2430         partition.)
2431        </para>
2432       </listitem>
2433
2434       <listitem>
2435        <para>
2436         Optionally, define a trigger or rule to redirect data inserted into
2437         the master table to the appropriate partition.
2438        </para>
2439       </listitem>
2440
2441       <listitem>
2442        <para>
2443         Ensure that the <xref linkend="guc-constraint-exclusion">
2444         configuration parameter is not disabled in
2445         <filename>postgresql.conf</>.
2446         If it is, queries will not be optimized as desired.
2447        </para>
2448       </listitem>
2449
2450      </orderedlist>
2451     </para>
2452
2453     <para>
2454      For example, suppose we are constructing a database for a large
2455      ice cream company. The company measures peak temperatures every
2456      day as well as ice cream sales in each region. Conceptually,
2457      we want a table like:
2458
2459 <programlisting>
2460 CREATE TABLE measurement (
2461     city_id         int not null,
2462     logdate         date not null,
2463     peaktemp        int,
2464     unitsales       int
2465 );
2466 </programlisting>
2467
2468      We know that most queries will access just the last week's, month's or
2469      quarter's data, since the main use of this table will be to prepare
2470      online reports for management.
2471      To reduce the amount of old data that needs to be stored, we
2472      decide to only keep the most recent 3 years worth of data. At the
2473      beginning of each month we will remove the oldest month's data.
2474     </para>
2475
2476     <para>
2477      In this situation we can use partitioning to help us meet all of our
2478      different requirements for the measurements table. Following the
2479      steps outlined above, partitioning can be set up as follows:
2480     </para>
2481
2482     <para>
2483      <orderedlist spacing="compact">
2484       <listitem>
2485        <para>
2486         The master table is the <structname>measurement</> table, declared
2487         exactly as above.
2488        </para>
2489       </listitem>
2490
2491       <listitem>
2492        <para>
2493         Next we create one partition for each active month:
2494
2495 <programlisting>
2496 CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
2497 CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
2498 ...
2499 CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
2500 CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
2501 CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
2502 </programlisting>
2503
2504         Each of the partitions are complete tables in their own right,
2505         but they inherit their definitions from the
2506         <structname>measurement</> table.
2507        </para>
2508
2509        <para>
2510         This solves one of our problems: deleting old data. Each
2511         month, all we will need to do is perform a <command>DROP
2512         TABLE</command> on the oldest child table and create a new
2513         child table for the new month's data.
2514        </para>
2515       </listitem>
2516
2517       <listitem>
2518        <para>
2519         We must provide non-overlapping table constraints.  Rather than
2520         just creating the partition tables as above, the table creation
2521         script should really be:
2522
2523 <programlisting>
2524 CREATE TABLE measurement_y2006m02 (
2525     CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
2526 ) INHERITS (measurement);
2527 CREATE TABLE measurement_y2006m03 (
2528     CHECK ( logdate &gt;= DATE '2006-03-01' AND logdate &lt; DATE '2006-04-01' )
2529 ) INHERITS (measurement);
2530 ...
2531 CREATE TABLE measurement_y2007m11 (
2532     CHECK ( logdate &gt;= DATE '2007-11-01' AND logdate &lt; DATE '2007-12-01' )
2533 ) INHERITS (measurement);
2534 CREATE TABLE measurement_y2007m12 (
2535     CHECK ( logdate &gt;= DATE '2007-12-01' AND logdate &lt; DATE '2008-01-01' )
2536 ) INHERITS (measurement);
2537 CREATE TABLE measurement_y2008m01 (
2538     CHECK ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
2539 ) INHERITS (measurement);
2540 </programlisting>
2541        </para>
2542       </listitem>
2543
2544       <listitem>
2545        <para>
2546         We probably need indexes on the key columns too:
2547
2548 <programlisting>
2549 CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
2550 CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
2551 ...
2552 CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
2553 CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
2554 CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
2555 </programlisting>
2556
2557         We choose not to add further indexes at this time.
2558        </para>
2559       </listitem>
2560
2561       <listitem>
2562        <para>
2563         We want our application to be able to say <literal>INSERT INTO
2564         measurement ...</> and have the data be redirected into the
2565         appropriate partition table.  We can arrange that by attaching
2566         a suitable trigger function to the master table.
2567         If data will be added only to the latest partition, we can
2568         use a very simple trigger function:
2569
2570 <programlisting>
2571 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
2572 RETURNS TRIGGER AS $$
2573 BEGIN
2574     INSERT INTO measurement_y2008m01 VALUES (NEW.*);
2575     RETURN NULL;
2576 END;
2577 $$
2578 LANGUAGE plpgsql;
2579 </programlisting>
2580
2581         After creating the function, we create a trigger which
2582         calls the trigger function:
2583
2584 <programlisting>
2585 CREATE TRIGGER insert_measurement_trigger
2586     BEFORE INSERT ON measurement
2587     FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
2588 </programlisting>
2589
2590         We must redefine the trigger function each month so that it always
2591         points to the current partition.  The trigger definition does
2592         not need to be updated, however.
2593        </para>
2594
2595        <para>
2596         We might want to insert data and have the server automatically
2597         locate the partition into which the row should be added. We
2598         could do this with a more complex trigger function, for example:
2599
2600 <programlisting>
2601 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
2602 RETURNS TRIGGER AS $$
2603 BEGIN
2604     IF ( NEW.logdate &gt;= DATE '2006-02-01' AND
2605          NEW.logdate &lt; DATE '2006-03-01' ) THEN
2606         INSERT INTO measurement_y2006m02 VALUES (NEW.*);
2607     ELSIF ( NEW.logdate &gt;= DATE '2006-03-01' AND
2608             NEW.logdate &lt; DATE '2006-04-01' ) THEN
2609         INSERT INTO measurement_y2006m03 VALUES (NEW.*);
2610     ...
2611     ELSIF ( NEW.logdate &gt;= DATE '2008-01-01' AND
2612             NEW.logdate &lt; DATE '2008-02-01' ) THEN
2613         INSERT INTO measurement_y2008m01 VALUES (NEW.*);
2614     ELSE
2615         RAISE EXCEPTION 'Date out of range.  Fix the measurement_insert_trigger() function!';
2616     END IF;
2617     RETURN NULL;
2618 END;
2619 $$
2620 LANGUAGE plpgsql;
2621 </programlisting>
2622
2623         The trigger definition is the same as before.
2624         Note that each <literal>IF</literal> test must exactly match the
2625         <literal>CHECK</literal> constraint for its partition.
2626        </para>
2627
2628        <para>
2629         While this function is more complex than the single-month case,
2630         it doesn't need to be updated as often, since branches can be
2631         added in advance of being needed.
2632        </para>
2633
2634        <note>
2635         <para>
2636          In practice it might be best to check the newest partition first,
2637          if most inserts go into that partition.  For simplicity we have
2638          shown the trigger's tests in the same order as in other parts
2639          of this example.
2640         </para>
2641        </note>
2642       </listitem>
2643      </orderedlist>
2644     </para>
2645
2646     <para>
2647      As we can see, a complex partitioning scheme could require a
2648      substantial amount of DDL. In the above example we would be
2649      creating a new partition each month, so it might be wise to write a
2650      script that generates the required DDL automatically.
2651     </para>
2652
2653    </sect2>
2654
2655    <sect2 id="ddl-partitioning-managing-partitions">
2656    <title>Managing Partitions</title>
2657
2658    <para>
2659      Normally the set of partitions established when initially
2660      defining the table are not intended to remain static. It is
2661      common to want to remove old partitions of data and periodically
2662      add new partitions for new data. One of the most important
2663      advantages of partitioning is precisely that it allows this
2664      otherwise painful task to be executed nearly instantaneously by
2665      manipulating the partition structure, rather than physically moving large
2666      amounts of data around.
2667    </para>
2668
2669    <para>
2670      The simplest option for removing old data is simply to drop the partition
2671      that is no longer necessary:
2672 <programlisting>
2673 DROP TABLE measurement_y2006m02;
2674 </programlisting>
2675      This can very quickly delete millions of records because it doesn't have
2676      to individually delete every record.
2677    </para>
2678
2679    <para>
2680      Another option that is often preferable is to remove the partition from
2681      the partitioned table but retain access to it as a table in its own
2682      right:
2683 <programlisting>
2684 ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
2685 </programlisting>
2686      This allows further operations to be performed on the data before
2687      it is dropped. For example, this is often a useful time to back up
2688      the data using <command>COPY</>, <application>pg_dump</>, or
2689      similar tools. It might also be a useful time to aggregate data
2690      into smaller formats, perform other data manipulations, or run
2691      reports.
2692    </para>
2693
2694    <para>
2695      Similarly we can add a new partition to handle new data. We can create an
2696      empty partition in the partitioned table just as the original partitions
2697      were created above:
2698
2699 <programlisting>
2700 CREATE TABLE measurement_y2008m02 (
2701     CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' )
2702 ) INHERITS (measurement);
2703 </programlisting>
2704
2705      As an alternative, it is sometimes more convenient to create the
2706      new table outside the partition structure, and make it a proper
2707      partition later. This allows the data to be loaded, checked, and
2708      transformed prior to it appearing in the partitioned table:
2709
2710 <programlisting>
2711 CREATE TABLE measurement_y2008m02
2712   (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
2713 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
2714    CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
2715 \copy measurement_y2008m02 from 'measurement_y2008m02'
2716 -- possibly some other data preparation work
2717 ALTER TABLE measurement_y2008m02 INHERIT measurement;
2718 </programlisting>
2719     </para>
2720    </sect2>
2721
2722    <sect2 id="ddl-partitioning-constraint-exclusion">
2723    <title>Partitioning and Constraint Exclusion</title>
2724
2725    <indexterm>
2726     <primary>constraint exclusion</primary>
2727    </indexterm>
2728
2729    <para>
2730     <firstterm>Constraint exclusion</> is a query optimization technique
2731     that improves performance for partitioned tables defined in the
2732     fashion described above.  As an example:
2733
2734 <programlisting>
2735 SET constraint_exclusion = on;
2736 SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
2737 </programlisting>
2738
2739     Without constraint exclusion, the above query would scan each of
2740     the partitions of the <structname>measurement</> table. With constraint
2741     exclusion enabled, the planner will examine the constraints of each
2742     partition and try to prove that the partition need not
2743     be scanned because it could not contain any rows meeting the query's
2744     <literal>WHERE</> clause.  When the planner can prove this, it
2745     excludes the partition from the query plan.
2746    </para>
2747
2748    <para>
2749     You can use the <command>EXPLAIN</> command to show the difference
2750     between a plan with <varname>constraint_exclusion</> on and a plan
2751     with it off.  A typical unoptimized plan for this type of table setup is:
2752
2753 <programlisting>
2754 SET constraint_exclusion = off;
2755 EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
2756
2757                                           QUERY PLAN
2758 -----------------------------------------------------------------------------------------------
2759  Aggregate  (cost=158.66..158.68 rows=1 width=0)
2760    -&gt;  Append  (cost=0.00..151.88 rows=2715 width=0)
2761          -&gt;  Seq Scan on measurement  (cost=0.00..30.38 rows=543 width=0)
2762                Filter: (logdate &gt;= '2008-01-01'::date)
2763          -&gt;  Seq Scan on measurement_y2006m02 measurement  (cost=0.00..30.38 rows=543 width=0)
2764                Filter: (logdate &gt;= '2008-01-01'::date)
2765          -&gt;  Seq Scan on measurement_y2006m03 measurement  (cost=0.00..30.38 rows=543 width=0)
2766                Filter: (logdate &gt;= '2008-01-01'::date)
2767 ...
2768          -&gt;  Seq Scan on measurement_y2007m12 measurement  (cost=0.00..30.38 rows=543 width=0)
2769                Filter: (logdate &gt;= '2008-01-01'::date)
2770          -&gt;  Seq Scan on measurement_y2008m01 measurement  (cost=0.00..30.38 rows=543 width=0)
2771                Filter: (logdate &gt;= '2008-01-01'::date)
2772 </programlisting>
2773
2774     Some or all of the partitions might use index scans instead of
2775     full-table sequential scans, but the point here is that there
2776     is no need to scan the older partitions at all to answer this query.
2777     When we enable constraint exclusion, we get a significantly
2778     cheaper plan that will deliver the same answer:
2779
2780 <programlisting>
2781 SET constraint_exclusion = on;
2782 EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
2783                                           QUERY PLAN
2784 -----------------------------------------------------------------------------------------------
2785  Aggregate  (cost=63.47..63.48 rows=1 width=0)
2786    -&gt;  Append  (cost=0.00..60.75 rows=1086 width=0)
2787          -&gt;  Seq Scan on measurement  (cost=0.00..30.38 rows=543 width=0)
2788                Filter: (logdate &gt;= '2008-01-01'::date)
2789          -&gt;  Seq Scan on measurement_y2008m01 measurement  (cost=0.00..30.38 rows=543 width=0)
2790                Filter: (logdate &gt;= '2008-01-01'::date)
2791 </programlisting>
2792    </para>
2793
2794    <para>
2795     Note that constraint exclusion is driven only by <literal>CHECK</>
2796     constraints, not by the presence of indexes.  Therefore it isn't
2797     necessary to define indexes on the key columns.  Whether an index
2798     needs to be created for a given partition depends on whether you
2799     expect that queries that scan the partition will generally scan
2800     a large part of the partition or just a small part.  An index will
2801     be helpful in the latter case but not the former.
2802    </para>
2803
2804    <para>
2805     The default (and recommended) setting of
2806     <xref linkend="guc-constraint-exclusion"> is actually neither
2807     <literal>on</> nor <literal>off</>, but an intermediate setting
2808     called <literal>partition</>, which causes the technique to be
2809     applied only to queries that are likely to be working on partitioned
2810     tables.  The <literal>on</> setting causes the planner to examine
2811     <literal>CHECK</> constraints in all queries, even simple ones that
2812     are unlikely to benefit.
2813    </para>
2814
2815    </sect2>
2816
2817    <sect2 id="ddl-partitioning-alternatives">
2818    <title>Alternative Partitioning Methods</title>
2819
2820     <para>
2821      A different approach to redirecting inserts into the appropriate
2822      partition table is to set up rules, instead of a trigger, on the
2823      master table.  For example:
2824
2825 <programlisting>
2826 CREATE RULE measurement_insert_y2006m02 AS
2827 ON INSERT TO measurement WHERE
2828     ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
2829 DO INSTEAD
2830     INSERT INTO measurement_y2006m02 VALUES (NEW.*);
2831 ...
2832 CREATE RULE measurement_insert_y2008m01 AS
2833 ON INSERT TO measurement WHERE
2834     ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
2835 DO INSTEAD
2836     INSERT INTO measurement_y2008m01 VALUES (NEW.*);
2837 </programlisting>
2838
2839      A rule has significantly more overhead than a trigger, but the overhead
2840      is paid once per query rather than once per row, so this method might be
2841      advantageous for bulk-insert situations.  In most cases, however, the
2842      trigger method will offer better performance.
2843     </para>
2844
2845     <para>
2846      Be aware that <command>COPY</> ignores rules.  If you want to
2847      use <command>COPY</> to insert data, you'll need to copy into the correct
2848      partition table rather than into the master.  <command>COPY</> does fire
2849      triggers, so you can use it normally if you use the trigger approach.
2850     </para>
2851
2852     <para>
2853      Another disadvantage of the rule approach is that there is no simple
2854      way to force an error if the set of rules doesn't cover the insertion
2855      date; the data will silently go into the master table instead.
2856     </para>
2857
2858     <para>
2859      Partitioning can also be arranged using a <literal>UNION ALL</literal>
2860      view, instead of table inheritance.  For example,
2861
2862 <programlisting>
2863 CREATE VIEW measurement AS
2864           SELECT * FROM measurement_y2006m02
2865 UNION ALL SELECT * FROM measurement_y2006m03
2866 ...
2867 UNION ALL SELECT * FROM measurement_y2007m11
2868 UNION ALL SELECT * FROM measurement_y2007m12
2869 UNION ALL SELECT * FROM measurement_y2008m01;
2870 </programlisting>
2871
2872      However, the need to recreate the view adds an extra step to adding and
2873      dropping individual partitions of the data set.  In practice this
2874      method has little to recommend it compared to using inheritance.
2875     </para>
2876
2877    </sect2>
2878
2879    <sect2 id="ddl-partitioning-caveats">
2880    <title>Caveats</title>
2881
2882    <para>
2883     The following caveats apply to partitioned tables:
2884    <itemizedlist>
2885     <listitem>
2886      <para>
2887       There is no automatic way to verify that all of the
2888       <literal>CHECK</literal> constraints are mutually
2889       exclusive.  It is safer to create code that generates
2890       partitions and creates and/or modifies associated objects than
2891       to write each by hand.
2892      </para>
2893     </listitem>
2894
2895     <listitem>
2896      <para>
2897       The schemes shown here assume that the partition key column(s)
2898       of a row never change, or at least do not change enough to require
2899       it to move to another partition.  An <command>UPDATE</> that attempts
2900       to do that will fail because of the <literal>CHECK</> constraints.
2901       If you need to handle such cases, you can put suitable update triggers
2902       on the partition tables, but it makes management of the structure
2903       much more complicated.
2904      </para>
2905     </listitem>
2906
2907     <listitem>
2908      <para>
2909       If you are using manual <command>VACUUM</command> or
2910       <command>ANALYZE</command> commands, don't forget that
2911       you need to run them on each partition individually. A command like:
2912 <programlisting>
2913 ANALYZE measurement;
2914 </programlisting>
2915       will only process the master table.
2916      </para>
2917     </listitem>
2918
2919    </itemizedlist>
2920    </para>
2921
2922    <para>
2923     The following caveats apply to constraint exclusion:
2924
2925    <itemizedlist>
2926     <listitem>
2927      <para>
2928       Constraint exclusion only works when the query's <literal>WHERE</>
2929       clause contains constants.  A parameterized query will not be
2930       optimized, since the planner cannot know which partitions the
2931       parameter value might select at run time.  For the same reason,
2932       <quote>stable</> functions such as <function>CURRENT_DATE</function>
2933       must be avoided.
2934      </para>
2935     </listitem>
2936
2937     <listitem>
2938      <para>
2939       Keep the partitioning constraints simple, else the planner may not be
2940       able to prove that partitions don't need to be visited.  Use simple
2941       equality conditions for list partitioning, or simple
2942       range tests for range partitioning, as illustrated in the preceding
2943       examples.  A good rule of thumb is that partitioning constraints should
2944       contain only comparisons of the partitioning column(s) to constants
2945       using B-tree-indexable operators.
2946      </para>
2947     </listitem>
2948
2949     <listitem>
2950      <para>
2951       All constraints on all partitions of the master table are examined
2952       during constraint exclusion, so large numbers of partitions are likely
2953       to increase query planning time considerably.  Partitioning using
2954       these techniques will work well with up to perhaps a hundred partitions;
2955       don't try to use many thousands of partitions.
2956      </para>
2957     </listitem>
2958
2959    </itemizedlist>
2960    </para>
2961   </sect2>
2962  </sect1>
2963
2964  <sect1 id="ddl-others">
2965   <title>Other Database Objects</title>
2966
2967   <para>
2968    Tables are the central objects in a relational database structure,
2969    because they hold your data.  But they are not the only objects
2970    that exist in a database.  Many other kinds of objects can be
2971    created to make the use and management of the data more efficient
2972    or convenient.  They are not discussed in this chapter, but we give
2973    you a list here so that you are aware of what is possible:
2974   </para>
2975
2976   <itemizedlist>
2977    <listitem>
2978     <para>
2979      Views
2980     </para>
2981    </listitem>
2982
2983    <listitem>
2984     <para>
2985      Functions and operators
2986     </para>
2987    </listitem>
2988
2989    <listitem>
2990     <para>
2991      Data types and domains
2992     </para>
2993    </listitem>
2994
2995    <listitem>
2996     <para>
2997      Triggers and rewrite rules
2998     </para>
2999    </listitem>
3000   </itemizedlist>
3001
3002   <para>
3003    Detailed information on
3004    these topics appears in <xref linkend="server-programming">.
3005   </para>
3006  </sect1>
3007
3008  <sect1 id="ddl-depend">
3009   <title>Dependency Tracking</title>
3010
3011   <indexterm zone="ddl-depend">
3012    <primary>CASCADE</primary>
3013    <secondary sortas="DROP">with DROP</secondary>
3014   </indexterm>
3015
3016   <indexterm zone="ddl-depend">
3017    <primary>RESTRICT</primary>
3018    <secondary sortas="DROP">with DROP</secondary>
3019   </indexterm>
3020
3021   <para>
3022    When you create complex database structures involving many tables
3023    with foreign key constraints, views, triggers, functions, etc. you
3024    implicitly create a net of dependencies between the objects.
3025    For instance, a table with a foreign key constraint depends on the
3026    table it references.
3027   </para>
3028
3029   <para>
3030    To ensure the integrity of the entire database structure,
3031    <productname>PostgreSQL</productname> makes sure that you cannot
3032    drop objects that other objects still depend on.  For example,
3033    attempting to drop the products table we had considered in <xref
3034    linkend="ddl-constraints-fk">, with the orders table depending on
3035    it, would result in an error message such as this:
3036 <screen>
3037 DROP TABLE products;
3038
3039 NOTICE:  constraint orders_product_no_fkey on table orders depends on table products
3040 ERROR:  cannot drop table products because other objects depend on it
3041 HINT:  Use DROP ... CASCADE to drop the dependent objects too.
3042 </screen>
3043    The error message contains a useful hint: if you do not want to
3044    bother deleting all the dependent objects individually, you can run:
3045 <screen>
3046 DROP TABLE products CASCADE;
3047 </screen>
3048    and all the dependent objects will be removed.  In this case, it
3049    doesn't remove the orders table, it only removes the foreign key
3050    constraint.  (If you want to check what <command>DROP ... CASCADE</> will do,
3051    run <command>DROP</> without <literal>CASCADE</> and read the <literal>NOTICE</> messages.)
3052   </para>
3053
3054   <para>
3055    All drop commands in <productname>PostgreSQL</productname> support
3056    specifying <literal>CASCADE</literal>.  Of course, the nature of
3057    the possible dependencies varies with the type of the object.  You
3058    can also write <literal>RESTRICT</literal> instead of
3059    <literal>CASCADE</literal> to get the default behavior, which is to
3060    prevent the dropping of objects that other objects depend on.
3061   </para>
3062
3063   <note>
3064    <para>
3065     According to the SQL standard, specifying either
3066     <literal>RESTRICT</literal> or <literal>CASCADE</literal> is
3067     required.  No database system actually enforces that rule, but
3068     whether the default behavior is <literal>RESTRICT</literal> or
3069     <literal>CASCADE</literal> varies across systems.
3070    </para>
3071   </note>
3072
3073   <note>
3074    <para>
3075     Foreign key constraint dependencies and serial column dependencies
3076     from <productname>PostgreSQL</productname> versions prior to 7.3
3077     are <emphasis>not</emphasis> maintained or created during the
3078     upgrade process.  All other dependency types will be properly
3079     created during an upgrade from a pre-7.3 database.
3080    </para>
3081   </note>
3082  </sect1>
3083
3084 </chapter>