granicus.if.org Git - postgresql/blob - doc/src/sgml/ddl.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.30 2004/08/08 21:33:11 tgl Exp $ -->
   2
   3 <chapter id="ddl">
   4  <title>Data Definition</title>
   5
   6  <para>
   7   This chapter covers how one creates the database structures that
   8   will hold one's data.  In a relational database, the raw data is
   9   stored in tables, so the majority of this chapter is devoted to
  10   explaining how tables are created and modified and what features are
  11   available to control what data is stored in the tables.
  12   Subsequently, we discuss how tables can be organized into
  13   schemas, and how privileges can be assigned to tables.  Finally,
  14   we will briefly look at other features that affect the data storage,
  15   such as views, functions, and triggers.
  16  </para>
  17
  18  <sect1 id="ddl-basics">
  19   <title>Table Basics</title>
  20
  21   <indexterm zone="ddl-basics">
  22    <primary>table</primary>
  23   </indexterm>
  24
  25   <indexterm>
  26    <primary>row</primary>
  27   </indexterm>
  28
  29   <indexterm>
  30    <primary>column</primary>
  31   </indexterm>
  32
  33   <para>
  34    A table in a relational database is much like a table on paper: It
  35    consists of rows and columns.  The number and order of the columns
  36    is fixed, and each column has a name.  The number of rows is
  37    variable -- it reflects how much data is stored at a given moment.
  38    SQL does not make any guarantees about the order of the rows in a
  39    table.  When a table is read, the rows will appear in random order,
  40    unless sorting is explicitly requested.  This is covered in <xref
  41    linkend="queries">.  Furthermore, SQL does not assign unique
  42    identifiers to rows, so it is possible to have several completely
  43    identical rows in a table.  This is a consequence of the
  44    mathematical model that underlies SQL but is usually not desirable.
  45    Later in this chapter we will see how to deal with this issue.
  46   </para>
  47
  48   <para>
  49    Each column has a data type.  The data type constrains the set of
  50    possible values that can be assigned to a column and assigns
  51    semantics to the data stored in the column so that it can be used
  52    for computations.  For instance, a column declared to be of a
  53    numerical type will not accept arbitrary text strings, and the data
  54    stored in such a column can be used for mathematical computations.
  55    By contrast, a column declared to be of a character string type
  56    will accept almost any kind of data but it does not lend itself to
  57    mathematical calculations, although other operations such as string
  58    concatenation are available.
  59   </para>
  60
  61   <para>
  62    <productname>PostgreSQL</productname> includes a sizable set of
  63    built-in data types that fit many applications.  Users can also
  64    define their own data types.  Most built-in data types have obvious
  65    names and semantics, so we defer a detailed explanation to <xref
  66    linkend="datatype">.  Some of the frequently used data types are
  67    <type>integer</type> for whole numbers, <type>numeric</type> for
  68    possibly fractional numbers, <type>text</type> for character
  69    strings, <type>date</type> for dates, <type>time</type> for
  70    time-of-day values, and <type>timestamp</type> for values
  71    containing both date and time.
  72   </para>
  73
  74   <indexterm>
  75    <primary>table</primary>
  76    <secondary>creating</secondary>
  77   </indexterm>
  78
  79   <para>
  80    To create a table, you use the aptly named <command>CREATE
  81    TABLE</command> command.  In this command you specify at least a
  82    name for the new table, the names of the columns and the data type
  83    of each column.  For example:
  84 <programlisting>
  85 CREATE TABLE my_first_table (
  86     first_column text,
  87     second_column integer
  88 );
  89 </programlisting>
  90    This creates a table named <literal>my_first_table</literal> with
  91    two columns.  The first column is named
  92    <literal>first_column</literal> and has a data type of
  93    <type>text</type>; the second column has the name
  94    <literal>second_column</literal> and the type <type>integer</type>.
  95    The table and column names follow the identifier syntax explained
  96    in <xref linkend="sql-syntax-identifiers">.  The type names are
  97    usually also identifiers, but there are some exceptions.  Note that the
  98    column list is comma-separated and surrounded by parentheses.
  99   </para>
 100
 101   <para>
 102    Of course, the previous example was heavily contrived.  Normally,
 103    you would give names to your tables and columns that convey what
 104    kind of data they store.  So let's look at a more realistic
 105    example:
 106 <programlisting>
 107 CREATE TABLE products (
 108     product_no integer,
 109     name text,
 110     price numeric
 111 );
 112 </programlisting>
 113    (The <type>numeric</type> type can store fractional components, as
 114    would be typical of monetary amounts.)
 115   </para>
 116
 117   <tip>
 118    <para>
 119     When you create many interrelated tables it is wise to choose a
 120     consistent naming pattern for the tables and columns.  For
 121     instance, there is a choice of using singular or plural nouns for
 122     table names, both of which are favored by some theorist or other.
 123    </para>
 124   </tip>
 125
 126   <para>
 127    There is a limit on how many columns a table can contain.
 128    Depending on the column types, it is between 250 and 1600.
 129    However, defining a table with anywhere near this many columns is
 130    highly unusual and often a questionable design.
 131   </para>
 132
 133   <indexterm>
 134    <primary>table</primary>
 135    <secondary>removing</secondary>
 136   </indexterm>
 137
 138   <para>
 139    If you no longer need a table, you can remove it using the
 140    <command>DROP TABLE</command> command.  For example:
 141 <programlisting>
 142 DROP TABLE my_first_table;
 143 DROP TABLE products;
 144 </programlisting>
 145    Attempting to drop a table that does not exist is an error.
 146    Nevertheless, it is common in SQL script files to unconditionally
 147    try to drop each table before creating it, ignoring the error
 148    messages.
 149   </para>
 150
 151   <para>
 152    If you need to modify a table that already exists look into <xref
 153    linkend="ddl-alter"> later in this chapter.
 154   </para>
 155
 156   <para>
 157    With the tools discussed so far you can create fully functional
 158    tables.  The remainder of this chapter is concerned with adding
 159    features to the table definition to ensure data integrity,
 160    security, or convenience.  If you are eager to fill your tables with
 161    data now you can skip ahead to <xref linkend="dml"> and read the
 162    rest of this chapter later.
 163   </para>
 164  </sect1>
 165
 166  <sect1 id="ddl-system-columns">
 167   <title>System Columns</title>
 168
 169   <para>
 170    Every table has several <firstterm>system columns</> that are
 171    implicitly defined by the system.  Therefore, these names cannot be
 172    used as names of user-defined columns.  (Note that these
 173    restrictions are separate from whether the name is a key word or
 174    not; quoting a name will not allow you to escape these
 175    restrictions.)  You do not really need to be concerned about these
 176    columns, just know they exist.
 177   </para>
 178
 179   <indexterm>
 180    <primary>column</primary>
 181    <secondary>system column</secondary>
 182   </indexterm>
 183
 184   <variablelist>
 185    <varlistentry>
 186     <term><structfield>oid</></term>
 187     <listitem>
 188      <para>
 189       <indexterm>
 190        <primary>OID</primary>
 191        <secondary>column</secondary>
 192       </indexterm>
 193       The object identifier (object ID) of a row.  This is a serial
 194       number that is automatically added by
 195       <productname>PostgreSQL</productname> to all table rows (unless
 196       the table was created using <literal>WITHOUT OIDS</literal>, in which
 197       case this column is not present).  This column is of type
 198       <type>oid</type> (same name as the column); see <xref
 199       linkend="datatype-oid"> for more information about the type.
 200      </para>
 201     </listitem>
 202    </varlistentry>
 203
 204    <varlistentry>
 205     <term><structfield>tableoid</></term>
 206     <listitem>
 207      <indexterm>
 208       <primary>tableoid</primary>
 209      </indexterm>
 210
 211      <para>
 212       The OID of the table containing this row.  This column is
 213       particularly handy for queries that select from inheritance
 214       hierarchies, since without it, it's difficult to tell which
 215       individual table a row came from.  The
 216       <structfield>tableoid</structfield> can be joined against the
 217       <structfield>oid</structfield> column of
 218       <structname>pg_class</structname> to obtain the table name.
 219      </para>
 220     </listitem>
 221    </varlistentry>
 222
 223    <varlistentry>
 224     <term><structfield>xmin</></term>
 225     <listitem>
 226      <indexterm>
 227       <primary>xmin</primary>
 228      </indexterm>
 229
 230      <para>
 231       The identity (transaction ID) of the inserting transaction for
 232       this row version.  (A row version is an individual state of a
 233       row; each update of a row creates a new row version for the same
 234       logical row.)
 235      </para>
 236     </listitem>
 237    </varlistentry>
 238
 239    <varlistentry>
 240     <term><structfield>cmin</></term>
 241     <listitem>
 242      <indexterm>
 243       <primary>cmin</primary>
 244      </indexterm>
 245
 246      <para>
 247       The command identifier (starting at zero) within the inserting
 248       transaction.
 249      </para>
 250     </listitem>
 251    </varlistentry>
 252
 253    <varlistentry>
 254     <term><structfield>xmax</></term>
 255     <listitem>
 256      <indexterm>
 257       <primary>xmax</primary>
 258      </indexterm>
 259
 260      <para>
 261       The identity (transaction ID) of the deleting transaction, or
 262       zero for an undeleted row version.  It is possible for this column to
 263       be nonzero in a visible row version: That usually indicates that the
 264       deleting transaction hasn't committed yet, or that an attempted
 265       deletion was rolled back.
 266      </para>
 267     </listitem>
 268    </varlistentry>
 269
 270    <varlistentry>
 271     <term><structfield>cmax</></term>
 272     <listitem>
 273      <indexterm>
 274       <primary>cmax</primary>
 275      </indexterm>
 276
 277      <para>
 278       The command identifier within the deleting transaction, or zero.
 279      </para>
 280     </listitem>
 281    </varlistentry>
 282
 283    <varlistentry>
 284     <term><structfield>ctid</></term>
 285     <listitem>
 286      <indexterm>
 287       <primary>ctid</primary>
 288      </indexterm>
 289
 290      <para>
 291       The physical location of the row version within its table.  Note that
 292       although the <structfield>ctid</structfield> can be used to
 293       locate the row version very quickly, a row's
 294       <structfield>ctid</structfield> will change each time it is
 295       updated or moved by <command>VACUUM FULL</>.  Therefore
 296       <structfield>ctid</structfield> is useless as a long-term row
 297       identifier.  The OID, or even better a user-defined serial
 298       number, should be used to identify logical rows.
 299      </para>
 300     </listitem>
 301    </varlistentry>
 302   </variablelist>
 303
 304    <para>
 305     OIDs are 32-bit quantities and are assigned from a single
 306     cluster-wide counter.  In a large or long-lived database, it is
 307     possible for the counter to wrap around.  Hence, it is bad
 308     practice to assume that OIDs are unique, unless you take steps to
 309     ensure that this is the case.  If you need to identify the rows in
 310     a table, using a sequence generator is strongly recommended.
 311     However, OIDs can be used as well, provided that a few additional
 312     precautions are taken:
 313
 314     <itemizedlist>
 315      <listitem>
 316       <para>
 317        A unique constraint should be created on the OID column of each
 318        table for which the OID will be used to identify rows.
 319       </para>
 320      </listitem>
 321      <listitem>
 322       <para>
 323        OIDs should never be assumed to be unique across tables; use
 324        the combination of <structfield>tableoid</> and row OID if you
 325        need a database-wide identifier.
 326       </para>
 327      </listitem>
 328      <listitem>
 329       <para>
 330        The tables in question should be created using <literal>WITH
 331        OIDS</literal> to ensure forward compatibility with future
 332        releases of <productname>PostgreSQL</productname> in which OIDs
 333        are not included in all tables by default.
 334       </para>
 335      </listitem>
 336     </itemizedlist>
 337    </para>
 338
 339    <para>
 340     Transaction identifiers are also 32-bit quantities.  In a
 341     long-lived database it is possible for transaction IDs to wrap
 342     around.  This is not a fatal problem given appropriate maintenance
 343     procedures; see <xref linkend="maintenance"> for details.  It is
 344     unwise, however, to depend on the uniqueness of transaction IDs
 345     over the long term (more than one billion transactions).
 346    </para>
 347
 348    <para>
 349     Command
 350     identifiers are also 32-bit quantities.  This creates a hard limit
 351     of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands
 352     within a single transaction.  In practice this limit is not a
 353     problem --- note that the limit is on number of
 354     <acronym>SQL</acronym> commands, not number of rows processed.
 355    </para>
 356  </sect1>
 357
 358  <sect1 id="ddl-default">
 359   <title>Default Values</title>
 360
 361   <indexterm zone="ddl-default">
 362    <primary>default value</primary>
 363   </indexterm>
 364
 365   <para>
 366    A column can be assigned a default value.  When a new row is
 367    created and no values are specified for some of the columns, the
 368    columns will be filled with their respective default values.  A
 369    data manipulation command can also request explicitly that a column
 370    be set to its default value, without knowing what this value is.
 371    (Details about data manipulation commands are in <xref linkend="dml">.)
 372   </para>
 373
 374   <para>
 375    <indexterm><primary>null value</primary><secondary>default value</secondary></indexterm>
 376    If no default value is declared explicitly, the null value is the
 377    default value.  This usually makes sense because a null value can
 378    be thought to represent unknown data.
 379   </para>
 380
 381   <para>
 382    In a table definition, default values are listed after the column
 383    data type.  For example:
 384 <programlisting>
 385 CREATE TABLE products (
 386     product_no integer,
 387     name text,
 388     price numeric <emphasis>DEFAULT 9.99</emphasis>
 389 );
 390 </programlisting>
 391   </para>
 392
 393   <para>
 394    The default value may be a scalar expression, which will be
 395    evaluated whenever the default value is inserted
 396    (<emphasis>not</emphasis> when the table is created).
 397   </para>
 398  </sect1>
 399
 400  <sect1 id="ddl-constraints">
 401   <title>Constraints</title>
 402
 403   <indexterm zone="ddl-constraints">
 404    <primary>constraint</primary>
 405   </indexterm>
 406
 407   <para>
 408    Data types are a way to limit the kind of data that can be stored
 409    in a table.  For many applications, however, the constraint they
 410    provide is too coarse.  For example, a column containing a product
 411    price should probably only accept positive values.  But there is no
 412    data type that accepts only positive numbers.  Another issue is
 413    that you might want to constrain column data with respect to other
 414    columns or rows.  For example, in a table containing product
 415    information, there should only be one row for each product number.
 416   </para>
 417
 418   <para>
 419    To that end, SQL allows you to define constraints on columns and
 420    tables.  Constraints give you as much control over the data in your
 421    tables as you wish.  If a user attempts to store data in a column
 422    that would violate a constraint, an error is raised.  This applies
 423    even if the value came from the default value definition.
 424   </para>
 425
 426   <sect2>
 427    <title>Check Constraints</title>
 428
 429    <indexterm>
 430     <primary>check constraint</primary>
 431    </indexterm>
 432
 433    <indexterm>
 434     <primary>constraint</primary>
 435     <secondary>check</secondary>
 436    </indexterm>
 437
 438    <para>
 439     A check constraint is the most generic constraint type.  It allows
 440     you to specify that the value in a certain column must satisfy an
 441     arbitrary expression.  For instance, to require positive product
 442     prices, you could use:
 443 <programlisting>
 444 CREATE TABLE products (
 445     product_no integer,
 446     name text,
 447     price numeric <emphasis>CHECK (price > 0)</emphasis>
 448 );
 449 </programlisting>
 450    </para>
 451
 452    <para>
 453     As you see, the constraint definition comes after the data type,
 454     just like default value definitions.  Default values and
 455     constraints can be listed in any order.  A check constraint
 456     consists of the key word <literal>CHECK</literal> followed by an
 457     expression in parentheses.  The check constraint expression should
 458     involve the column thus constrained, otherwise the constraint
 459     would not make too much sense.
 460    </para>
 461
 462    <indexterm>
 463     <primary>constraint</primary>
 464     <secondary>name</secondary>
 465    </indexterm>
 466
 467    <para>
 468     You can also give the constraint a separate name.  This clarifies
 469     error messages and allows you to refer to the constraint when you
 470     need to change it.  The syntax is:
 471 <programlisting>
 472 CREATE TABLE products (
 473     product_no integer,
 474     name text,
 475     price numeric <emphasis>CONSTRAINT positive_price</emphasis> CHECK (price > 0)
 476 );
 477 </programlisting>
 478     So, to specify a named constraint, use the key word
 479     <literal>CONSTRAINT</literal> followed by an identifier followed
 480     by the constraint definition.
 481    </para>
 482
 483    <para>
 484     A check constraint can also refer to several columns.  Say you
 485     store a regular price and a discounted price, and you want to
 486     ensure that the discounted price is lower than the regular price.
 487 <programlisting>
 488 CREATE TABLE products (
 489     product_no integer,
 490     name text,
 491     price numeric CHECK (price > 0),
 492     discounted_price numeric CHECK (discounted_price > 0),
 493     CHECK (price > discounted_price)
 494 );
 495 </programlisting>
 496    </para>
 497
 498    <para>
 499     The first two constraints should look familiar.  The third one
 500     uses a new syntax.  It is not attached to a particular column,
 501     instead it appears as a separate item in the comma-separated
 502     column list.  Column definitions and these constraint
 503     definitions can be listed in mixed order.
 504    </para>
 505
 506    <para>
 507     We say that the first two constraints are column constraints, whereas the
 508     third one is a table constraint because it is written separately
 509     from the column definitions.  Column constraints can also be
 510     written as table constraints, while the reverse is not necessarily
 511     possible.  The above example could also be written as
 512 <programlisting>
 513 CREATE TABLE products (
 514     product_no integer,
 515     name text,
 516     price numeric,
 517     CHECK (price > 0),
 518     discounted_price numeric,
 519     CHECK (discounted_price > 0),
 520     CHECK (price > discounted_price)
 521 );
 522 </programlisting>
 523     or even
 524 <programlisting>
 525 CREATE TABLE products (
 526     product_no integer,
 527     name text,
 528     price numeric CHECK (price > 0),
 529     discounted_price numeric,
 530     CHECK (discounted_price > 0 AND price > discounted_price)
 531 );
 532 </programlisting>
 533     It's a matter of taste.
 534    </para>
 535
 536    <indexterm>
 537     <primary>null value</primary>
 538     <secondary sortas="check constraints">with check constraints</secondary>
 539    </indexterm>
 540
 541    <para>
 542     It should be noted that a check constraint is satisfied if the
 543     check expression evaluates to true or the null value.  Since most
 544     expressions will evaluate to the null value if one operand is null,
 545     they will not prevent null values in the constrained columns.  To
 546     ensure that a column does not contain null values, the not-null
 547     constraint described in the next section should be used.
 548    </para>
 549   </sect2>
 550
 551   <sect2>
 552    <title>Not-Null Constraints</title>
 553
 554    <indexterm>
 555     <primary>not-null constraint</primary>
 556    </indexterm>
 557
 558    <indexterm>
 559     <primary>constraint</primary>
 560     <secondary>NOT NULL</secondary>
 561    </indexterm>
 562
 563    <para>
 564     A not-null constraint simply specifies that a column must not
 565     assume the null value.  A syntax example:
 566 <programlisting>
 567 CREATE TABLE products (
 568     product_no integer <emphasis>NOT NULL</emphasis>,
 569     name text <emphasis>NOT NULL</emphasis>,
 570     price numeric
 571 );
 572 </programlisting>
 573    </para>
 574
 575    <para>
 576     A not-null constraint is always written as a column constraint.  A
 577     not-null constraint is functionally equivalent to creating a check
 578     constraint <literal>CHECK (<replaceable>column_name</replaceable>
 579     IS NOT NULL)</literal>, but in
 580     <productname>PostgreSQL</productname> creating an explicit
 581     not-null constraint is more efficient.  The drawback is that you
 582     cannot give explicit names to not-null constraints created that
 583     way.
 584    </para>
 585
 586    <para>
 587     Of course, a column can have more than one constraint.  Just write
 588     the constraints after one another:
 589 <programlisting>
 590 CREATE TABLE products (
 591     product_no integer NOT NULL,
 592     name text NOT NULL,
 593     price numeric NOT NULL CHECK (price > 0)
 594 );
 595 </programlisting>
 596     The order doesn't matter.  It does not necessarily determine in which
 597     order the constraints are checked.
 598    </para>
 599
 600    <para>
 601     The <literal>NOT NULL</literal> constraint has an inverse: the
 602     <literal>NULL</literal> constraint.  This does not mean that the
 603     column must be null, which would surely be useless.  Instead, this
 604     simply defines the default behavior that the column may be null.
 605     The <literal>NULL</literal> constraint is not defined in the SQL
 606     standard and should not be used in portable applications.  (It was
 607     only added to <productname>PostgreSQL</productname> to be
 608     compatible with some other database systems.)  Some users, however,
 609     like it because it makes it easy to toggle the constraint in a
 610     script file.  For example, you could start with
 611 <programlisting>
 612 CREATE TABLE products (
 613     product_no integer NULL,
 614     name text NULL,
 615     price numeric NULL
 616 );
 617 </programlisting>
 618     and then insert the <literal>NOT</literal> key word where desired.
 619    </para>
 620
 621    <tip>
 622     <para>
 623      In most database designs the majority of columns should be marked
 624      not null.
 625     </para>
 626    </tip>
 627   </sect2>
 628
 629   <sect2>
 630    <title>Unique Constraints</title>
 631
 632    <indexterm>
 633     <primary>unique constraint</primary>
 634    </indexterm>
 635
 636    <indexterm>
 637     <primary>constraint</primary>
 638     <secondary>unique</secondary>
 639    </indexterm>
 640
 641    <para>
 642     Unique constraints ensure that the data contained in a column or a
 643     group of columns is unique with respect to all the rows in the
 644     table.  The syntax is
 645 <programlisting>
 646 CREATE TABLE products (
 647     product_no integer <emphasis>UNIQUE</emphasis>,
 648     name text,
 649     price numeric
 650 );
 651 </programlisting>
 652     when written as a column constraint, and
 653 <programlisting>
 654 CREATE TABLE products (
 655     product_no integer,
 656     name text,
 657     price numeric,
 658     <emphasis>UNIQUE (product_no)</emphasis>
 659 );
 660 </programlisting>
 661     when written as a table constraint.
 662    </para>
 663
 664    <para>
 665     If a unique constraint refers to a group of columns, the columns
 666     are listed separated by commas:
 667 <programlisting>
 668 CREATE TABLE example (
 669     a integer,
 670     b integer,
 671     c integer,
 672     <emphasis>UNIQUE (a, c)</emphasis>
 673 );
 674 </programlisting>
 675    </para>
 676
 677    <para>
 678     It is also possible to assign names to unique constraints:
 679 <programlisting>
 680 CREATE TABLE products (
 681     product_no integer <emphasis>CONSTRAINT must_be_different</emphasis> UNIQUE,
 682     name text,
 683     price numeric
 684 );
 685 </programlisting>
 686    </para>
 687
 688    <indexterm>
 689     <primary>null value</primary>
 690     <secondary sortas="unique constraints">with unique constraints</secondary>
 691    </indexterm>
 692
 693    <para>
 694     In general, a unique constraint is violated when there are (at
 695     least) two rows in the table where the values of each of the
 696     corresponding columns that are part of the constraint are equal.
 697     However, null values are not considered equal in this
 698     consideration.  That means even in the presence of a
 699     unique constraint it is possible to store an unlimited number of
 700     rows that contain a null value in at least one of the constrained
 701     columns.  This behavior conforms to the SQL standard, but we have
 702     heard that other SQL databases may not follow this rule.  So be
 703     careful when developing applications that are intended to be
 704     portable.
 705    </para>
 706   </sect2>
 707
 708   <sect2>
 709    <title>Primary Keys</title>
 710
 711    <indexterm>
 712     <primary>primary key</primary>
 713    </indexterm>
 714
 715    <indexterm>
 716     <primary>constraint</primary>
 717     <secondary>primary key</secondary>
 718    </indexterm>
 719
 720    <para>
 721     Technically, a primary key constraint is simply a combination of a
 722     unique constraint and a not-null constraint.  So, the following
 723     two table definitions accept the same data:
 724 <programlisting>
 725 CREATE TABLE products (
 726     product_no integer UNIQUE NOT NULL,
 727     name text,
 728     price numeric
 729 );
 730 </programlisting>
 731
 732 <programlisting>
 733 CREATE TABLE products (
 734     product_no integer <emphasis>PRIMARY KEY</emphasis>,
 735     name text,
 736     price numeric
 737 );
 738 </programlisting>
 739    </para>
 740
 741    <para>
 742     Primary keys can also constrain more than one column; the syntax
 743     is similar to unique constraints:
 744 <programlisting>
 745 CREATE TABLE example (
 746     a integer,
 747     b integer,
 748     c integer,
 749     <emphasis>PRIMARY KEY (a, c)</emphasis>
 750 );
 751 </programlisting>
 752    </para>
 753
 754    <para>
 755     A primary key indicates that a column or group of columns can be
 756     used as a unique identifier for rows in the table.  (This is a
 757     direct consequence of the definition of a primary key.  Note that
 758     a unique constraint does not, by itself, provide a unique identifier
 759     because it does not exclude null values.)  This is useful both for
 760     documentation purposes and for client applications.  For example,
 761     a GUI application that allows modifying row values probably needs
 762     to know the primary key of a table to be able to identify rows
 763     uniquely.
 764    </para>
 765
 766    <para>
 767     A table can have at most one primary key (while it can have many
 768     unique and not-null constraints).  Relational database theory
 769     dictates that every table must have a primary key.  This rule is
 770     not enforced by <productname>PostgreSQL</productname>, but it is
 771     usually best to follow it.
 772    </para>
 773   </sect2>
 774
 775   <sect2 id="ddl-constraints-fk">
 776    <title>Foreign Keys</title>
 777
 778    <indexterm>
 779     <primary>foreign key</primary>
 780    </indexterm>
 781
 782    <indexterm>
 783     <primary>constraint</primary>
 784     <secondary>foreign key</secondary>
 785    </indexterm>
 786
 787    <indexterm>
 788     <primary>referential integrity</primary>
 789    </indexterm>
 790
 791    <para>
 792     A foreign key constraint specifies that the values in a column (or
 793     a group of columns) must match the values appearing in some row
 794     of another table.
 795     We say this maintains the <firstterm>referential
 796     integrity</firstterm> between two related tables.
 797    </para>
 798
 799    <para>
 800     Say you have the product table that we have used several times already:
 801 <programlisting>
 802 CREATE TABLE products (
 803     product_no integer PRIMARY KEY,
 804     name text,
 805     price numeric
 806 );
 807 </programlisting>
 808     Let's also assume you have a table storing orders of those
 809     products.  We want to ensure that the orders table only contains
 810     orders of products that actually exist.  So we define a foreign
 811     key constraint in the orders table that references the products
 812     table:
 813 <programlisting>
 814 CREATE TABLE orders (
 815     order_id integer PRIMARY KEY,
 816     product_no integer <emphasis>REFERENCES products (product_no)</emphasis>,
 817     quantity integer
 818 );
 819 </programlisting>
 820     Now it is impossible to create orders with
 821     <structfield>product_no</structfield> entries that do not appear in the
 822     products table.
 823    </para>
 824
 825    <para>
 826     We say that in this situation the orders table is the
 827     <firstterm>referencing</firstterm> table and the products table is
 828     the <firstterm>referenced</firstterm> table.  Similarly, there are
 829     referencing and referenced columns.
 830    </para>
 831
 832    <para>
 833     You can also shorten the above command to
 834 <programlisting>
 835 CREATE TABLE orders (
 836     order_id integer PRIMARY KEY,
 837     product_no integer REFERENCES products,
 838     quantity integer
 839 );
 840 </programlisting>
 841     because in absence of a column list the primary key of the
 842     referenced table is used as the referenced column.
 843    </para>
 844
 845    <para>
 846     A foreign key can also constrain and reference a group of columns.
 847     As usual, it then needs to be written in table constraint form.
 848     Here is a contrived syntax example:
 849 <programlisting>
 850 CREATE TABLE t1 (
 851   a integer PRIMARY KEY,
 852   b integer,
 853   c integer,
 854   <emphasis>FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)</emphasis>
 855 );
 856 </programlisting>
 857     Of course, the number and type of the constrained columns needs to
 858     match the number and type of the referenced columns.
 859    </para>
 860
 861    <para>
 862     A table can contain more than one foreign key constraint.  This is
 863     used to implement many-to-many relationships between tables.  Say
 864     you have tables about products and orders, but now you want to
 865     allow one order to contain possibly many products (which the
 866     structure above did not allow).  You could use this table structure:
 867 <programlisting>
 868 CREATE TABLE products (
 869     product_no integer PRIMARY KEY,
 870     name text,
 871     price numeric
 872 );
 873
 874 CREATE TABLE orders (
 875     order_id integer PRIMARY KEY,
 876     shipping_address text,
 877     ...
 878 );
 879
 880 CREATE TABLE order_items (
 881     product_no integer REFERENCES products,
 882     order_id integer REFERENCES orders,
 883     quantity integer,
 884     PRIMARY KEY (product_no, order_id)
 885 );
 886 </programlisting>
 887     Note also that the primary key overlaps with the foreign keys in
 888     the last table.
 889    </para>
 890
 891    <indexterm>
 892     <primary>CASCADE</primary>
 893     <secondary>foreign key action</secondary>
 894    </indexterm>
 895
 896    <indexterm>
 897     <primary>RESTRICT</primary>
 898     <secondary>foreign key action</secondary>
 899    </indexterm>
 900
 901    <para>
 902     We know that the foreign keys disallow creation of orders that
 903     do not relate to any products.  But what if a product is removed
 904     after an order is created that references it?  SQL allows you to
 905     specify that as well.  Intuitively, we have a few options:
 906     <itemizedlist spacing="compact">
 907      <listitem><para>Disallow deleting a referenced product</para></listitem>
 908      <listitem><para>Delete the orders as well</para></listitem>
 909      <listitem><para>Something else?</para></listitem>
 910     </itemizedlist>
 911    </para>
 912
 913    <para>
 914     To illustrate this, let's implement the following policy on the
 915     many-to-many relationship example above: when someone wants to
 916     remove a product that is still referenced by an order (via
 917     <literal>order_items</literal>), we disallow it.  If someone
 918     removes an order, the order items are removed as well.
 919 <programlisting>
 920 CREATE TABLE products (
 921     product_no integer PRIMARY KEY,
 922     name text,
 923     price numeric
 924 );
 925
 926 CREATE TABLE orders (
 927     order_id integer PRIMARY KEY,
 928     shipping_address text,
 929     ...
 930 );
 931
 932 CREATE TABLE order_items (
 933     product_no integer REFERENCES products <emphasis>ON DELETE RESTRICT</emphasis>,
 934     order_id integer REFERENCES orders <emphasis>ON DELETE CASCADE</emphasis>,
 935     quantity integer,
 936     PRIMARY KEY (product_no, order_id)
 937 );
 938 </programlisting>
 939    </para>
 940
 941    <para>
 942     Restricting and cascading deletes are the two most common options.
 943     <literal>RESTRICT</literal> can also be written as <literal>NO
 944     ACTION</literal> and it's also the default if you do not specify
 945     anything.  There are two other options for what should happen with
 946     the foreign key columns when a primary key is deleted:
 947     <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>.
 948     Note that these do not excuse you from observing any constraints.
 949     For example, if an action specifies <literal>SET DEFAULT</literal>
 950     but the default value would not satisfy the foreign key, the
 951     deletion of the primary key will fail.
 952    </para>
 953
 954    <para>
 955     Analogous to <literal>ON DELETE</literal> there is also
 956     <literal>ON UPDATE</literal> which is invoked when a primary key
 957     is changed (updated).  The possible actions are the same.
 958    </para>
 959
 960    <para>
 961     More information about updating and deleting data is in <xref
 962     linkend="dml">.
 963    </para>
 964
 965    <para>
 966     Finally, we should mention that a foreign key must reference
 967     columns that are either a primary key or form a unique constraint.
 968     If the foreign key references a unique constraint, there are some
 969     additional possibilities regarding how null values are matched.
 970     These are explained in the reference documentation for
 971     <xref linkend="sql-createtable" endterm="sql-createtable-title">.
 972    </para>
 973   </sect2>
 974  </sect1>
 975
 976  <sect1 id="ddl-inherit">
 977   <title>Inheritance</title>
 978
 979   <remark>This section needs to be rethought.  Some of the
 980   information should go into the following chapters.</remark>
 981
 982   <para>
 983    Let's create two tables. The capitals  table  contains
 984    state  capitals  which  are also cities. Naturally, the
 985    capitals table should inherit from cities.
 986
 987 <programlisting>
 988 CREATE TABLE cities (
 989     name            text,
 990     population      float,
 991     altitude        int     -- (in ft)
 992 );
 993
 994 CREATE TABLE capitals (
 995     state           char(2)
 996 ) INHERITS (cities);
 997 </programlisting>
 998
 999    In this case, a row of capitals <firstterm>inherits</firstterm> all
1000    attributes (name, population, and altitude) from its parent, cities.  State
1001    capitals have an extra attribute, state, that shows their state.  In
1002    <productname>PostgreSQL</productname>, a table can inherit from zero or
1003    more other tables, and a query can reference either all rows of a table or
1004    all rows of a table plus all of its descendants.
1005
1006    <note>
1007     <para>
1008      The inheritance hierarchy is actually a directed acyclic graph.
1009     </para>
1010    </note>
1011   </para>
1012
1013   <para>
1014     For example, the  following  query finds the  names  of  all  cities,
1015     including  state capitals, that are located at an altitude
1016     over 500ft:
1017
1018 <programlisting>
1019 SELECT name, altitude
1020     FROM cities
1021     WHERE altitude &gt; 500;
1022 </programlisting>
1023
1024    which returns:
1025
1026 <programlisting>
1027    name    | altitude
1028 -----------+----------
1029  Las Vegas |     2174
1030  Mariposa  |     1953
1031  Madison   |      845
1032 </programlisting>
1033   </para>
1034
1035   <para>
1036     On the other hand, the  following  query  finds
1037     all  the cities that are not state capitals and
1038     are situated at an altitude over 500ft:
1039
1040 <programlisting>
1041 SELECT name, altitude
1042     FROM ONLY cities
1043     WHERE altitude &gt; 500;
1044
1045    name    | altitude
1046 -----------+----------
1047  Las Vegas |     2174
1048  Mariposa  |     1953
1049 </programlisting>
1050   </para>
1051
1052   <para>
1053    Here the <quote>ONLY</quote> before cities indicates that the query should
1054    be  run over only cities and not tables below cities in the
1055    inheritance hierarchy.  Many of the  commands  that  we
1056    have  already discussed -- <command>SELECT</command>,
1057    <command>UPDATE</command> and <command>DELETE</command> --
1058    support this <quote>ONLY</quote> notation.
1059   </para>
1060
1061   <note>
1062    <title>Deprecated</title>
1063    <para>
1064      In previous versions of <productname>PostgreSQL</productname>, the
1065      default behavior was not to include child tables in queries. This was
1066      found to be error prone and is also in violation of the SQL99
1067      standard. Under the old syntax, to get the sub-tables you append
1068      <literal>*</literal> to the table name.
1069      For example
1070 <programlisting>
1071 SELECT * from cities*;
1072 </programlisting>
1073      You can still explicitly specify scanning child tables by appending
1074      <literal>*</literal>, as well as explicitly specify not scanning child tables by
1075      writing <quote>ONLY</quote>.  But beginning in version 7.1, the default
1076      behavior for an undecorated table name is to scan its child tables
1077      too, whereas before the default was not to do so.  To get the old
1078      default behavior, set the configuration option
1079      <literal>SQL_Inheritance</literal> to off, e.g.,
1080 <programlisting>
1081 SET SQL_Inheritance TO OFF;
1082 </programlisting>
1083      or add a line in your <filename>postgresql.conf</filename> file.
1084    </para>
1085   </note>
1086
1087   <para>
1088   In some cases you may wish to know which table a particular row
1089   originated from. There is a system column called
1090   <structfield>TABLEOID</structfield> in each table which can tell you the
1091   originating table:
1092
1093 <programlisting>
1094 SELECT c.tableoid, c.name, c.altitude
1095 FROM cities c
1096 WHERE c.altitude &gt; 500;
1097 </programlisting>
1098
1099    which returns:
1100
1101 <programlisting>
1102  tableoid |   name    | altitude
1103 ----------+-----------+----------
1104    139793 | Las Vegas |     2174
1105    139793 | Mariposa  |     1953
1106    139798 | Madison   |      845
1107 </programlisting>
1108
1109    (If you try to reproduce this example, you will probably get
1110    different numeric OIDs.)  By doing a join with
1111    <structname>pg_class</> you can see the actual table names:
1112
1113 <programlisting>
1114 SELECT p.relname, c.name, c.altitude
1115 FROM cities c, pg_class p
1116 WHERE c.altitude &gt; 500 and c.tableoid = p.oid;
1117 </programlisting>
1118
1119    which returns:
1120
1121 <programlisting>
1122  relname  |   name    | altitude
1123 ----------+-----------+----------
1124  cities   | Las Vegas |     2174
1125  cities   | Mariposa  |     1953
1126  capitals | Madison   |      845
1127 </programlisting>
1128
1129   </para>
1130
1131   <para>
1132    A serious limitation of the inheritance feature is that indexes (including
1133    unique constraints) and foreign key constraints only apply to single
1134    tables, not to their inheritance children.  This is true on both the
1135    referencing and referenced sides of a foreign key constraint.  Thus,
1136    in the terms of the above example:
1137
1138    <itemizedlist>
1139     <listitem>
1140      <para>
1141       If we declared <structname>cities</>.<structfield>name</> to be
1142       <literal>UNIQUE</> or a <literal>PRIMARY KEY</>, this would not stop the
1143       <structname>capitals</> table from having rows with names duplicating
1144       rows in <structname>cities</>.  And those duplicate rows would by
1145       default show up in SELECTs from <structname>cities</>.  In fact, by
1146       default <structname>capitals</> would have no unique constraint at all,
1147       and so could contain multiple rows with the same name.
1148       You could add a unique constraint to <structname>capitals</>, but this
1149       would not prevent duplication compared to <structname>cities</>.
1150      </para>
1151     </listitem>
1152
1153     <listitem>
1154      <para>
1155       Similarly, if we were to specify that
1156       <structname>cities</>.<structfield>name</> <literal>REFERENCES</> some
1157       other table, this constraint would not automatically propagate to
1158       <structname>capitals</>.  In this case you could work around it by
1159       manually adding the same <literal>REFERENCES</> constraint to
1160       <structname>capitals</>.
1161      </para>
1162     </listitem>
1163
1164     <listitem>
1165      <para>
1166       Specifying that another table's column <literal>REFERENCES
1167       cities(name)</> would allow the other table to contain city names, but
1168       not capital names.  There is no good workaround for this case.
1169      </para>
1170     </listitem>
1171    </itemizedlist>
1172
1173    These deficiencies will probably be fixed in some future release,
1174    but in the meantime considerable care is needed in deciding whether
1175    inheritance is useful for your problem.
1176   </para>
1177  </sect1>
1178
1179  <sect1 id="ddl-alter">
1180   <title>Modifying Tables</title>
1181
1182   <indexterm zone="ddl-alter">
1183    <primary>table</primary>
1184    <secondary>modifying</secondary>
1185   </indexterm>
1186
1187   <para>
1188    When you create a table and you realize that you made a mistake, or
1189    the requirements of the application changed, then you can drop the
1190    table and create it again.  But this is not a convenient option if
1191    the table is already filled with data, or if the table is
1192    referenced by other database objects (for instance a foreign key
1193    constraint).  Therefore <productname>PostgreSQL</productname>
1194    provides a family of commands to make modifications on existing
1195    tables.
1196   </para>
1197
1198   <para>
1199    You can
1200    <itemizedlist spacing="compact">
1201     <listitem>
1202      <para>Add columns,</para>
1203     </listitem>
1204     <listitem>
1205      <para>Remove columns,</para>
1206     </listitem>
1207     <listitem>
1208      <para>Add constraints,</para>
1209     </listitem>
1210     <listitem>
1211      <para>Remove constraints,</para>
1212     </listitem>
1213     <listitem>
1214      <para>Change default values,</para>
1215     </listitem>
1216     <listitem>
1217      <para>Rename columns,</para>
1218     </listitem>
1219     <listitem>
1220      <para>Rename tables.</para>
1221     </listitem>
1222    </itemizedlist>
1223
1224    All these actions are performed using the <literal>ALTER
1225    TABLE</literal> command.
1226   </para>
1227
1228   <sect2>
1229    <title>Adding a Column</title>
1230
1231    <indexterm>
1232     <primary>column</primary>
1233     <secondary>adding</secondary>
1234    </indexterm>
1235
1236    <para>
1237     To add a column, use this command:
1238 <programlisting>
1239 ALTER TABLE products ADD COLUMN description text;
1240 </programlisting>
1241     The new column will initially be filled with null values in the
1242     existing rows of the table.
1243    </para>
1244
1245    <para>
1246     You can also define a constraint on the column at the same time,
1247     using the usual syntax:
1248 <programlisting>
1249 ALTER TABLE products ADD COLUMN description text CHECK (description &lt;&gt; '');
1250 </programlisting>
1251     A new column cannot have a not-null constraint since the column
1252     initially has to contain null values.  But you can add a not-null
1253     constraint later.  Also, you cannot define a default value on a
1254     new column.  According to the SQL standard, this would have to
1255     fill the new columns in the existing rows with the default value,
1256     which is not implemented yet.  But you can adjust the column
1257     default later on.
1258    </para>
1259   </sect2>
1260
1261   <sect2>
1262    <title>Removing a Column</title>
1263
1264    <indexterm>
1265     <primary>column</primary>
1266     <secondary>removing</secondary>
1267    </indexterm>
1268
1269    <para>
1270     To remove a column, use this command:
1271 <programlisting>
1272 ALTER TABLE products DROP COLUMN description;
1273 </programlisting>
1274    </para>
1275   </sect2>
1276
1277   <sect2>
1278    <title>Adding a Constraint</title>
1279
1280    <indexterm>
1281     <primary>constraint</primary>
1282     <secondary>adding</secondary>
1283    </indexterm>
1284
1285    <para>
1286     To add a constraint, the table constraint syntax is used.  For example:
1287 <programlisting>
1288 ALTER TABLE products ADD CHECK (name &lt;&gt; '');
1289 ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no);
1290 ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;
1291 </programlisting>
1292     To add a not-null constraint, which cannot be written as a table
1293     constraint, use this syntax:
1294 <programlisting>
1295 ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;
1296 </programlisting>
1297    </para>
1298
1299    <para>
1300     The constraint will be checked immediately, so the table data must
1301     satisfy the constraint before it can be added.
1302    </para>
1303   </sect2>
1304
1305   <sect2>
1306    <title>Removing a Constraint</title>
1307
1308    <indexterm>
1309     <primary>constraint</primary>
1310     <secondary>removing</secondary>
1311    </indexterm>
1312
1313    <para>
1314     To remove a constraint you need to know its name.  If you gave it
1315     a name then that's easy.  Otherwise the system assigned a
1316     generated name, which you need to find out.  The
1317     <application>psql</application> command <literal>\d
1318     <replaceable>tablename</replaceable></literal> can be helpful
1319     here; other interfaces might also provide a way to inspect table
1320     details.  Then the command is:
1321 <programlisting>
1322 ALTER TABLE products DROP CONSTRAINT some_name;
1323 </programlisting>
1324     (If you are dealing with a generated constraint name like <literal>$2</>,
1325     don't forget that you'll need to double-quote it to make it a valid
1326     identifier.)
1327    </para>
1328
1329    <para>
1330     This works the same for all constraint types except not-null
1331     constraints. To drop a not null constraint use
1332 <programlisting>
1333 ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
1334 </programlisting>
1335     (Recall that not-null constraints do not have names.)
1336    </para>
1337   </sect2>
1338
1339   <sect2>
1340    <title>Changing the Default</title>
1341
1342    <indexterm>
1343     <primary>default value</primary>
1344     <secondary>changing</secondary>
1345    </indexterm>
1346
1347    <para>
1348     To set a new default for a column, use a command like this:
1349 <programlisting>
1350 ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
1351 </programlisting>
1352     To remove any default value, use
1353 <programlisting>
1354 ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
1355 </programlisting>
1356     This is equivalent to setting the default to null, at least in
1357     <productname>PostgreSQL</>.  As a consequence, it is not an error
1358     to drop a default where one hadn't been defined, because the
1359     default is implicitly the null value.
1360    </para>
1361   </sect2>
1362
1363   <sect2>
1364    <title>Renaming a Column</title>
1365
1366    <indexterm>
1367     <primary>column</primary>
1368     <secondary>renaming</secondary>
1369    </indexterm>
1370
1371    <para>
1372     To rename a column:
1373 <programlisting>
1374 ALTER TABLE products RENAME COLUMN product_no TO product_number;
1375 </programlisting>
1376    </para>
1377   </sect2>
1378
1379   <sect2>
1380    <title>Renaming a Table</title>
1381
1382    <indexterm>
1383     <primary>table</primary>
1384     <secondary>renaming</secondary>
1385    </indexterm>
1386
1387    <para>
1388     To rename a table:
1389 <programlisting>
1390 ALTER TABLE products RENAME TO items;
1391 </programlisting>
1392    </para>
1393   </sect2>
1394  </sect1>
1395
1396  <sect1 id="ddl-priv">
1397   <title>Privileges</title>
1398
1399   <indexterm zone="ddl-priv">
1400    <primary>privilege</primary>
1401   </indexterm>
1402
1403   <indexterm>
1404    <primary>permission</primary>
1405    <see>privilege</see>
1406   </indexterm>
1407
1408   <para>
1409    When you create a database object, you become its owner.  By
1410    default, only the owner of an object can do anything with the
1411    object. In order to allow other users to use it,
1412    <firstterm>privileges</firstterm> must be granted.  (However,
1413    users that have the superuser attribute can always
1414    access any object.)
1415   </para>
1416
1417   <para>
1418    There are several different privileges: <literal>SELECT</>,
1419    <literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>,
1420    <literal>RULE</>, <literal>REFERENCES</>, <literal>TRIGGER</>,
1421    <literal>CREATE</>, <literal>TEMPORARY</>, <literal>EXECUTE</>,
1422    and <literal>USAGE</>.  The privileges applicable to a particular
1423    object vary depending on the object's type (table, function, etc).
1424    For complete
1425    information on the different types of privileges supported by
1426    <productname>PostgreSQL</productname>, refer to the
1427    <xref linkend="sql-grant"> reference page.  The following sections
1428    and chapters will also show you how those privileges are used.
1429   </para>
1430
1431   <para>
1432    The right to modify or destroy an object is always the privilege of
1433    the owner only.
1434   </para>
1435
1436   <note>
1437    <para>
1438     To change the owner of a table, index, sequence, or view, use the
1439     <xref linkend="sql-altertable"> command.  There are corresponding
1440     <literal>ALTER</> commands for other object types.
1441    </para>
1442   </note>
1443
1444   <para>
1445    To assign privileges, the <command>GRANT</command> command is
1446    used. For example, if <literal>joe</literal> is an existing user, and
1447    <literal>accounts</literal> is an existing table, the privilege to
1448    update the table can be granted with
1449 <programlisting>
1450 GRANT UPDATE ON accounts TO joe;
1451 </programlisting>
1452    To grant a privilege to a group, use this syntax:
1453 <programlisting>
1454 GRANT SELECT ON accounts TO GROUP staff;
1455 </programlisting>
1456    The special <quote>user</quote> name <literal>PUBLIC</literal> can
1457    be used to grant a privilege to every user on the system. Writing
1458    <literal>ALL</literal> in place of a specific privilege grants all
1459    privileges that are relevant for the object type.
1460   </para>
1461
1462   <para>
1463    To revoke a privilege, use the fittingly named
1464    <command>REVOKE</command> command:
1465 <programlisting>
1466 REVOKE ALL ON accounts FROM PUBLIC;
1467 </programlisting>
1468    The special privileges of the object owner (i.e., the right to do
1469    <command>DROP</>, <command>GRANT</>, <command>REVOKE</>, etc.)
1470    are always implicit in being the owner,
1471    and cannot be granted or revoked.  But the object owner can choose
1472    to revoke his own ordinary privileges, for example to make a
1473    table read-only for himself as well as others.
1474   </para>
1475
1476   <para>
1477    Ordinarily, only the object's owner (or a superuser) can grant or revoke
1478    privileges on an object.  However, it is possible to grant a privilege
1479    <quote>with grant option</>, which gives the recipient the right to
1480    grant it in turn to others.  If the grant option is subsequently revoked
1481    then all who received the privilege from that recipient (directly or
1482    through a chain of grants) will lose the privilege.  For details see
1483    the <xref linkend="sql-grant"> and <xref linkend="sql-revoke"> reference
1484    pages.
1485   </para>
1486  </sect1>
1487
1488  <sect1 id="ddl-schemas">
1489   <title>Schemas</title>
1490
1491   <indexterm zone="ddl-schemas">
1492    <primary>schema</primary>
1493   </indexterm>
1494
1495   <para>
1496    A <productname>PostgreSQL</productname> database cluster
1497    contains one or more named databases.  Users and groups of users are
1498    shared across the entire cluster, but no other data is shared across
1499    databases.  Any given client connection to the server can access
1500    only the data in a single database, the one specified in the connection
1501    request.
1502   </para>
1503
1504   <note>
1505    <para>
1506     Users of a cluster do not necessarily have the privilege to access every
1507     database in the cluster.  Sharing of user names means that there
1508     cannot be different users named, say, <literal>joe</> in two databases
1509     in the same cluster; but the system can be configured to allow
1510     <literal>joe</> access to only some of the databases.
1511    </para>
1512   </note>
1513
1514   <para>
1515    A database contains one or more named <firstterm>schemas</>, which
1516    in turn contain tables.  Schemas also contain other kinds of named
1517    objects, including data types, functions, and operators.  The same
1518    object name can be used in different schemas without conflict; for
1519    example, both <literal>schema1</> and <literal>myschema</> may
1520    contain tables named <literal>mytable</>.  Unlike databases,
1521    schemas are not rigidly separated: a user may access objects in any
1522    of the schemas in the database he is connected to, if he has
1523    privileges to do so.
1524   </para>
1525
1526   <para>
1527    There are several reasons why one might want to use schemas:
1528
1529    <itemizedlist>
1530     <listitem>
1531      <para>
1532       To allow many users to use one database without interfering with
1533       each other.
1534      </para>
1535     </listitem>
1536
1537     <listitem>
1538      <para>
1539       To organize database objects into logical groups to make them
1540       more manageable.
1541      </para>
1542     </listitem>
1543
1544     <listitem>
1545      <para>
1546       Third-party applications can be put into separate schemas so
1547       they cannot collide with the names of other objects.
1548      </para>
1549     </listitem>
1550    </itemizedlist>
1551
1552    Schemas are analogous to directories at the operating system level,
1553    except that schemas cannot be nested.
1554   </para>
1555
1556   <sect2 id="ddl-schemas-create">
1557    <title>Creating a Schema</title>
1558
1559    <indexterm zone="ddl-schemas-create">
1560     <primary>schema</primary>
1561     <secondary>creating</secondary>
1562    </indexterm>
1563
1564    <para>
1565     To create a separate schema, use the command <literal>CREATE
1566     SCHEMA</literal>.  Give the schema a name of your choice.  For
1567     example:
1568 <programlisting>
1569 CREATE SCHEMA myschema;
1570 </programlisting>
1571    </para>
1572
1573    <indexterm>
1574     <primary>qualified name</primary>
1575    </indexterm>
1576
1577    <indexterm>
1578     <primary>name</primary>
1579     <secondary>qualified</secondary>
1580    </indexterm>
1581
1582    <para>
1583     To create or access objects in a schema, write a
1584     <firstterm>qualified name</> consisting of the schema name and
1585     table name separated by a dot:
1586 <synopsis>
1587 <replaceable>schema</><literal>.</><replaceable>table</>
1588 </synopsis>
1589     (For brevity we will speak of tables only, but the same ideas apply
1590     to other kinds of named objects, such as types and functions.)
1591    </para>
1592
1593    <para>
1594     Actually, the even more general syntax
1595 <synopsis>
1596 <replaceable>database</><literal>.</><replaceable>schema</><literal>.</><replaceable>table</>
1597 </synopsis>
1598     can be used too, but at present this is just for pro-forma compliance
1599     with the SQL standard.  If you write a database name, it must be the
1600     same as the database you are connected to.
1601    </para>
1602
1603    <para>
1604     So to create a table in the new schema, use
1605 <programlisting>
1606 CREATE TABLE myschema.mytable (
1607  ...
1608 );
1609 </programlisting>
1610     This works anywhere a table name is expected, including the table
1611     modification commands and the data access commands discussed in
1612     the following chapters.
1613    </para>
1614
1615    <indexterm>
1616     <primary>schema</primary>
1617     <secondary>removing</secondary>
1618    </indexterm>
1619
1620    <para>
1621     To drop a schema if it's empty (all objects in it have been
1622     dropped), use
1623 <programlisting>
1624 DROP SCHEMA myschema;
1625 </programlisting>
1626     To drop a schema including all contained objects, use
1627 <programlisting>
1628 DROP SCHEMA myschema CASCADE;
1629 </programlisting>
1630     See <xref linkend="ddl-depend"> for a description of the general
1631     mechanism behind this.
1632    </para>
1633
1634    <para>
1635     Often you will want to create a schema owned by someone else
1636     (since this is one of the ways to restrict the activities of your
1637     users to well-defined namespaces).  The syntax for that is:
1638 <programlisting>
1639 CREATE SCHEMA <replaceable>schemaname</replaceable> AUTHORIZATION <replaceable>username</replaceable>;
1640 </programlisting>
1641     You can even omit the schema name, in which case the schema name
1642     will be the same as the user name.  See <xref
1643     linkend="ddl-schemas-patterns"> for how this can be useful.
1644    </para>
1645
1646    <para>
1647     Schema names beginning with <literal>pg_</> are reserved for
1648     system purposes and may not be created by users.
1649    </para>
1650   </sect2>
1651
1652   <sect2 id="ddl-schemas-public">
1653    <title>The Public Schema</title>
1654
1655    <indexterm zone="ddl-schemas-public">
1656     <primary>schema</primary>
1657     <secondary>public</secondary>
1658    </indexterm>
1659
1660    <para>
1661     In the previous sections we created tables without specifying any
1662     schema names.  By default, such tables (and other objects) are
1663     automatically put into a schema named <quote>public</quote>.  Every new
1664     database contains such a schema.  Thus, the following are equivalent:
1665 <programlisting>
1666 CREATE TABLE products ( ... );
1667 </programlisting>
1668     and
1669 <programlisting>
1670 CREATE TABLE public.products ( ... );
1671 </programlisting>
1672    </para>
1673   </sect2>
1674
1675   <sect2 id="ddl-schemas-path">
1676    <title>The Schema Search Path</title>
1677
1678    <indexterm>
1679     <primary>search path</primary>
1680    </indexterm>
1681
1682    <indexterm>
1683     <primary>unqualified name</primary>
1684    </indexterm>
1685
1686    <indexterm>
1687     <primary>name</primary>
1688     <secondary>unqualified</secondary>
1689    </indexterm>
1690
1691    <para>
1692     Qualified names are tedious to write, and it's often best not to
1693     wire a particular schema name into applications anyway.  Therefore
1694     tables are often referred to by <firstterm>unqualified names</>,
1695     which consist of just the table name.  The system determines which table
1696     is meant by following a <firstterm>search path</>, which is a list
1697     of schemas to look in.  The first matching table in the search path
1698     is taken to be the one wanted.  If there is no match in the search
1699     path, an error is reported, even if matching table names exist
1700     in other schemas in the database.
1701    </para>
1702
1703    <indexterm>
1704     <primary>schema</primary>
1705     <secondary>current</secondary>
1706    </indexterm>
1707
1708    <para>
1709     The first schema named in the search path is called the current schema.
1710     Aside from being the first schema searched, it is also the schema in
1711     which new tables will be created if the <command>CREATE TABLE</>
1712     command does not specify a schema name.
1713    </para>
1714
1715    <indexterm>
1716     <primary>search_path</primary>
1717    </indexterm>
1718
1719    <para>
1720     To show the current search path, use the following command:
1721 <programlisting>
1722 SHOW search_path;
1723 </programlisting>
1724     In the default setup this returns:
1725 <screen>
1726  search_path
1727 --------------
1728  $user,public
1729 </screen>
1730     The first element specifies that a schema with the same name as
1731     the current user is to be searched.  If no such schema exists,
1732     the entry is ignored.  The second element refers to the
1733     public schema that we have seen already.
1734    </para>
1735
1736    <para>
1737     The first schema in the search path that exists is the default
1738     location for creating new objects.  That is the reason that by
1739     default objects are created in the public schema.  When objects
1740     are referenced in any other context without schema qualification
1741     (table modification, data modification, or query commands) the
1742     search path is traversed until a matching object is found.
1743     Therefore, in the default configuration, any unqualified access
1744     again can only refer to the public schema.
1745    </para>
1746
1747    <para>
1748     To put our new schema in the path, we use
1749 <programlisting>
1750 SET search_path TO myschema,public;
1751 </programlisting>
1752     (We omit the <literal>$user</literal> here because we have no
1753     immediate need for it.)  And then we can access the table without
1754     schema qualification:
1755 <programlisting>
1756 DROP TABLE mytable;
1757 </programlisting>
1758     Also, since <literal>myschema</literal> is the first element in
1759     the path, new objects would by default be created in it.
1760    </para>
1761
1762    <para>
1763     We could also have written
1764 <programlisting>
1765 SET search_path TO myschema;
1766 </programlisting>
1767     Then we no longer have access to the public schema without
1768     explicit qualification.  There is nothing special about the public
1769     schema except that it exists by default.  It can be dropped, too.
1770    </para>
1771
1772    <para>
1773     See also <xref linkend="functions-info"> for other ways to access
1774     the schema search path.
1775    </para>
1776
1777    <para>
1778     The search path works in the same way for data type names, function names,
1779     and operator names as it does for table names.  Data type and function
1780     names can be qualified in exactly the same way as table names.  If you
1781     need to write a qualified operator name in an expression, there is a
1782     special provision: you must write
1783 <synopsis>
1784 <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operator</><literal>)</>
1785 </synopsis>
1786     This is needed to avoid syntactic ambiguity.  An example is
1787 <programlisting>
1788 SELECT 3 OPERATOR(pg_catalog.+) 4;
1789 </programlisting>
1790     In practice one usually relies on the search path for operators,
1791     so as not to have to write anything so ugly as that.
1792    </para>
1793   </sect2>
1794
1795   <sect2 id="ddl-schemas-priv">
1796    <title>Schemas and Privileges</title>
1797
1798    <indexterm zone="ddl-schemas-priv">
1799     <primary>privilege</primary>
1800     <secondary sortas="schemas">for schemas</secondary>
1801    </indexterm>
1802
1803    <para>
1804     By default, users cannot access any objects in schemas they do not
1805     own.  To allow that, the owner of the schema needs to grant the
1806     <literal>USAGE</literal> privilege on the schema.  To allow users
1807     to make use of the objects in the schema, additional privileges
1808     may need to be granted, as appropriate for the object.
1809    </para>
1810
1811    <para>
1812     A user can also be allowed to create objects in someone else's
1813     schema.  To allow that, the <literal>CREATE</literal> privilege on
1814     the schema needs to be granted.  Note that by default, everyone
1815     has <literal>CREATE</literal> and <literal>USAGE</literal> privileges on
1816     the schema
1817     <literal>public</literal>.  This allows all users that are able to
1818     connect to a given database to create objects in its
1819     <literal>public</literal> schema.  If you do
1820     not want to allow that, you can revoke that privilege:
1821 <programlisting>
1822 REVOKE CREATE ON SCHEMA public FROM PUBLIC;
1823 </programlisting>
1824     (The first <quote>public</quote> is the schema, the second
1825     <quote>public</quote> means <quote>every user</quote>.  In the
1826     first sense it is an identifier, in the second sense it is a
1827     reserved word, hence the different capitalization; recall the
1828     guidelines from <xref linkend="sql-syntax-identifiers">.)
1829    </para>
1830   </sect2>
1831
1832   <sect2 id="ddl-schemas-catalog">
1833    <title>The System Catalog Schema</title>
1834
1835    <indexterm zone="ddl-schemas-catalog">
1836     <primary>system catalog</primary>
1837     <secondary>schema</secondary>
1838    </indexterm>
1839
1840    <para>
1841     In addition to <literal>public</> and user-created schemas, each
1842     database contains a <literal>pg_catalog</> schema, which contains
1843     the system tables and all the built-in data types, functions, and
1844     operators.  <literal>pg_catalog</> is always effectively part of
1845     the search path.  If it is not named explicitly in the path then
1846     it is implicitly searched <emphasis>before</> searching the path's
1847     schemas.  This ensures that built-in names will always be
1848     findable.  However, you may explicitly place
1849     <literal>pg_catalog</> at the end of your search path if you
1850     prefer to have user-defined names override built-in names.
1851    </para>
1852
1853    <para>
1854     In <productname>PostgreSQL</productname> versions before 7.3,
1855     table names beginning with <literal>pg_</> were reserved.  This is
1856     no longer true: you may create such a table name if you wish, in
1857     any non-system schema.  However, it's best to continue to avoid
1858     such names, to ensure that you won't suffer a conflict if some
1859     future version defines a system table named the same as your
1860     table.  (With the default search path, an unqualified reference to
1861     your table name would be resolved as the system table instead.)
1862     System tables will continue to follow the convention of having
1863     names beginning with <literal>pg_</>, so that they will not
1864     conflict with unqualified user-table names so long as users avoid
1865     the <literal>pg_</> prefix.
1866    </para>
1867   </sect2>
1868
1869   <sect2 id="ddl-schemas-patterns">
1870    <title>Usage Patterns</title>
1871
1872    <para>
1873     Schemas can be used to organize your data in many ways.  There are
1874     a few usage patterns that are recommended and are easily supported by
1875     the default configuration:
1876     <itemizedlist>
1877      <listitem>
1878       <para>
1879        If you do not create any schemas then all users access the
1880        public schema implicitly.  This simulates the situation where
1881        schemas are not available at all.  This setup is mainly
1882        recommended when there is only a single user or a few cooperating
1883        users in a database.  This setup also allows smooth transition
1884        from the non-schema-aware world.
1885       </para>
1886      </listitem>
1887
1888      <listitem>
1889       <para>
1890        You can create a schema for each user with the same name as
1891        that user.  Recall that the default search path starts with
1892        <literal>$user</literal>, which resolves to the user name.
1893        Therefore, if each user has a separate schema, they access their
1894        own schemas by default.
1895       </para>
1896
1897       <para>
1898        If you use this setup then you might also want to revoke access
1899        to the public schema (or drop it altogether), so users are
1900        truly constrained to their own schemas.
1901       </para>
1902      </listitem>
1903
1904      <listitem>
1905       <para>
1906        To install shared applications (tables to be used by everyone,
1907        additional functions provided by third parties, etc.), put them
1908        into separate schemas.  Remember to grant appropriate
1909        privileges to allow the other users to access them.  Users can
1910        then refer to these additional objects by qualifying the names
1911        with a schema name, or they can put the additional schemas into
1912        their search path, as they choose.
1913       </para>
1914      </listitem>
1915     </itemizedlist>
1916    </para>
1917   </sect2>
1918
1919   <sect2 id="ddl-schemas-portability">
1920    <title>Portability</title>
1921
1922    <para>
1923     In the SQL standard, the notion of objects in the same schema
1924     being owned by different users does not exist.  Moreover, some
1925     implementations do not allow you to create schemas that have a
1926     different name than their owner.  In fact, the concepts of schema
1927     and user are nearly equivalent in a database system that
1928     implements only the basic schema support specified in the
1929     standard.  Therefore, many users consider qualified names to
1930     really consist of
1931     <literal><replaceable>username</>.<replaceable>tablename</></literal>.
1932     This is how <productname>PostgreSQL</productname> will effectively
1933     behave if you create a per-user schema for every user.
1934    </para>
1935
1936    <para>
1937     Also, there is no concept of a <literal>public</> schema in the
1938     SQL standard.  For maximum conformance to the standard, you should
1939     not use (perhaps even remove) the <literal>public</> schema.
1940    </para>
1941
1942    <para>
1943     Of course, some SQL database systems might not implement schemas
1944     at all, or provide namespace support by allowing (possibly
1945     limited) cross-database access.  If you need to work with those
1946     systems, then maximum portability would be achieved by not using
1947     schemas at all.
1948    </para>
1949   </sect2>
1950  </sect1>
1951
1952  <sect1 id="ddl-others">
1953   <title>Other Database Objects</title>
1954
1955   <para>
1956    Tables are the central objects in a relational database structure,
1957    because they hold your data.  But they are not the only objects
1958    that exist in a database.  Many other kinds of objects can be
1959    created to make the use and management of the data more efficient
1960    or convenient.  They are not discussed in this chapter, but we give
1961    you a list here so that you are aware of what is possible.
1962   </para>
1963
1964   <itemizedlist>
1965    <listitem>
1966     <para>
1967      Views
1968     </para>
1969    </listitem>
1970
1971    <listitem>
1972     <para>
1973      Functions, operators, data types, domains
1974     </para>
1975    </listitem>
1976
1977    <listitem>
1978     <para>
1979      Triggers and rewrite rules
1980     </para>
1981    </listitem>
1982   </itemizedlist>
1983
1984   <para>
1985    Detailed information on
1986    these topics appears in <xref linkend="server-programming">.
1987   </para>
1988  </sect1>
1989
1990  <sect1 id="ddl-depend">
1991   <title>Dependency Tracking</title>
1992
1993   <indexterm zone="ddl-depend">
1994    <primary>CASCADE</primary>
1995    <secondary sortas="DROP">with DROP</secondary>
1996   </indexterm>
1997
1998   <indexterm zone="ddl-depend">
1999    <primary>RESTRICT</primary>
2000    <secondary sortas="DROP">with DROP</secondary>
2001   </indexterm>
2002
2003   <para>
2004    When you create complex database structures involving many tables
2005    with foreign key constraints, views, triggers, functions, etc. you
2006    will implicitly create a net of dependencies between the objects.
2007    For instance, a table with a foreign key constraint depends on the
2008    table it references.
2009   </para>
2010
2011   <para>
2012    To ensure the integrity of the entire database structure,
2013    <productname>PostgreSQL</productname> makes sure that you cannot
2014    drop objects that other objects still depend on.  For example,
2015    attempting to drop the products table we had considered in <xref
2016    linkend="ddl-constraints-fk">, with the orders table depending on
2017    it, would result in an error message such as this:
2018 <screen>
2019 DROP TABLE products;
2020
2021 NOTICE:  constraint orders_product_no_fkey on table orders depends on table products
2022 ERROR:  cannot drop table products because other objects depend on it
2023 HINT:  Use DROP ... CASCADE to drop the dependent objects too.
2024 </screen>
2025    The error message contains a useful hint: if you do not want to
2026    bother deleting all the dependent objects individually, you can run
2027 <screen>
2028 DROP TABLE products CASCADE;
2029 </screen>
2030    and all the dependent objects will be removed.  In this case, it
2031    doesn't remove the orders table, it only removes the foreign key
2032    constraint.  (If you want to check what <literal>DROP ... CASCADE</> will do,
2033    run <command>DROP</> without <literal>CASCADE</> and read the <literal>NOTICE</> messages.)
2034   </para>
2035
2036   <para>
2037    All drop commands in <productname>PostgreSQL</productname> support
2038    specifying <literal>CASCADE</literal>.  Of course, the nature of
2039    the possible dependencies varies with the type of the object.  You
2040    can also write <literal>RESTRICT</literal> instead of
2041    <literal>CASCADE</literal> to get the default behavior, which is to
2042    prevent drops of objects that other objects depend on.
2043   </para>
2044
2045   <note>
2046    <para>
2047     According to the SQL standard, specifying either
2048     <literal>RESTRICT</literal> or <literal>CASCADE</literal> is
2049     required.  No database system actually implements it that way, but
2050     whether the default behavior is <literal>RESTRICT</literal> or
2051     <literal>CASCADE</literal> varies across systems.
2052    </para>
2053   </note>
2054
2055   <note>
2056    <para>
2057     Foreign key constraint dependencies and serial column dependencies
2058     from <productname>PostgreSQL</productname> versions prior to 7.3
2059     are <emphasis>not</emphasis> maintained or created during the
2060     upgrade process.  All other dependency types will be properly
2061     created during an upgrade.
2062    </para>
2063   </note>
2064  </sect1>
2065
2066 </chapter>