granicus.if.org Git - postgresql/blob - doc/src/sgml/advanced.sgml

   1 <!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
   2
   3  <chapter id="tutorial-advanced">
   4   <title>Advanced Features</title>
   5
   6   <sect1 id="tutorial-advanced-intro">
   7    <title>Introduction</title>
   8
   9    <para>
  10     In the previous chapter we have covered the basics of using
  11     <acronym>SQL</acronym> to store and access your data in
  12     <productname>PostgreSQL</productname>.  We will now discuss some
  13     more advanced features of <acronym>SQL</acronym> that simplify
  14     management and prevent loss or corruption of your data.  Finally,
  15     we will look at some <productname>PostgreSQL</productname>
  16     extensions.
  17    </para>
  18
  19    <para>
  20     This chapter will on occasion refer to examples found in <xref
  21     linkend="tutorial-sql"> to change or improve them, so it will be
  22     of advantage if you have read that chapter.  Some examples from
  23     this chapter can also be found in
  24     <filename>advanced.sql</filename> in the tutorial directory.  This
  25     file also contains some example data to load, which is not
  26     repeated here.  (Refer to <xref linkend="tutorial-sql-intro"> for
  27     how to use the file.)
  28    </para>
  29   </sect1>
  30
  31
  32   <sect1 id="tutorial-views">
  33    <title>Views</title>
  34
  35    <indexterm zone="tutorial-views">
  36     <primary>view</primary>
  37    </indexterm>
  38
  39    <para>
  40     Refer back to the queries in <xref linkend="tutorial-join">.
  41     Suppose the combined listing of weather records and city location
  42     is of particular interest to your application, but you do not want
  43     to type the query each time you need it.  You can create a
  44     <firstterm>view</firstterm> over the query, which gives a name to
  45     the query that you can refer to like an ordinary table:
  46
  47 <programlisting>
  48 CREATE VIEW myview AS
  49     SELECT city, temp_lo, temp_hi, prcp, date, location
  50         FROM weather, cities
  51         WHERE city = name;
  52
  53 SELECT * FROM myview;
  54 </programlisting>
  55    </para>
  56
  57    <para>
  58     Making liberal use of views is a key aspect of good SQL database
  59     design.  Views allow you to encapsulate the details of the
  60     structure of your tables, which might change as your application
  61     evolves, behind consistent interfaces.
  62    </para>
  63
  64    <para>
  65     Views can be used in almost any place a real table can be used.
  66     Building views upon other views is not uncommon.
  67    </para>
  68   </sect1>
  69
  70
  71   <sect1 id="tutorial-fk">
  72    <title>Foreign Keys</title>
  73
  74    <indexterm zone="tutorial-fk">
  75     <primary>foreign key</primary>
  76    </indexterm>
  77
  78    <indexterm zone="tutorial-fk">
  79     <primary>referential integrity</primary>
  80    </indexterm>
  81
  82    <para>
  83     Recall the <classname>weather</classname> and
  84     <classname>cities</classname> tables from <xref
  85     linkend="tutorial-sql">.  Consider the following problem:  You
  86     want to make sure that no one can insert rows in the
  87     <classname>weather</classname> table that do not have a matching
  88     entry in the <classname>cities</classname> table.  This is called
  89     maintaining the <firstterm>referential integrity</firstterm> of
  90     your data.  In simplistic database systems this would be
  91     implemented (if at all) by first looking at the
  92     <classname>cities</classname> table to check if a matching record
  93     exists, and then inserting or rejecting the new
  94     <classname>weather</classname> records.  This approach has a
  95     number of problems and is very inconvenient, so
  96     <productname>PostgreSQL</productname> can do this for you.
  97    </para>
  98
  99    <para>
 100     The new declaration of the tables would look like this:
 101
 102 <programlisting>
 103 CREATE TABLE cities (
 104         city     varchar(80) primary key,
 105         location point
 106 );
 107
 108 CREATE TABLE weather (
 109         city      varchar(80) references cities(city),
 110         temp_lo   int,
 111         temp_hi   int,
 112         prcp      real,
 113         date      date
 114 );
 115 </programlisting>
 116
 117     Now try inserting an invalid record:
 118
 119 <programlisting>
 120 INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');
 121 </programlisting>
 122
 123 <screen>
 124 ERROR:  insert or update on table "weather" violates foreign key constraint "weather_city_fkey"
 125 DETAIL:  Key (city)=(Berkeley) is not present in table "cities".
 126 </screen>
 127    </para>
 128
 129    <para>
 130     The behavior of foreign keys can be finely tuned to your
 131     application.  We will not go beyond this simple example in this
 132     tutorial, but just refer you to <xref linkend="ddl">
 133     for more information.  Making correct use of
 134     foreign keys will definitely improve the quality of your database
 135     applications, so you are strongly encouraged to learn about them.
 136    </para>
 137   </sect1>
 138
 139
 140   <sect1 id="tutorial-transactions">
 141    <title>Transactions</title>
 142
 143    <indexterm zone="tutorial-transactions">
 144     <primary>transaction</primary>
 145    </indexterm>
 146
 147    <para>
 148     <firstterm>Transactions</> are a fundamental concept of all database
 149     systems.  The essential point of a transaction is that it bundles
 150     multiple steps into a single, all-or-nothing operation.  The intermediate
 151     states between the steps are not visible to other concurrent transactions,
 152     and if some failure occurs that prevents the transaction from completing,
 153     then none of the steps affect the database at all.
 154    </para>
 155
 156    <para>
 157     For example, consider a bank database that contains balances for various
 158     customer accounts, as well as total deposit balances for branches.
 159     Suppose that we want to record a payment of $100.00 from Alice's account
 160     to Bob's account.  Simplifying outrageously, the SQL commands for this
 161     might look like:
 162
 163 <programlisting>
 164 UPDATE accounts SET balance = balance - 100.00
 165     WHERE name = 'Alice';
 166 UPDATE branches SET balance = balance - 100.00
 167     WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice');
 168 UPDATE accounts SET balance = balance + 100.00
 169     WHERE name = 'Bob';
 170 UPDATE branches SET balance = balance + 100.00
 171     WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob');
 172 </programlisting>
 173    </para>
 174
 175    <para>
 176     The details of these commands are not important here; the important
 177     point is that there are several separate updates involved to accomplish
 178     this rather simple operation.  Our bank's officers will want to be
 179     assured that either all these updates happen, or none of them happen.
 180     It would certainly not do for a system failure to result in Bob
 181     receiving $100.00 that was not debited from Alice.  Nor would Alice long
 182     remain a happy customer if she was debited without Bob being credited.
 183     We need a guarantee that if something goes wrong partway through the
 184     operation, none of the steps executed so far will take effect.  Grouping
 185     the updates into a <firstterm>transaction</> gives us this guarantee.
 186     A transaction is said to be <firstterm>atomic</>: from the point of
 187     view of other transactions, it either happens completely or not at all.
 188    </para>
 189
 190    <para>
 191     We also want a
 192     guarantee that once a transaction is completed and acknowledged by
 193     the database system, it has indeed been permanently recorded
 194     and won't be lost even if a crash ensues shortly thereafter.
 195     For example, if we are recording a cash withdrawal by Bob,
 196     we do not want any chance that the debit to his account will
 197     disappear in a crash just after he walks out the bank door.
 198     A transactional database guarantees that all the updates made by
 199     a transaction are logged in permanent storage (i.e., on disk) before
 200     the transaction is reported complete.
 201    </para>
 202
 203    <para>
 204     Another important property of transactional databases is closely
 205     related to the notion of atomic updates: when multiple transactions
 206     are running concurrently, each one should not be able to see the
 207     incomplete changes made by others.  For example, if one transaction
 208     is busy totalling all the branch balances, it would not do for it
 209     to include the debit from Alice's branch but not the credit to
 210     Bob's branch, nor vice versa.  So transactions must be all-or-nothing
 211     not only in terms of their permanent effect on the database, but
 212     also in terms of their visibility as they happen.  The updates made
 213     so far by an open transaction are invisible to other transactions
 214     until the transaction completes, whereupon all the updates become
 215     visible simultaneously.
 216    </para>
 217
 218    <para>
 219     In <productname>PostgreSQL</>, a transaction is set up by surrounding
 220     the SQL commands of the transaction with
 221     <command>BEGIN</> and <command>COMMIT</> commands.  So our banking
 222     transaction would actually look like:
 223
 224 <programlisting>
 225 BEGIN;
 226 UPDATE accounts SET balance = balance - 100.00
 227     WHERE name = 'Alice';
 228 -- etc etc
 229 COMMIT;
 230 </programlisting>
 231    </para>
 232
 233    <para>
 234     If, partway through the transaction, we decide we do not want to
 235     commit (perhaps we just noticed that Alice's balance went negative),
 236     we can issue the command <command>ROLLBACK</> instead of
 237     <command>COMMIT</>, and all our updates so far will be canceled.
 238    </para>
 239
 240    <para>
 241     <productname>PostgreSQL</> actually treats every SQL statement as being
 242     executed within a transaction.  If you do not issue a <command>BEGIN</>
 243     command,
 244     then each individual statement has an implicit <command>BEGIN</> and
 245     (if successful) <command>COMMIT</> wrapped around it.  A group of
 246     statements surrounded by <command>BEGIN</> and <command>COMMIT</>
 247     is sometimes called a <firstterm>transaction block</>.
 248    </para>
 249
 250    <note>
 251     <para>
 252      Some client libraries issue <command>BEGIN</> and <command>COMMIT</>
 253      commands automatically, so that you might get the effect of transaction
 254      blocks without asking.  Check the documentation for the interface
 255      you are using.
 256     </para>
 257    </note>
 258
 259    <para>
 260     It's possible to control the statements in a transaction in a more
 261     granular fashion through the use of <firstterm>savepoints</>.  Savepoints
 262     allow you to selectively discard parts of the transaction, while
 263     committing the rest.  After defining a savepoint with
 264     <command>SAVEPOINT</>, you can if needed roll back to the savepoint
 265     with <command>ROLLBACK TO</>.  All the transaction's database changes
 266     between defining the savepoint and rolling back to it are discarded, but
 267     changes earlier than the savepoint are kept.
 268    </para>
 269
 270    <para>
 271     After rolling back to a savepoint, it continues to be defined, so you can
 272     roll back to it several times.  Conversely, if you are sure you won't need
 273     to roll back to a particular savepoint again, it can be released, so the
 274     system can free some resources.  Keep in mind that either releasing or
 275     rolling back to a savepoint
 276     will automatically release all savepoints that were defined after it.
 277    </para>
 278
 279    <para>
 280     All this is happening within the transaction block, so none of it
 281     is visible to other database sessions.  When and if you commit the
 282     transaction block, the committed actions become visible as a unit
 283     to other sessions, while the rolled-back actions never become visible
 284     at all.
 285    </para>
 286
 287    <para>
 288     Remembering the bank database, suppose we debit $100.00 from Alice's
 289     account, and credit Bob's account, only to find later that we should
 290     have credited Wally's account.  We could do it using savepoints like
 291     this:
 292
 293 <programlisting>
 294 BEGIN;
 295 UPDATE accounts SET balance = balance - 100.00
 296     WHERE name = 'Alice';
 297 SAVEPOINT my_savepoint;
 298 UPDATE accounts SET balance = balance + 100.00
 299     WHERE name = 'Bob';
 300 -- oops ... forget that and use Wally's account
 301 ROLLBACK TO my_savepoint;
 302 UPDATE accounts SET balance = balance + 100.00
 303     WHERE name = 'Wally';
 304 COMMIT;
 305 </programlisting>
 306    </para>
 307
 308    <para>
 309     This example is, of course, oversimplified, but there's a lot of control
 310     to be had over a transaction block through the use of savepoints.
 311     Moreover, <command>ROLLBACK TO</> is the only way to regain control of a
 312     transaction block that was put in aborted state by the
 313     system due to an error, short of rolling it back completely and starting
 314     again.
 315    </para>
 316
 317   </sect1>
 318
 319
 320   <sect1 id="tutorial-inheritance">
 321    <title>Inheritance</title>
 322
 323    <indexterm zone="tutorial-inheritance">
 324     <primary>inheritance</primary>
 325    </indexterm>
 326
 327    <para>
 328     Inheritance is a concept from object-oriented databases.  It opens
 329     up interesting new possibilities of database design.
 330    </para>
 331
 332    <para>
 333     Let's create two tables:  A table <classname>cities</classname>
 334     and a table <classname>capitals</classname>.  Naturally, capitals
 335     are also cities, so you want some way to show the capitals
 336     implicitly when you list all cities.  If you're really clever you
 337     might invent some scheme like this:
 338
 339 <programlisting>
 340 CREATE TABLE capitals (
 341   name       text,
 342   population real,
 343   altitude   int,    -- (in ft)
 344   state      char(2)
 345 );
 346
 347 CREATE TABLE non_capitals (
 348   name       text,
 349   population real,
 350   altitude   int     -- (in ft)
 351 );
 352
 353 CREATE VIEW cities AS
 354   SELECT name, population, altitude FROM capitals
 355     UNION
 356   SELECT name, population, altitude FROM non_capitals;
 357 </programlisting>
 358
 359     This works OK as far as querying goes, but it gets ugly when you
 360     need to update several rows, for one thing.
 361    </para>
 362
 363    <para>
 364     A better solution is this:
 365
 366 <programlisting>
 367 CREATE TABLE cities (
 368   name       text,
 369   population real,
 370   altitude   int     -- (in ft)
 371 );
 372
 373 CREATE TABLE capitals (
 374   state      char(2)
 375 ) INHERITS (cities);
 376 </programlisting>
 377    </para>
 378
 379    <para>
 380     In this case, a row of <classname>capitals</classname>
 381     <firstterm>inherits</firstterm> all columns (<structfield>name</>,
 382     <structfield>population</>, and <structfield>altitude</>) from its
 383     <firstterm>parent</firstterm>, <classname>cities</classname>.  The
 384     type of the column <structfield>name</structfield> is
 385     <type>text</type>, a native <productname>PostgreSQL</productname>
 386     type for variable length character strings.  State capitals have
 387     an extra column, <structfield>state</>, that shows their state.  In
 388     <productname>PostgreSQL</productname>, a table can inherit from
 389     zero or more other tables.
 390    </para>
 391
 392    <para>
 393     For example, the  following  query finds the  names  of  all  cities,
 394     including  state capitals, that are located at an altitude
 395     over 500 feet:
 396
 397 <programlisting>
 398 SELECT name, altitude
 399   FROM cities
 400   WHERE altitude &gt; 500;
 401 </programlisting>
 402
 403     which returns:
 404
 405 <screen>
 406    name    | altitude
 407 -----------+----------
 408  Las Vegas |     2174
 409  Mariposa  |     1953
 410  Madison   |      845
 411 (3 rows)
 412 </screen>
 413    </para>
 414
 415    <para>
 416     On the other hand, the  following  query  finds
 417     all  the cities that are not state capitals and
 418     are situated at an altitude of 500 feet or higher:
 419
 420 <programlisting>
 421 SELECT name, altitude
 422     FROM ONLY cities
 423     WHERE altitude &gt; 500;
 424 </programlisting>
 425
 426 <screen>
 427    name    | altitude
 428 -----------+----------
 429  Las Vegas |     2174
 430  Mariposa  |     1953
 431 (2 rows)
 432 </screen>
 433    </para>
 434
 435    <para>
 436     Here the <literal>ONLY</literal> before <literal>cities</literal>
 437     indicates that the query should be run over only the
 438     <classname>cities</classname> table, and not tables below
 439     <classname>cities</classname> in the inheritance hierarchy.  Many
 440     of the commands that we have already discussed &mdash;
 441     <command>SELECT</command>, <command>UPDATE</command>, and
 442     <command>DELETE</command> &mdash; support this <literal>ONLY</literal>
 443     notation.
 444    </para>
 445
 446    <note>
 447     <para>
 448      Although inheritance is frequently useful, it has not been integrated
 449      with unique constraints or foreign keys, which limits its usefulness.
 450      See <xref linkend="ddl-inherit"> for more detail.
 451     </para>
 452    </note>
 453   </sect1>
 454
 455
 456   <sect1 id="tutorial-conclusion">
 457    <title>Conclusion</title>
 458
 459    <para>
 460     <productname>PostgreSQL</productname> has many features not
 461     touched upon in this tutorial introduction, which has been
 462     oriented toward newer users of <acronym>SQL</acronym>.  These
 463     features are discussed in more detail in the remainder of this
 464     book.
 465    </para>
 466
 467    <para>
 468     If you feel you need more introductory material, please visit the PostgreSQL
 469     <ulink url="http://www.postgresql.org">web site</ulink>
 470     for links to more resources.
 471    </para>
 472   </sect1>
 473  </chapter>