From 81c41e3d0ed3c63b02412a6aab824e5ce79780c2 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 5 Jan 2005 23:42:03 +0000 Subject: [PATCH] More minor updates and copy-editing. --- doc/src/sgml/arch-dev.sgml | 81 +++++++++------- doc/src/sgml/bki.sgml | 126 ++++++++++++++----------- doc/src/sgml/catalogs.sgml | 182 +++++++++++++++++++++++------------- doc/src/sgml/geqo.sgml | 40 +++++--- doc/src/sgml/plhandler.sgml | 10 +- 5 files changed, 265 insertions(+), 174 deletions(-) diff --git a/doc/src/sgml/arch-dev.sgml b/doc/src/sgml/arch-dev.sgml index f84269c032..b4f79f44bb 100644 --- a/doc/src/sgml/arch-dev.sgml +++ b/doc/src/sgml/arch-dev.sgml @@ -1,5 +1,5 @@ @@ -63,11 +63,11 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E system catalogs) to apply to the query tree. It performs the transformations given in the rule bodies. - One application of the rewrite system is in the realization of - views. + One application of the rewrite system is in the realization of + views. Whenever a query against a view (i.e. a virtual table) is made, the rewrite system rewrites the user's query to @@ -90,8 +90,8 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E relation to be scanned, there are two paths for the scan. One possibility is a simple sequential scan and the other possibility is to use the index. Next the cost for the execution of - each plan is estimated and the - cheapest plan is chosen and handed back. + each path is estimated and the cheapest path is chosen. The cheapest + path is expanded into a complete plan that the executor can use. @@ -142,7 +142,8 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E PostgreSQL protocol described in . Many clients are based on the C-language library libpq, but several independent - implementations exist, such as the Java JDBC driver. + implementations of the protocol exist, such as the Java + JDBC driver. @@ -339,7 +340,7 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E different ways, each of which will produce the same set of results. If it is computationally feasible, the query optimizer will examine each of these possible execution plans, ultimately - selecting the execution plan that will run the fastest. + selecting the execution plan that is expected to run the fastest. @@ -355,20 +356,26 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E - After the cheapest path is determined, a plan tree - is built to pass to the executor. This represents the desired - execution plan in sufficient detail for the executor to run it. + The planner's search procedure actually works with data structures + called paths, which are simply cut-down representations of + plans containing only as much information as the planner needs to make + its decisions. After the cheapest path is determined, a full-fledged + plan tree is built to pass to the executor. This represents + the desired execution plan in sufficient detail for the executor to run it. + In the rest of this section we'll ignore the distinction between paths + and plans. Generating Possible Plans - The planner/optimizer decides which plans should be generated - based upon the types of indexes defined on the relations appearing in - a query. There is always the possibility of performing a - sequential scan on a relation, so a plan using only - sequential scans is always created. Assume an index is defined on a + The planner/optimizer starts by generating plans for scanning each + individual relation (table) used in the query. The possible plans + are determined by the available indexes on each relation. + There is always the possibility of performing a + sequential scan on a relation, so a sequential scan plan is always + created. Assume an index is defined on a relation (for example a B-tree index) and a query contains the restriction relation.attribute OPR constant. If @@ -395,37 +402,47 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E - nested loop join: The right relation is scanned - once for every row found in the left relation. This strategy - is easy to implement but can be very time consuming. (However, - if the right relation can be scanned with an index scan, this can - be a good strategy. It is possible to use values from the current - row of the left relation as keys for the index scan of the right.) + nested loop join: The right relation is scanned + once for every row found in the left relation. This strategy + is easy to implement but can be very time consuming. (However, + if the right relation can be scanned with an index scan, this can + be a good strategy. It is possible to use values from the current + row of the left relation as keys for the index scan of the right.) - merge sort join: Each relation is sorted on the join - attributes before the join starts. Then the two relations are - merged together taking into account that both relations are - ordered on the join attributes. This kind of join is more - attractive because each relation has to be scanned only once. + merge sort join: Each relation is sorted on the join + attributes before the join starts. Then the two relations are + scanned in parallel, and matching rows are combined to form + join rows. This kind of join is more + attractive because each relation has to be scanned only once. + The required sorting may be achieved either by an explicit sort + step, or by scanning the relation in the proper order using an + index on the join key. - hash join: the right relation is first scanned - and loaded into a hash table, using its join attributes as hash keys. - Next the left relation is scanned and the - appropriate values of every row found are used as hash keys to - locate the matching rows in the table. + hash join: the right relation is first scanned + and loaded into a hash table, using its join attributes as hash keys. + Next the left relation is scanned and the + appropriate values of every row found are used as hash keys to + locate the matching rows in the table. + + When the query involves more than two relations, the final result + must be built up by a tree of join steps, each with two inputs. + The planner examines different possible join sequences to find the + cheapest one. + + The finished plan tree consists of sequential or index scans of the base relations, plus nested-loop, merge, or hash join nodes as @@ -512,7 +529,7 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E the executor top level uses this information to create a new updated row and mark the old row deleted. For DELETE, the only column that is actually returned by the plan is the TID, and the executor top - level simply uses the TID to visit the target rows and mark them deleted. + level simply uses the TID to visit each target row and mark it deleted. diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml index a4b07d7573..867dd421fe 100644 --- a/doc/src/sgml/bki.sgml +++ b/doc/src/sgml/bki.sgml @@ -1,5 +1,5 @@ @@ -7,10 +7,11 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ Backend Interface (BKI) files are scripts in a - special language that are input to the - PostgreSQL backend running in the special - bootstrap mode that allows it to perform database - functions without a database system already existing. + special language that is understood by the + PostgreSQL backend when running in the + bootstrap mode. The bootstrap mode allows system catalogs + to be created and filled from scratch, whereas ordinary SQL commands + require the catalogs to exist already. BKI files can therefore be used to create the database system in the first place. (And they are probably not useful for anything else.) @@ -21,8 +22,9 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ to do part of its job when creating a new database cluster. The input file used by initdb is created as part of building and installing PostgreSQL - by a program named genbki.sh from some - specially formatted C header files in the source tree. The created + by a program named genbki.sh, which reads some + specially formatted C header files in the src/include/catalog/ + directory of the source tree. The created BKI file is called postgres.bki and is normally installed in the share subdirectory of the installation tree. @@ -40,9 +42,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ This section describes how the PostgreSQL backend interprets BKI files. This description will be easier to understand if the postgres.bki - file is at hand as an example. You should also study the source - code of initdb to get an idea of how the - backend is invoked. + file is at hand as an example. @@ -67,6 +67,61 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ <acronym>BKI</acronym> Commands + + + create + bootstrap + shared_relation + without_oids + tablename + (name1 = + type1 , + name2 = type2, ...) + + + + + Create a table named tablename with the columns given + in parentheses. + + + + The following column types are supported directly by + bootstrap.c: bool, + bytea, char (1 byte), + name, int2, + int4, regproc, regclass, + regtype, text, + oid, tid, xid, + cid, int2vector, oidvector, + _int4 (array), _text (array), + _aclitem (array). Although it is possible to create + tables containing columns of other types, this cannot be done until + after pg_type has been created and filled with + appropriate entries. + + + + When bootstrap is specified, + the table will only be created on disk; nothing is entered into + pg_class, + pg_attribute, etc, for it. Thus the + table will not be accessible by ordinary SQL operations until + such entries are made the hard way (with insert + commands). This option is used for creating + pg_class etc themselves. + + + + The table is created as shared if shared_relation is + specified. + It will have OIDs unless without_oids is specified. + + + + open tablename @@ -98,51 +153,6 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ - - - create tablename - (name1 = - type1 , - name2 = type2, ...) - - - - - Create a table named tablename with the columns given - in parentheses. - - - - The type is not necessarily the data - type that the column will have in the SQL environment; that is - determined by the pg_attribute system - catalog. The type here is essentially only used to allocate - storage. The following types are allowed: bool, - bytea, char (1 byte), - name, int2, int2vector, - int4, regproc, regclass, - regtype, text, - oid, tid, xid, - cid, oidvector, smgr, - _int4 (array), _aclitem (array). - Array types can also be indicated by writing - [] after the name of the element type. - - - - - The table will only be created on disk, it will not - automatically be registered in the system catalogs and will - therefore not be accessible unless appropriate rows are - inserted in pg_class, - pg_attribute, etc. - - - - - insert OID = oid_value (value1 value2 ...) @@ -190,6 +200,8 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ classes to use are opclass1, opclass2 etc., respectively. + The index file is created and appropriate catalog entries are + made for it, but the index contents are not initialized by this command. @@ -199,7 +211,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ - Build the indices that have previously been declared. + Fill in the indices that have previously been declared. @@ -212,7 +224,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $ The following sequence of commands will create the - test_table table with the two columns + table test_table with two columns cola and colb of type int4 and text, respectively, and insert two rows into the table. diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 554d75adb2..b74f6ea9f1 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -1,6 +1,6 @@ @@ -33,7 +33,7 @@ Most system catalogs are copied from the template database during database creation and are thereafter database-specific. A few catalogs are physically shared across all databases in a cluster; - these are marked in the descriptions of the individual catalogs. + these are noted in the descriptions of the individual catalogs. @@ -85,7 +85,7 @@ pg_class - tables, indexes, sequences (relations) + tables, indexes, sequences, views (relations) @@ -663,6 +663,14 @@
+ + The adsrc field is historical, and is best + not used, because it does not track outside changes that might affect + the representation of the default value. Reverse-compiling the + adbin field (with pg_get_expr for + example) is a better way to display the default value. + + @@ -678,7 +686,8 @@ table columns. There will be exactly one pg_attribute row for every column in every table in the database. (There will also be attribute entries for - indexes and other objects. See pg_class.) + indexes, and indeed all objects that have pg_class + entries.) @@ -728,7 +737,7 @@ attstattarget controls the level of detail of statistics accumulated for this column by - ANALYZE. + . A zero value indicates that no statistics should be collected. A negative value says to use the system default statistics target. The exact meaning of positive values is data type-dependent. @@ -878,6 +887,17 @@ + + + In a dropped column's pg_attribute entry, + atttypid is reset to zero, but + attlen and the other fields copied from + pg_type are still valid. This arrangement is needed + to cope with the situation where the dropped column's data type was + later dropped, and so there is no pg_type row anymore. + attlen and the other fields can be used + to interpret the contents of a row of the table. + @@ -1230,8 +1250,8 @@ relhasrules bool - Table has rules; see - pg_rewrite catalog + True if table has rules; see + pg_rewrite catalog. @@ -1239,7 +1259,7 @@ relhassubclass bool - At least one table inherits from this one + True if table has (or once had) any inheritance children. @@ -1247,9 +1267,10 @@ aclitem[] - Access privileges; see the descriptions of - GRANT and REVOKE for - details. + Access privileges; see + and + + for details. @@ -1432,8 +1453,10 @@ - The catalog pg_conversion stores encoding conversion information. See - CREATE CONVERSION for more information. + The catalog pg_conversion describes the + available encoding conversion procedures. See + + for more information. @@ -1643,7 +1666,12 @@ dataclaclitem[] - Access privileges + + Access privileges; see + and + + for details. + @@ -1827,8 +1855,8 @@ - The catalog pg_description can store an optional description or - comment for each database object. Descriptions can be manipulated + The catalog pg_description stores optional descriptions + (comments) for each database object. Descriptions can be manipulated with the COMMENT command and viewed with psql's \d commands. Descriptions of many built-in system objects are provided in the initial @@ -2081,7 +2109,9 @@ The catalog pg_inherits records information about - table inheritance hierarchies. + table inheritance hierarchies. There is one entry for each direct + child table in the database. (Indirect inheritance can be determined + by following chains of entries.)
@@ -2121,7 +2151,7 @@ int4 - If there is more than one parent for a child table (multiple + If there is more than one direct parent for a child table (multiple inheritance), this number tells the order in which the inherited columns are to be arranged. The count starts at 1. @@ -2141,10 +2171,10 @@ - The catalog pg_language registers call interfaces or + The catalog pg_language registers languages in which you can write functions or stored procedures. - See under CREATE LANGUAGE and in - for more information about language handlers. + See + and for more information about language handlers.
@@ -2165,7 +2195,7 @@ lannamename - Name of the language (to be specified when creating a function) + Name of the language @@ -2186,8 +2216,7 @@ bool - This is a trusted language. See under CREATE - LANGUAGE what this means. If this is an internal + This is a trusted language. If this is an internal language (lanispl is false) then this column is meaningless. @@ -2212,8 +2241,7 @@ This references a language validator function that is responsible for checking the syntax and validity of new functions when they - are created. See under CREATE LANGUAGE for - further information about validators. + are created. Zero if no validator is provided. @@ -2221,7 +2249,12 @@ lanaclaclitem[] - Access privileges + + Access privileges; see + and + + for details. + @@ -2309,8 +2342,10 @@ - The catalog pg_listener supports the LISTEN - and NOTIFY commands. A listener creates an entry in + The catalog pg_listener supports the + and + + commands. A listener creates an entry in pg_listener for each notification name it is listening for. A notifier scans pg_listener and updates each matching entry to show that a notification has occurred. @@ -2410,7 +2445,12 @@ nspacl aclitem[] - Access privileges + + Access privileges; see + and + + for details. + @@ -2528,9 +2568,9 @@ - The catalog pg_operator stores information about operators. See - CREATE OPERATOR and for - details on these operator parameters. + The catalog pg_operator stores information about operators. + See + and for more information.
@@ -2703,9 +2743,8 @@ The catalog pg_proc stores information about functions (or procedures). - The description of CREATE FUNCTION and - contain more information about the meaning of - some columns. + See + and for more information. @@ -2869,16 +2908,21 @@ proacl aclitem[] - Access privileges + + Access privileges; see + and + + for details. +
+ For compiled functions, both built-in and dynamically loaded, prosrc contains the function's C-language - name (link symbol) for compiled functions, both built-in and - dynamically loaded. For all other language types, + name (link symbol). For all other currently-known language types, prosrc contains the function's source text. probin is unused except for dynamically-loaded C functions, for which it gives the name of the @@ -3041,7 +3085,7 @@ usesysid int4 - User id (arbitrary number used to reference this user) + User ID (arbitrary number used to reference this user) @@ -3072,14 +3116,14 @@ passwd text - Password + Password (possibly encrypted) valuntil abstime - Account expiry time (only used for password authentication) + Password expiry time (only used for password authentication) @@ -3311,7 +3355,12 @@ spcacl aclitem[] - Access privileges + + Access privileges; see + and + + for details. + @@ -3327,8 +3376,9 @@ - The catalog pg_trigger stores triggers on tables. See under - CREATE TRIGGER for more information. + The catalog pg_trigger stores triggers on tables. + See + for more information. @@ -3443,8 +3493,8 @@ - pg_class.reltriggers needs to match up with the - entries in this table. + pg_class.reltriggers needs to agree with the + number of triggers found in this table for the given relation. @@ -3459,12 +3509,14 @@ - The catalog pg_type stores information about data types. Base types - (scalar types) are created with CREATE TYPE. + The catalog pg_type stores information about data + types. Base types (scalar types) are created with + , and + domains with + . A composite type is automatically created for each table in the database, to represent the row structure of the table. It is also possible to create - composite types with CREATE TYPE AS and - domains with CREATE DOMAIN. + composite types with CREATE TYPE AS.
@@ -3797,17 +3849,9 @@ In addition to the system catalogs, PostgreSQL - provides a number of built-in views. The system views provide convenient - access to some commonly used queries on the system catalogs. Some of these - views provide access to internal server state, as well. - - - - lists the system views described here. - More detailed documentation of each view follows below. - There are some additional views that provide access to the results of - the statistics collector; they are described in . + provides a number of built-in views. Some system views provide convenient + access to some commonly used queries on the system catalogs. Other views + provide access to internal server state. @@ -3819,6 +3863,14 @@ the information you need. + + lists the system views described here. + More detailed documentation of each view follows below. + There are some additional views that provide access to the results of + the statistics collector; they are described in . + + Except where noted, all the views described here are read-only. @@ -4398,7 +4450,7 @@ and histogram_bounds arrays can be set on a column-by-column basis using the ALTER TABLE SET STATISTICS command, or globally by setting the - default_statistics_target runtime parameter. + runtime parameter. @@ -4515,7 +4567,7 @@ usesysid int4 - User id (arbitrary number used to reference this user) + User ID (arbitrary number used to reference this user) @@ -4553,7 +4605,7 @@ valuntil abstime - Account expiry time (only used for password authentication) + Password expiry time (only used for password authentication) diff --git a/doc/src/sgml/geqo.sgml b/doc/src/sgml/geqo.sgml index cbf4da6ec7..5822199860 100644 --- a/doc/src/sgml/geqo.sgml +++ b/doc/src/sgml/geqo.sgml @@ -1,5 +1,5 @@ @@ -65,8 +65,7 @@ Genetic Optimizer enormous amount of time and memory space when the number of joins in the query grows large. This makes the ordinary PostgreSQL query optimizer - inappropriate for database application domains that involve the - need for extensive queries, such as artificial intelligence. + inappropriate for queries that join a large number of tables. @@ -97,7 +96,7 @@ Genetic Optimizer The genetic algorithm (GA) is a heuristic optimization method which operates through - determined, randomized search. The set of possible solutions for the + nondeterministic, randomized search. The set of possible solutions for the optimization problem is considered as a population of individuals. The degree of adaptation of an individual to its environment is specified @@ -176,11 +175,12 @@ Genetic Optimizer Genetic Query Optimization (<acronym>GEQO</acronym>) in PostgreSQL - The GEQO module is intended for the solution of the query - optimization problem similar to a traveling salesman problem (TSP). + The GEQO module approaches the query + optimization problem as though it were the well-known traveling salesman + problem (TSP). Possible query plans are encoded as integer strings. Each string represents the join order from one relation of the query to the next. - E. g., the query tree + For example, the join tree /\ /\ 2 @@ -245,29 +245,39 @@ Genetic Optimizer Work is still needed to improve the genetic algorithm parameter settings. - In file backend/optimizer/geqo/geqo_params.c, routines + In file src/backend/optimizer/geqo/geqo_main.c, + routines gimme_pool_size and gimme_number_generations, we have to find a compromise for the parameter settings to satisfy two competing demands: - - Optimality of the query plan - + + Optimality of the query plan + - - Computing time - + + Computing time + + + At a more basic level, it is not clear that solving query optimization + with a GA algorithm designed for TSP is appropriate. In the TSP case, + the cost associated with any substring (partial tour) is independent + of the rest of the tour, but this is certainly not true for query + optimization. Thus it is questionable whether edge recombination + crossover is the most effective mutation procedure. + + - Further Readings + Further Reading The following resources contain additional information about diff --git a/doc/src/sgml/plhandler.sgml b/doc/src/sgml/plhandler.sgml index 8090f96272..cc2cac92a2 100644 --- a/doc/src/sgml/plhandler.sgml +++ b/doc/src/sgml/plhandler.sgml @@ -1,5 +1,5 @@ @@ -56,11 +56,11 @@ $PostgreSQL: pgsql/doc/src/sgml/plhandler.sgml,v 1.3 2004/12/30 21:45:36 tgl Exp system table pg_proc and to analyze the argument and return types of the called function. The AS clause from the - CREATE FUNCTION of the function will be found + CREATE FUNCTION command for the function will be found in the prosrc column of the - pg_proc row. This may be the source - text in the procedural language itself (like for PL/Tcl), a - path name to a file, or anything else that tells the call handler + pg_proc row. This is commonly source + text in the procedural language, but in theory it could be something else, + such as a path name to a file, or anything else that tells the call handler what to do in detail. -- 2.40.0