Fix lots of bad markup, bad English, bad explanations.
This commit covers only about half the contrib modules, but I grow weary...
+<!-- $PostgreSQL: pgsql/doc/src/sgml/adminpack.sgml,v 1.3 2007/12/06 04:12:09 tgl Exp $ -->
+
<sect1 id="adminpack">
<title>adminpack</title>
-
+
<indexterm zone="adminpack">
<primary>adminpack</primary>
</indexterm>
<para>
- adminpack is a PostgreSQL standard module that implements a number of
- support functions which pgAdmin and other administration and management tools
- can use to provide additional functionality if installed on a server.
+ <filename>adminpack</> provides a number of support functions which
+ <application>pgAdmin</> and other administration and management tools can
+ use to provide additional functionality, such as remote management
+ of server log files.
</para>
<sect2>
<title>Functions implemented</title>
+
<para>
- Functions implemented by adminpack can only be run by a superuser. Here's a
- list of these functions:
- </para>
- <para>
- <programlisting>
- int8 pg_catalog.pg_file_write(fname text, data text, append bool)
- bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text)
- bool pg_catalog.pg_file_rename(oldname text, newname text)
- bool pg_catalog.pg_file_unlink(fname text)
- setof record pg_catalog.pg_logdir_ls()
-
- /* Renaming of existing backend functions for pgAdmin compatibility */
- int8 pg_catalog.pg_file_read(fname text, data text, append bool)
- bigint pg_catalog.pg_file_length(text)
- int4 pg_catalog.pg_logfile_rotate()
- </programlisting>
+ The functions implemented by <filename>adminpack</> can only be run by a
+ superuser. Here's a list of these functions:
+
+<programlisting>
+int8 pg_catalog.pg_file_write(fname text, data text, append bool)
+bool pg_catalog.pg_file_rename(oldname text, newname text, archivename text)
+bool pg_catalog.pg_file_rename(oldname text, newname text)
+bool pg_catalog.pg_file_unlink(fname text)
+setof record pg_catalog.pg_logdir_ls()
+
+/* Renaming of existing backend functions for pgAdmin compatibility */
+int8 pg_catalog.pg_file_read(fname text, data text, append bool)
+bigint pg_catalog.pg_file_length(text)
+int4 pg_catalog.pg_logfile_rotate()
+</programlisting>
</para>
+
</sect2>
</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/btree-gist.sgml,v 1.4 2007/12/06 04:12:09 tgl Exp $ -->
+
<sect1 id="btree-gist">
<title>btree_gist</title>
-
+
<indexterm zone="btree-gist">
<primary>btree_gist</primary>
</indexterm>
<para>
- btree_gist is a B-Tree implementation using GiST that supports the int2, int4,
- int8, float4, float8 timestamp with/without time zone, time
- with/without time zone, date, interval, oid, money, macaddr, char,
- varchar/text, bytea, numeric, bit, varbit and inet/cidr types.
+ <filename>btree_gist</> provides sample GiST operator classes that
+ implement B-Tree equivalent behavior for the data types
+ <type>int2</>, <type>int4</>, <type>int8</>, <type>float4</>,
+ <type>float8</>, <type>numeric</>, <type>timestamp with time zone</>,
+ <type>timestamp without time zone</>, <type>time with time zone</>,
+ <type>time without time zone</>, <type>date</>, <type>interval</>,
+ <type>oid</>, <type>money</>, <type>char</>,
+ <type>varchar</>, <type>text</>, <type>bytea</>, <type>bit</>,
+ <type>varbit</>, <type>macaddr</>, <type>inet</>, and <type>cidr</>.
+ </para>
+
+ <para>
+ In general, these operator classes will not outperform the equivalent
+ standard btree index methods, and they lack one major feature of the
+ standard btree code: the ability to enforce uniqueness. However,
+ they are useful for GiST testing and as a base for developing other
+ GiST operator classes.
</para>
<sect2>
<title>Example usage</title>
- <programlisting>
- CREATE TABLE test (a int4);
- -- create index
- CREATE INDEX testidx ON test USING gist (a);
- -- query
- SELECT * FROM test WHERE a < 10;
- </programlisting>
+
+<programlisting>
+CREATE TABLE test (a int4);
+-- create index
+CREATE INDEX testidx ON test USING gist (a);
+-- query
+SELECT * FROM test WHERE a < 10;
+</programlisting>
+
</sect2>
-
+
<sect2>
<title>Authors</title>
+
<para>
- All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) ,
- Oleg Bartunov (<email>oleg@sai.msu.su</email>), Janko Richter
- (<email>jankorichter@yahoo.de</email>). See
- <ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for additional
- information.
+ Teodor Sigaev (<email>teodor@stack.net</email>) ,
+ Oleg Bartunov (<email>oleg@sai.msu.su</email>), and
+ Janko Richter (<email>jankorichter@yahoo.de</email>). See
+ <ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink>
+ for additional information.
</para>
+
</sect2>
</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/chkpass.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
+
<sect1 id="chkpass">
- <title>chkpass</title>
-
- <!--
+ <title>chkpass</title>
+
<indexterm zone="chkpass">
<primary>chkpass</primary>
</indexterm>
- -->
+
<para>
- chkpass is a password type that is automatically checked and converted upon
- entry. It is stored encrypted. To compare, simply compare against a clear
+ This module implements a data type <type>chkpass</> that is
+ designed for storing encrypted passwords.
+ Each password is automatically converted to encrypted form upon entry,
+ and is always stored encrypted. To compare, simply compare against a clear
text password and the comparison function will encrypt it before comparing.
- It also returns an error if the code determines that the password is easily
- crackable. This is currently a stub that does nothing.
</para>
<para>
- Note that the chkpass data type is not indexable.
- <!--
- I haven't worried about making this type indexable. I doubt that anyone
- would ever need to sort a file in order of encrypted password.
- -->
+ There are provisions in the code to report an error if the password is
+ determined to be easily crackable. However, this is currently just
+ a stub that does nothing.
</para>
<para>
- If you precede the string with a colon, the encryption and checking are
- skipped so that you can enter existing passwords into the field.
+ If you precede an input string with a colon, it is assumed to be an
+ already-encrypted password, and is stored without further encryption.
+ This allows entry of previously-encrypted passwords.
</para>
<para>
On output, a colon is prepended. This makes it possible to dump and reload
- passwords without re-encrypting them. If you want the password (encrypted)
- without the colon then use the raw() function. This allows you to use the
+ passwords without re-encrypting them. If you want the encrypted password
+ without the colon then use the <function>raw()</> function.
+ This allows you to use the
type with things like Apache's Auth_PostgreSQL module.
</para>
<para>
- The encryption uses the standard Unix function crypt(), and so it suffers
+ The encryption uses the standard Unix function <function>crypt()</>,
+ and so it suffers
from all the usual limitations of that function; notably that only the
first eight characters of a password are considered.
</para>
<para>
- Here is some sample usage:
+ Note that the chkpass data type is not indexable.
+ <!--
+ I haven't worried about making this type indexable. I doubt that anyone
+ would ever need to sort a file in order of encrypted password.
+ -->
</para>
- <programlisting>
+ <para>
+ Sample usage:
+ </para>
+
+<programlisting>
test=# create table test (p chkpass);
CREATE TABLE
test=# insert into test values ('hello');
----------
f
(1 row)
- </programlisting>
+</programlisting>
<sect2>
<title>Author</title>
+
<para>
- D'Arcy J.M. Cain <email>darcy@druid.net</email>
+ D'Arcy J.M. Cain (<email>darcy@druid.net</email>)
</para>
</sect2>
-</sect1>
+</sect1>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="contrib-spi">
<title>spi</title>
<para>
<function>check_primary_key()</> checks the referencing table.
- To use, create a BEFORE INSERT OR UPDATE trigger using this
- function on a table referencing another table. You are to specify
- as trigger arguments: triggered table column names which correspond
- to foreign key, referenced table name and column names in referenced
- table which correspond to primary/unique key. To handle multiple
- foreign keys, create a trigger for each reference.
+ To use, create a <literal>BEFORE INSERT OR UPDATE</> trigger using this
+ function on a table referencing another table. Specify as the trigger
+ arguments: the referencing table's column name(s) which form the foreign
+ key, the referenced table name, and the column names in the referenced table
+ which form the primary/unique key. To handle multiple foreign
+ keys, create a trigger for each reference.
</para>
<para>
<function>check_foreign_key()</> checks the referenced table.
- To use, create a BEFORE DELETE OR UPDATE trigger using this
- function on a table referenced by other table(s). You are to specify
- as trigger arguments: number of references for which the function has to
- perform checking, action if referencing key found ('cascade' — to delete
- corresponding foreign key, 'restrict' — to abort transaction if foreign keys
- exist, 'setnull' — to set foreign key referencing primary/unique key
- being deleted to null), triggered table column names which correspond
- to primary/unique key, then referencing table name and column names
- corresponding to foreign key (repeated for as many referencing tables/keys
- as were specified by first argument). Note that the primary/unique key
- columns should be marked NOT NULL and should have a unique index.
+ To use, create a <literal>BEFORE DELETE OR UPDATE</> trigger using this
+ function on a table referenced by other table(s). Specify as the trigger
+ arguments: the number of referencing tables for which the function has to
+ perform checking, the action if a referencing key is found
+ (<literal>cascade</> — to delete the referencing row,
+ <literal>restrict</> — to abort transaction if referencing keys
+ exist, <literal>setnull</> — to set referencing key fields to null),
+ the triggered table's column names which form the primary/unique key, then
+ the referencing table name and column names (repeated for as many
+ referencing tables as were specified by first argument). Note that the
+ primary/unique key columns should be marked NOT NULL and should have a
+ unique index.
</para>
<para>
Long ago, <productname>PostgreSQL</> had a built-in time travel feature
that kept the insert and delete times for each tuple. This can be
emulated using these functions. To use these functions,
- you are to add to a table two columns of <type>abstime</> type to store
+ you must add to a table two columns of <type>abstime</> type to store
the date when a tuple was inserted (start_date) and changed/deleted
(stop_date):
<programlisting>
CREATE TABLE mytab (
... ...
- start_date abstime default now(),
- stop_date abstime default 'infinity'
+ start_date abstime,
+ stop_date abstime
... ...
);
</programlisting>
- So, tuples being inserted with unspecified start_date/stop_date will get
- the current time in start_date and <literal>infinity</> in
- stop_date.
+ The columns can be named whatever you like, but in this discussion
+ we'll call them start_date and stop_date.
+ </para>
+
+ <para>
+ When a new row is inserted, start_date should normally be set to
+ current time, and stop_date to <literal>infinity</>. The trigger
+ will automatically substitute these values if the inserted data
+ contains nulls in these columns. Generally, inserting explicit
+ non-null data in these columns should only be done when re-loading
+ dumped data.
</para>
<para>
Tuples with stop_date equal to <literal>infinity</> are <quote>valid
- now</quote>: when trigger will be fired for UPDATE/DELETE of a tuple with
- stop_date NOT equal to <literal>infinity</> then
- this tuple will not be changed/deleted!
+ now</quote>, and can be modified. Tuples with a finite stop_date cannot
+ be modified anymore — the trigger will prevent it. (If you need
+ to do that, you can turn off time travel as shown below.)
</para>
<para>
- If stop_date is equal to <literal>infinity</> then on
- update only the stop_date in the tuple being updated will be changed (to
- current time) and a new tuple with new data (coming from SET ... in UPDATE)
- will be inserted. Start_date in this new tuple will be set to current time
- and stop_date to <literal>infinity</>.
+ For a modifiable row, on update only the stop_date in the tuple being
+ updated will be changed (to current time) and a new tuple with the modified
+ data will be inserted. Start_date in this new tuple will be set to current
+ time and stop_date to <literal>infinity</>.
</para>
<para>
- A delete does not actually remove the tuple but only set its stop_date
+ A delete does not actually remove the tuple but only sets its stop_date
to current time.
</para>
<para>
To query for tuples <quote>valid now</quote>, include
<literal>stop_date = 'infinity'</> in the query's WHERE condition.
- (You might wish to incorporate that in a view.)
- </para>
-
- <para>
- You can't change start/stop date columns with UPDATE!
- Use set_timetravel (below) if you need this.
+ (You might wish to incorporate that in a view.) Similarly, you can
+ query for tuples valid at any past time with suitable conditions on
+ start_date and stop_date.
</para>
<para>
<function>timetravel()</> is the general trigger function that supports
- this behavior. Create a BEFORE INSERT OR UPDATE OR DELETE trigger using this
- function on each time-traveled table. You are to specify two trigger arguments:
- name of start_date column and name of stop_date column in triggered table.
+ this behavior. Create a <literal>BEFORE INSERT OR UPDATE OR DELETE</>
+ trigger using this function on each time-traveled table. Specify two
+ trigger arguments: the actual
+ names of the start_date and stop_date columns.
Optionally, you can specify one to three more arguments, which must refer
to columns of type <type>text</>. The trigger will store the name of
the current user into the first of these columns during INSERT, the
<literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab.
<literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab.
In both cases the old status is reported. While TT is off, you can modify
- the start_date and stop_date columns freely.
+ the start_date and stop_date columns freely. Note that the on/off status
+ is local to the current database session — fresh sessions will
+ always start out with TT ON for all tables.
</para>
<para>
</para>
<para>
- To use, create a BEFORE INSERT (or optionally BEFORE INSERT OR UPDATE)
- trigger using this function. You are to specify
- as trigger arguments: the name of the integer column to be modified,
+ To use, create a <literal>BEFORE INSERT</> (or optionally <literal>BEFORE
+ INSERT OR UPDATE</>) trigger using this function. Specify two
+ trigger arguments: the name of the integer column to be modified,
and the name of the sequence object that will supply values.
(Actually, you can specify any number of pairs of such names, if
you'd like to update more than one autoincrementing column.)
</para>
<para>
- To use, create a BEFORE INSERT and/or UPDATE
- trigger using this function. You are to specify a single trigger
+ To use, create a <literal>BEFORE INSERT</> and/or <literal>UPDATE</>
+ trigger using this function. Specify a single trigger
argument: the name of the text column to be modified.
</para>
</para>
<para>
- To use, create a BEFORE UPDATE
- trigger using this function. You are to specify a single trigger
+ To use, create a <literal>BEFORE UPDATE</>
+ trigger using this function. Specify a single trigger
argument: the name of the <type>timestamp</> column to be modified.
</para>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.7 2007/12/03 04:18:47 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.8 2007/12/06 04:12:09 tgl Exp $ -->
<appendix id="contrib">
<title>Additional Supplied Modules</title>
<para>
Many modules supply new user-defined functions, operators, or types.
To make use of one of these modules, after you have installed the code
- you need to register the new objects in the database
+ you need to register the new objects in the database
system by running the SQL commands in the <literal>.sql</> file
supplied by the module. For example,
Here, <replaceable>SHAREDIR</> means the installation's <quote>share</>
directory (<literal>pg_config --sharedir</> will tell you what this is).
+ In most cases the script must be run by a database superuser.
</para>
<para>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/cube.sgml,v 1.5 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="cube">
<title>cube</title>
-
+
<indexterm zone="cube">
<primary>cube</primary>
</indexterm>
<para>
- This module contains the user-defined type, CUBE, representing
- multidimensional cubes.
+ This module implements a data type <type>cube</> for
+ representing multi-dimensional cubes.
</para>
<sect2>
<title>Syntax</title>
<para>
- The following are valid external representations for the CUBE type:
+ The following are valid external representations for the <type>cube</>
+ type. <replaceable>x</>, <replaceable>y</>, etc denote floating-point
+ numbers:
</para>
<table>
<tgroup cols="2">
<tbody>
<row>
- <entry>'x'</entry>
- <entry>A floating point value representing a one-dimensional point or
- one-dimensional zero length cubement
- </entry>
- </row>
- <row>
- <entry>'(x)'</entry>
- <entry>Same as above</entry>
- </row>
- <row>
- <entry>'x1,x2,x3,...,xn'</entry>
- <entry>A point in n-dimensional space, represented internally as a zero
- volume box
- </entry>
- </row>
- <row>
- <entry>'(x1,x2,x3,...,xn)'</entry>
- <entry>Same as above</entry>
- </row>
- <row>
- <entry>'(x),(y)'</entry>
- <entry>1-D cubement starting at x and ending at y or vice versa; the
- order does not matter
- </entry>
- </row>
- <row>
- <entry>'(x1,...,xn),(y1,...,yn)'</entry>
- <entry>n-dimensional box represented by a pair of its opposite corners, no
- matter which. Functions take care of swapping to achieve "lower left --
- upper right" representation before computing any values
- </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </sect2>
-
- <sect2>
- <title>Grammar</title>
- <table>
- <title>Cube Grammar Rules</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>rule 1</entry>
- <entry>box -> O_BRACKET paren_list COMMA paren_list C_BRACKET</entry>
- </row>
- <row>
- <entry>rule 2</entry>
- <entry>box -> paren_list COMMA paren_list</entry>
- </row>
- <row>
- <entry>rule 3</entry>
- <entry>box -> paren_list</entry>
- </row>
- <row>
- <entry>rule 4</entry>
- <entry>box -> list</entry>
- </row>
- <row>
- <entry>rule 5</entry>
- <entry>paren_list -> O_PAREN list C_PAREN</entry>
- </row>
- <row>
- <entry>rule 6</entry>
- <entry>list -> FLOAT</entry>
- </row>
- <row>
- <entry>rule 7</entry>
- <entry>list -> list COMMA FLOAT</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </sect2>
-
- <sect2>
- <title>Tokens</title>
- <table>
- <title>Cube Grammar Rules</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>n</entry>
- <entry>[0-9]+</entry>
- </row>
- <row>
- <entry>i</entry>
- <entry>nteger [+-]?{n}</entry>
- </row>
- <row>
- <entry>real</entry>
- <entry>[+-]?({n}\.{n}?|\.{n})</entry>
- </row>
- <row>
- <entry>FLOAT</entry>
- <entry>({integer}|{real})([eE]{integer})?</entry>
- </row>
- <row>
- <entry>O_BRACKET</entry>
- <entry>\[</entry>
- </row>
- <row>
- <entry>C_BRACKET</entry>
- <entry>\]</entry>
- </row>
- <row>
- <entry>O_PAREN</entry>
- <entry>\(</entry>
- </row>
- <row>
- <entry>C_PAREN</entry>
- <entry>\)</entry>
- </row>
- <row>
- <entry>COMMA</entry>
- <entry>\,</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
- </sect2>
-
- <sect2>
- <title>Examples</title>
- <table>
- <title>Examples</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>'x'</entry>
- <entry>A floating point value representing a one-dimensional point
+ <entry><literal><replaceable>x</></literal></entry>
+ <entry>A one-dimensional point
(or, zero-length one-dimensional interval)
</entry>
</row>
<row>
- <entry>'(x)'</entry>
+ <entry><literal>(<replaceable>x</>)</literal></entry>
<entry>Same as above</entry>
</row>
<row>
- <entry>'x1,x2,x3,...,xn'</entry>
- <entry>A point in n-dimensional space,represented internally as a zero
- volume cube
+ <entry><literal><replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</></literal></entry>
+ <entry>A point in n-dimensional space, represented internally as a
+ zero-volume cube
</entry>
</row>
<row>
- <entry>'(x1,x2,x3,...,xn)'</entry>
+ <entry><literal>(<replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</>)</literal></entry>
<entry>Same as above</entry>
</row>
<row>
- <entry>'(x),(y)'</entry>
- <entry>A 1-D interval starting at x and ending at y or vice versa; the
+ <entry><literal>(<replaceable>x</>),(<replaceable>y</>)</literal></entry>
+ <entry>A one-dimensional interval starting at <replaceable>x</> and ending at <replaceable>y</> or vice versa; the
order does not matter
</entry>
</row>
<row>
- <entry>'[(x),(y)]'</entry>
+ <entry><literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal></entry>
<entry>Same as above</entry>
</row>
<row>
- <entry>'(x1,...,xn),(y1,...,yn)'</entry>
- <entry>An n-dimensional box represented by a pair of its diagonally
- opposite corners, regardless of order. Swapping is provided
- by all comarison routines to ensure the
- "lower left -- upper right" representation
- before actaul comparison takes place.
+ <entry><literal>(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)</literal></entry>
+ <entry>An n-dimensional cube represented by a pair of its diagonally
+ opposite corners
</entry>
</row>
<row>
- <entry>'[(x1,...,xn),(y1,...,yn)]'</entry>
+ <entry><literal>[(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)]</literal></entry>
<entry>Same as above</entry>
</row>
</tbody>
</tgroup>
</table>
+
<para>
- White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]'
+ It does not matter which order the opposite corners of a cube are
+ entered in. The <type>cube</> functions
+ automatically swap values if needed to create a uniform
+ <quote>lower left — upper right</> internal representation.
</para>
- </sect2>
- <sect2>
- <title>Defaults</title>
+
<para>
- I believe this union:
+ White space is ignored, so <literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal> is the same as
+ <literal>[ ( <replaceable>x</> ), ( <replaceable>y</> ) ]</literal>.
</para>
-<programlisting>
-select cube_union('(0,5,2),(2,3,1)','0');
-cube_union
--------------------
-(0, 0, 0),(2, 5, 2)
-(1 row)
-</programlisting>
-
- <para>
- does not contradict to the common sense, neither does the intersection
- </para>
-
-<programlisting>
-select cube_inter('(0,-1),(1,1)','(-2),(2)');
-cube_inter
--------------
-(0, 0),(1, 0)
-(1 row)
-</programlisting>
-
- <para>
- In all binary operations on differently sized boxes, I assume the smaller
- one to be a cartesian projection, i. e., having zeroes in place of coordinates
- omitted in the string representation. The above examples are equivalent to:
- </para>
-
-<programlisting>
-cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)');
-cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
-</programlisting>
-
- <para>
- The following containment predicate uses the point syntax,
- while in fact the second argument is internally represented by a box.
- This syntax makes it unnecessary to define the special Point type
- and functions for (box,point) predicates.
- </para>
-
-<programlisting>
-select cube_contains('(0,0),(1,1)', '0.5,0.5');
-cube_contains
---------------
-t
-(1 row)
-</programlisting>
</sect2>
+
<sect2>
<title>Precision</title>
+
<para>
-Values are stored internally as 64-bit floating point numbers. This means that
-numbers with more than about 16 significant digits will be truncated.
+ Values are stored internally as 64-bit floating point numbers. This means
+ that numbers with more than about 16 significant digits will be truncated.
</para>
</sect2>
<sect2>
<title>Usage</title>
- <para>
- The access method for CUBE is a GiST index (gist_cube_ops), which is a
- generalization of R-tree. GiSTs allow the postgres implementation of
- R-tree, originally encoded to support 2-D geometric types such as
- boxes and polygons, to be used with any data type whose data domain
- can be partitioned using the concepts of containment, intersection and
- equality. In other words, everything that can intersect or contain
- its own kind can be indexed with a GiST. That includes, among other
- things, all geometric data types, regardless of their dimensionality
- (see also contrib/seg).
- </para>
<para>
- The operators supported by the GiST access method include:
- </para>
-
- <programlisting>
-a = b Same as
- </programlisting>
- <para>
- The cubements a and b are identical.
+ The <filename>cube</> module includes a GiST index operator class for
+ <type>cube</> values.
+ The operators supported by the GiST opclass include:
</para>
- <programlisting>
+ <itemizedlist>
+ <listitem>
+ <programlisting>
+a = b Same as
+ </programlisting>
+ <para>
+ The cubes a and b are identical.
+ </para>
+ </listitem>
+ <listitem>
+ <programlisting>
a && b Overlaps
- </programlisting>
- <para>
- The cubements a and b overlap.
- </para>
-
- <programlisting>
+ </programlisting>
+ <para>
+ The cubes a and b overlap.
+ </para>
+ </listitem>
+ <listitem>
+ <programlisting>
a @> b Contains
- </programlisting>
- <para>
- The cubement a contains the cubement b.
- </para>
-
+ </programlisting>
+ <para>
+ The cube a contains the cube b.
+ </para>
+ </listitem>
+ <listitem>
<programlisting>
a <@ b Contained in
</programlisting>
- <para>
- The cubement a is contained in b.
- </para>
+ <para>
+ The cube a is contained in the cube b.
+ </para>
+ </listitem>
+ </itemizedlist>
<para>
(Before PostgreSQL 8.2, the containment operators @> and <@ were
</para>
<para>
- Although the mnemonics of the following operators is questionable, I
- preserved them to maintain visual consistency with other geometric
- data types defined in Postgres.
- </para>
-
- <para>
- Other operators:
- </para>
+ The standard B-tree operators are also provided, for example
<programlisting>
[a, b] < [c, d] Less than
[a, b] > [c, d] Greater than
</programlisting>
-
- <para>
+
These operators do not make a lot of sense for any practical
purpose but sorting. These operators first compare (a) to (c),
- and if these are equal, compare (b) to (d). That accounts for
+ and if these are equal, compare (b) to (d). That results in
reasonably good sorting in most cases, which is useful if
- you want to use ORDER BY with this type
+ you want to use ORDER BY with this type.
</para>
<para>
</para>
<table>
- <title>Functions available</title>
+ <title>Cube functions</title>
<tgroup cols="2">
<tbody>
- <row>
- <entry><literal>cube_distance(cube, cube) returns double</literal></entry>
- <entry>cube_distance returns the distance between two cubes. If both
- cubes are points, this is the normal distance function.
- </entry>
- </row>
- <row>
- <entry><literal>cube(text)</literal></entry>
- <entry>Takes text input and returns a cube. This is useful for making
- cubes from computed strings.
- </entry>
- </row>
<row>
<entry><literal>cube(float8) returns cube</literal></entry>
- <entry>This makes a one dimensional cube with both coordinates the same.
- If the type of the argument is a numeric type other than float8 an
- explicit cast to float8 may be needed.
+ <entry>Makes a one dimensional cube with both coordinates the same.
<literal>cube(1) == '(1)'</literal>
</entry>
</row>
<row>
<entry><literal>cube(float8, float8) returns cube</literal></entry>
- <entry>
- This makes a one dimensional cube.
+ <entry>Makes a one dimensional cube.
<literal>cube(1,2) == '(1),(2)'</literal>
</entry>
</row>
<row>
<entry><literal>cube(float8[]) returns cube</literal></entry>
- <entry>This makes a zero-volume cube using the coordinates
- defined by thearray.<literal>cube(ARRAY[1,2]) == '(1,2)'</literal>
+ <entry>Makes a zero-volume cube using the coordinates
+ defined by the array.
+ <literal>cube(ARRAY[1,2]) == '(1,2)'</literal>
</entry>
</row>
<row>
<entry><literal>cube(float8[], float8[]) returns cube</literal></entry>
- <entry>This makes a cube, with upper right and lower left
- coordinates as defined by the 2 float arrays. Arrays must be of the
+ <entry>Makes a cube with upper right and lower left
+ coordinates as defined by the two arrays, which must be of the
same length.
<literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)'
</literal>
<row>
<entry><literal>cube(cube, float8) returns cube</literal></entry>
- <entry>This builds a new cube by adding a dimension on to an
- existing cube with the same values for both parts of the new coordinate.
+ <entry>Makes a new cube by adding a dimension on to an
+ existing cube with the same values for both parts of the new coordinate.
This is useful for building cubes piece by piece from calculated values.
<literal>cube('(1)',2) == '(1,2),(1,2)'</literal>
</entry>
<row>
<entry><literal>cube(cube, float8, float8) returns cube</literal></entry>
- <entry>This builds a new cube by adding a dimension on to an
- existing cube. This is useful for building cubes piece by piece from
+ <entry>Makes a new cube by adding a dimension on to an
+ existing cube. This is useful for building cubes piece by piece from
calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal>
</entry>
</row>
<row>
<entry><literal>cube_dim(cube) returns int</literal></entry>
- <entry>cube_dim returns the number of dimensions stored in the
- the data structure
- for a cube. This is useful for constraints on the dimensions of a cube.
+ <entry>Returns the number of dimensions of the cube
</entry>
</row>
<row>
<entry><literal>cube_ll_coord(cube, int) returns double </literal></entry>
- <entry>
- cube_ll_coord returns the nth coordinate value for the lower left
- corner of a cube. This is useful for doing coordinate transformations.
+ <entry>Returns the n'th coordinate value for the lower left
+ corner of a cube
</entry>
</row>
<row>
<entry><literal>cube_ur_coord(cube, int) returns double
</literal></entry>
- <entry>cube_ur_coord returns the nth coordinate value for the
- upper right corner of a cube. This is useful for doing coordinate
- transformations.
+ <entry>Returns the n'th coordinate value for the
+ upper right corner of a cube
+ </entry>
+ </row>
+
+ <row>
+ <entry><literal>cube_is_point(cube) returns bool</literal></entry>
+ <entry>Returns true if a cube is a point, that is,
+ the two defining corners are the same.</entry>
+ </row>
+
+ <row>
+ <entry><literal>cube_distance(cube, cube) returns double</literal></entry>
+ <entry>Returns the distance between two cubes. If both
+ cubes are points, this is the normal distance function.
</entry>
</row>
<row>
<entry><literal>cube_subset(cube, int[]) returns cube
</literal></entry>
- <entry>Builds a new cube from an existing cube, using a list of
- dimension indexes
- from an array. Can be used to find both the ll and ur coordinate of single
- dimenion, e.g.: cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)'
- Or can be used to drop dimensions, or reorder them as desired, e.g.:
- cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) =
- '(5, 3, 1, 1),(8, 7, 6, 6)'
+ <entry>Makes a new cube from an existing cube, using a list of
+ dimension indexes from an array. Can be used to find both the LL and UR
+ coordinates of a single dimension, e.g.
+ <literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)'</>.
+ Or can be used to drop dimensions, or reorder them as desired, e.g.
+ <literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) = '(5, 3,
+ 1, 1),(8, 7, 6, 6)'</>.
</entry>
</row>
<row>
- <entry><literal>cube_is_point(cube) returns bool</literal></entry>
- <entry>cube_is_point returns true if a cube is also a point.
- This is true when the two defining corners are the same.</entry>
+ <entry><literal>cube_union(cube, cube) returns cube</literal></entry>
+ <entry>Produces the union of two cubes
+ </entry>
</row>
-
+
+ <row>
+ <entry><literal>cube_inter(cube, cube) returns cube</literal></entry>
+ <entry>Produces the intersection of two cubes
+ </entry>
+ </row>
+
<row>
- <entry><literal>cube_enlarge(cube, double, int) returns cube</literal></entry>
- <entry>
- cube_enlarge increases the size of a cube by a specified
- radius in at least
- n dimensions. If the radius is negative the box is shrunk instead. This
+ <entry><literal>cube_enlarge(cube c, double r, int n) returns cube</literal></entry>
+ <entry>Increases the size of a cube by a specified radius in at least
+ n dimensions. If the radius is negative the cube is shrunk instead. This
is useful for creating bounding boxes around a point for searching for
- nearby points. All defined dimensions are changed by the radius. If n
- is greater than the number of defined dimensions and the cube is being
- increased (r >= 0) then 0 is used as the base for the extra coordinates.
- LL coordinates are decreased by r and UR coordinates are increased by r.
- If a LL coordinate is increased to larger than the corresponding UR
- coordinate (this can only happen when r < 0) than both coordinates are
- set to their average. To make it harder for people to break things there
- is an effective maximum on the dimension of cubes of 100. This is set
- in cubedata.h if you need something bigger.
+ nearby points. All defined dimensions are changed by the radius r.
+ LL coordinates are decreased by r and UR coordinates are increased by r.
+ If a LL coordinate is increased to larger than the corresponding UR
+ coordinate (this can only happen when r < 0) than both coordinates
+ are set to their average. If n is greater than the number of defined
+ dimensions and the cube is being increased (r >= 0) then 0 is used
+ as the base for the extra coordinates.
</entry>
</row>
</tbody>
</tgroup>
</table>
-
+ </sect2>
+
+ <sect2>
+ <title>Defaults</title>
+
+ <para>
+ I believe this union:
+ </para>
+<programlisting>
+select cube_union('(0,5,2),(2,3,1)', '0');
+cube_union
+-------------------
+(0, 0, 0),(2, 5, 2)
+(1 row)
+</programlisting>
+
+ <para>
+ does not contradict common sense, neither does the intersection
+ </para>
+
+<programlisting>
+select cube_inter('(0,-1),(1,1)', '(-2),(2)');
+cube_inter
+-------------
+(0, 0),(1, 0)
+(1 row)
+</programlisting>
+
+ <para>
+ In all binary operations on differently-dimensioned cubes, I assume the
+ lower-dimensional one to be a cartesian projection, i. e., having zeroes
+ in place of coordinates omitted in the string representation. The above
+ examples are equivalent to:
+ </para>
+
+<programlisting>
+cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)');
+cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
+</programlisting>
+
+ <para>
+ The following containment predicate uses the point syntax,
+ while in fact the second argument is internally represented by a box.
+ This syntax makes it unnecessary to define a separate point type
+ and functions for (box,point) predicates.
+ </para>
+
+<programlisting>
+select cube_contains('(0,0),(1,1)', '0.5,0.5');
+cube_contains
+--------------
+t
+(1 row)
+</programlisting>
+ </sect2>
+
+ <sect2>
+ <title>Notes</title>
+
<para>
- There are a few other potentially useful functions defined in cube.c
- that vanished from the schema because I stopped using them. Some of
- these were meant to support type casting. Let me know if I was wrong:
- I will then add them back to the schema. I would also appreciate
- other ideas that would enhance the type and make it more useful.
+ For examples of usage, see the regression test <filename>sql/cube.sql</>.
</para>
<para>
- For examples of usage, see sql/cube.sql
+ To make it harder for people to break things, there
+ is a limit of 100 on the number of dimensions of cubes. This is set
+ in <filename>cubedata.h</> if you need something bigger.
</para>
</sect2>
<sect2>
<title>Credits</title>
+
<para>
- This code is essentially based on the example written for
- Illustra, <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink>
- </para>
- <para>
- My thanks are primarily to Prof. Joe Hellerstein
- (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
- gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and
- to his former student, Andy Dong
- (<ulink url="http://best.me.berkeley.edu/~adong/"></ulink>), for his exemplar.
- I am also grateful to all postgres developers, present and past, for enabling
- myself to create my own world and live undisturbed in it. And I would like to
- acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy
- for the years of faithful support of my database research.
+ Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
+ Mathematics and Computer Science Division, Argonne National Laboratory.
</para>
<para>
- Gene Selkov, Jr.
- Computational Scientist
- Mathematics and Computer Science Division
- Argonne National Laboratory
- 9700 S Cass Ave.
- Building 221
- Argonne, IL 60439-4844
- <email>selkovjr@mcs.anl.gov</email>
+ My thanks are primarily to Prof. Joe Hellerstein
+ (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
+ gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and
+ to his former student, Andy Dong (<ulink
+ url="http://best.me.berkeley.edu/~adong/"></ulink>), for his example
+ written for Illustra,
+ <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink>.
+ I am also grateful to all Postgres developers, present and past, for
+ enabling myself to create my own world and live undisturbed in it. And I
+ would like to acknowledge my gratitude to Argonne Lab and to the
+ U.S. Department of Energy for the years of faithful support of my database
+ research.
</para>
<para>
- Minor updates to this package were made by Bruno Wolff III
- <email>bruno@wolff.to</email> in August/September of 2002. These include
- changing the precision from single precision to double precision and adding
+ Minor updates to this package were made by Bruno Wolff III
+ <email>bruno@wolff.to</email> in August/September of 2002. These include
+ changing the precision from single precision to double precision and adding
some new functions.
</para>
<para>
- Additional updates were made by Joshua Reich <email>josh@root.net</email> in
- July 2006. These include <literal>cube(float8[], float8[])</literal> and
- cleaning up the code to use the V1 call protocol instead of the deprecated V0
- form.
+ Additional updates were made by Joshua Reich <email>josh@root.net</email> in
+ July 2006. These include <literal>cube(float8[], float8[])</literal> and
+ cleaning up the code to use the V1 call protocol instead of the deprecated
+ V0 protocol.
</para>
</sect2>
-</sect1>
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/dblink.sgml,v 1.3 2007/12/06 04:12:09 tgl Exp $ -->
+
<sect1 id="dblink">
<title>dblink</title>
-
+
<indexterm zone="dblink">
<primary>dblink</primary>
</indexterm>
<para>
- <literal>dblink</> is a module which allows connections with
- other databases.
+ <filename>dblink</> is a module which supports connections to
+ other <productname>PostgreSQL</> databases from within a database
+ session.
</para>
<refentry id="CONTRIB-DBLINK-CONNECT">
<refname>dblink_connect</refname>
<refpurpose>opens a persistent connection to a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_connect(text connstr)
- dblink_connect(text connname, text connstr)
+ dblink_connect(text connstr) returns text
+ dblink_connect(text connname, text connstr) returns text
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- if 2 arguments ar given, the first is used as a name for a persistent
- connection
- </para>
- </refsect2>
-
- <refsect2>
- <title>connstr</title>
- <para>
- standard libpq format connection string,
- e.g. "hostaddr=127.0.0.1 port=5432 dbname=mydb user=postgres password=mypasswd"
- </para>
- <para>
- if only one argument is given, the connection is unnamed; only one unnamed
- connection can exist at a time
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns status = "OK"</para>
- </refsect1>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_connect()</> establishes a connection to a remote
+ <productname>PostgreSQL</> database. The server and database to
+ be contacted are identified through a standard <application>libpq</>
+ connection string. Optionally, a name can be assigned to the
+ connection. Multiple named connections can be open at once, but
+ only one unnamed connection is permitted at a time. The connection
+ will persist until closed or until the database session is ended.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ The name to use for this connection; if omitted, an unnamed
+ connection is opened, replacing any existing unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>connstr</parameter></term>
+ <listitem>
+ <para>
+ <application>libpq</>-style connection info string, for example
+ <literal>hostaddr=127.0.0.1 port=5432 dbname=mydb user=postgres
+ password=mypasswd</>.
+ For details see <function>PQconnectdb</> in
+ <xref linkend="libpq-connect">.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ Returns status, which is always <literal>OK</> (since any error
+ causes the function to throw an error instead of returning).
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+
+ <para>
+ Only superusers may use <function>dblink_connect</> to create
+ non-password-authenticated connections. If non-superusers need this
+ capability, use <function>dblink_connect_u</> instead.
+ </para>
+
+ <para>
+ It is unwise to choose connection names that contain equal signs,
+ as this opens a risk of confusion with connection info strings
+ in other <filename>dblink</> functions.
+ </para>
+ </refsect1>
+
<refsect1>
<title>Example</title>
+
<programlisting>
select dblink_connect('dbname=postgres');
dblink_connect
----------------
OK
(1 row)
-
- select dblink_connect('myconn','dbname=postgres');
+
+ select dblink_connect('myconn', 'dbname=postgres');
dblink_connect
----------------
OK
(1 row)
</programlisting>
</refsect1>
- </refentry>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-CONNECT-U">
+ <refnamediv>
+ <refname>dblink_connect_u</refname>
+ <refpurpose>opens a persistent connection to a remote database, insecurely</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_connect_u(text connstr) returns text
+ dblink_connect_u(text connname, text connstr) returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_connect_u()</> is identical to
+ <function>dblink_connect()</>, except that it will allow non-superusers
+ to connect using any authentication method.
+ </para>
+
+ <para>
+ If the remote server selects an authentication method that does not
+ involve a password, then impersonation and subsequent escalation of
+ privileges can occur, because the session will appear to have
+ originated from the user as which the local <productname>PostgreSQL</>
+ server runs. Therefore, <function>dblink_connect_u()</> is initially
+ installed with all privileges revoked from <literal>PUBLIC</>,
+ making it un-callable except by superusers. In some situations
+ it may be appropriate to grant <literal>EXECUTE</> permission for
+ <function>dblink_connect_u()</> to specific users who are considered
+ trustworthy, but this should be done with care.
+ </para>
+
+ <para>
+ For further details see <function>dblink_connect()</>.
+ </para>
+ </refsect1>
+ </refentry>
<refentry id="CONTRIB-DBLINK-DISCONNECT">
<refnamediv>
<refname>dblink_disconnect</refname>
<refpurpose>closes a persistent connection to a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_disconnect()
- dblink_disconnect(text connname)
+ dblink_disconnect() returns text
+ dblink_disconnect(text connname) returns text
</synopsis>
</refsynopsisdiv>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_disconnect()</> closes a connection previously opened
+ by <function>dblink_connect()</>. The form with no arguments closes
+ an unnamed connection.
+ </para>
+ </refsect1>
+
<refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- if an argument is given, it is used as a name for a persistent
- connection to close; otherwiase the unnamed connection is closed
- </para>
- </refsect2>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ The name of a named connection to be closed.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect1>
-
+
<refsect1>
- <title>Outputs</title>
- <para>Returns status = "OK"</para>
+ <title>Return Value</title>
+
+ <para>
+ Returns status, which is always <literal>OK</> (since any error
+ causes the function to throw an error instead of returning).
+ </para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
test=# select dblink_disconnect();
dblink_disconnect
-------------------
OK
(1 row)
-
+
select dblink_disconnect('myconn');
dblink_disconnect
-------------------
(1 row)
</programlisting>
</refsect1>
- </refentry>
+ </refentry>
- <refentry id="CONTRIB-DBLINK-OPEN">
+ <refentry id="CONTRIB-DBLINK">
<refnamediv>
- <refname>dblink_open</refname>
- <refpurpose>opens a cursor on a remote database</refpurpose>
+ <refname>dblink</refname>
+ <refpurpose>executes a query in a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_open(text cursorname, text sql [, bool fail_on_error])
- dblink_open(text connname, text cursorname, text sql [, bool fail_on_error])
+ dblink(text connname, text sql [, bool fail_on_error]) returns setof record
+ dblink(text connstr, text sql [, bool fail_on_error]) returns setof record
+ dblink(text sql [, bool fail_on_error]) returns setof record
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- if three arguments are present, the first is taken as the specific
- connection name to use; otherwise the unnamed connection is assumed
- </para>
- </refsect2>
-
- <refsect2>
- <title>cursorname</title>
- <para>
- a reference name for the cursor
- </para>
- </refsect2>
-
- <refsect2>
- <title>sql</title>
- <para>
- sql statement that you wish to execute on the remote host
- e.g. "select * from pg_class"
- </para>
- </refsect2>
-
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and the return value is set
- to 'ERROR'.
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns status = "OK"</para>
- </refsect1>
-
- <refsect1>
- <title>Note</title>
- <itemizedlist>
- <listitem>
- <para>
- dblink_connect(text connstr) must be executed first
- </para>
- </listitem>
- <listitem>
- <para>
- dblink_open starts an explicit transaction. If, after using dblink_open,
- you use dblink_exec to change data, and then an error occurs or you use
- dblink_disconnect without a dblink_close first, your change *will* be
- lost. Also, using dblink_close explicitly ends the transaction and thus
- effectively closes *all* open cursors.
- </para>
- </listitem>
- </itemizedlist>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink</> executes a query (usually a <command>SELECT</>,
+ but it can be any SQL statement that returns rows) in a remote database.
+ </para>
+
+ <para>
+ When two <type>text</> arguments are given, the first one is first
+ looked up as a persistent connection's name; if found, the command
+ is executed on that connection. If not found, the first argument
+ is treated as a connection info string as for <function>dblink_connect</>,
+ and the indicated connection is made just for the duration of this command.
+ </para>
</refsect1>
+
<refsect1>
- <title>Example</title>
- <programlisting>
- test=# select dblink_connect('dbname=postgres');
- dblink_connect
- ----------------
- OK
- (1 row)
-
- test=# select dblink_open('foo','select proname, prosrc from pg_proc');
- dblink_open
- -------------
- OK
- (1 row)
- </programlisting>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use; omit this parameter to use the
+ unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>connstr</parameter></term>
+ <listitem>
+ <para>
+ A connection info string, as previously described for
+ <function>dblink_connect</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>sql</parameter></term>
+ <listitem>
+ <para>
+ The SQL query that you wish to execute in the remote database,
+ for example <literal>select * from foo</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function returns no rows.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK-FETCH">
- <refnamediv>
- <refname>dblink_fetch</refname>
- <refpurpose>returns a set from an open cursor on a remote database</refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink_fetch(text cursorname, int32 howmany [, bool fail_on_error])
- dblink_fetch(text connname, text cursorname, int32 howmany [, bool fail_on_error])
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- if three arguments are present, the first is taken as the specific
- connection name to use; otherwise the unnamed connection is assumed
- </para>
- </refsect2>
-
- <refsect2>
- <title>cursorname</title>
- <para>
- The reference name for the cursor
- </para>
- </refsect2>
-
- <refsect2>
- <title>howmany</title>
- <para>
- Maximum number of rows to retrieve. The next howmany rows are fetched,
- starting at the current cursor position, moving forward. Once the cursor
- has positioned to the end, no more rows are produced.
- </para>
- </refsect2>
-
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and no rows are returned.
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns setof record</para>
- </refsect1>
-
- <refsect1>
- <title>Note</title>
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ The function returns the row(s) produced by the query. Since
+ <function>dblink</> can be used with any query, it is declared
+ to return <type>record</>, rather than specifying any particular
+ set of columns. This means that you must specify the expected
+ set of columns in the calling query — otherwise
+ <productname>PostgreSQL</> would not know what to expect.
+ Here is an example:
+
+<programlisting>
+SELECT *
+ FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
+ AS t1(proname name, prosrc text)
+ WHERE proname LIKE 'bytea%';
+</programlisting>
+
+ The <quote>alias</> part of the <literal>FROM</> clause must
+ specify the column names and types that the function will return.
+ (Specifying column names in an alias is actually standard SQL
+ syntax, but specifying column types is a <productname>PostgreSQL</>
+ extension.) This allows the system to understand what
+ <literal>*</> should expand to, and what <structname>proname</>
+ in the <literal>WHERE</> clause refers to, in advance of trying
+ to execute the function. At runtime, an error will be thrown
+ if the actual query result from the remote database does not
+ have the same number of columns shown in the <literal>FROM</> clause.
+ The column names need not match, however, and <function>dblink</>
+ does not insist on exact type matches either. It will succeed
+ so long as the returned data strings are valid input for the
+ column type declared in the <literal>FROM</> clause.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+
+ <para>
+ <function>dblink</> fetches the entire remote query result before
+ returning any of it to the local system. If the query is expected
+ to return a large number of rows, it's better to open it as a cursor
+ with <function>dblink_open</> and then fetch a manageable number
+ of rows at a time.
+ </para>
+
<para>
- On a mismatch between the number of return fields as specified in the FROM
- clause, and the actual number of fields returned by the remote cursor, an
- ERROR will be thrown. In this event, the remote cursor is still advanced
- by as many rows as it would have been if the ERROR had not occurred.
+ A convenient way to use <function>dblink</> with predetermined
+ queries is to create a view.
+ This allows the column type information to be buried in the view,
+ instead of having to spell it out in every query. For example,
+
+ <programlisting>
+ create view myremote_pg_proc as
+ select *
+ from dblink('dbname=postgres', 'select proname, prosrc from pg_proc')
+ as t1(proname name, prosrc text);
+
+ select * from myremote_pg_proc where proname like 'bytea%';
+ </programlisting>
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- test=# select dblink_connect('dbname=postgres');
+ select * from dblink('dbname=postgres', 'select proname, prosrc from pg_proc')
+ as t1(proname name, prosrc text) where proname like 'bytea%';
+ proname | prosrc
+ ------------+------------
+ byteacat | byteacat
+ byteaeq | byteaeq
+ bytealt | bytealt
+ byteale | byteale
+ byteagt | byteagt
+ byteage | byteage
+ byteane | byteane
+ byteacmp | byteacmp
+ bytealike | bytealike
+ byteanlike | byteanlike
+ byteain | byteain
+ byteaout | byteaout
+ (12 rows)
+
+ select dblink_connect('dbname=postgres');
dblink_connect
----------------
OK
(1 row)
-
- test=# select dblink_open('foo','select proname, prosrc from pg_proc where proname like ''bytea%''');
- dblink_open
- -------------
+
+ select * from dblink('select proname, prosrc from pg_proc')
+ as t1(proname name, prosrc text) where proname like 'bytea%';
+ proname | prosrc
+ ------------+------------
+ byteacat | byteacat
+ byteaeq | byteaeq
+ bytealt | bytealt
+ byteale | byteale
+ byteagt | byteagt
+ byteage | byteage
+ byteane | byteane
+ byteacmp | byteacmp
+ bytealike | bytealike
+ byteanlike | byteanlike
+ byteain | byteain
+ byteaout | byteaout
+ (12 rows)
+
+ select dblink_connect('myconn', 'dbname=regression');
+ dblink_connect
+ ----------------
OK
(1 row)
-
- test=# select * from dblink_fetch('foo',5) as (funcname name, source text);
- funcname | source
- ----------+----------
- byteacat | byteacat
- byteacmp | byteacmp
- byteaeq | byteaeq
- byteage | byteage
- byteagt | byteagt
- (5 rows)
-
- test=# select * from dblink_fetch('foo',5) as (funcname name, source text);
- funcname | source
- -----------+-----------
- byteain | byteain
- byteale | byteale
- bytealike | bytealike
- bytealt | bytealt
- byteane | byteane
- (5 rows)
-
- test=# select * from dblink_fetch('foo',5) as (funcname name, source text);
- funcname | source
+
+ select * from dblink('myconn', 'select proname, prosrc from pg_proc')
+ as t1(proname name, prosrc text) where proname like 'bytea%';
+ proname | prosrc
------------+------------
+ bytearecv | bytearecv
+ byteasend | byteasend
+ byteale | byteale
+ byteagt | byteagt
+ byteage | byteage
+ byteane | byteane
+ byteacmp | byteacmp
+ bytealike | bytealike
byteanlike | byteanlike
+ byteacat | byteacat
+ byteaeq | byteaeq
+ bytealt | bytealt
+ byteain | byteain
byteaout | byteaout
- (2 rows)
-
- test=# select * from dblink_fetch('foo',5) as (funcname name, source text);
- funcname | source
- ----------+--------
- (0 rows)
+ (14 rows)
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-CLOSE">
+ <refentry id="CONTRIB-DBLINK-EXEC">
<refnamediv>
- <refname>dblink_close</refname>
- <refpurpose>closes a cursor on a remote database</refpurpose>
+ <refname>dblink_exec</refname>
+ <refpurpose>executes a command in a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_close(text cursorname [, bool fail_on_error])
- dblink_close(text connname, text cursorname [, bool fail_on_error])
+ dblink_exec(text connname, text sql [, bool fail_on_error]) returns text
+ dblink_exec(text connstr, text sql [, bool fail_on_error]) returns text
+ dblink_exec(text sql [, bool fail_on_error]) returns text
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- if two arguments are present, the first is taken as the specific
- connection name to use; otherwise the unnamed connection is assumed
- </para>
- </refsect2>
-
- <refsect2>
- <title>cursorname</title>
- <para>
- a reference name for the cursor
- </para>
- </refsect2>
-
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and the return value is set
- to 'ERROR'.
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns status = "OK"</para>
- </refsect1>
-
- <refsect1>
- <title>Note</title>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_exec</> executes a command (that is, any SQL statement
+ that doesn't return rows) in a remote database.
+ </para>
+
+ <para>
+ When two <type>text</> arguments are given, the first one is first
+ looked up as a persistent connection's name; if found, the command
+ is executed on that connection. If not found, the first argument
+ is treated as a connection info string as for <function>dblink_connect</>,
+ and the indicated connection is made just for the duration of this command.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use; omit this parameter to use the
+ unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>connstr</parameter></term>
+ <listitem>
+ <para>
+ A connection info string, as previously described for
+ <function>dblink_connect</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>sql</parameter></term>
+ <listitem>
+ <para>
+ The SQL command that you wish to execute in the remote database,
+ for example
+ <literal>insert into foo values(0,'a','{"a0","b0","c0"}')</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function's return value is set to <literal>ERROR</>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
<para>
- dblink_connect(text connstr) or dblink_connect(text connname, text connstr)
- must be executed first.
+ Returns status, either the command's status string or <literal>ERROR</>.
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- test=# select dblink_connect('dbname=postgres');
+ select dblink_connect('dbname=dblink_test_slave');
dblink_connect
----------------
OK
(1 row)
-
- test=# select dblink_open('foo','select proname, prosrc from pg_proc');
- dblink_open
- -------------
- OK
- (1 row)
-
- test=# select dblink_close('foo');
- dblink_close
- --------------
- OK
+
+ select dblink_exec('insert into foo values(21,''z'',''{"a0","b0","c0"}'');');
+ dblink_exec
+ -----------------
+ INSERT 943366 1
(1 row)
-
- select dblink_connect('myconn','dbname=regression');
+
+ select dblink_connect('myconn', 'dbname=regression');
dblink_connect
----------------
OK
(1 row)
-
- select dblink_open('myconn','foo','select proname, prosrc from pg_proc');
- dblink_open
- -------------
- OK
- (1 row)
-
- select dblink_close('myconn','foo');
- dblink_close
- --------------
- OK
+
+ select dblink_exec('myconn', 'insert into foo values(21,''z'',''{"a0","b0","c0"}'');');
+ dblink_exec
+ ------------------
+ INSERT 6432584 1
+ (1 row)
+
+ select dblink_exec('myconn', 'insert into pg_class values (''foo'')',false);
+ NOTICE: sql error
+ DETAIL: ERROR: null value in column "relnamespace" violates not-null constraint
+
+ dblink_exec
+ -------------
+ ERROR
(1 row)
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-EXEC">
+ <refentry id="CONTRIB-DBLINK-OPEN">
<refnamediv>
- <refname>dblink_exec</refname>
- <refpurpose>executes an UPDATE/INSERT/DELETE on a remote database</refpurpose>
+ <refname>dblink_open</refname>
+ <refpurpose>opens a cursor in a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_exec(text connstr, text sql [, bool fail_on_error])
- dblink_exec(text connname, text sql [, bool fail_on_error])
- dblink_exec(text sql [, bool fail_on_error])
+ dblink_open(text cursorname, text sql [, bool fail_on_error]) returns text
+ dblink_open(text connname, text cursorname, text sql [, bool fail_on_error]) returns text
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname/connstr</title>
- <para>
- If two arguments are present, the first is first assumed to be a specific
- connection name to use. If the name is not found, the argument is then
- assumed to be a valid connection string, of standard libpq format,
- e.g.: "hostaddr=127.0.0.1 dbname=mydb user=postgres password=mypasswd"
-
- If only one argument is used, then the unnamed connection is used.
- </para>
- </refsect2>
-
- <refsect2>
- <title>sql</title>
- <para>
- sql statement that you wish to execute on the remote host, e.g.:
- insert into foo values(0,'a','{"a0","b0","c0"}');
- </para>
- </refsect2>
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and the return value is set
- to 'ERROR'.
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns status of the command, or 'ERROR' if the command failed.</para>
- </refsect1>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_open()</> opens a cursor in a remote database.
+ The cursor can subsequently be manipulated with
+ <function>dblink_fetch()</> and <function>dblink_close()</>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use; omit this parameter to use the
+ unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>cursorname</parameter></term>
+ <listitem>
+ <para>
+ The name to assign to this cursor.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>sql</parameter></term>
+ <listitem>
+ <para>
+ The <command>SELECT</> statement that you wish to execute in the remote
+ database, for example <literal>select * from pg_class</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function's return value is set to <literal>ERROR</>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ Returns status, either <literal>OK</> or <literal>ERROR</>.
+ </para>
+ </refsect1>
+
<refsect1>
<title>Notes</title>
+
<para>
- dblink_open starts an explicit transaction. If, after using dblink_open,
- you use dblink_exec to change data, and then an error occurs or you use
- dblink_disconnect without a dblink_close first, your change *will* be
- lost.
+ Since a cursor can only persist within a transaction,
+ <function>dblink_open</> starts an explicit transaction block
+ (<command>BEGIN</>) on the remote side, if the remote side was
+ not already within a transaction. This transaction will be
+ closed again when the matching <function>dblink_close</> is
+ executed. Note that if
+ you use <function>dblink_exec</> to change data between
+ <function>dblink_open</> and <function>dblink_close</>,
+ and then an error occurs or you use <function>dblink_disconnect</> before
+ <function>dblink_close</>, your change <emphasis>will be
+ lost</> because the transaction will be aborted.
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- select dblink_connect('dbname=dblink_test_slave');
- dblink_connect
- ----------------
- OK
- (1 row)
-
- select dblink_exec('insert into foo values(21,''z'',''{"a0","b0","c0"}'');');
- dblink_exec
- -----------------
- INSERT 943366 1
- (1 row)
-
- select dblink_connect('myconn','dbname=regression');
+ test=# select dblink_connect('dbname=postgres');
dblink_connect
----------------
OK
(1 row)
-
- select dblink_exec('myconn','insert into foo values(21,''z'',''{"a0","b0","c0"}'');');
- dblink_exec
- ------------------
- INSERT 6432584 1
- (1 row)
-
- select dblink_exec('myconn','insert into pg_class values (''foo'')',false);
- NOTICE: sql error
- DETAIL: ERROR: null value in column "relnamespace" violates not-null constraint
-
- dblink_exec
+
+ test=# select dblink_open('foo', 'select proname, prosrc from pg_proc');
+ dblink_open
-------------
- ERROR
+ OK
(1 row)
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-CURRENT-QUERY">
+ <refentry id="CONTRIB-DBLINK-FETCH">
<refnamediv>
- <refname>dblink_current_query</refname>
- <refpurpose>returns the current query string</refpurpose>
+ <refname>dblink_fetch</refname>
+ <refpurpose>returns rows from an open cursor in a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_current_query () RETURNS text
+ dblink_fetch(text cursorname, int howmany [, bool fail_on_error]) returns setof record
+ dblink_fetch(text connname, text cursorname, int howmany [, bool fail_on_error]) returns setof record
</synopsis>
</refsynopsisdiv>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_fetch</> fetches rows from a cursor previously
+ established by <function>dblink_open</>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use; omit this parameter to use the
+ unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>cursorname</parameter></term>
+ <listitem>
+ <para>
+ The name of the cursor to fetch from.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>howmany</parameter></term>
+ <listitem>
+ <para>
+ The maximum number of rows to retrieve. The next <parameter>howmany</>
+ rows are fetched, starting at the current cursor position, moving
+ forward. Once the cursor has reached its end, no more rows are produced.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function returns no rows.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
<refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>None</title>
- <para>
- </para>
- </refsect2>
+ <title>Return Value</title>
+
+ <para>
+ The function returns the row(s) fetched from the cursor. To use this
+ function, you will need to specify the expected set of columns,
+ as previously discussed for <function>dblink</>.
+ </para>
</refsect1>
-
+
<refsect1>
- <title>Outputs</title>
- <para>Returns test -- a copy of the currenty executing query</para>
+ <title>Notes</title>
+
+ <para>
+ On a mismatch between the number of return columns specified in the
+ <literal>FROM</> clause, and the actual number of columns returned by the
+ remote cursor, an error will be thrown. In this event, the remote cursor
+ is still advanced by as many rows as it would have been if the error had
+ not occurred. The same is true for any other error occurring in the local
+ query after the remote <command>FETCH</> has been done.
+ </para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- test=# select dblink_current_query() from (select dblink('dbname=postgres','select oid, proname from pg_proc where proname = ''byteacat''') as f1) as t1;
- dblink_current_query
- -----------------------------------------------------------------------------------------------------------------------------------------------------
- select dblink_current_query() from (select dblink('dbname=postgres','select oid, proname from pg_proc where proname = ''byteacat''') as f1) as t1;
+ test=# select dblink_connect('dbname=postgres');
+ dblink_connect
+ ----------------
+ OK
+ (1 row)
+
+ test=# select dblink_open('foo', 'select proname, prosrc from pg_proc where proname like ''bytea%''');
+ dblink_open
+ -------------
+ OK
(1 row)
+
+ test=# select * from dblink_fetch('foo', 5) as (funcname name, source text);
+ funcname | source
+ ----------+----------
+ byteacat | byteacat
+ byteacmp | byteacmp
+ byteaeq | byteaeq
+ byteage | byteage
+ byteagt | byteagt
+ (5 rows)
+
+ test=# select * from dblink_fetch('foo', 5) as (funcname name, source text);
+ funcname | source
+ -----------+-----------
+ byteain | byteain
+ byteale | byteale
+ bytealike | bytealike
+ bytealt | bytealt
+ byteane | byteane
+ (5 rows)
+
+ test=# select * from dblink_fetch('foo', 5) as (funcname name, source text);
+ funcname | source
+ ------------+------------
+ byteanlike | byteanlike
+ byteaout | byteaout
+ (2 rows)
+
+ test=# select * from dblink_fetch('foo', 5) as (funcname name, source text);
+ funcname | source
+ ----------+--------
+ (0 rows)
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-GET-PKEY">
+ <refentry id="CONTRIB-DBLINK-CLOSE">
<refnamediv>
- <refname>dblink_get_pkey</refname>
- <refpurpose>returns the position and field names of a relation's
- primary key fields
- </refpurpose>
+ <refname>dblink_close</refname>
+ <refpurpose>closes a cursor in a remote database</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_get_pkey(text relname) RETURNS setof dblink_pkey_results
+ dblink_close(text cursorname [, bool fail_on_error]) returns text
+ dblink_close(text connname, text cursorname [, bool fail_on_error]) returns text
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>relname</title>
- <para>
- any relation name;
- e.g. 'foobar'
- </para>
- </refsect2>
- </refsect1>
-
+
<refsect1>
- <title>Outputs</title>
+ <title>Description</title>
+
<para>
- Returns setof dblink_pkey_results -- one row for each primary key field,
- in order of position in the key. dblink_pkey_results is defined as follows:
- CREATE TYPE dblink_pkey_results AS (position int4, colname text);
+ <function>dblink_close</> closes a cursor previously opened with
+ <function>dblink_open</>.
</para>
</refsect1>
-
+
<refsect1>
- <title>Example</title>
- <programlisting>
- test=# select * from dblink_get_pkey('foobar');
- position | colname
- ----------+---------
- 1 | f1
- 2 | f2
- 3 | f3
- 4 | f4
- 5 | f5
- </programlisting>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use; omit this parameter to use the
+ unnamed connection.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>cursorname</parameter></term>
+ <listitem>
+ <para>
+ The name of the cursor to close.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function's return value is set to <literal>ERROR</>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK-BUILD-SQL-INSERT">
- <refnamediv>
- <refname>dblink_build_sql_insert</refname>
- <refpurpose>
- builds an insert statement using a local tuple, replacing the
- selection key field values with alternate supplied values
- </refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink_build_sql_insert(text relname
- ,int2vector primary_key_attnums
- ,int2 num_primary_key_atts
- ,_text src_pk_att_vals_array
- ,_text tgt_pk_att_vals_array) RETURNS text
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>relname</title>
- <para>
- any relation name;
- e.g. 'foobar';
- </para>
- </refsect2>
- <refsect2>
- <title>primary_key_attnums</title>
- <para>
- vector of primary key attnums (1 based, see pg_index.indkey);
- e.g. '1 2'
- </para>
- </refsect2>
- <refsect2>
- <title>num_primary_key_atts</title>
- <para>
- number of primary key attnums in the vector; e.g. 2
- </para>
- </refsect2>
- <refsect2>
- <title>src_pk_att_vals_array</title>
- <para>
- array of primary key values, used to look up the local matching
- tuple, the values of which are then used to construct the SQL
- statement
- </para>
- </refsect2>
- <refsect2>
- <title>tgt_pk_att_vals_array</title>
- <para>
- array of primary key values, used to replace the local tuple
- values in the SQL statement
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns text -- requested SQL statement</para>
- </refsect1>
-
<refsect1>
- <title>Example</title>
- <programlisting>
- test=# select dblink_build_sql_insert('foo','1 2',2,'{"1", "a"}','{"1", "b''a"}');
- dblink_build_sql_insert
- --------------------------------------------------
- INSERT INTO foo(f1,f2,f3) VALUES('1','b''a','1')
- (1 row)
- </programlisting>
+ <title>Return Value</title>
+
+ <para>
+ Returns status, either <literal>OK</> or <literal>ERROR</>.
+ </para>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK-BUILD-SQL-DELETE">
- <refnamediv>
- <refname>dblink_build_sql_delete</refname>
- <refpurpose>builds a delete statement using supplied values for selection
- key field values
- </refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink_build_sql_delete(text relname
- ,int2vector primary_key_attnums
- ,int2 num_primary_key_atts
- ,_text tgt_pk_att_vals_array) RETURNS text
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>relname</title>
- <para>
- any relation name;
- e.g. 'foobar';
- </para>
- </refsect2>
- <refsect2>
- <title>primary_key_attnums</title>
- <para>
- vector of primary key attnums (1 based, see pg_index.indkey);
- e.g. '1 2'
- </para>
- </refsect2>
- <refsect2>
- <title>num_primary_key_atts</title>
- <para>
- number of primary key attnums in the vector; e.g. 2
- </para>
- </refsect2>
- <refsect2>
- <title>src_pk_att_vals_array</title>
- <para>
- array of primary key values, used to look up the local matching
- tuple, the values of which are then used to construct the SQL
- statement
- </para>
- </refsect2>
- <refsect2>
- <title>tgt_pk_att_vals_array</title>
- <para>
- array of primary key values, used to replace the local tuple
- values in the SQL statement
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns text -- requested SQL statement</para>
- </refsect1>
-
<refsect1>
- <title>Example</title>
- <programlisting>
- test=# select dblink_build_sql_delete('MyFoo','1 2',2,'{"1", "b"}');
- dblink_build_sql_delete
- ---------------------------------------------
- DELETE FROM "MyFoo" WHERE f1='1' AND f2='b'
- (1 row)
- </programlisting>
+ <title>Notes</title>
+
+ <para>
+ If <function>dblink_open</> started an explicit transaction block,
+ and this is the last remaining open cursor in this connection,
+ <function>dblink_close</> will issue the matching <command>COMMIT</>.
+ </para>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK-BUILD-SQL-UPDATE">
- <refnamediv>
- <refname>dblink_build_sql_update</refname>
- <refpurpose>builds an update statement using a local tuple, replacing
- the selection key field values with alternate supplied values
- </refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink_build_sql_update(text relname
- ,int2vector primary_key_attnums
- ,int2 num_primary_key_atts
- ,_text src_pk_att_vals_array
- ,_text tgt_pk_att_vals_array) RETURNS text
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>relname</title>
- <para>
- any relation name;
- e.g. 'foobar';
- </para>
- </refsect2>
- <refsect2>
- <title>primary_key_attnums</title>
- <para>
- vector of primary key attnums (1 based, see pg_index.indkey);
- e.g. '1 2'
- </para>
- </refsect2>
- <refsect2>
- <title>num_primary_key_atts</title>
- <para>
- number of primary key attnums in the vector; e.g. 2
- </para>
- </refsect2>
- <refsect2>
- <title>src_pk_att_vals_array</title>
- <para>
- array of primary key values, used to look up the local matching
- tuple, the values of which are then used to construct the SQL
- statement
- </para>
- </refsect2>
- <refsect2>
- <title>tgt_pk_att_vals_array</title>
- <para>
- array of primary key values, used to replace the local tuple
- values in the SQL statement
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns text -- requested SQL statement</para>
- </refsect1>
-
<refsect1>
<title>Example</title>
+
<programlisting>
- test=# select dblink_build_sql_update('foo','1 2',2,'{"1", "a"}','{"1", "b"}');
- dblink_build_sql_update
- -------------------------------------------------------------
- UPDATE foo SET f1='1',f2='b',f3='1' WHERE f1='1' AND f2='b'
+ test=# select dblink_connect('dbname=postgres');
+ dblink_connect
+ ----------------
+ OK
+ (1 row)
+
+ test=# select dblink_open('foo', 'select proname, prosrc from pg_proc');
+ dblink_open
+ -------------
+ OK
+ (1 row)
+
+ test=# select dblink_close('foo');
+ dblink_close
+ --------------
+ OK
(1 row)
</programlisting>
</refsect1>
<refentry id="CONTRIB-DBLINK-GET-CONNECTIONS">
<refnamediv>
<refname>dblink_get_connections</refname>
- <refpurpose>returns a text array of all active named dblink connections</refpurpose>
+ <refpurpose>returns the names of all open named dblink connections</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_get_connections() RETURNS text[]
+ dblink_get_connections() returns text[]
</synopsis>
</refsynopsisdiv>
-
+
<refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>none</title>
- <para></para>
- </refsect2>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_get_connections</> returns an array of the names
+ of all open named <filename>dblink</> connections.
+ </para>
</refsect1>
-
+
<refsect1>
- <title>Outputs</title>
- <para>Returns text array of all active named dblink connections</para>
+ <title>Return Value</title>
+
+ <para>Returns a text array of connection names, or NULL if none.</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
SELECT dblink_get_connections();
</programlisting>
</refsect1>
- </refentry>
+ </refentry>
- <refentry id="CONTRIB-DBLINK-IS-BUSY">
+ <refentry id="CONTRIB-DBLINK-ERROR-MESSAGE">
<refnamediv>
- <refname>dblink_is_busy</refname>
- <refpurpose>checks to see if named connection is busy with an async query</refpurpose>
+ <refname>dblink_error_message</refname>
+ <refpurpose>gets last error message on the named connection</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_is_busy(text connname) RETURNS int
+ dblink_error_message(text connname) returns text
</synopsis>
</refsynopsisdiv>
-
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_error_message</> fetches the most recent remote
+ error message for a given connection.
+ </para>
+ </refsect1>
+
<refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- The specific connection name to use
- </para>
- </refsect2>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect1>
-
+
<refsect1>
- <title>Outputs</title>
+ <title>Return Value</title>
+
<para>
- Returns 1 if connection is busy, 0 if it is not busy.
- If this function returns 0, it is guaranteed that dblink_get_result
- will not block.
+ Returns last error message, or an empty string if there has been
+ no error in this connection.
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- SELECT dblink_is_busy('dtest1');
+ SELECT dblink_error_message('dtest1');
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-CANCEL-QUERY">
+ <refentry id="CONTRIB-DBLINK-SEND-QUERY">
<refnamediv>
- <refname>dblink_cancel_query</refname>
- <refpurpose>cancels any active query on the named connection</refpurpose>
- </refnamediv>
-
+ <refname>dblink_send_query</refname>
+ <refpurpose>sends an async query to a remote database</refpurpose>
+ </refnamediv>
+
<refsynopsisdiv>
<synopsis>
- dblink_cancel_query(text connname) RETURNS text
+ dblink_send_query(text connname, text sql) returns int
</synopsis>
</refsynopsisdiv>
-
+
<refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- The specific connection name to use.
- </para>
- </refsect2>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_send_query</> sends a query to be executed
+ asynchronously, that is, without immediately waiting for the result.
+ There must not be an async query already in progress on the
+ connection.
+ </para>
+
+ <para>
+ After successfully dispatching an async query, completion status
+ can be checked with <function>dblink_is_busy</>, and the results
+ are ultimately collected with <function>dblink_get_result</>.
+ It is also possible to attempt to cancel an active async query
+ using <function>dblink_cancel_query</>.
+ </para>
</refsect1>
-
+
<refsect1>
- <title>Outputs</title>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>sql</parameter></term>
+ <listitem>
+ <para>
+ The SQL statement that you wish to execute in the remote database,
+ for example <literal>select * from pg_class</>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
<para>
- Returns "OK" on success, or an error message on failure.
+ Returns 1 if the query was successfully dispatched, 0 otherwise.
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
- SELECT dblink_cancel_query('dtest1');
+ SELECT dblink_send_query('dtest1', 'SELECT * FROM foo WHERE f1 < 3');
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-ERROR-MESSAGE">
+ <refentry id="CONTRIB-DBLINK-IS-BUSY">
<refnamediv>
- <refname>dblink_error_message</refname>
- <refpurpose>gets last error message on the named connection</refpurpose>
+ <refname>dblink_is_busy</refname>
+ <refpurpose>checks if connection is busy with an async query</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_error_message(text connname) RETURNS text
+ dblink_is_busy(text connname) returns int
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- The specific connection name to use.
- </para>
- </refsect2>
- </refsect1>
-
+
<refsect1>
- <title>Outputs</title>
+ <title>Description</title>
+
<para>
- Returns last error message.
+ <function>dblink_is_busy</> tests whether an async query is in progress.
</para>
</refsect1>
-
+
<refsect1>
- <title>Example</title>
- <programlisting>
- SELECT dblink_error_message('dtest1');
- </programlisting>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to check.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK">
- <refnamediv>
- <refname>dblink</refname>
- <refpurpose>returns a set from a remote database</refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink(text connstr, text sql [, bool fail_on_error])
- dblink(text connname, text sql [, bool fail_on_error])
- dblink(text sql [, bool fail_on_error])
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname/connstr</title>
- <para>
- If two arguments are present, the first is first assumed to be a specific
- connection name to use. If the name is not found, the argument is then
- assumed to be a valid connection string, of standard libpq format,
- e.g.: "hostaddr=127.0.0.1 dbname=mydb user=postgres password=mypasswd"
-
- If only one argument is used, then the unnamed connection is used.
- </para>
- </refsect2>
-
- <refsect2>
- <title>sql</title>
- <para>
- sql statement that you wish to execute on the remote host
- e.g. "select * from pg_class"
- </para>
- </refsect2>
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and no rows are returned.
- </para>
- </refsect2>
- <refsect2>
- <title></title>
- <para>
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns setof record</para>
- </refsect1>
-
<refsect1>
- <title>Example</title>
- <programlisting>
- select * from dblink('dbname=postgres','select proname, prosrc from pg_proc')
- as t1(proname name, prosrc text) where proname like 'bytea%';
- proname | prosrc
- ------------+------------
- byteacat | byteacat
- byteaeq | byteaeq
- bytealt | bytealt
- byteale | byteale
- byteagt | byteagt
- byteage | byteage
- byteane | byteane
- byteacmp | byteacmp
- bytealike | bytealike
- byteanlike | byteanlike
- byteain | byteain
- byteaout | byteaout
- (12 rows)
-
- select dblink_connect('dbname=postgres');
- dblink_connect
- ----------------
- OK
- (1 row)
-
- select * from dblink('select proname, prosrc from pg_proc')
- as t1(proname name, prosrc text) where proname like 'bytea%';
- proname | prosrc
- ------------+------------
- byteacat | byteacat
- byteaeq | byteaeq
- bytealt | bytealt
- byteale | byteale
- byteagt | byteagt
- byteage | byteage
- byteane | byteane
- byteacmp | byteacmp
- bytealike | bytealike
- byteanlike | byteanlike
- byteain | byteain
- byteaout | byteaout
- (12 rows)
-
- select dblink_connect('myconn','dbname=regression');
- dblink_connect
- ----------------
- OK
- (1 row)
-
- select * from dblink('myconn','select proname, prosrc from pg_proc')
- as t1(proname name, prosrc text) where proname like 'bytea%';
- proname | prosrc
- ------------+------------
- bytearecv | bytearecv
- byteasend | byteasend
- byteale | byteale
- byteagt | byteagt
- byteage | byteage
- byteane | byteane
- byteacmp | byteacmp
- bytealike | bytealike
- byteanlike | byteanlike
- byteacat | byteacat
- byteaeq | byteaeq
- bytealt | bytealt
- byteain | byteain
- byteaout | byteaout
- (14 rows)
- </programlisting>
- <para>
- A more convenient way to use dblink may be to create a view:
- </para>
- <programlisting>
- create view myremote_pg_proc as
- select *
- from dblink('dbname=postgres','select proname, prosrc from pg_proc')
- as t1(proname name, prosrc text);
- </programlisting>
+ <title>Return Value</title>
+
<para>
- Then you can simply write:
+ Returns 1 if connection is busy, 0 if it is not busy.
+ If this function returns 0, it is guaranteed that
+ <function>dblink_get_result</> will not block.
</para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
<programlisting>
- select * from myremote_pg_proc where proname like 'bytea%';
+ SELECT dblink_is_busy('dtest1');
</programlisting>
</refsect1>
</refentry>
- <refentry id="CONTRIB-DBLINK-SEND-QUERY">
+ <refentry id="CONTRIB-DBLINK-GET-RESULT">
<refnamediv>
- <refname>dblink_send_query</refname>
- <refpurpose>sends an async query to a remote database</refpurpose>
+ <refname>dblink_get_result</refname>
+ <refpurpose>gets an async query result</refpurpose>
</refnamediv>
-
+
<refsynopsisdiv>
<synopsis>
- dblink_send_query(text connname, text sql)
+ dblink_get_result(text connname [, bool fail_on_error]) returns setof record
</synopsis>
</refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- The specific connection name to use.
- </para>
- </refsect2>
- <refsect2>
- <title>sql</title>
- <para>
- sql statement that you wish to execute on the remote host
- e.g. "select * from pg_class"
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
+
+ <refsect1>
+ <title>Description</title>
+
<para>
- Returns int. A return value of 1 if the query was successfully dispatched,
- 0 otherwise. If 1, results must be fetched by dblink_get_result(connname).
- A running query may be cancelled by dblink_cancel_query(connname).
+ <function>dblink_get_result</> collects the results of an
+ asynchronous query previously sent with <function>dblink_send_query</>.
+ If the query is not already completed, <function>dblink_get_result</>
+ will wait until it is.
</para>
</refsect1>
-
+
<refsect1>
- <title>Example</title>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>fail_on_error</parameter></term>
+ <listitem>
+ <para>
+ If true (the default when omitted) then an error thrown on the
+ remote side of the connection causes an error to also be thrown
+ locally. If false, the remote error is locally reported as a NOTICE,
+ and the function returns no rows.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ For an async query (that is, a SQL statement returning rows),
+ the function returns the row(s) produced by the query. To use this
+ function, you will need to specify the expected set of columns,
+ as previously discussed for <function>dblink</>.
+ </para>
+
<para>
- <literal>
- SELECT dblink_connect('dtest1', 'dbname=contrib_regression');
- SELECT * FROM
- dblink_send_query('dtest1', 'SELECT * FROM foo WHERE f1 < 3') AS t1;
- </literal>
+ For an async command (that is, a SQL statement not returning rows),
+ the function returns a single row with a single text column containing
+ the command's status string. It is still necessary to specify that
+ the result will have a single text column in the calling <literal>FROM</>
+ clause.
</para>
</refsect1>
- </refentry>
- <refentry id="CONTRIB-DBLINK-GET-RESULT">
- <refnamediv>
- <refname>dblink_get_result</refname>
- <refpurpose>gets an async query result</refpurpose>
- </refnamediv>
-
- <refsynopsisdiv>
- <synopsis>
- dblink_get_result(text connname [, bool fail_on_error])
- </synopsis>
- </refsynopsisdiv>
-
- <refsect1>
- <title>Inputs</title>
-
- <refsect2>
- <title>connname</title>
- <para>
- The specific connection name to use. An asynchronous query must
- have already been sent using dblink_send_query()
- </para>
- </refsect2>
- <refsect2>
- <title>fail_on_error</title>
- <para>
- If true (default when not present) then an ERROR thrown on the remote side
- of the connection causes an ERROR to also be thrown locally. If false, the
- remote ERROR is locally treated as a NOTICE, and no rows are returned.
- </para>
- </refsect2>
- </refsect1>
-
- <refsect1>
- <title>Outputs</title>
- <para>Returns setof record</para>
- </refsect1>
-
<refsect1>
<title>Notes</title>
+
<para>
- Blocks until a result gets available.
-
- This function *must* be called if dblink_send_query returned
- a 1, even on cancelled queries - otherwise the connection
- can't be used anymore. It must be called once for each query
+ This function <emphasis>must</> be called if
+ <function>dblink_send_query</> returned 1.
+ It must be called once for each query
sent, and one additional time to obtain an empty set result,
- prior to using the connection again.
+ before the connection can be used again.
</para>
</refsect1>
-
+
<refsect1>
<title>Example</title>
+
<programlisting>
contrib_regression=# SELECT dblink_connect('dtest1', 'dbname=contrib_regression');
dblink_connect
----------------
OK
(1 row)
-
+
contrib_regression=# SELECT * from
contrib_regression-# dblink_send_query('dtest1', 'select * from foo where f1 < 3') as t1;
t1
----
1
(1 row)
-
+
contrib_regression=# SELECT * from dblink_get_result('dtest1') as t1(f1 int, f2 text, f3 text[]);
f1 | f2 | f3
----+----+------------
1 | b | {a1,b1,c1}
2 | c | {a2,b2,c2}
(3 rows)
-
+
contrib_regression=# SELECT * from dblink_get_result('dtest1') as t1(f1 int, f2 text, f3 text[]);
f1 | f2 | f3
----+----+----
(0 rows)
-
+
contrib_regression=# SELECT * from
dblink_send_query('dtest1', 'select * from foo where f1 < 3; select * from foo where f1 > 6') as t1;
t1
----
1
(1 row)
-
+
contrib_regression=# SELECT * from dblink_get_result('dtest1') as t1(f1 int, f2 text, f3 text[]);
f1 | f2 | f3
----+----+------------
1 | b | {a1,b1,c1}
2 | c | {a2,b2,c2}
(3 rows)
-
+
contrib_regression=# SELECT * from dblink_get_result('dtest1') as t1(f1 int, f2 text, f3 text[]);
f1 | f2 | f3
----+----+---------------
9 | j | {a9,b9,c9}
10 | k | {a10,b10,c10}
(4 rows)
-
+
contrib_regression=# SELECT * from dblink_get_result('dtest1') as t1(f1 int, f2 text, f3 text[]);
f1 | f2 | f3
----+----+----
</programlisting>
</refsect1>
</refentry>
+
+ <refentry id="CONTRIB-DBLINK-CANCEL-QUERY">
+ <refnamediv>
+ <refname>dblink_cancel_query</refname>
+ <refpurpose>cancels any active query on the named connection</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_cancel_query(text connname) returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_cancel_query</> attempts to cancel any query that
+ is in progress on the named connection. Note that this is not
+ certain to succeed (since, for example, the remote query might
+ already have finished). A cancel request simply improves the
+ odds that the query will fail soon. You must still complete the
+ normal query protocol, for example by calling
+ <function>dblink_get_result</>.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>conname</parameter></term>
+ <listitem>
+ <para>
+ Name of the connection to use.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ Returns <literal>OK</> if the cancel request has been sent, or
+ the text of an error message on failure.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+ SELECT dblink_cancel_query('dtest1');
+ </programlisting>
+ </refsect1>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-CURRENT-QUERY">
+ <refnamediv>
+ <refname>dblink_current_query</refname>
+ <refpurpose>returns the current query string</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_current_query() returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ Returns the currently executing interactive command string of the
+ local database session, or NULL if it can't be determined. Note
+ that this function is not really related to <filename>dblink</>'s
+ other functionality. It is provided since it is sometimes useful
+ in generating queries to be forwarded to remote databases.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>Returns a copy of the currently executing query string.</para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+test=# select dblink_current_query();
+ dblink_current_query
+--------------------------------
+ select dblink_current_query();
+(1 row)
+ </programlisting>
+ </refsect1>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-GET-PKEY">
+ <refnamediv>
+ <refname>dblink_get_pkey</refname>
+ <refpurpose>returns the positions and field names of a relation's
+ primary key fields
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_get_pkey(text relname) returns setof dblink_pkey_results
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_get_pkey</> provides information about the primary
+ key of a relation in the local database. This is sometimes useful
+ in generating queries to be sent to remote databases.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>relname</parameter></term>
+ <listitem>
+ <para>
+ Name of a local relation, for example <literal>foo</> or
+ <literal>myschema.mytab</>. Include double quotes if the
+ name is mixed-case or contains special characters, for
+ example <literal>"FooBar"</>; without quotes, the string
+ will be folded to lower case.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>
+ Returns one row for each primary key field, or no rows if the relation
+ has no primary key. The result rowtype is defined as
+
+ <programlisting>
+CREATE TYPE dblink_pkey_results AS (position int, colname text);
+ </programlisting>
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+test=# create table foobar(f1 int, f2 int, f3 int,
+test(# primary key(f1,f2,f3));
+CREATE TABLE
+test=# select * from dblink_get_pkey('foobar');
+ position | colname
+----------+---------
+ 1 | f1
+ 2 | f2
+ 3 | f3
+(3 rows)
+ </programlisting>
+ </refsect1>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-BUILD-SQL-INSERT">
+ <refnamediv>
+ <refname>dblink_build_sql_insert</refname>
+ <refpurpose>
+ builds an INSERT statement using a local tuple, replacing the
+ primary key field values with alternative supplied values
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_build_sql_insert(text relname,
+ int2vector primary_key_attnums,
+ int2 num_primary_key_atts,
+ text[] src_pk_att_vals_array,
+ text[] tgt_pk_att_vals_array) returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_build_sql_insert</> can be useful in doing selective
+ replication of a local table to a remote database. It selects a row
+ from the local table based on primary key, and then builds a SQL
+ <command>INSERT</> command that will duplicate that row, but with
+ the primary key values replaced by the values in the last argument.
+ (To make an exact copy of the row, just specify the same values for
+ the last two arguments.)
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>relname</parameter></term>
+ <listitem>
+ <para>
+ Name of a local relation, for example <literal>foo</> or
+ <literal>myschema.mytab</>. Include double quotes if the
+ name is mixed-case or contains special characters, for
+ example <literal>"FooBar"</>; without quotes, the string
+ will be folded to lower case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>primary_key_attnums</parameter></term>
+ <listitem>
+ <para>
+ Attribute numbers (1-based) of the primary key fields,
+ for example <literal>1 2</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>num_primary_key_atts</parameter></term>
+ <listitem>
+ <para>
+ The number of primary key fields.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>src_pk_att_vals_array</parameter></term>
+ <listitem>
+ <para>
+ Values of the primary key fields to be used to look up the
+ local tuple. Each field is represented in text form.
+ An error is thrown if there is no local row with these
+ primary key values.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>tgt_pk_att_vals_array</parameter></term>
+ <listitem>
+ <para>
+ Values of the primary key fields to be placed in the resulting
+ <command>INSERT</> command. Each field is represented in text form.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>Returns the requested SQL statement as text.</para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+ test=# select dblink_build_sql_insert('foo', '1 2', 2, '{"1", "a"}', '{"1", "b''a"}');
+ dblink_build_sql_insert
+ --------------------------------------------------
+ INSERT INTO foo(f1,f2,f3) VALUES('1','b''a','1')
+ (1 row)
+ </programlisting>
+ </refsect1>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-BUILD-SQL-DELETE">
+ <refnamediv>
+ <refname>dblink_build_sql_delete</refname>
+ <refpurpose>builds a DELETE statement using supplied values for primary
+ key field values
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_build_sql_delete(text relname,
+ int2vector primary_key_attnums,
+ int2 num_primary_key_atts,
+ text[] tgt_pk_att_vals_array) returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_build_sql_delete</> can be useful in doing selective
+ replication of a local table to a remote database. It builds a SQL
+ <command>DELETE</> command that will delete the row with the given
+ primary key values.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>relname</parameter></term>
+ <listitem>
+ <para>
+ Name of a local relation, for example <literal>foo</> or
+ <literal>myschema.mytab</>. Include double quotes if the
+ name is mixed-case or contains special characters, for
+ example <literal>"FooBar"</>; without quotes, the string
+ will be folded to lower case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>primary_key_attnums</parameter></term>
+ <listitem>
+ <para>
+ Attribute numbers (1-based) of the primary key fields,
+ for example <literal>1 2</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>num_primary_key_atts</parameter></term>
+ <listitem>
+ <para>
+ The number of primary key fields.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>tgt_pk_att_vals_array</parameter></term>
+ <listitem>
+ <para>
+ Values of the primary key fields to be used in the resulting
+ <command>DELETE</> command. Each field is represented in text form.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>Returns the requested SQL statement as text.</para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+ test=# select dblink_build_sql_delete('"MyFoo"', '1 2', 2, '{"1", "b"}');
+ dblink_build_sql_delete
+ ---------------------------------------------
+ DELETE FROM "MyFoo" WHERE f1='1' AND f2='b'
+ (1 row)
+ </programlisting>
+ </refsect1>
+ </refentry>
+
+ <refentry id="CONTRIB-DBLINK-BUILD-SQL-UPDATE">
+ <refnamediv>
+ <refname>dblink_build_sql_update</refname>
+ <refpurpose>builds an UPDATE statement using a local tuple, replacing
+ the primary key field values with alternative supplied values
+ </refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <synopsis>
+ dblink_build_sql_update(text relname,
+ int2vector primary_key_attnums,
+ int2 num_primary_key_atts,
+ text[] src_pk_att_vals_array,
+ text[] tgt_pk_att_vals_array) returns text
+ </synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <function>dblink_build_sql_update</> can be useful in doing selective
+ replication of a local table to a remote database. It selects a row
+ from the local table based on primary key, and then builds a SQL
+ <command>UPDATE</> command that will duplicate that row, but with
+ the primary key values replaced by the values in the last argument.
+ (To make an exact copy of the row, just specify the same values for
+ the last two arguments.) The <command>UPDATE</> command always assigns
+ all fields of the row — the main difference between this and
+ <function>dblink_build_sql_insert</> is that it's assumed that
+ the target row already exists in the remote table.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Arguments</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><parameter>relname</parameter></term>
+ <listitem>
+ <para>
+ Name of a local relation, for example <literal>foo</> or
+ <literal>myschema.mytab</>. Include double quotes if the
+ name is mixed-case or contains special characters, for
+ example <literal>"FooBar"</>; without quotes, the string
+ will be folded to lower case.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>primary_key_attnums</parameter></term>
+ <listitem>
+ <para>
+ Attribute numbers (1-based) of the primary key fields,
+ for example <literal>1 2</>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>num_primary_key_atts</parameter></term>
+ <listitem>
+ <para>
+ The number of primary key fields.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>src_pk_att_vals_array</parameter></term>
+ <listitem>
+ <para>
+ Values of the primary key fields to be used to look up the
+ local tuple. Each field is represented in text form.
+ An error is thrown if there is no local row with these
+ primary key values.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><parameter>tgt_pk_att_vals_array</parameter></term>
+ <listitem>
+ <para>
+ Values of the primary key fields to be placed in the resulting
+ <command>UPDATE</> command. Each field is represented in text form.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect1>
+
+ <refsect1>
+ <title>Return Value</title>
+
+ <para>Returns the requested SQL statement as text.</para>
+ </refsect1>
+
+ <refsect1>
+ <title>Example</title>
+
+ <programlisting>
+ test=# select dblink_build_sql_update('foo', '1 2', 2, '{"1", "a"}', '{"1", "b"}');
+ dblink_build_sql_update
+ -------------------------------------------------------------
+ UPDATE foo SET f1='1',f2='b',f3='1' WHERE f1='1' AND f2='b'
+ (1 row)
+ </programlisting>
+ </refsect1>
+ </refentry>
+
</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-int.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="dict-int">
<title>dict_int</title>
-
+
<indexterm zone="dict-int">
<primary>dict_int</primary>
</indexterm>
<para>
- The motivation for this example dictionary is to control the indexing of
- integers (signed and unsigned), and, consequently, to minimize the number of
- unique words which greatly affect the performance of searching.
+ <filename>dict_int</> is an example of an add-on dictionary template
+ for full-text search. The motivation for this example dictionary is to
+ control the indexing of integers (signed and unsigned), allowing such
+ numbers to be indexed while preventing excessive growth in the number of
+ unique words, which greatly affects the performance of searching.
</para>
<sect2>
<title>Configuration</title>
+
<para>
- The dictionary accepts two options:
+ The dictionary accepts two options:
</para>
<itemizedlist>
<listitem>
<para>
- The MAXLEN parameter specifies the maximum length (number of digits)
- allowed in an integer word. The default value is 6.
+ The <literal>maxlen</> parameter specifies the maximum number of
+ digits allowed in an integer word. The default value is 6.
</para>
</listitem>
<listitem>
<para>
- The REJECTLONG parameter specifies if an overlength integer should be
- truncated or ignored. If REJECTLONG=FALSE (default), the dictionary returns
- the first MAXLEN digits of the integer. If REJECTLONG=TRUE, the
- dictionary treats an overlength integer as a stop word, so that it will
- not be indexed.
+ The <literal>rejectlong</> parameter specifies whether an overlength
+ integer should be truncated or ignored. If <literal>rejectlong</> is
+ <literal>false</> (the default), the dictionary returns the first
+ <literal>maxlen</> digits of the integer. If <literal>rejectlong</> is
+ <literal>true</>, the dictionary treats an overlength integer as a stop
+ word, so that it will not be indexed. Note that this also means that
+ such an integer cannot be searched for.
</para>
</listitem>
</itemizedlist>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-xsyn.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="dict-xsyn">
<title>dict_xsyn</title>
-
+
<indexterm zone="dict-xsyn">
<primary>dict_xsyn</primary>
</indexterm>
<para>
- The Extended Synonym Dictionary module replaces words with groups of their
- synonyms, and so makes it possible to search for a word using any of its
- synonyms.
+ <filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an
+ add-on dictionary template for full-text search. This dictionary type
+ replaces words with groups of their synonyms, and so makes it possible to
+ search for a word using any of its synonyms.
</para>
<sect2>
<title>Configuration</title>
+
<para>
A <literal>dict_xsyn</> dictionary accepts the following options:
</para>
<itemizedlist>
<listitem>
<para>
- KEEPORIG controls whether the original word is included, or only its
- synonyms. Default is 'true'.
+ <literal>keeporig</> controls whether the original word is included (if
+ <literal>true</>), or only its synonyms (if <literal>false</>). Default
+ is <literal>true</>.
</para>
</listitem>
<listitem>
<para>
- RULES is the base name of the file containing the list of synonyms.
- This file must be in $(prefix)/share/tsearch_data/, and its name must
- end in ".rules" (which is not included in the RULES parameter).
+ <literal>rules</> is the base name of the file containing the list of
+ synonyms. This file must be stored in
+ <filename>$SHAREDIR/tsearch_data/</> (where <literal>$SHAREDIR</> means
+ the <productname>PostgreSQL</> installation's shared-data directory).
+ Its name must end in <literal>.rules</> (which is not to be included in
+ the <literal>rules</> parameter).
</para>
</listitem>
</itemizedlist>
<listitem>
<para>
Each line represents a group of synonyms for a single word, which is
- given first on the line. Synonyms are separated by whitespace:
- </para>
+ given first on the line. Synonyms are separated by whitespace, thus:
<programlisting>
word syn1 syn2 syn3
</programlisting>
+ </para>
</listitem>
<listitem>
<para>
- Sharp ('#') sign is a comment delimiter. It may appear at any position
- inside the line. The rest of the line will be skipped.
+ The sharp (<literal>#</>) sign is a comment delimiter. It may appear at
+ any position in a line. The rest of the line will be skipped.
</para>
</listitem>
</itemizedlist>
<para>
- Look at xsyn_sample.rules, which is installed in $(prefix)/share/tsearch_data/,
- for an example.
+ Look at <filename>xsyn_sample.rules</>, which is installed in
+ <filename>$SHAREDIR/tsearch_data/</>, for an example.
</para>
</sect2>
<sect2>
<title>Usage</title>
- <programlisting>
-mydb=# SELECT ts_lexize('xsyn','word');
-ts_lexize
-----------------
-{word,syn1,syn2,syn3)
- </programlisting>
+
<para>
- Change dictionary options:
- </para>
- <programlisting>
-mydb# ALTER TEXT SEARCH DICTIONARY xsyn (KEEPORIG=false);
+ Running the installation script creates a text search template
+ <literal>xsyn_template</> and a dictionary <literal>xsyn</>
+ based on it, with default parameters. You can alter the
+ parameters, for example
+
+<programlisting>
+mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
ALTER TEXT SEARCH DICTIONARY
- </programlisting>
+</programlisting>
+
+ or create new dictionaries based on the template.
+ </para>
+
+ <para>
+ To test the dictionary, you can try
+
+<programlisting>
+mydb=# SELECT ts_lexize('xsyn', 'word');
+ ts_lexize
+-----------------------
+ {word,syn1,syn2,syn3}
+</programlisting>
+
+ but real-world usage will involve including it in a text search
+ configuration as described in <xref linkend="textsearch">.
+ That might look like this:
+
+<programlisting>
+ALTER TEXT SEARCH CONFIGURATION english
+ ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
+</programlisting>
+
+ </para>
</sect2>
</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/earthdistance.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="earthdistance">
<title>earthdistance</title>
-
+
<indexterm zone="earthdistance">
<primary>earthdistance</primary>
</indexterm>
<para>
- This module contains two different approaches to calculating
- great circle distances on the surface of the Earth. The one described
- first depends on the contrib/cube package (which MUST be installed before
- earthdistance is installed). The second one is based on the point
- datatype using latitude and longitude for the coordinates. The install
- script makes the defined functions executable by anyone.
- </para>
- <para>
- A spherical model of the Earth is used.
- </para>
- <para>
- Data is stored in cubes that are points (both corners are the same) using 3
- coordinates representing the distance from the center of the Earth.
- </para>
- <para>
- The radius of the Earth is obtained from the earth() function. It is
- given in meters. But by changing this one function you can change it
- to use some other units or to use a different value of the radius
- that you feel is more appropiate.
- </para>
- <para>
- This package also has applications to astronomical databases as well.
- Astronomers will probably want to change earth() to return a radius of
- 180/pi() so that distances are in degrees.
- </para>
- <para>
- Functions are provided to allow for input in latitude and longitude (in
- degrees), to allow for output of latitude and longitude, to calculate
- the great circle distance between two points and to easily specify a
- bounding box usable for index searches.
- </para>
- <para>
- The functions are all 'sql' functions. If you want to make these functions
- executable by other people you will also have to make the referenced
- cube functions executable. cube(text), cube(float8), cube(cube,float8),
- cube_distance(cube,cube), cube_ll_coord(cube,int) and
- cube_enlarge(cube,float8,int) are used indirectly by the earth distance
- functions. is_point(cube) and cube_dim(cube) are used in constraints for data
- in domain earth. cube_ur_coord(cube,int) is used in the regression tests and
- might be useful for looking at bounding box coordinates in user applications.
- </para>
- <para>
- A domain of type cube named earth is defined.
- There are constraints on it defined to make sure the cube is a point,
- that it does not have more than 3 dimensions and that it is very near
- the surface of a sphere centered about the origin with the radius of
- the Earth.
- </para>
- <para>
- The following functions are provided:
+ The <filename>earthdistance</> module provides two different approaches to
+ calculating great circle distances on the surface of the Earth. The one
+ described first depends on the <filename>cube</> package (which
+ <emphasis>must</> be installed before <filename>earthdistance</> can be
+ installed). The second one is based on the built-in <type>point</> datatype,
+ using longitude and latitude for the coordinates.
</para>
- <table id="earthdistance-functions">
- <title>EarthDistance functions</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry><literal>earth()</literal></entry>
- <entry>returns the radius of the Earth in meters.</entry>
- </row>
- <row>
- <entry><literal>sec_to_gc(float8)</literal></entry>
- <entry>converts the normal straight line
- (secant) distance between between two points on the surface of the Earth
- to the great circle distance between them.
- </entry>
- </row>
- <row>
- <entry><literal>gc_to_sec(float8)</literal></entry>
- <entry>Converts the great circle distance
- between two points on the surface of the Earth to the normal straight line
- (secant) distance between them.
- </entry>
- </row>
- <row>
- <entry><literal>ll_to_earth(float8, float8)</literal></entry>
- <entry>Returns the location of a point on the surface of the Earth given
- its latitude (argument 1) and longitude (argument 2) in degrees.
- </entry>
- </row>
- <row>
- <entry><literal>latitude(earth)</literal></entry>
- <entry>Returns the latitude in degrees of a point on the surface of the
- Earth.
- </entry>
- </row>
- <row>
- <entry><literal>longitude(earth)</literal></entry>
- <entry>Returns the longitude in degrees of a point on the surface of the
- Earth.
- </entry>
- </row>
- <row>
- <entry><literal>earth_distance(earth, earth)</literal></entry>
- <entry>Returns the great circle distance between two points on the
- surface of the Earth.
- </entry>
- </row>
- <row>
- <entry><literal>earth_box(earth, float8)</literal></entry>
- <entry>Returns a box suitable for an indexed search using the cube @>
- operator for points within a given great circle distance of a location.
- Some points in this box are further than the specified great circle
- distance from the location so a second check using earth_distance
- should be made at the same time.
- </entry>
- </row>
- <row>
- <entry><literal><@></literal> operator</entry>
- <entry>gives the distance in statute miles between
- two points on the Earth's surface. Coordinates are in degrees. Points are
- taken as (longitude, latitude) and not vice versa as longitude is closer
- to the intuitive idea of x-axis and latitude to y-axis.
- </entry>
- </row>
- </tbody>
- </tgroup>
- </table>
<para>
- One advantage of using cube representation over a point using latitude and
- longitude for coordinates, is that you don't have to worry about special
- conditions at +/- 180 degrees of longitude or near the poles.
+ In this module, the Earth is assumed to be perfectly spherical.
+ (If that's too inaccurate for you, you might want to look at the
+ <application><ulink url="http://www.postgis.org/">PostGIS</ulink></>
+ project.)
</para>
-</sect1>
+ <sect2>
+ <title>Cube-based earth distances</title>
+
+ <para>
+ Data is stored in cubes that are points (both corners are the same) using 3
+ coordinates representing the x, y, and z distance from the center of the
+ Earth. A domain <type>earth</> over <type>cube</> is provided, which
+ includes constraint checks that the value meets these restrictions and
+ is reasonably close to the actual surface of the Earth.
+ </para>
+
+ <para>
+ The radius of the Earth is obtained from the <function>earth()</>
+ function. It is given in meters. But by changing this one function you can
+ change the module to use some other units, or to use a different value of
+ the radius that you feel is more appropiate.
+ </para>
+
+ <para>
+ This package has applications to astronomical databases as well.
+ Astronomers will probably want to change <function>earth()</> to return a
+ radius of <literal>180/pi()</> so that distances are in degrees.
+ </para>
+
+ <para>
+ Functions are provided to support input in latitude and longitude (in
+ degrees), to support output of latitude and longitude, to calculate
+ the great circle distance between two points and to easily specify a
+ bounding box usable for index searches.
+ </para>
+
+ <para>
+ The following functions are provided:
+ </para>
+
+ <table id="earthdistance-cube-functions">
+ <title>Cube-based earthdistance functions</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Function</entry>
+ <entry>Returns</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><function>earth()</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Returns the assumed radius of the Earth.</entry>
+ </row>
+ <row>
+ <entry><function>sec_to_gc(float8)</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Converts the normal straight line
+ (secant) distance between between two points on the surface of the Earth
+ to the great circle distance between them.
+ </entry>
+ </row>
+ <row>
+ <entry><function>gc_to_sec(float8)</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Converts the great circle distance between two points on the
+ surface of the Earth to the normal straight line (secant) distance
+ between them.
+ </entry>
+ </row>
+ <row>
+ <entry><function>ll_to_earth(float8, float8)</function></entry>
+ <entry><type>earth</type></entry>
+ <entry>Returns the location of a point on the surface of the Earth given
+ its latitude (argument 1) and longitude (argument 2) in degrees.
+ </entry>
+ </row>
+ <row>
+ <entry><function>latitude(earth)</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Returns the latitude in degrees of a point on the surface of the
+ Earth.
+ </entry>
+ </row>
+ <row>
+ <entry><function>longitude(earth)</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Returns the longitude in degrees of a point on the surface of the
+ Earth.
+ </entry>
+ </row>
+ <row>
+ <entry><function>earth_distance(earth, earth)</function></entry>
+ <entry><type>float8</type></entry>
+ <entry>Returns the great circle distance between two points on the
+ surface of the Earth.
+ </entry>
+ </row>
+ <row>
+ <entry><function>earth_box(earth, float8)</function></entry>
+ <entry><type>cube</type></entry>
+ <entry>Returns a box suitable for an indexed search using the cube
+ <literal>@></>
+ operator for points within a given great circle distance of a location.
+ Some points in this box are further than the specified great circle
+ distance from the location, so a second check using
+ <function>earth_distance</> should be included in the query.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ </sect2>
+
+ <sect2>
+ <title>Point-based earth distances</title>
+
+ <para>
+ The second part of the module relies on representing Earth locations as
+ values of type <type>point</>, in which the first component is taken to
+ represent longitude in degrees, and the second component is taken to
+ represent latitude in degrees. Points are taken as (longitude, latitude)
+ and not vice versa because longitude is closer to the intuitive idea of
+ x-axis and latitude to y-axis.
+ </para>
+
+ <para>
+ A single operator is provided:
+ </para>
+
+ <table id="earthdistance-point-operators">
+ <title>Point-based earthdistance operators</title>
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Operator</entry>
+ <entry>Returns</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><type>point</> <literal><@></literal> <type>point</></entry>
+ <entry><type>float8</type></entry>
+ <entry>Gives the distance in statute miles between
+ two points on the Earth's surface.
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ Note that unlike the <type>cube</>-based part of the module, units
+ are hardwired here: changing the <function>earth()</> function will
+ not affect the results of this operator.
+ </para>
+
+ <para>
+ One disadvantage of the longitude/latitude representation is that
+ you need to be careful about the edge conditions near the poles
+ and near +/- 180 degrees of longitude. The <type>cube</>-based
+ representation avoids these discontinuities.
+ </para>
+
+ </sect2>
+
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/fuzzystrmatch.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="fuzzystrmatch">
<title>fuzzystrmatch</title>
-
+
+ <indexterm zone="fuzzystrmatch">
+ <primary>fuzzystrmatch</primary>
+ </indexterm>
+
<para>
- This section describes the fuzzystrmatch module which provides different
+ The <filename>fuzzystrmatch</> module provides several
functions to determine similarities and distance between strings.
</para>
<sect2>
<title>Soundex</title>
+
<para>
- The Soundex system is a method of matching similar sounding names
- (or any words) to the same code. It was initially used by the
- United States Census in 1880, 1900, and 1910, but it has little use
- beyond English names (or the English pronunciation of names), and
- it is not a linguistic tool.
+ The Soundex system is a method of matching similar-sounding names
+ by converting them to the same code. It was initially used by the
+ United States Census in 1880, 1900, and 1910. Note that Soundex
+ is not very useful for non-English names.
</para>
+
<para>
- When comparing two soundex values to determine similarity, the
- difference function reports how close the match is on a scale
- from zero to four, with zero being no match and four being an
- exact match.
+ The <filename>fuzzystrmatch</> module provides two functions
+ for working with Soundex codes:
</para>
+
+ <programlisting>
+ soundex(text) returns text
+ difference(text, text) returns int
+ </programlisting>
+
+ <para>
+ The <function>soundex</> function converts a string to its Soundex code.
+ The <function>difference</> function converts two strings to their Soundex
+ codes and then reports the number of matching code positions. Since
+ Soundex codes have four characters, the result ranges from zero to four,
+ with zero being no match and four being an exact match. (Thus, the
+ function is misnamed — <function>similarity</> would have been
+ a better name.)
+ </para>
+
<para>
- The following are some usage examples:
+ Here are some usage examples:
</para>
+
<programlisting>
SELECT soundex('hello world!');
SELECT * FROM s WHERE soundex(nm) = soundex('john');
-SELECT a.nm, b.nm FROM s a, s b WHERE soundex(a.nm) = soundex(b.nm) AND a.oid <> b.oid;
-
-CREATE FUNCTION text_sx_eq(text, text) RETURNS boolean AS
-'select soundex($1) = soundex($2)'
-LANGUAGE SQL;
-
-CREATE FUNCTION text_sx_lt(text, text) RETURNS boolean AS
-'select soundex($1) < soundex($2)'
-LANGUAGE SQL;
-
-CREATE FUNCTION text_sx_gt(text, text) RETURNS boolean AS
-'select soundex($1) > soundex($2)'
-LANGUAGE SQL;
-
-CREATE FUNCTION text_sx_le(text, text) RETURNS boolean AS
-'select soundex($1) <= soundex($2)'
-LANGUAGE SQL;
-
-CREATE FUNCTION text_sx_ge(text, text) RETURNS boolean AS
-'select soundex($1) >= soundex($2)'
-LANGUAGE SQL;
+SELECT * FROM s WHERE difference(s.nm, 'john') > 2;
+ </programlisting>
+ </sect2>
-CREATE FUNCTION text_sx_ne(text, text) RETURNS boolean AS
-'select soundex($1) <> soundex($2)'
-LANGUAGE SQL;
+ <sect2>
+ <title>Levenshtein</title>
-DROP OPERATOR #= (text, text);
+ <para>
+ This function calculates the Levenshtein distance between two strings:
+ </para>
-CREATE OPERATOR #= (leftarg=text, rightarg=text, procedure=text_sx_eq, commutator = #=);
+ <programlisting>
+ levenshtein(text source, text target) returns int
+ </programlisting>
-SELECT * FROM s WHERE text_sx_eq(nm, 'john');
+ <para>
+ Both <literal>source</literal> and <literal>target</literal> can be any
+ non-null string, with a maximum of 255 characters.
+ </para>
-SELECT * FROM s WHERE s.nm #= 'john';
+ <para>
+ Example:
+ </para>
-SELECT * FROM s WHERE difference(s.nm, 'john') > 2;
+ <programlisting>
+test=# SELECT levenshtein('GUMBO', 'GAMBOL');
+ levenshtein
+-------------
+ 2
+(1 row)
</programlisting>
</sect2>
<sect2>
- <title>levenshtein</title>
+ <title>Metaphone</title>
+
+ <para>
+ Metaphone, like Soundex, is based on the idea of constructing a
+ representative code for an input string. Two strings are then
+ deemed similar if they have the same codes.
+ </para>
+
<para>
- This function calculates the levenshtein distance between two strings:
+ This function calculates the metaphone code of an input string:
</para>
+
<programlisting>
- int levenshtein(text source, text target)
+ metaphone(text source, int max_output_length) returns text
</programlisting>
+
<para>
- Both <literal>source</literal> and <literal>target</literal> can be any
- NOT NULL string with a maximum of 255 characters.
+ <literal>source</literal> has to be a non-null string with a maximum of
+ 255 characters. <literal>max_output_length</literal> sets the maximum
+ length of the output metaphone code; if longer, the output is truncated
+ to this length.
</para>
+
<para>
Example:
</para>
+
<programlisting>
- SELECT levenshtein('GUMBO','GAMBOL');
+test=# SELECT metaphone('GUMBO', 4);
+ metaphone
+-----------
+ KM
+(1 row)
</programlisting>
</sect2>
<sect2>
- <title>metaphone</title>
+ <title>Double Metaphone</title>
+
<para>
- This function calculates and returns the metaphone code of an input string:
+ The Double Metaphone system computes two <quote>sounds like</> strings
+ for a given input string — a <quote>primary</> and an
+ <quote>alternate</>. In most cases they are the same, but for non-English
+ names especially they can be a bit different, depending on pronunciation.
+ These functions compute the primary and alternate codes:
</para>
+
<programlisting>
- text metahpone(text source, int max_output_length)
+ dmetaphone(text source) returns text
+ dmetaphone_alt(text source) returns text
</programlisting>
+
<para>
- <literal>source</literal> has to be a NOT NULL string with a maximum of
- 255 characters. <literal>max_output_length</literal> fixes the maximum
- length of the output metaphone code; if longer, the output is truncated
- to this length.
+ There is no length limit on the input strings.
+ </para>
+
+ <para>
+ Example:
</para>
- <para>Example</para>
+
<programlisting>
- SELECT metaphone('GUMBO',4);
+test=# select dmetaphone('gumbo');
+ dmetaphone
+------------
+ KMP
+(1 row)
</programlisting>
</sect2>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/hstore.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="hstore">
<title>hstore</title>
-
+
<indexterm zone="hstore">
<primary>hstore</primary>
</indexterm>
<para>
- The <literal>hstore</literal> module is usefull for storing (key,value) pairs.
- This module can be useful in different scenarios: case with many attributes
- rarely searched, semistructural data or a lazy DBA.
+ This module implements a data type <type>hstore</> for storing sets of
+ (key,value) pairs within a single <productname>PostgreSQL</> data field.
+ This can be useful in various scenarios, such as rows with many attributes
+ that are rarely examined, or semi-structured data.
</para>
<sect2>
- <title>Operations</title>
- <itemizedlist>
- <listitem>
- <para>
- <literal>hstore -> text</literal> - get value , perl analogy $h{key}
- </para>
- <programlisting>
-select 'a=>q, b=>g'->'a';
- ?
-------
- q
- </programlisting>
- <para>
- Note the use of parenthesis in the select below, because priority of 'is' is
- higher than that of '->':
- </para>
- <programlisting>
-SELECT id FROM entrants WHERE (info->'education_period') IS NOT NULL;
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>hstore || hstore</literal> - concatenation, perl analogy %a=( %b, %c );
- </para>
- <programlisting>
-regression=# select 'a=>b'::hstore || 'c=>d'::hstore;
- ?column?
---------------------
- "a"=>"b", "c"=>"d"
-(1 row)
- </programlisting>
-
- <para>
- but, notice
- </para>
-
- <programlisting>
-regression=# select 'a=>b'::hstore || 'a=>d'::hstore;
- ?column?
-----------
- "a"=>"d"
-(1 row)
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>text => text</literal> - creates hstore type from two text strings
- </para>
- <programlisting>
-select 'a'=>'b';
- ?column?
-----------
- "a"=>"b"
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>hstore @> hstore</literal> - contains operation, check if left operand contains right.
- </para>
- <programlisting>
-regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'a=>c';
- ?column?
-----------
- f
-(1 row)
-
-regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1';
- ?column?
-----------
- t
-(1 row)
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>hstore <@ hstore</literal> - contained operation, check if
- left operand is contained in right
- </para>
- <para>
- (Before PostgreSQL 8.2, the containment operators @> and <@ were
- respectively called @ and ~. These names are still available, but are
- deprecated and will eventually be retired. Notice that the old names
- are reversed from the convention formerly followed by the core geometric
- datatypes!)
- </para>
- </listitem>
- </itemizedlist>
+ <title><type>hstore</> External Representation</title>
+
+ <para>
+ The text representation of an <type>hstore</> value includes zero
+ or more <replaceable>key</> <literal>=></> <replaceable>value</>
+ items, separated by commas. For example:
+
+ <programlisting>
+ k => v
+ foo => bar, baz => whatever
+ "1-a" => "anything at all"
+ </programlisting>
+
+ The order of the items is not considered significant (and may not be
+ reproduced on output). Whitespace between items or around the
+ <literal>=></> sign is ignored. Use double quotes if a key or
+ value includes whitespace, comma, <literal>=</> or <literal>></>.
+ To include a double quote or a backslash in a key or value, precede
+ it with another backslash. (Keep in mind that depending on the
+ setting of <varname>standard_conforming_strings</>, you may need to
+ double backslashes in SQL literal strings.)
+ </para>
+
+ <para>
+ A value (but not a key) can be a SQL NULL. This is represented as
+
+ <programlisting>
+ key => NULL
+ </programlisting>
+
+ The <literal>NULL</> keyword is not case-sensitive. Again, use
+ double quotes if you want the string <literal>null</> to be treated
+ as an ordinary data value.
+ </para>
+
+ <para>
+ Currently, double quotes are always used to surround key and value
+ strings on output, even when this is not strictly necessary.
+ </para>
+
</sect2>
<sect2>
- <title>Functions</title>
-
- <itemizedlist>
- <listitem>
- <para>
- <literal>akeys(hstore)</literal> - returns all keys from hstore as array
- </para>
- <programlisting>
-regression=# select akeys('a=>1,b=>2');
- akeys
--------
- {a,b}
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>skeys(hstore)</literal> - returns all keys from hstore as strings
- </para>
- <programlisting>
-regression=# select skeys('a=>1,b=>2');
- skeys
--------
- a
- b
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>avals(hstore)</literal> - returns all values from hstore as array
- </para>
- <programlisting>
-regression=# select avals('a=>1,b=>2');
- avals
--------
- {1,2}
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>svals(hstore)</literal> - returns all values from hstore as
- strings
- </para>
- <programlisting>
-regression=# select svals('a=>1,b=>2');
- svals
--------
- 1
- 2
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>delete (hstore,text)</literal> - delete (key,value) from hstore if
- key matches argument.
- </para>
- <programlisting>
-regression=# select delete('a=>1,b=>2','b');
- delete
-----------
- "a"=>"1"
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>each(hstore)</literal> - return (key, value) pairs
- </para>
- <programlisting>
-regression=# select * from each('a=>1,b=>2');
- key | value
+ <title><type>hstore</> Operators and Functions</title>
+
+ <table id="hstore-op-table">
+ <title><type>hstore</> Operators</title>
+
+ <tgroup cols="4">
+ <thead>
+ <row>
+ <entry>Operator</entry>
+ <entry>Description</entry>
+ <entry>Example</entry>
+ <entry>Result</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry><type>hstore</> <literal>-></> <type>text</></entry>
+ <entry>get value for key (null if not present)</entry>
+ <entry><literal>'a=>x, b=>y'::hstore -> 'a'</literal></entry>
+ <entry><literal>x</literal></entry>
+ </row>
+
+ <row>
+ <entry><type>text</> <literal>=></> <type>text</></entry>
+ <entry>make single-item <type>hstore</></entry>
+ <entry><literal>'a' => 'b'</literal></entry>
+ <entry><literal>"a"=>"b"</literal></entry>
+ </row>
+
+ <row>
+ <entry><type>hstore</> <literal>||</> <type>hstore</></entry>
+ <entry>concatenation</entry>
+ <entry><literal>'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore</literal></entry>
+ <entry><literal>"a"=>"b", "c"=>"x", "d"=>"q"</literal></entry>
+ </row>
+
+ <row>
+ <entry><type>hstore</> <literal>?</> <type>text</></entry>
+ <entry>does <type>hstore</> contain key?</entry>
+ <entry><literal>'a=>1'::hstore ? 'a'</literal></entry>
+ <entry><literal>t</literal></entry>
+ </row>
+
+ <row>
+ <entry><type>hstore</> <literal>@></> <type>hstore</></entry>
+ <entry>does left operand contain right?</entry>
+ <entry><literal>'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1'</literal></entry>
+ <entry><literal>t</literal></entry>
+ </row>
+
+ <row>
+ <entry><type>hstore</> <literal><@</> <type>hstore</></entry>
+ <entry>is left operand contained in right?</entry>
+ <entry><literal>'a=>c'::hstore <@ 'a=>b, b=>1, c=>NULL'</literal></entry>
+ <entry><literal>f</literal></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ (Before PostgreSQL 8.2, the containment operators @> and <@ were
+ respectively called @ and ~. These names are still available, but are
+ deprecated and will eventually be retired. Notice that the old names
+ are reversed from the convention formerly followed by the core geometric
+ datatypes!)
+ </para>
+
+ <table id="hstore-func-table">
+ <title><type>hstore</> Functions</title>
+
+ <tgroup cols="5">
+ <thead>
+ <row>
+ <entry>Function</entry>
+ <entry>Return Type</entry>
+ <entry>Description</entry>
+ <entry>Example</entry>
+ <entry>Result</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry><function>akeys(hstore)</function></entry>
+ <entry><type>text[]</type></entry>
+ <entry>get <type>hstore</>'s keys as array</entry>
+ <entry><literal>akeys('a=>1,b=>2')</literal></entry>
+ <entry><literal>{a,b}</literal></entry>
+ </row>
+
+ <row>
+ <entry><function>skeys(hstore)</function></entry>
+ <entry><type>setof text</type></entry>
+ <entry>get <type>hstore</>'s keys as set</entry>
+ <entry><literal>skeys('a=>1,b=>2')</literal></entry>
+ <entry>
+<programlisting>
+a
+b
+</programlisting></entry>
+ </row>
+
+ <row>
+ <entry><function>avals(hstore)</function></entry>
+ <entry><type>text[]</type></entry>
+ <entry>get <type>hstore</>'s values as array</entry>
+ <entry><literal>avals('a=>1,b=>2')</literal></entry>
+ <entry><literal>{1,2}</literal></entry>
+ </row>
+
+ <row>
+ <entry><function>svals(hstore)</function></entry>
+ <entry><type>setof text</type></entry>
+ <entry>get <type>hstore</>'s values as set</entry>
+ <entry><literal>svals('a=>1,b=>2')</literal></entry>
+ <entry>
+<programlisting>
+1
+2
+</programlisting></entry>
+ </row>
+
+ <row>
+ <entry><function>each(hstore)</function></entry>
+ <entry><type>setof (key text, value text)</type></entry>
+ <entry>get <type>hstore</>'s keys and values as set</entry>
+ <entry><literal>select * from each('a=>1,b=>2')</literal></entry>
+ <entry>
+<programlisting>
+ key | value
-----+-------
a | 1
b | 2
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>exist (hstore,text)</literal>
- </para>
- <para>
- <literal>hstore ? text</literal> - returns 'true if key is exists in hstore
- and false otherwise.
- </para>
- <programlisting>
-regression=# select exist('a=>1','a'), 'a=>1' ? 'a';
- exist | ?column?
--------+----------
- t | t
- </programlisting>
- </listitem>
-
- <listitem>
- <para>
- <literal>defined (hstore,text)</literal> - returns true if key is exists in
- hstore and its value is not NULL.
- </para>
- <programlisting>
-regression=# select defined('a=>NULL','a');
- defined
----------
- f
- </programlisting>
- </listitem>
- </itemizedlist>
+</programlisting></entry>
+ </row>
+
+ <row>
+ <entry><function>exist(hstore,text)</function></entry>
+ <entry><type>boolean</type></entry>
+ <entry>does <type>hstore</> contain key?</entry>
+ <entry><literal>exist('a=>1','a')</literal></entry>
+ <entry><literal>t</literal></entry>
+ </row>
+
+ <row>
+ <entry><function>defined(hstore,text)</function></entry>
+ <entry><type>boolean</type></entry>
+ <entry>does <type>hstore</> contain non-null value for key?</entry>
+ <entry><literal>defined('a=>NULL','a')</literal></entry>
+ <entry><literal>f</literal></entry>
+ </row>
+
+ <row>
+ <entry><function>delete(hstore,text)</function></entry>
+ <entry><type>hstore</type></entry>
+ <entry>delete any item matching key</entry>
+ <entry><literal>delete('a=>1,b=>2','b')</literal></entry>
+ <entry><literal>"a"=>"1"</literal></entry>
+ </row>
+
+ </tbody>
+ </tgroup>
+ </table>
</sect2>
<sect2>
- <title>Indices</title>
+ <title>Indexes</title>
+
<para>
- Module provides index support for '@>' and '?' operations.
+ <type>hstore</> has index support for <literal>@></> and <literal>?</>
+ operators. You can use either GiST or GIN index types. For example:
</para>
<programlisting>
CREATE INDEX hidx ON testhstore USING GIST(h);
+
CREATE INDEX hidx ON testhstore USING GIN(h);
</programlisting>
</sect2>
<title>Examples</title>
<para>
- Add a key:
+ Add a key, or update an existing key with a new value:
</para>
<programlisting>
-UPDATE tt SET h=h||'c=>3';
+UPDATE tab SET h = h || ('c' => '3');
</programlisting>
+
<para>
Delete a key:
</para>
<programlisting>
-UPDATE tt SET h=delete(h,'k1');
+UPDATE tab SET h = delete(h, 'k1');
</programlisting>
</sect2>
<sect2>
<title>Statistics</title>
+
<para>
-hstore type, because of its intrinsic liberality, could contain a lot of
-different keys. Checking for valid keys is the task of application.
-Examples below demonstrate several techniques how to check keys statistics.
+ The <type>hstore</> type, because of its intrinsic liberality, could
+ contain a lot of different keys. Checking for valid keys is the task of the
+ application. Examples below demonstrate several techniques for checking
+ keys and obtaining statistics.
</para>
<para>
- Simple example
+ Simple example:
</para>
<programlisting>
-SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1 ');
+SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1');
</programlisting>
<para>
- Using table
+ Using a table:
</para>
<programlisting>
-SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore ;
+SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
</programlisting>
- <para>Online stat</para>
+ <para>
+ Online statistics:
+ </para>
<programlisting>
-SELECT key, count(*) FROM (SELECT (each(h)).key FROM testhstore) AS stat GROUP BY key ORDER BY count DESC, key;
- key | count
+SELECT key, count(*) FROM
+ (SELECT (each(h)).key FROM testhstore) AS stat
+ GROUP BY key
+ ORDER BY count DESC, key;
+ key | count
-----------+-------
line | 883
query | 207
<sect2>
<title>Authors</title>
+
<para>
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
</para>
+
<para>
- Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia
+ Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd., Russia
</para>
</sect2>
-</sect1>
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/lo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="lo">
<title>lo</title>
-
+
<indexterm zone="lo">
<primary>lo</primary>
</indexterm>
<para>
- PostgreSQL type extension for managing Large Objects
+ The <filename>lo</> module provides support for managing Large Objects
+ (also called LOs or BLOBs). This includes a data type <type>lo</>
+ and a trigger <function>lo_manage</>.
</para>
<sect2>
- <title>Overview</title>
+ <title>Rationale</title>
+
<para>
One of the problems with the JDBC driver (and this affects the ODBC driver
- also), is that the specification assumes that references to BLOBS (Binary
- Large OBjectS) are stored within a table, and if that entry is changed, the
+ also), is that the specification assumes that references to BLOBs (Binary
+ Large OBjects) are stored within a table, and if that entry is changed, the
associated BLOB is deleted from the database.
</para>
+
<para>
- As PostgreSQL stands, this doesn't occur. Large objects are treated as
- objects in their own right; a table entry can reference a large object by
- OID, but there can be multiple table entries referencing the same large
- object OID, so the system doesn't delete the large object just because you
- change or remove one such entry.
- </para>
- <para>
- Now this is fine for new PostgreSQL-specific applications, but existing ones
- using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
- that are not referenced by anything, and simply occupy disk space.
+ As <productname>PostgreSQL</> stands, this doesn't occur. Large objects
+ are treated as objects in their own right; a table entry can reference a
+ large object by OID, but there can be multiple table entries referencing
+ the same large object OID, so the system doesn't delete the large object
+ just because you change or remove one such entry.
</para>
- </sect2>
- <sect2>
- <title>The Fix</title>
<para>
- I've fixed this by creating a new data type 'lo', some support functions, and
- a Trigger which handles the orphaning problem. The trigger essentially just
- does a 'lo_unlink' whenever you delete or modify a value referencing a large
- object. When you use this trigger, you are assuming that there is only one
- database reference to any large object that is referenced in a
- trigger-controlled column!
+ Now this is fine for <productname>PostgreSQL</>-specific applications, but
+ standard code using JDBC or ODBC won't delete the objects, resulting in
+ orphan objects — objects that are not referenced by anything, and
+ simply occupy disk space.
</para>
+
<para>
- The 'lo' type was created because we needed to differentiate between plain
- OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
- but (after talking to Byron), the ODBC driver needed a unique type. They had
- created an 'lo' type, but not the solution to orphaning.
+ The <filename>lo</> module allows fixing this by attaching a trigger
+ to tables that contain LO reference columns. The trigger essentially just
+ does a <function>lo_unlink</> whenever you delete or modify a value
+ referencing a large object. When you use this trigger, you are assuming
+ that there is only one database reference to any large object that is
+ referenced in a trigger-controlled column!
</para>
+
<para>
- You don't actually have to use the 'lo' type to use the trigger, but it may be
- convenient to use it to keep track of which columns in your database represent
- large objects that you are managing with the trigger.
+ The module also provides a data type <type>lo</>, which is really just
+ a domain of the <type>oid</> type. This is useful for differentiating
+ database columns that hold large object references from those that are
+ OIDs of other things. You don't have to use the <type>lo</> type to
+ use the trigger, but it may be convenient to use it to keep track of which
+ columns in your database represent large objects that you are managing with
+ the trigger. It is also rumored that the ODBC driver gets confused if you
+ don't use <type>lo</> for BLOB columns.
</para>
</sect2>
<sect2>
- <title>How to Use</title>
+ <title>How to Use It</title>
+
<para>
- The easiest way is by an example:
+ Here's a simple example of usage:
</para>
+
<programlisting>
CREATE TABLE image (title TEXT, raster lo);
+
CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image
FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster);
</programlisting>
+
<para>
- Create a trigger for each column that contains a lo type, and give the column
- name as the trigger procedure argument. You can have more than one trigger on
- a table if you need multiple lo columns in the same table, but don't forget to
- give a different name to each trigger.
+ For each column that will contain unique references to large objects,
+ create a <literal>BEFORE UPDATE OR DELETE</> trigger, and give the column
+ name as the sole trigger argument. If you need multiple <type>lo</>
+ columns in the same table, create a separate trigger for each one,
+ remembering to give a different name to each trigger on the same table.
</para>
</sect2>
<sect2>
- <title>Issues</title>
+ <title>Limitations</title>
<itemizedlist>
<listitem>
<para>
Dropping a table will still orphan any objects it contains, as the trigger
- is not executed.
+ is not executed. You can avoid this by preceding the <command>DROP
+ TABLE</> with <command>DELETE FROM <replaceable>table</></command>.
</para>
+
<para>
- Avoid this by preceding the 'drop table' with 'delete from {table}'.
+ <command>TRUNCATE</> has the same hazard.
</para>
+
<para>
- If you already have, or suspect you have, orphaned large objects, see
- the contrib/vacuumlo module to help you clean them up. It's a good idea
- to run contrib/vacuumlo occasionally as a back-stop to the lo_manage
- trigger.
+ If you already have, or suspect you have, orphaned large objects, see the
+ <filename>contrib/vacuumlo</> module (<xref linkend="vacuumlo">) to help
+ you clean them up. It's a good idea to run <application>vacuumlo</>
+ occasionally as a back-stop to the <function>lo_manage</> trigger.
</para>
</listitem>
+
<listitem>
<para>
Some frontends may create their own tables, and will not create the
- associated trigger(s). Also, users may not remember (or know) to create
+ associated trigger(s). Also, users may not remember (or know) to create
the triggers.
</para>
</listitem>
</itemizedlist>
-
- <para>
- As the ODBC driver needs a permanent lo type (& JDBC could be optimised to
- use it if it's Oid is fixed), and as the above issues can only be fixed by
- some internal changes, I feel it should become a permanent built-in type.
- </para>
</sect2>
<sect2>
<title>Author</title>
+
<para>
- Peter Mount <email>peter@retep.org.uk</email> June 13 1998
+ Peter Mount <email>peter@retep.org.uk</email>
</para>
</sect2>
-</sect1>
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/seg.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="seg">
<title>seg</title>
-
+
<indexterm zone="seg">
<primary>seg</primary>
</indexterm>
<para>
- The <literal>seg</literal> module contains the code for the user-defined
- type, <literal>SEG</literal>, representing laboratory measurements as
- floating point intervals.
+ This module implements a data type <type>seg</> for
+ representing line segments, or floating point intervals.
+ <type>seg</> can represent uncertainty in the interval endpoints,
+ making it especially useful for representing laboratory measurements.
</para>
-
+
<sect2>
<title>Rationale</title>
+
<para>
The geometry of measurements is usually more complex than that of a
point in a numeric continuum. A measurement is usually a segment of
the value being measured may naturally be an interval indicating some
condition, such as the temperature range of stability of a protein.
</para>
+
<para>
Using just common sense, it appears more convenient to store such data
as intervals, rather than pairs of numbers. In practice, it even turns
out more efficient in most applications.
</para>
+
<para>
Further along the line of common sense, the fuzziness of the limits
suggests that the use of traditional numeric data types leads to a
certain loss of information. Consider this: your instrument reads
6.50, and you input this reading into the database. What do you get
when you fetch it? Watch:
- </para>
+
<programlisting>
-test=> select 6.50 as "pH";
+test=> select 6.50 :: float8 as "pH";
pH
---
6.5
(1 row)
</programlisting>
- <para>
+
In the world of measurements, 6.50 is not the same as 6.5. It may
sometimes be critically different. The experimenters usually write
down (and publish) the digits they trust. 6.50 is actually a fuzzy
share. We definitely do not want such different data items to appear the
same.
</para>
+
<para>
Conclusion? It is nice to have a special data type that can record the
limits of an interval with arbitrarily variable precision. Variable in
- a sense that each data element records its own precision.
+ the sense that each data element records its own precision.
</para>
+
<para>
Check this out:
- </para>
- <programlisting>
+
+ <programlisting>
test=> select '6.25 .. 6.50'::seg as "pH";
pH
------------
6.25 .. 6.50
(1 row)
- </programlisting>
+ </programlisting>
+ </para>
</sect2>
<sect2>
<title>Syntax</title>
+
<para>
The external representation of an interval is formed using one or two
- floating point numbers joined by the range operator ('..' or '...').
- Optional certainty indicators (<, > and ~) are ignored by the internal
- logics, but are retained in the data.
+ floating point numbers joined by the range operator (<literal>..</literal>
+ or <literal>...</literal>). Alternatively, it can be specified as a
+ center point plus or minus a deviation.
+ Optional certainty indicators (<literal><</literal>,
+ <literal>></literal> and <literal>~</literal>) can be stored as well.
+ (Certainty indicators are ignored by all the built-in operators, however.)
+ </para>
+
+ <para>
+ In the following table, <replaceable>x</>, <replaceable>y</>, and
+ <replaceable>delta</> denote
+ floating-point numbers. <replaceable>x</> and <replaceable>y</>, but
+ not <replaceable>delta</>, can be preceded by a certainty indicator:
</para>
-
- <table>
- <title>Rules</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>rule 1</entry>
- <entry>seg -> boundary PLUMIN deviation</entry>
- </row>
- <row>
- <entry>rule 2</entry>
- <entry>seg -> boundary RANGE boundary</entry>
- </row>
- <row>
- <entry>rule 3</entry>
- <entry>seg -> boundary RANGE</entry>
- </row>
- <row>
- <entry>rule 4</entry>
- <entry>seg -> RANGE boundary</entry>
- </row>
- <row>
- <entry>rule 5</entry>
- <entry>seg -> boundary</entry>
- </row>
- <row>
- <entry>rule 6</entry>
- <entry>boundary -> FLOAT</entry>
- </row>
- <row>
- <entry>rule 7</entry>
- <entry>boundary -> EXTENSION FLOAT</entry>
- </row>
- <row>
- <entry>rule 8</entry>
- <entry>deviation -> FLOAT</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
<table>
- <title>Tokens</title>
+ <title><type>seg</> external representations</title>
<tgroup cols="2">
<tbody>
<row>
- <entry>RANGE</entry>
- <entry>(\.\.)(\.)?</entry>
- </row>
- <row>
- <entry>PLUMIN</entry>
- <entry>\'\+\-\'</entry>
+ <entry><literal><replaceable>x</></literal></entry>
+ <entry>Single value (zero-length interval)
+ </entry>
</row>
<row>
- <entry>integer</entry>
- <entry>[+-]?[0-9]+</entry>
+ <entry><literal><replaceable>x</> .. <replaceable>y</></literal></entry>
+ <entry>Interval from <replaceable>x</> to <replaceable>y</>
+ </entry>
</row>
<row>
- <entry>real</entry>
- <entry>[+-]?[0-9]+\.[0-9]+</entry>
+ <entry><literal><replaceable>x</> (+-) <replaceable>delta</></literal></entry>
+ <entry>Interval from <replaceable>x</> - <replaceable>delta</> to
+ <replaceable>x</> + <replaceable>delta</>
+ </entry>
</row>
<row>
- <entry>FLOAT</entry>
- <entry>({integer}|{real})([eE]{integer})?</entry>
+ <entry><literal><replaceable>x</> ..</literal></entry>
+ <entry>Open interval with lower bound <replaceable>x</>
+ </entry>
</row>
<row>
- <entry>EXTENSION</entry>
- <entry>[<>~]</entry>
+ <entry><literal>.. <replaceable>x</></literal></entry>
+ <entry>Open interval with upper bound <replaceable>x</>
+ </entry>
</row>
</tbody>
</tgroup>
</table>
-
+
<table>
- <title>Examples of valid <literal>SEG</literal> representations</title>
+ <title>Examples of valid <type>seg</> input</title>
<tgroup cols="2">
<tbody>
<row>
- <entry>Any number</entry>
+ <entry><literal>5.0</literal></entry>
<entry>
- (rules 5,6) -- creates a zero-length segment (a point,
- if you will)
+ Creates a zero-length segment (a point, if you will)
</entry>
</row>
<row>
- <entry>~5.0</entry>
+ <entry><literal>~5.0</literal></entry>
<entry>
- (rules 5,7) -- creates a zero-length segment AND records
- '~' in the data. This notation reads 'approximately 5.0',
- but its meaning is not recognized by the code. It is ignored
- until you get the value back. View it is a short-hand comment.
+ Creates a zero-length segment and records
+ <literal>~</> in the data. <literal>~</literal> is ignored
+ by <type>seg</> operations, but
+ is preserved as a comment.
</entry>
- </row>
+ </row>
<row>
- <entry><5.0</entry>
+ <entry><literal><5.0</literal></entry>
<entry>
- (rules 5,7) -- creates a point at 5.0; '<' is ignored but
- is preserved as a comment
+ Creates a point at 5.0. <literal><</literal> is ignored but
+ is preserved as a comment.
</entry>
</row>
<row>
- <entry>>5.0</entry>
+ <entry><literal>>5.0</literal></entry>
<entry>
- (rules 5,7) -- creates a point at 5.0; '>' is ignored but
- is preserved as a comment
+ Creates a point at 5.0. <literal>></literal> is ignored but
+ is preserved as a comment.
</entry>
</row>
<row>
- <entry><para>5(+-)0.3</para><para>5'+-'0.3</para></entry>
+ <entry><literal>5(+-)0.3</literal></entry>
<entry>
- <para>
- (rules 1,8) -- creates an interval '4.7..5.3'. As of this
- writing (02/09/2000), this mechanism isn't completely accurate
- in determining the number of significant digits for the
- boundaries. For example, it adds an extra digit to the lower
- boundary if the resulting interval includes a power of ten:
- </para>
- <programlisting>
-postgres=> select '10(+-)1'::seg as seg;
- seg
----------
-9.0 .. 11 -- should be: 9 .. 11
- </programlisting>
- <para>
- Also, the (+-) notation is not preserved: 'a(+-)b' will
- always be returned as '(a-b) .. (a+b)'. The purpose of this
- notation is to allow input from certain data sources without
- conversion.
- </para>
+ Creates an interval <literal>4.7 .. 5.3</literal>.
+ Note that the <literal>(+-)</> notation isn't preserved.
</entry>
</row>
<row>
- <entry>50 .. </entry>
- <entry>(rule 3) -- everything that is greater than or equal to 50</entry>
+ <entry><literal>50 .. </literal></entry>
+ <entry>Everything that is greater than or equal to 50</entry>
</row>
<row>
- <entry>.. 0</entry>
- <entry>(rule 4) -- everything that is less than or equal to 0</entry>
+ <entry><literal>.. 0</literal></entry>
+ <entry>Everything that is less than or equal to 0</entry>
</row>
<row>
- <entry>1.5e-2 .. 2E-2 </entry>
- <entry>(rule 2) -- creates an interval (0.015 .. 0.02)</entry>
+ <entry><literal>1.5e-2 .. 2E-2 </literal></entry>
+ <entry>Creates an interval <literal>0.015 .. 0.02</literal></entry>
</row>
<row>
- <entry>1 ... 2</entry>
+ <entry><literal>1 ... 2</literal></entry>
<entry>
- The same as 1...2, or 1 .. 2, or 1..2 (space is ignored).
- Because of the widespread use of '...' in the data sources,
- I decided to stick to is as a range operator. This, and
- also the fact that the white space around the range operator
- is ignored, creates a parsing conflict with numeric constants
- starting with a decimal point.
+ The same as <literal>1...2</literal>, or <literal>1 .. 2</literal>,
+ or <literal>1..2</literal>
+ (spaces around the range operator are ignored)
</entry>
</row>
</tbody>
</tgroup>
</table>
- <table>
- <title>Examples</title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>.1e7</entry>
- <entry>should be: 0.1e7</entry>
- </row>
- <row>
- <entry>.1 .. .2</entry>
- <entry>should be: 0.1 .. 0.2</entry>
- </row>
- <row>
- <entry>2.4 E4</entry>
- <entry>should be: 2.4E4</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
<para>
- The following, although it is not a syntax error, is disallowed to improve
- the sanity of the data:
+ Because <literal>...</> is widely used in data sources, it is allowed
+ as an alternative spelling of <literal>..</>. Unfortunately, this
+ creates a parsing ambiguity: it is not clear whether the upper bound
+ in <literal>0...23</> is meant to be <literal>23</> or <literal>0.23</>.
+ This is resolved by requiring at least one digit before the decimal
+ point in all numbers in <type>seg</> input.
</para>
- <table>
- <title></title>
- <tgroup cols="2">
- <tbody>
- <row>
- <entry>5 .. 2</entry>
- <entry>should be: 2 .. 5</entry>
- </row>
- </tbody>
- </tgroup>
- </table>
+
+ <para>
+ As a sanity check, <type>seg</> rejects intervals with the lower bound
+ greater than the upper, for example <literal>5 .. 2</>.
+ </para>
+
</sect2>
<sect2>
<title>Precision</title>
+
<para>
- The segments are stored internally as pairs of 32-bit floating point
- numbers. It means that the numbers with more than 7 significant digits
+ <type>seg</> values are stored internally as pairs of 32-bit floating point
+ numbers. This means that numbers with more than 7 significant digits
will be truncated.
</para>
+
<para>
- The numbers with less than or exactly 7 significant digits retain their
+ Numbers with 7 or fewer significant digits retain their
original precision. That is, if your query returns 0.00, you will be
sure that the trailing zeroes are not the artifacts of formatting: they
reflect the precision of the original data. The number of leading
<sect2>
<title>Usage</title>
+
<para>
- The access method for SEG is a GiST index (gist_seg_ops), which is a
- generalization of R-tree. GiSTs allow the postgres implementation of
- R-tree, originally encoded to support 2-D geometric types such as
- boxes and polygons, to be used with any data type whose data domain
- can be partitioned using the concepts of containment, intersection and
- equality. In other words, everything that can intersect or contain
- its own kind can be indexed with a GiST. That includes, among other
- things, all geometric data types, regardless of their dimensionality
- (see also contrib/cube).
- </para>
- <para>
- The operators supported by the GiST access method include:
+ The <filename>seg</> module includes a GiST index operator class for
+ <type>seg</> values.
+ The operators supported by the GiST opclass include:
</para>
+
<itemizedlist>
<listitem>
<programlisting>
[a, b] << [c, d] Is left of
</programlisting>
<para>
- The left operand, [a, b], occurs entirely to the left of the
- right operand, [c, d], on the axis (-inf, inf). It means,
+ [a, b] is entirely to the left of [c, d]. That is,
[a, b] << [c, d] is true if b < c and false otherwise
</para>
</listitem>
[a, b] >> [c, d] Is right of
</programlisting>
<para>
- [a, b] is occurs entirely to the right of [c, d].
- [a, b] >> [c, d] is true if a > d and false otherwise
+ [a, b] is entirely to the right of [c, d]. That is,
+ [a, b] >> [c, d] is true if a > d and false otherwise
</para>
</listitem>
<listitem>
[a, b] &< [c, d] Overlaps or is left of
</programlisting>
<para>
- This might be better read as "does not extend to right of".
- It is true when b <= d.
+ This might be better read as <quote>does not extend to right of</quote>.
+ It is true when b <= d.
</para>
</listitem>
<listitem>
[a, b] &> [c, d] Overlaps or is right of
</programlisting>
<para>
- This might be better read as "does not extend to left of".
- It is true when a >= c.
+ This might be better read as <quote>does not extend to left of</quote>.
+ It is true when a >= c.
</para>
</listitem>
<listitem>
<programlisting>
-[a, b] = [c, d] Same as
+[a, b] = [c, d] Same as
</programlisting>
<para>
- The segments [a, b] and [c, d] are identical, that is, a == b
- and c == d
+ The segments [a, b] and [c, d] are identical, that is, a = c and b = d
</para>
</listitem>
<listitem>
[a, b] && [c, d] Overlaps
</programlisting>
<para>
- The segments [a, b] and [c, d] overlap.
+ The segments [a, b] and [c, d] overlap.
</para>
</listitem>
<listitem>
<programlisting>
-[a, b] @> [c, d] Contains
+[a, b] @> [c, d] Contains
</programlisting>
<para>
- The segment [a, b] contains the segment [c, d], that is,
- a <= c and b >= d
+ The segment [a, b] contains the segment [c, d], that is,
+ a <= c and b >= d
</para>
</listitem>
<listitem>
<programlisting>
-[a, b] <@ [c, d] Contained in
+[a, b] <@ [c, d] Contained in
</programlisting>
<para>
- The segment [a, b] is contained in [c, d], that is,
- a >= c and b <= d
+ The segment [a, b] is contained in [c, d], that is,
+ a >= c and b <= d
</para>
</listitem>
</itemizedlist>
+
<para>
(Before PostgreSQL 8.2, the containment operators @> and <@ were
respectively called @ and ~. These names are still available, but are
are reversed from the convention formerly followed by the core geometric
datatypes!)
</para>
+
<para>
- Although the mnemonics of the following operators is questionable, I
- preserved them to maintain visual consistency with other geometric
- data types defined in Postgres.
- </para>
- <para>
- Other operators:
- </para>
+ The standard B-tree operators are also provided, for example
<programlisting>
[a, b] < [c, d] Less than
[a, b] > [c, d] Greater than
</programlisting>
- <para>
+
These operators do not make a lot of sense for any practical
purpose but sorting. These operators first compare (a) to (c),
- and if these are equal, compare (b) to (d). That accounts for
+ and if these are equal, compare (b) to (d). That results in
reasonably good sorting in most cases, which is useful if
- you want to use ORDER BY with this type
+ you want to use ORDER BY with this type.
</para>
+ </sect2>
+
+ <sect2>
+ <title>Notes</title>
<para>
- There are a few other potentially useful functions defined in seg.c
- that vanished from the schema because I stopped using them. Some of
- these were meant to support type casting. Let me know if I was wrong:
- I will then add them back to the schema. I would also appreciate
- other ideas that would enhance the type and make it more useful.
+ For examples of usage, see the regression test <filename>sql/seg.sql</>.
</para>
+
<para>
- For examples of usage, see sql/seg.sql
+ The mechanism that converts <literal>(+-)</> to regular ranges
+ isn't completely accurate in determining the number of significant digits
+ for the boundaries. For example, it adds an extra digit to the lower
+ boundary if the resulting interval includes a power of ten:
+
+ <programlisting>
+postgres=> select '10(+-)1'::seg as seg;
+ seg
+---------
+9.0 .. 11 -- should be: 9 .. 11
+ </programlisting>
</para>
+
<para>
- NOTE: The performance of an R-tree index can largely depend on the
+ The performance of an R-tree index can largely depend on the initial
order of input values. It may be very helpful to sort the input table
- on the SEG column (see the script sort-segments.pl for an example)
+ on the <type>seg</> column; see the script <filename>sort-segments.pl</>
+ for an example.
</para>
</sect2>
<sect2>
<title>Credits</title>
+
<para>
- My thanks are primarily to Prof. Joe Hellerstein
- (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
- gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>). I am
- also grateful to all postgres developers, present and past, for enabling
- myself to create my own world and live undisturbed in it. And I would like
- to acknowledge my gratitude to Argonne Lab and to the U.S. Department of
- Energy for the years of faithful support of my database research.
+ Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
+ Mathematics and Computer Science Division, Argonne National Laboratory.
</para>
- <programlisting>
- Gene Selkov, Jr.
- Computational Scientist
- Mathematics and Computer Science Division
- Argonne National Laboratory
- 9700 S Cass Ave.
- Building 221
- Argonne, IL 60439-4844
- </programlisting>
+
<para>
- <email>selkovjr@mcs.anl.gov</email>
+ My thanks are primarily to Prof. Joe Hellerstein
+ (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
+ gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>). I am
+ also grateful to all Postgres developers, present and past, for enabling
+ myself to create my own world and live undisturbed in it. And I would like
+ to acknowledge my gratitude to Argonne Lab and to the U.S. Department of
+ Energy for the years of faithful support of my database research.
</para>
+
</sect2>
</sect1>
-
+<!-- $PostgreSQL: pgsql/doc/src/sgml/sslinfo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="sslinfo">
<title>sslinfo</title>
-
+
<indexterm zone="sslinfo">
<primary>sslinfo</primary>
</indexterm>
<para>
- This modules provides information about current SSL certificate for PostgreSQL.
+ The <filename>sslinfo</> module provides information about the SSL
+ certificate that the current client provided when connecting to
+ <productname>PostgreSQL</>. The module is useless (most functions
+ will return NULL) if the current connection does not use SSL.
</para>
- <sect2>
- <title>Notes</title>
- <para>
- This extension won't build unless your PostgreSQL server is configured
- with --with-openssl. Information provided with these functions would
- be completely useless if you don't use SSL to connect to database.
- </para>
- </sect2>
+ <para>
+ This extension won't build at all unless the installation was
+ configured with <literal>--with-openssl</>.
+ </para>
<sect2>
- <title>Functions Description</title>
-
- <itemizedlist>
- <listitem>
- <programlisting>
-ssl_is_used() RETURNS boolean;
- </programlisting>
+ <title>Functions Provided</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><function>
+ssl_is_used() returns boolean
+ </function></term>
+ <listitem>
<para>
- Returns TRUE, if current connection to server uses SSL and FALSE
+ Returns TRUE if current connection to server uses SSL, and FALSE
otherwise.
</para>
- </listitem>
+ </listitem>
+ </varlistentry>
- <listitem>
- <programlisting>
-ssl_client_cert_present() RETURNS boolean
- </programlisting>
+ <varlistentry>
+ <term><function>
+ssl_client_cert_present() returns boolean
+ </function></term>
+ <listitem>
<para>
- Returns TRUE if current client have presented valid SSL client
- certificate to the server and FALSE otherwise (e.g., no SSL,
- certificate hadn't be requested by server).
+ Returns TRUE if current client has presented a valid SSL client
+ certificate to the server, and FALSE otherwise. (The server
+ might or might not be configured to require a client certificate.)
</para>
- </listitem>
-
- <listitem>
- <programlisting>
-ssl_client_serial() RETURNS numeric
- </programlisting>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><function>
+ssl_client_serial() returns numeric
+ </function></term>
+ <listitem>
<para>
- Returns serial number of current client certificate. The combination
- of certificate serial number and certificate issuer is guaranteed to
- uniquely identify certificate (but not its owner -- the owner ought to
- regularily change his keys, and get new certificates from the issuer).
+ Returns serial number of current client certificate. The combination of
+ certificate serial number and certificate issuer is guaranteed to
+ uniquely identify a certificate (but not its owner — the owner
+ ought to regularly change his keys, and get new certificates from the
+ issuer).
</para>
+
<para>
- So, if you run you own CA and allow only certificates from this CA to
- be accepted by server, the serial number is the most reliable (albeit
- not very mnemonic) means to indentify user.
+ So, if you run your own CA and allow only certificates from this CA to
+ be accepted by the server, the serial number is the most reliable (albeit
+ not very mnemonic) means to identify a user.
</para>
- </listitem>
+ </listitem>
+ </varlistentry>
- <listitem>
- <programlisting>
-ssl_client_dn() RETURNS text
- </programlisting>
+ <varlistentry>
+ <term><function>
+ssl_client_dn() returns text
+ </function></term>
+ <listitem>
<para>
- Returns the full subject of current client certificate, converting
+ Returns the full subject of the current client certificate, converting
character data into the current database encoding. It is assumed that
- if you use non-Latin characters in the certificate names, your
+ if you use non-ASCII characters in the certificate names, your
database is able to represent these characters, too. If your database
- uses the SQL_ASCII encoding, non-Latin characters in the name will be
+ uses the SQL_ASCII encoding, non-ASCII characters in the name will be
represented as UTF-8 sequences.
</para>
+
<para>
- The result looks like '/CN=Somebody /C=Some country/O=Some organization'.
+ The result looks like <literal>/CN=Somebody /C=Some country/O=Some organization</>.
</para>
- </listitem>
-
- <listitem>
- <programlisting>
-ssl_issuer_dn()
- </programlisting>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><function>
+ssl_issuer_dn() returns text
+ </function></term>
+ <listitem>
<para>
- Returns the full issuer name of the client certificate, converting
- character data into current database encoding.
+ Returns the full issuer name of the current client certificate, converting
+ character data into the current database encoding. Encoding conversions
+ are handled the same as for <function>ssl_client_dn</>.
</para>
<para>
The combination of the return value of this function with the
certificate serial number uniquely identifies the certificate.
</para>
<para>
- The result of this function is really useful only if you have more
- than one trusted CA certificate in your server's root.crt file, or if
- this CA has issued some intermediate certificate authority
- certificates.
+ This function is really useful only if you have more than one trusted CA
+ certificate in your server's <filename>root.crt</> file, or if this CA
+ has issued some intermediate certificate authority certificates.
</para>
- </listitem>
-
- <listitem>
- <programlisting>
-ssl_client_dn_field(fieldName text) RETURNS text
- </programlisting>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><function>
+ssl_client_dn_field(fieldname text) returns text
+ </function></term>
+ <listitem>
<para>
This function returns the value of the specified field in the
- certificate subject. Field names are string constants that are
- converted into ASN1 object identificators using the OpenSSL object
+ certificate subject, or NULL if the field is not present.
+ Field names are string constants that are
+ converted into ASN1 object identifiers using the OpenSSL object
database. The following values are acceptable:
</para>
<programlisting>
surname (alias SN)
name
givenName (alias GN)
-countryName (alias C)
+countryName (alias C)
localityName (alias L)
stateOrProvinceName (alias ST)
organizationName (alias O)
description
dnQualifier
x500UniqueIdentifier
-pseudonim
+pseudonym
role
emailAddress
</programlisting>
<para>
- All of these fields are optional, except commonName. It depends
- entirely on your CA policy which of them would be included and which
- wouldn't. The meaning of these fields, howeer, is strictly defined by
+ All of these fields are optional, except <structfield>commonName</>.
+ It depends
+ entirely on your CA's policy which of them would be included and which
+ wouldn't. The meaning of these fields, however, is strictly defined by
the X.500 and X.509 standards, so you cannot just assign arbitrary
meaning to them.
</para>
- </listitem>
-
- <listitem>
- <programlisting>
-ssl_issuer_field(fieldName text) RETURNS text;
- </programlisting>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><function>
+ssl_issuer_field(fieldname text) returns text
+ </function></term>
+ <listitem>
<para>
- Does same as ssl_client_dn_field, but for the certificate issuer
+ Same as <function>ssl_client_dn_field</>, but for the certificate issuer
rather than the certificate subject.
</para>
- </listitem>
- </itemizedlist>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</sect2>
<sect2>
<title>Author</title>
+
<para>
Victor Wagner <email>vitus@cryptocom.ru</email>, Cryptocom LTD
- E-Mail of Cryptocom OpenSSL development group:
+ </para>
+
+ <para>
+ E-Mail of Cryptocom OpenSSL development group:
<email>openssl@cryptocom.ru</email>
</para>
</sect2>
-</sect1>
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/tablefunc.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="tablefunc">
<title>tablefunc</title>
-
+
<indexterm zone="tablefunc">
<primary>tablefunc</primary>
</indexterm>
<para>
- <literal>tablefunc</literal> provides functions to convert query rows into fields.
+ The <filename>tablefunc</> module includes various functions that return
+ tables (that is, multiple rows). These functions are useful both in their
+ own right and as examples of how to write C functions that return
+ multiple rows.
</para>
+
<sect2>
- <title>Functions</title>
+ <title>Functions Provided</title>
+
<table>
- <title></title>
+ <title><filename>tablefunc</> functions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Function</entry>
<entry>Returns</entry>
- <entry>Comments</entry>
+ <entry>Description</entry>
</row>
</thead>
<tbody>
<row>
+ <entry><function>normal_rand(int numvals, float8 mean, float8 stddev)</function></entry>
+ <entry><type>setof float8</></entry>
<entry>
- <literal>
- normal_rand(int numvals, float8 mean, float8 stddev)
- </literal>
+ Produces a set of normally distributed random values
</entry>
+ </row>
+ <row>
+ <entry><function>crosstab(text sql)</function></entry>
+ <entry><type>setof record</></entry>
<entry>
- returns a set of normally distributed float8 values
+ Produces a <quote>pivot table</> containing
+ row names plus <replaceable>N</> value columns, where
+ <replaceable>N</> is determined by the rowtype specified in the calling
+ query
</entry>
- <entry></entry>
</row>
<row>
- <entry><literal>crosstabN(text sql)</literal></entry>
- <entry>returns a set of row_name plus N category value columns</entry>
+ <entry><function>crosstab<replaceable>N</>(text sql)</function></entry>
+ <entry><type>setof table_crosstab_<replaceable>N</></></entry>
<entry>
- crosstab2(), crosstab3(), and crosstab4() are defined for you,
- but you can create additional crosstab functions per the instructions
- in the documentation below.
+ Produces a <quote>pivot table</> containing
+ row names plus <replaceable>N</> value columns.
+ <function>crosstab2</>, <function>crosstab3</>, and
+ <function>crosstab4</> are predefined, but you can create additional
+ <function>crosstab<replaceable>N</></> functions as described below
</entry>
</row>
<row>
- <entry><literal>crosstab(text sql)</literal></entry>
- <entry>returns a set of row_name plus N category value columns</entry>
+ <entry><function>crosstab(text source_sql, text category_sql)</function></entry>
+ <entry><type>setof record</></entry>
<entry>
- requires anonymous composite type syntax in the FROM clause. See
- the instructions in the documentation below.
+ Produces a <quote>pivot table</>
+ with the value columns specified by a second query
</entry>
</row>
<row>
- <entry><literal>crosstab(text sql, N int)</literal></entry>
- <entry></entry>
+ <entry><function>crosstab(text sql, int N)</function></entry>
+ <entry><type>setof record</></entry>
<entry>
- <para>obsolete version of crosstab()</para>
- <para>
- the argument N is now ignored, since the number of value columns
- is always determined by the calling query
+ <para>Obsolete version of <function>crosstab(text)</>.
+ The parameter <replaceable>N</> is now ignored, since the number of
+ value columns is always determined by the calling query
</para>
</entry>
</row>
<row>
<entry>
- <literal>
+ <function>
connectby(text relname, text keyid_fld, text parent_keyid_fld
- [, text orderby_fld], text start_with, int max_depth
- [, text branch_delim])
- </literal>
+ [, text orderby_fld ], text start_with, int max_depth
+ [, text branch_delim ])
+ </function>
</entry>
+ <entry><type>setof record</></entry>
<entry>
- returns keyid, parent_keyid, level, and an optional branch string
- and an optional serial column for ordering siblings
- </entry>
- <entry>
- requires anonymous composite type syntax in the FROM clause. See
- the instructions in the documentation below.
+ Produces a representation of a hierarchical tree structure
</entry>
</row>
</tbody>
</table>
<sect3>
- <title><literal>normal_rand</literal></title>
+ <title><function>normal_rand</function></title>
+
<programlisting>
-normal_rand(int numvals, float8 mean, float8 stddev) RETURNS SETOF float8
+normal_rand(int numvals, float8 mean, float8 stddev) returns setof float8
</programlisting>
+
<para>
- Where <literal>numvals</literal> is the number of values to be returned
- from the function. <literal>mean</literal> is the mean of the normal
- distribution of values and <literal>stddev</literal> is the standard
- deviation of the normal distribution of values.
+ <function>normal_rand</> produces a set of normally distributed random
+ values (Gaussian distribution).
</para>
+
<para>
- Returns a float8 set of random values normally distributed (Gaussian
- distribution).
+ <parameter>numvals</parameter> is the number of values to be returned
+ from the function. <parameter>mean</parameter> is the mean of the normal
+ distribution of values and <parameter>stddev</parameter> is the standard
+ deviation of the normal distribution of values.
</para>
+
<para>
- Example:
+ For example, this call requests 1000 values with a mean of 5 and a
+ standard deviation of 3:
</para>
+
<programlisting>
- test=# SELECT * FROM
- test=# normal_rand(1000, 5, 3);
+test=# SELECT * FROM normal_rand(1000, 5, 3);
normal_rand
----------------------
1.56556322244898
2.49639286969028
(1000 rows)
</programlisting>
- <para>
- Returns 1000 values with a mean of 5 and a standard deviation of 3.
- </para>
</sect3>
-
<sect3>
- <title><literal>crosstabN(text sql)</literal></title>
- <programlisting>
-crosstabN(text sql)
- </programlisting>
- <para>
- The <literal>sql</literal> parameter is a SQL statement which produces the
- source set of data. The SQL statement must return one row_name column, one
- category column, and one value column. <literal>row_name</literal> and
- value must be of type text. The function returns a set of
- <literal>row_name</literal> plus N category value columns.
- </para>
- <para>
- Provided <literal>sql</literal> must produce a set something like:
- </para>
-<programlisting>
-row_name cat value
----------+-------+-------
- row1 cat1 val1
- row1 cat2 val2
- row1 cat3 val3
- row1 cat4 val4
- row2 cat1 val5
- row2 cat2 val6
- row2 cat3 val7
- row2 cat4 val8
- </programlisting>
- <para>
- The returned value is a <literal>SETOF table_crosstab_N</literal>, which
- is defined by:
- </para>
- <programlisting>
-CREATE TYPE tablefunc_crosstab_N AS (
- row_name TEXT,
- category_1 TEXT,
- category_2 TEXT,
- .
- .
- .
- category_N TEXT
-);
- </programlisting>
- <para>
- for the default installed functions, where N is 2, 3, or 4.
- </para>
- <para>
- e.g. the provided crosstab2 function produces a set something like:
- </para>
- <programlisting>
- <== values columns ==>
- row_name category_1 category_2
- ---------+------------+------------
- row1 val1 val2
- row2 val5 val6
- </programlisting>
- <note>
- <orderedlist>
- <listitem><para>The sql result must be ordered by 1,2.</para></listitem>
- <listitem>
- <para>
- The number of values columns depends on the tuple description
- of the function's declared return type.
- </para>
- </listitem>
- <listitem>
- <para>
- Missing values (i.e. not enough adjacent rows of same row_name to
- fill the number of result values columns) are filled in with nulls.
- </para>
- </listitem>
- <listitem>
- <para>
- Extra values (i.e. too many adjacent rows of same row_name to fill
- the number of result values columns) are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- Rows with all nulls in the values columns are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- The installed defaults are for illustration purposes. You
- can create your own return types and functions based on the
- crosstab() function of the installed library. See below for
- details.
- </para>
- </listitem>
- </orderedlist>
- </note>
- <para>
- Example:
- </para>
- <programlisting>
-create table ct(id serial, rowclass text, rowid text, attribute text, value text);
-insert into ct(rowclass, rowid, attribute, value) values('group1','test1','att1','val1');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test1','att2','val2');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test1','att3','val3');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test1','att4','val4');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test2','att1','val5');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test2','att2','val6');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test2','att3','val7');
-insert into ct(rowclass, rowid, attribute, value) values('group1','test2','att4','val8');
-
-select * from crosstab3(
- 'select rowid, attribute, value
- from ct
- where rowclass = ''group1''
- and (attribute = ''att2'' or attribute = ''att3'') order by 1,2;');
-
- row_name | category_1 | category_2 | category_3
-----------+------------+------------+------------
- test1 | val2 | val3 |
- test2 | val6 | val7 |
-(2 rows)
- </programlisting>
- </sect3>
+ <title><function>crosstab(text)</function></title>
- <sect3>
- <title><literal>crosstab(text)</literal></title>
<programlisting>
crosstab(text sql)
crosstab(text sql, int N)
</programlisting>
+
<para>
- The <literal>sql</literal> parameter is a SQL statement which produces the
- source set of data. The SQL statement must return one
- <literal>row_name</literal> column, one <literal>category</literal> column,
- and one <literal>value</literal> column. <literal>N</literal> is an
- obsolete argument; ignored if supplied (formerly this had to match the
- number of category columns determined by the calling query).
+ The <function>crosstab</> function is used to produce <quote>pivot</>
+ displays, wherein data is listed across the page rather than down.
+ For example, we might have data like
+ <programlisting>
+row1 val11
+row1 val12
+row1 val13
+...
+row2 val21
+row2 val22
+row2 val23
+...
+ </programlisting>
+ which we wish to display like
+ <programlisting>
+row1 val11 val12 val13 ...
+row2 val21 val22 val23 ...
+...
+ </programlisting>
+ The <function>crosstab</> function takes a text parameter that is a SQL
+ query producing raw data formatted in the first way, and produces a table
+ formatted in the second way.
</para>
+
<para>
+ The <parameter>sql</parameter> parameter is a SQL statement that produces
+ the source set of data. This statement must return one
+ <structfield>row_name</structfield> column, one
+ <structfield>category</structfield> column, and one
+ <structfield>value</structfield> column. <parameter>N</parameter> is an
+ obsolete parameter, ignored if supplied (formerly this had to match the
+ number of output value columns, but now that is determined by the
+ calling query).
</para>
+
<para>
- e.g. provided sql must produce a set something like:
+ For example, the provided query might produce a set something like:
</para>
+
<programlisting>
row_name cat value
----------+-------+-------
row2 cat3 val7
row2 cat4 val8
</programlisting>
+
<para>
- Returns a <literal>SETOF RECORD</literal>, which must be defined with a
- column definition in the FROM clause of the SELECT statement, e.g.:
+ The <function>crosstab</> function is declared to return <type>setof
+ record</type>, so the actual names and types of the output columns must be
+ defined in the <literal>FROM</> clause of the calling <command>SELECT</>
+ statement, for example:
</para>
+
<programlisting>
- SELECT *
- FROM crosstab(sql) AS ct(row_name text, category_1 text, category_2 text);
+ SELECT * FROM crosstab('...') AS ct(row_name text, category_1 text, category_2 text);
</programlisting>
+
<para>
- the example crosstab function produces a set something like:
+ This example produces a set something like:
</para>
+
<programlisting>
- <== values columns ==>
+ <== value columns ==>
row_name category_1 category_2
---------+------------+------------
row1 val1 val2
row2 val5 val6
</programlisting>
+
<para>
- Note that it follows these rules:
+ The <literal>FROM</> clause must define the output as one
+ <structfield>row_name</> column (of the same datatype as the first result
+ column of the SQL query) followed by N <structfield>value</> columns
+ (all of the same datatype as the third result column of the SQL query).
+ You can set up as many output value columns as you wish. The names of the
+ output columns are up to you.
</para>
- <orderedlist>
- <listitem><para>The sql result must be ordered by 1,2.</para></listitem>
- <listitem>
- <para>
- The number of values columns is determined by the column definition
- provided in the FROM clause. The FROM clause must define one
- row_name column (of the same datatype as the first result column
- of the sql query) followed by N category columns (of the same
- datatype as the third result column of the sql query). You can
- set up as many category columns as you wish.
- </para>
- </listitem>
- <listitem>
- <para>
- Missing values (i.e. not enough adjacent rows of same row_name to
- fill the number of result values columns) are filled in with nulls.
- </para>
- </listitem>
- <listitem>
- <para>
- Extra values (i.e. too many adjacent rows of same row_name to fill
- the number of result values columns) are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- Rows with all nulls in the values columns are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- You can avoid always having to write out a FROM clause that defines the
- output columns by setting up a custom crosstab function that has
- the desired output row type wired into its definition.
- </para>
- </listitem>
- </orderedlist>
+
+ <para>
+ The <function>crosstab</> function produces one output row for each
+ consecutive group of input rows with the same
+ <structfield>row_name</structfield> value. It fills the output
+ <structfield>value</> columns, left to right, with the
+ <structfield>value</structfield> fields from these rows. If there
+ are fewer rows in a group than there are output <structfield>value</>
+ columns, the extra output columns are filled with nulls; if there are
+ more rows, the extra input rows are skipped.
+ </para>
+
+ <para>
+ In practice the SQL query should always specify <literal>ORDER BY 1,2</>
+ to ensure that the input rows are properly ordered, that is, values with
+ the same <structfield>row_name</structfield> are brought together and
+ correctly ordered within the row. Notice that <function>crosstab</>
+ itself does not pay any attention to the second column of the query
+ result; it's just there to be ordered by, to control the order in which
+ the third-column values appear across the page.
+ </para>
+
+ <para>
+ Here is a complete example:
+ </para>
+
+ <programlisting>
+CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);
+INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
+INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');
+INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');
+INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');
+INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');
+INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');
+INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');
+INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');
+
+SELECT *
+FROM crosstab(
+ 'select rowid, attribute, value
+ from ct
+ where attribute = ''att2'' or attribute = ''att3''
+ order by 1,2')
+AS ct(row_name text, category_1 text, category_2 text, category_3 text);
+
+ row_name | category_1 | category_2 | category_3
+----------+------------+------------+------------
+ test1 | val2 | val3 |
+ test2 | val6 | val7 |
+(2 rows)
+ </programlisting>
+
+ <para>
+ You can avoid always having to write out a <literal>FROM</> clause to
+ define the output columns, by setting up a custom crosstab function that
+ has the desired output row type wired into its definition. This is
+ described in the next section. Another possibility is to embed the
+ required <literal>FROM</> clause in a view definition.
+ </para>
+
+ </sect3>
+
+ <sect3>
+ <title><function>crosstab<replaceable>N</>(text)</function></title>
+
+ <programlisting>
+crosstab<replaceable>N</>(text sql)
+ </programlisting>
+
+ <para>
+ The <function>crosstab<replaceable>N</></> functions are examples of how
+ to set up custom wrappers for the general <function>crosstab</> function,
+ so that you need not write out column names and types in the calling
+ <command>SELECT</> query. The <filename>tablefunc</> module includes
+ <function>crosstab2</>, <function>crosstab3</>, and
+ <function>crosstab4</>, whose output rowtypes are defined as
+ </para>
+
+ <programlisting>
+CREATE TYPE tablefunc_crosstab_N AS (
+ row_name TEXT,
+ category_1 TEXT,
+ category_2 TEXT,
+ .
+ .
+ .
+ category_N TEXT
+);
+ </programlisting>
+
+ <para>
+ Thus, these functions can be used directly when the input query produces
+ <structfield>row_name</> and <structfield>value</> columns of type
+ <type>text</>, and you want 2, 3, or 4 output values columns.
+ In all other ways they behave exactly as described above for the
+ general <function>crosstab</> function.
+ </para>
+
+ <para>
+ For instance, the example given in the previous section would also
+ work as
+ </para>
+
+ <programlisting>
+SELECT *
+FROM crosstab3(
+ 'select rowid, attribute, value
+ from ct
+ where attribute = ''att2'' or attribute = ''att3''
+ order by 1,2');
+ </programlisting>
+
<para>
- There are two ways you can set up a custom crosstab function:
+ These functions are provided mostly for illustration purposes. You
+ can create your own return types and functions based on the
+ underlying <function>crosstab()</> function. There are two ways
+ to do it:
</para>
+
<itemizedlist>
<listitem>
<para>
- Create a composite type to define your return type, similar to the
- examples in the installation script. Then define a unique function
- name accepting one text parameter and returning setof your_type_name.
- For example, if your source data produces row_names that are TEXT,
- and values that are FLOAT8, and you want 5 category columns:
+ Create a composite type describing the desired output columns,
+ similar to the examples in the installation script. Then define a
+ unique function name accepting one <type>text</> parameter and returning
+ <type>setof your_type_name</>, but linking to the same underlying
+ <function>crosstab</> C function. For example, if your source data
+ produces row names that are <type>text</>, and values that are
+ <type>float8</>, and you want 5 value columns:
</para>
+
<programlisting>
CREATE TYPE my_crosstab_float8_5_cols AS (
- row_name TEXT,
- category_1 FLOAT8,
- category_2 FLOAT8,
- category_3 FLOAT8,
- category_4 FLOAT8,
- category_5 FLOAT8
+ my_row_name text,
+ my_category_1 float8,
+ my_category_2 float8,
+ my_category_3 float8,
+ my_category_4 float8,
+ my_category_5 float8
);
CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(text)
AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
</programlisting>
</listitem>
+
<listitem>
<para>
- Use OUT parameters to define the return type implicitly.
+ Use <literal>OUT</> parameters to define the return type implicitly.
The same example could also be done this way:
</para>
+
<programlisting>
CREATE OR REPLACE FUNCTION crosstab_float8_5_cols(IN text,
- OUT row_name TEXT,
- OUT category_1 FLOAT8,
- OUT category_2 FLOAT8,
- OUT category_3 FLOAT8,
- OUT category_4 FLOAT8,
- OUT category_5 FLOAT8)
+ OUT my_row_name text,
+ OUT my_category_1 float8,
+ OUT my_category_2 float8,
+ OUT my_category_3 float8,
+ OUT my_category_4 float8,
+ OUT my_category_5 float8)
RETURNS setof record
AS '$libdir/tablefunc','crosstab' LANGUAGE C STABLE STRICT;
</programlisting>
</listitem>
</itemizedlist>
- <para>
- Example:
- </para>
- <programlisting>
-CREATE TABLE ct(id SERIAL, rowclass TEXT, rowid TEXT, attribute TEXT, value TEXT);
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test1','att1','val1');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test1','att2','val2');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test1','att3','val3');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test1','att4','val4');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test2','att1','val5');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test2','att2','val6');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test2','att3','val7');
-INSERT INTO ct(rowclass, rowid, attribute, value) VALUES('group1','test2','att4','val8');
-
-SELECT *
-FROM crosstab(
- 'select rowid, attribute, value
- from ct
- where rowclass = ''group1''
- and (attribute = ''att2'' or attribute = ''att3'') order by 1,2;', 3)
-AS ct(row_name text, category_1 text, category_2 text, category_3 text);
-
- row_name | category_1 | category_2 | category_3
-----------+------------+------------+------------
- test1 | val2 | val3 |
- test2 | val6 | val7 |
-(2 rows)
- </programlisting>
</sect3>
<sect3>
- <title><literal>crosstab(text, text)</literal></title>
+ <title><function>crosstab(text, text)</function></title>
+
<programlisting>
crosstab(text source_sql, text category_sql)
</programlisting>
<para>
- Where <literal>source_sql</literal> is a SQL statement which produces the
- source set of data. The SQL statement must return one
- <literal>row_name</literal> column, one <literal>category</literal> column,
- and one <literal>value</literal> column. It may also have one or more
- <emphasis>extra</emphasis> columns.
- </para>
- <para>
- The <literal>row_name</literal> column must be first. The
- <literal>category</literal> and <literal>value</literal> columns must be
- the last two columns, in that order. <emphasis>extra</emphasis> columns must
- be columns 2 through (N - 2), where N is the total number of columns.
+ The main limitation of the single-parameter form of <function>crosstab</>
+ is that it treats all values in a group alike, inserting each value into
+ the first available column. If you want the value
+ columns to correspond to specific categories of data, and some groups
+ might not have data for some of the categories, that doesn't work well.
+ The two-parameter form of <function>crosstab</> handles this case by
+ providing an explicit list of the categories corresponding to the
+ output columns.
</para>
+
<para>
- The <emphasis>extra</emphasis> columns are assumed to be the same for all
- rows with the same <literal>row_name</literal>. The values returned are
- copied from the first row with a given <literal>row_name</literal> and
- subsequent values of these columns are ignored until
- <literal>row_name</literal> changes.
+ <parameter>source_sql</parameter> is a SQL statement that produces the
+ source set of data. This statement must return one
+ <structfield>row_name</structfield> column, one
+ <structfield>category</structfield> column, and one
+ <structfield>value</structfield> column. It may also have one or more
+ <quote>extra</quote> columns.
+ The <structfield>row_name</structfield> column must be first. The
+ <structfield>category</structfield> and <structfield>value</structfield>
+ columns must be the last two columns, in that order. Any columns between
+ <structfield>row_name</structfield> and
+ <structfield>category</structfield> are treated as <quote>extra</>.
+ The <quote>extra</quote> columns are expected to be the same for all rows
+ with the same <structfield>row_name</structfield> value.
</para>
+
<para>
- e.g. <literal>source_sql</literal> must produce a set something like:
+ For example, <parameter>source_sql</parameter> might produce a set
+ something like:
</para>
<programlisting>
- SELECT row_name, extra_col, cat, value FROM foo;
+ SELECT row_name, extra_col, cat, value FROM foo ORDER BY 1;
row_name extra_col cat value
----------+------------+-----+---------
</programlisting>
<para>
- <literal>category_sql</literal> has to be a SQL statement which produces
- the distinct set of categories. The SQL statement must return one category
- column only. <literal>category_sql</literal> must produce at least one
- result row or an error will be generated. <literal>category_sql</literal>
- must not produce duplicate categories or an error will be generated. e.g.:
+ <parameter>category_sql</parameter> is a SQL statement that produces
+ the set of categories. This statement must return only one column.
+ It must produce at least one row, or an error will be generated.
+ Also, it must not produce duplicate values, or an error will be
+ generated. <parameter>category_sql</parameter> might be something like:
</para>
+
<programlisting>
-SELECT DISTINCT cat FROM foo;
+SELECT DISTINCT cat FROM foo ORDER BY 1;
cat
-------
cat1
cat3
cat4
</programlisting>
+
<para>
- The function returns <literal>SETOF RECORD</literal>, which must be defined
- with a column definition in the FROM clause of the SELECT statement, e.g.:
+ The <function>crosstab</> function is declared to return <type>setof
+ record</type>, so the actual names and types of the output columns must be
+ defined in the <literal>FROM</> clause of the calling <command>SELECT</>
+ statement, for example:
</para>
+
<programlisting>
- SELECT * FROM crosstab(source_sql, cat_sql)
- AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4 text);
+ SELECT * FROM crosstab('...', '...')
+ AS ct(row_name text, extra text, cat1 text, cat2 text, cat3 text, cat4 text);
</programlisting>
+
<para>
- the example crosstab function produces a set something like:
+ This will produce a result something like:
</para>
+
<programlisting>
- <== values columns ==>
+ <== value columns ==>
row_name extra cat1 cat2 cat3 cat4
---------+-------+------+------+------+------
row1 extra1 val1 val2 val4
row2 extra2 val5 val6 val7 val8
</programlisting>
+
+ <para>
+ The <literal>FROM</> clause must define the proper number of output
+ columns of the proper data types. If there are <replaceable>N</>
+ columns in the <parameter>source_sql</> query's result, the first
+ <replaceable>N</>-2 of them must match up with the first
+ <replaceable>N</>-2 output columns. The remaining output columns
+ must have the type of the last column of the <parameter>source_sql</>
+ query's result, and there must be exactly as many of them as there
+ are rows in the <parameter>category_sql</parameter> query's result.
+ </para>
+
+ <para>
+ The <function>crosstab</> function produces one output row for each
+ consecutive group of input rows with the same
+ <structfield>row_name</structfield> value. The output
+ <structfield>row_name</structfield> column, plus any <quote>extra</>
+ columns, are copied from the first row of the group. The output
+ <structfield>value</> columns are filled with the
+ <structfield>value</structfield> fields from rows having matching
+ <structfield>category</> values. If a row's <structfield>category</>
+ does not match any output of the <parameter>category_sql</parameter>
+ query, its <structfield>value</structfield> is ignored. Output
+ columns whose matching category is not present in any input row
+ of the group are filled with nulls.
+ </para>
+
+ <para>
+ In practice the <parameter>source_sql</parameter> query should always
+ specify <literal>ORDER BY 1</> to ensure that values with the same
+ <structfield>row_name</structfield> are brought together. However,
+ ordering of the categories within a group is not important.
+ Also, it is essential to be sure that the order of the
+ <parameter>category_sql</parameter> query's output matches the specified
+ output column order.
+ </para>
+
<para>
- Note that it follows these rules:
+ Here are two complete examples:
</para>
- <orderedlist>
- <listitem><para>source_sql must be ordered by row_name (column 1).</para></listitem>
- <listitem>
- <para>
- The number of values columns is determined at run-time. The
- column definition provided in the FROM clause must provide for
- the correct number of columns of the proper data types.
- </para>
- </listitem>
- <listitem>
- <para>
- Missing values (i.e. not enough adjacent rows of same row_name to
- fill the number of result values columns) are filled in with nulls.
- </para>
- </listitem>
- <listitem>
- <para>
- Extra values (i.e. source rows with category not found in category_sql
- result) are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- Rows with a null row_name column are skipped.
- </para>
- </listitem>
- <listitem>
- <para>
- You can create predefined functions to avoid having to write out
- the result column names/types in each query. See the examples
- for crosstab(text).
- </para>
- </listitem>
- </orderedlist>
<programlisting>
-CREATE TABLE cth(id serial, rowid text, rowdt timestamp, attribute text, val text);
-INSERT INTO cth VALUES(DEFAULT,'test1','01 March 2003','temperature','42');
-INSERT INTO cth VALUES(DEFAULT,'test1','01 March 2003','test_result','PASS');
-INSERT INTO cth VALUES(DEFAULT,'test1','01 March 2003','volts','2.6987');
-INSERT INTO cth VALUES(DEFAULT,'test2','02 March 2003','temperature','53');
-INSERT INTO cth VALUES(DEFAULT,'test2','02 March 2003','test_result','FAIL');
-INSERT INTO cth VALUES(DEFAULT,'test2','02 March 2003','test_startdate','01 March 2003');
-INSERT INTO cth VALUES(DEFAULT,'test2','02 March 2003','volts','3.1234');
+create table sales(year int, month int, qty int);
+insert into sales values(2007, 1, 1000);
+insert into sales values(2007, 2, 1500);
+insert into sales values(2007, 7, 500);
+insert into sales values(2007, 11, 1500);
+insert into sales values(2007, 12, 2000);
+insert into sales values(2008, 1, 1000);
+
+select * from crosstab(
+ 'select year, month, qty from sales order by 1',
+ 'select m from generate_series(1,12) m'
+) as (
+ year int,
+ "Jan" int,
+ "Feb" int,
+ "Mar" int,
+ "Apr" int,
+ "May" int,
+ "Jun" int,
+ "Jul" int,
+ "Aug" int,
+ "Sep" int,
+ "Oct" int,
+ "Nov" int,
+ "Dec" int
+);
+ year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec
+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+------+------
+ 2007 | 1000 | 1500 | | | | | 500 | | | | 1500 | 2000
+ 2008 | 1000 | | | | | | | | | | |
+(2 rows)
+ </programlisting>
+
+ <programlisting>
+CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);
+INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');
+INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');
+INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');
+INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');
+INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');
+INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March 2003');
+INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');
SELECT * FROM crosstab
(
test_startdate timestamp,
volts float8
);
- rowid | rowdt | temperature | test_result | test_startdate | volts
+ rowid | rowdt | temperature | test_result | test_startdate | volts
-------+--------------------------+-------------+-------------+--------------------------+--------
test1 | Sat Mar 01 00:00:00 2003 | 42 | PASS | | 2.6987
test2 | Sun Mar 02 00:00:00 2003 | 53 | FAIL | Sat Mar 01 00:00:00 2003 | 3.1234
(2 rows)
</programlisting>
+
+ <para>
+ You can create predefined functions to avoid having to write out
+ the result column names and types in each query. See the examples
+ in the previous section. The underlying C function for this form
+ of <function>crosstab</> is named <literal>crosstab_hash</>.
+ </para>
+
</sect3>
+
<sect3>
- <title>
- <literal>connectby(text, text, text[, text], text, text, int[, text])</literal>
- </title>
+ <title><function>connectby</function></title>
+
<programlisting>
connectby(text relname, text keyid_fld, text parent_keyid_fld
- [, text orderby_fld], text start_with, int max_depth
- [, text branch_delim])
+ [, text orderby_fld ], text start_with, int max_depth
+ [, text branch_delim ])
</programlisting>
+
+ <para>
+ The <function>connectby</> function produces a display of hierarchical
+ data that is stored in a table. The table must have a key field that
+ uniquely identifies rows, and a parent-key field that references the
+ parent (if any) of each row. <function>connectby</> can display the
+ sub-tree descending from any row.
+ </para>
+
<table>
- <title><literal>connectby</literal> parameters</title>
+ <title><function>connectby</function> parameters</title>
<tgroup cols="2">
<thead>
<row>
</thead>
<tbody>
<row>
- <entry><literal>relname</literal></entry>
+ <entry><parameter>relname</parameter></entry>
<entry>Name of the source relation</entry>
</row>
<row>
- <entry><literal>keyid_fld</literal></entry>
+ <entry><parameter>keyid_fld</parameter></entry>
<entry>Name of the key field</entry>
</row>
<row>
- <entry><literal>parent_keyid_fld</literal></entry>
- <entry>Name of the key_parent field</entry>
+ <entry><parameter>parent_keyid_fld</parameter></entry>
+ <entry>Name of the parent-key field</entry>
</row>
<row>
- <entry><literal>orderby_fld</literal></entry>
- <entry>
- If optional ordering of siblings is desired: Name of the field to
- order siblings
- </entry>
+ <entry><parameter>orderby_fld</parameter></entry>
+ <entry>Name of the field to order siblings by (optional)</entry>
</row>
<row>
- <entry><literal>start_with</literal></entry>
- <entry>
- Root value of the tree input as a text value regardless of
- <literal>keyid_fld</literal>
- </entry>
+ <entry><parameter>start_with</parameter></entry>
+ <entry>Key value of the row to start at</entry>
</row>
<row>
- <entry><literal>max_depth</literal></entry>
- <entry>
- Zero (0) for unlimited depth, otherwise restrict level to this depth
- </entry>
+ <entry><parameter>max_depth</parameter></entry>
+ <entry>Maximum depth to descend to, or zero for unlimited depth</entry>
</row>
<row>
- <entry><literal>branch_delim</literal></entry>
- <entry>
- If optional branch value is desired, this string is used as the delimiter.
- When not provided, a default value of '~' is used for internal
- recursion detection only, and no "branch" field is returned.
- </entry>
+ <entry><parameter>branch_delim</parameter></entry>
+ <entry>String to separate keys with in branch output (optional)</entry>
</row>
</tbody>
</tgroup>
</table>
+
<para>
- The function returns <literal>SETOF RECORD</literal>, which must defined
- with a column definition in the FROM clause of the SELECT statement, e.g.:
- </para>
- <programlisting>
- SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0, '~')
- AS t(keyid text, parent_keyid text, level int, branch text);
- </programlisting>
- <para>
- or
+ The key and parent-key fields can be any data type, but they must be
+ the same type. Note that the <parameter>start_with</> value must be
+ entered as a text string, regardless of the type of the key field.
</para>
- <programlisting>
- SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)
- AS t(keyid text, parent_keyid text, level int);
- </programlisting>
+
<para>
- or
+ The <function>connectby</> function is declared to return <type>setof
+ record</type>, so the actual names and types of the output columns must be
+ defined in the <literal>FROM</> clause of the calling <command>SELECT</>
+ statement, for example:
</para>
+
<programlisting>
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text, pos int);
</programlisting>
+
<para>
- or
+ The first two output columns are used for the current row's key and
+ its parent row's key; they must match the type of the table's key field.
+ The third output column is the depth in the tree and must be of type
+ <type>integer</>. If a <parameter>branch_delim</parameter> parameter was
+ given, the next output column is the branch display and must be of type
+ <type>text</>. Finally, if an <parameter>orderby_fld</parameter>
+ parameter was given, the last output column is a serial number, and must
+ be of type <type>integer</>.
</para>
- <programlisting>
- SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0)
- AS t(keyid text, parent_keyid text, level int, pos int);
- </programlisting>
+
<para>
- Note that it follows these rules:
+ The <quote>branch</> output column shows the path of keys taken to
+ reach the current row. The keys are separated by the specified
+ <parameter>branch_delim</parameter> string. If no branch display is
+ wanted, omit both the <parameter>branch_delim</parameter> parameter
+ and the branch column in the output column list.
</para>
- <orderedlist>
- <listitem><para>keyid and parent_keyid must be the same data type</para></listitem>
- <listitem>
- <para>
- The column definition *must* include a third column of type INT4 for
- the level value output
- </para>
- </listitem>
- <listitem>
- <para>
- If the branch field is not desired, omit both the branch_delim input
- parameter *and* the branch field in the query column definition. Note
- that when branch_delim is not provided, a default value of '~' is used
- for branch_delim for internal recursion detection, even though the branch
- field is not returned.
- </para>
- </listitem>
- <listitem>
- <para>
- If the branch field is desired, it must be the fourth column in the query
- column definition, and it must be type TEXT.
- </para>
- </listitem>
- <listitem>
- <para>
- The parameters representing table and field names must include double
- quotes if the names are mixed-case or contain special characters.
- </para>
- </listitem>
- <listitem>
- <para>
- If sorting of siblings is desired, the orderby_fld input parameter *and*
- a name for the resulting serial field (type INT32) in the query column
- definition must be given.
- </para>
- </listitem>
- </orderedlist>
+
+ <para>
+ If the ordering of siblings of the same parent is important,
+ include the <parameter>orderby_fld</parameter> parameter to
+ specify which field to order siblings by. This field can be of any
+ sortable data type. The output column list must include a final
+ integer serial-number column, if and only if
+ <parameter>orderby_fld</parameter> is specified.
+ </para>
+
<para>
- Example:
+ The parameters representing table and field names are copied as-is
+ into the SQL queries that <function>connectby</> generates internally.
+ Therefore, include double quotes if the names are mixed-case or contain
+ special characters. You may also need to schema-qualify the table name.
</para>
+
+ <para>
+ In large tables, performance will be poor unless there is an index on
+ the parent-key field.
+ </para>
+
+ <para>
+ It is important that the <parameter>branch_delim</parameter> string
+ not appear in any key values, else <function>connectby</> may incorrectly
+ report an infinite-recursion error. Note that if
+ <parameter>branch_delim</parameter> is not provided, a default value
+ of <literal>~</> is used for recursion detection purposes.
+ <!-- That pretty well sucks. FIXME -->
+ </para>
+
+ <para>
+ Here is an example:
+ </para>
+
<programlisting>
CREATE TABLE connectby_tree(keyid text, parent_keyid text, pos int);
INSERT INTO connectby_tree VALUES('row8','row6', 0);
INSERT INTO connectby_tree VALUES('row9','row5', 0);
--- with branch, without orderby_fld
+-- with branch, without orderby_fld (order of results is not guaranteed)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0, '~')
AS t(keyid text, parent_keyid text, level int, branch text);
keyid | parent_keyid | level | branch
row9 | row5 | 2 | row2~row5~row9
(6 rows)
--- without branch, without orderby_fld
+-- without branch, without orderby_fld (order of results is not guaranteed)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'row2', 0)
AS t(keyid text, parent_keyid text, level int);
keyid | parent_keyid | level
-- with branch, with orderby_fld (notice that row5 comes before row4)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0, '~')
- AS t(keyid text, parent_keyid text, level int, branch text, pos int) ORDER BY t.pos;
- keyid | parent_keyid | level | branch | pos
+ AS t(keyid text, parent_keyid text, level int, branch text, pos int);
+ keyid | parent_keyid | level | branch | pos
-------+--------------+-------+---------------------+-----
row2 | | 0 | row2 | 1
row5 | row2 | 1 | row2~row5 | 2
-- without branch, with orderby_fld (notice that row5 comes before row4)
SELECT * FROM connectby('connectby_tree', 'keyid', 'parent_keyid', 'pos', 'row2', 0)
- AS t(keyid text, parent_keyid text, level int, pos int) ORDER BY t.pos;
+ AS t(keyid text, parent_keyid text, level int, pos int);
keyid | parent_keyid | level | pos
-------+--------------+-------+-----
row2 | | 0 | 1
(6 rows)
</programlisting>
</sect3>
+
</sect2>
+
<sect2>
<title>Author</title>
+
<para>
Joe Conway
</para>
+
</sect2>
-</sect1>
+</sect1>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="test-parser">
<title>test_parser</title>
</indexterm>
<para>
- This is an example of a custom parser for full text search.
+ <filename>test_parser</> is an example of a custom parser for full-text
+ search. It doesn't do anything especially useful, but can serve as
+ a starting point for developing your own parser.
</para>
<para>
- It recognizes space-delimited words and returns just two token types:
+ <filename>test_parser</> recognizes words separated by white space,
+ and returns just two token types:
<programlisting>
mydb=# SELECT * FROM ts_token_type('testparser');
- tokid | alias | description
+ tokid | alias | description
-------+-------+---------------
3 | word | Word
12 | blank | Space symbols
<programlisting>
mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
- tokid | token
+ tokid | token
-------+--------
3 | That's
- 12 |
+ 12 |
3 | my
- 12 |
+ 12 |
3 | first
- 12 |
+ 12 |
3 | own
- 12 |
+ 12 |
3 | parser
</programlisting>
</para>
ALTER TEXT SEARCH CONFIGURATION
mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser');
- to_tsvector
+ to_tsvector
-------------------------------
'that':1 'first':3 'parser':5
(1 row)
mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(# to_tsquery('testcfg', 'star'));
- ts_headline
+ ts_headline
-----------------------------------------------------------------
Supernovae <b>stars</b> are the brightest phenomena in galaxies
(1 row)
+<!-- $PostgreSQL: pgsql/doc/src/sgml/tsearch2.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="tsearch2">
<title>tsearch2</title>
-
+
<indexterm zone="tsearch2">
<primary>tsearch2</primary>
</indexterm>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/uuid-ossp.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="uuid-ossp">
<title>uuid-ossp</title>
-
+
<indexterm zone="uuid-ossp">
<primary>uuid-ossp</primary>
</indexterm>
<para>
- This module provides functions to generate universally unique
- identifiers (UUIDs) using one of the several standard algorithms, as
- well as functions to produce certain special UUID constants.
+ The <filename>uuid-ossp</> module provides functions to generate universally
+ unique identifiers (UUIDs) using one of several standard algorithms. There
+ are also functions to produce certain special UUID constants.
+ </para>
+
+ <para>
+ This module depends on the OSSP UUID library, which can be found at
+ <ulink url="http://www.ossp.org/pkg/lib/uuid/"></ulink>.
</para>
<sect2>
- <title>UUID Generation</title>
+ <title><literal>uuid-ossp</literal> Functions</title>
+
<para>
The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC
4122 specify four algorithms for generating UUIDs, identified by the
</para>
<table>
- <title><literal>uuid-ossp</literal> functions</title>
+ <title>Functions for UUID Generation</title>
<tgroup cols="2">
<thead>
<row>
<para>
This function generates a version 3 UUID in the given namespace using
the specified input name. The namespace should be one of the special
- constants produced by the uuid_ns_*() functions shown below. (It
- could be any UUID in theory.) The name is an identifier in the
- selected namespace. For example:
- </para>
- </entry>
- </row>
- <row>
- <entry><literal>uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org')</literal></entry>
- <entry>
- <para>
- The name parameter will be MD5-hashed, so the cleartext cannot be
- derived from the generated UUID.
- </para>
- <para>
- The generation of UUIDs by this method has no random or
- environment-dependent element and is therefore reproducible.
+ constants produced by the <function>uuid_ns_*()</> functions shown
+ below. (It could be any UUID in theory.) The name is an identifier
+ in the selected namespace.
</para>
</entry>
</row>
</tgroup>
</table>
+ <para>
+ For example:
+
+ <programlisting>
+ SELECT uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org');
+ </programlisting>
+
+ The name parameter will be MD5-hashed, so the cleartext cannot be
+ derived from the generated UUID.
+ The generation of UUIDs by this method has no random or
+ environment-dependent element and is therefore reproducible.
+ </para>
+
<table>
- <title>UUID Constants</title>
+ <title>Functions Returning UUID Constants</title>
<tgroup cols="2">
<tbody>
<row>
<entry><literal>uuid_nil()</literal></entry>
<entry>
<para>
- A "nil" UUID constant, which does not occur as a real UUID.
+ A <quote>nil</> UUID constant, which does not occur as a real UUID.
</para>
</entry>
</row>
<entry>
<para>
Constant designating the ISO object identifier (OID) namespace for
- UUIDs. (This pertains to ASN.1 OIDs, unrelated to the OIDs used in
- PostgreSQL.)
+ UUIDs. (This pertains to ASN.1 OIDs, which are unrelated to the OIDs
+ used in <productname>PostgreSQL</>.)
</para>
</entry>
</row>
</tgroup>
</table>
</sect2>
+
<sect2>
<title>Author</title>
+
<para>
Peter Eisentraut <email>peter_e@gmx.net</email>
</para>
+
</sect2>
-</sect1>
+</sect1>
+<!-- $PostgreSQL: pgsql/doc/src/sgml/vacuumlo.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="vacuumlo">
<title>vacuumlo</title>
-
+
<indexterm zone="vacuumlo">
<primary>vacuumlo</primary>
</indexterm>
<para>
- This is a simple utility that will remove any orphaned large objects out of a
- PostgreSQL database. An orphaned LO is considered to be any LO whose OID
- does not appear in any OID data column of the database.
- </para>
- <para>
- If you use this, you may also be interested in the lo_manage trigger in
- contrib/lo. lo_manage is useful to try to avoid creating orphaned LOs
- in the first place.
+ <application>vacuumlo</> is a simple utility program that will remove any
+ <quote>orphaned</> large objects from a
+ <productname>PostgreSQL</> database. An orphaned large object (LO) is
+ considered to be any LO whose OID does not appear in any <type>oid</> or
+ <type>lo</> data column of the database.
</para>
+
<para>
- <note>
- <para>
- It was decided to place this in contrib as it needs further testing, but hopefully,
- this (or a variant of it) would make it into the backend as a "vacuum lo"
- command in a later release.
- </para>
- </note>
+ If you use this, you may also be interested in the <function>lo_manage</>
+ trigger in <filename>contrib/lo</> (see <xref linkend="lo">).
+ <function>lo_manage</> is useful to try
+ to avoid creating orphaned LOs in the first place.
</para>
<sect2>
<title>Usage</title>
- <programlisting>
-vacuumlo [options] database [database2 ... databasen]
- </programlisting>
+
+ <synopsis>
+vacuumlo [options] database [database2 ... databaseN]
+ </synopsis>
+
<para>
All databases named on the command line are processed. Available options
include:
</para>
- <programlisting>
--v Write a lot of progress messages
--n Don't remove large objects, just show what would be done
--U username Username to connect as
--W Prompt for password
--h hostname Database server host
--p port Database server port
- </programlisting>
+
+ <variablelist>
+ <varlistentry>
+ <term><option>-v</option></term>
+ <listitem>
+ <para>Write a lot of progress messages</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-n</option></term>
+ <listitem>
+ <para>Don't remove anything, just show what would be done</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-U</option> <replaceable>username</></term>
+ <listitem>
+ <para>Username to connect as</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-W</option></term>
+ <listitem>
+ <para>Force prompt for password (generally useless)</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-h</option> <replaceable>hostname</></term>
+ <listitem>
+ <para>Database server's host</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-p</option> <replaceable>port</></term>
+ <listitem>
+ <para>Database server's port</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</sect2>
<sect2>
<title>Method</title>
+
<para>
First, it builds a temporary table which contains all of the OIDs of the
large objects in that database.
</para>
+
<para>
- It then scans through all columns in the database that are of type "oid"
- or "lo", and removes matching entries from the temporary table.
+ It then scans through all columns in the database that are of type
+ <type>oid</> or <type>lo</>, and removes matching entries from the
+ temporary table.
</para>
+
<para>
- The remaining entries in the temp table identify orphaned LOs. These are
- removed.
+ The remaining entries in the temp table identify orphaned LOs.
+ These are removed.
</para>
</sect2>
<sect2>
<title>Author</title>
+
<para>
Peter Mount <email>peter@retep.org.uk</email>
</para>
- <para>
- <ulink url="http://www.retep.org.uk"></ulink>
- </para>
</sect2>
</sect1>
-
+<!-- $PostgreSQL: pgsql/doc/src/sgml/xml2.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
+
<sect1 id="xml2">
- <title>xml2: XML-handling functions</title>
-
+ <title>xml2</title>
+
<indexterm zone="xml2">
<primary>xml2</primary>
</indexterm>
+ <para>
+ The <filename>xml2</> module provides XPath querying and
+ XSLT functionality.
+ </para>
+
<sect2>
<title>Deprecation notice</title>
+
<para>
- From PostgreSQL 8.3 on, there is XML-related
- functionality based on the SQL/XML standard in the core server.
- That functionality covers XML syntax checking and XPath queries,
- which is what this module does as well, and more, but the API is
- not at all compatible. It is planned that this module will be
- removed in PostgreSQL 8.4 in favor of the newer standard API, so
- you are encouraged to try converting your applications. If you
- find that some of the functionality of this module is not
- available in an adequate form with the newer API, please explain
- your issue to pgsql-hackers@postgresql.org so that the deficiency
- can be addressed.
+ From <productname>PostgreSQL</> 8.3 on, there is XML-related
+ functionality based on the SQL/XML standard in the core server.
+ That functionality covers XML syntax checking and XPath queries,
+ which is what this module does, and more, but the API is
+ not at all compatible. It is planned that this module will be
+ removed in PostgreSQL 8.4 in favor of the newer standard API, so
+ you are encouraged to try converting your applications. If you
+ find that some of the functionality of this module is not
+ available in an adequate form with the newer API, please explain
+ your issue to pgsql-hackers@postgresql.org so that the deficiency
+ can be addressed.
</para>
</sect2>
-
+
<sect2>
<title>Description of functions</title>
+
<para>
- The first set of functions are straightforward XML parsing and XPath queries:
+ These functions provide straightforward XML parsing and XPath queries.
+ All arguments are of type <type>text</>, so for brevity that is not shown.
</para>
<table>
<tbody>
<row>
<entry>
- <programlisting>
- xml_is_well_formed(document) RETURNS bool
- </programlisting>
+ <synopsis>
+ xml_is_well_formed(document) returns bool
+ </synopsis>
</entry>
<entry>
<para>
This parses the document text in its parameter and returns true if the
- document is well-formed XML. (Note: before PostgreSQL 8.2, this function
- was called xml_valid(). That is the wrong name since validity and
- well-formedness have different meanings in XML. The old name is still
- available, but is deprecated and will be removed in 8.3.)
+ document is well-formed XML. (Note: before PostgreSQL 8.2, this
+ function was called <function>xml_valid()</>. That is the wrong name
+ since validity and well-formedness have different meanings in XML.
+ The old name is still available, but is deprecated.)
</para>
</entry>
</row>
<row>
<entry>
- <programlisting>
- xpath_string(document,query) RETURNS text
- xpath_number(document,query) RETURNS float4
- xpath_bool(document,query) RETURNS bool
- </programlisting>
+ <synopsis>
+ xpath_string(document,query) returns text
+ xpath_number(document,query) returns float4
+ xpath_bool(document,query) returns bool
+ </synopsis>
</entry>
<entry>
<para>
</row>
<row>
<entry>
- <programlisting>
- xpath_nodeset(document,query,toptag,itemtag) RETURNS text
- </programlisting>
+ <synopsis>
+ xpath_nodeset(document,query,toptag,itemtag) returns text
+ </synopsis>
</entry>
<entry>
<para>
the result is multivalued, the output will look like:
</para>
<literal>
- <toptag>
- <itemtag>Value 1 which could be an XML fragment</itemtag>
- <itemtag>Value 2....</itemtag>
- </toptag>
+ <toptag>
+ <itemtag>Value 1 which could be an XML fragment</itemtag>
+ <itemtag>Value 2....</itemtag>
+ </toptag>
</literal>
<para>
If either toptag or itemtag is an empty string, the relevant tag is omitted.
</row>
<row>
<entry>
- <programlisting>
- xpath_nodeset(document,query) RETURNS
- </programlisting>
+ <synopsis>
+ xpath_nodeset(document,query) returns text
+ </synopsis>
</entry>
<entry>
<para>
- Like xpath_nodeset(document,query,toptag,itemtag) but text omits both tags.
+ Like xpath_nodeset(document,query,toptag,itemtag) but result omits both tags.
</para>
</entry>
</row>
<row>
<entry>
- <programlisting>
- xpath_nodeset(document,query,itemtag) RETURNS
- </programlisting>
+ <synopsis>
+ xpath_nodeset(document,query,itemtag) returns text
+ </synopsis>
</entry>
<entry>
<para>
- Like xpath_nodeset(document,query,toptag,itemtag) but text omits toptag.
+ Like xpath_nodeset(document,query,toptag,itemtag) but result omits toptag.
</para>
</entry>
</row>
<row>
<entry>
- <programlisting>
- xpath_list(document,query,seperator) RETURNS text
- </programlisting>
+ <synopsis>
+ xpath_list(document,query,separator) returns text
+ </synopsis>
</entry>
<entry>
<para>
- This function returns multiple values seperated by the specified
- seperator, e.g. Value 1,Value 2,Value 3 if seperator=','.
+ This function returns multiple values separated by the specified
+ separator, for example <literal>Value 1,Value 2,Value 3</> if
+ separator is <literal>,</>.
</para>
</entry>
</row>
<row>
<entry>
- <programlisting>
- xpath_list(document,query) RETURNS text
- </programlisting>
+ <synopsis>
+ xpath_list(document,query) returns text
+ </synopsis>
</entry>
<entry>
- This is a wrapper for the above function that uses ',' as the seperator.
+ This is a wrapper for the above function that uses <literal>,</>
+ as the separator.
</entry>
</row>
</tbody>
</table>
</sect2>
-
<sect2>
<title><literal>xpath_table</literal></title>
+
+ <synopsis>
+ xpath_table(text key, text document, text relation, text xpaths, text criteria) returns setof record
+ </synopsis>
+
<para>
- This is a table function which evaluates a set of XPath queries on
- each of a set of documents and returns the results as a table. The
- primary key field from the original document table is returned as the
- first column of the result so that the resultset from xpath_table can
- be readily used in joins.
- </para>
- <para>
- The function itself takes 5 arguments, all text.
+ <function>xpath_table</> is a table function that evaluates a set of XPath
+ queries on each of a set of documents and returns the results as a
+ table. The primary key field from the original document table is returned
+ as the first column of the result so that the result set
+ can readily be used in joins.
</para>
- <programlisting>
- xpath_table(key,document,relation,xpaths,criteria)
- </programlisting>
+
<table>
<title>Parameters</title>
<tgroup cols="2">
<tbody>
<row>
- <entry><literal>key</literal></entry>
+ <entry><parameter>key</parameter></entry>
<entry>
<para>
- the name of the "key" field - this is just a field to be used as
- the first column of the output table i.e. it identifies the record from
- which each output row came (see note below about multiple values).
+ the name of the <quote>key</> field — this is just a field to be used as
+ the first column of the output table, i.e. it identifies the record from
+ which each output row came (see note below about multiple values)
</para>
</entry>
</row>
<row>
- <entry><literal>document</literal></entry>
+ <entry><parameter>document</parameter></entry>
<entry>
<para>
the name of the field containing the XML document
</entry>
</row>
<row>
- <entry><literal>relation</literal></entry>
+ <entry><parameter>relation</parameter></entry>
<entry>
<para>
the name of the table or view containing the documents
</entry>
</row>
<row>
- <entry><literal>xpaths</literal></entry>
+ <entry><parameter>xpaths</parameter></entry>
<entry>
<para>
- multiple xpath expressions separated by <literal>|</literal>
+ one or more XPath expressions, separated by <literal>|</literal>
</para>
</entry>
</row>
<row>
- <entry><literal>criteria</literal></entry>
+ <entry><parameter>criteria</parameter></entry>
<entry>
<para>
- The contents of the where clause. This needs to be specified,
- so use "true" or "1=1" here if you want to process all the rows in the
- relation.
+ the contents of the WHERE clause. This cannot be omitted, so use
+ <literal>true</literal> or <literal>1=1</literal> if you want to
+ process all the rows in the relation
</para>
</entry>
</row>
</table>
<para>
- NB These parameters (except the XPath strings) are just substituted
- into a plain SQL SELECT statement, so you have some flexibility - the
+ These parameters (except the XPath strings) are just substituted
+ into a plain SQL SELECT statement, so you have some flexibility — the
statement is
</para>
<para>
<literal>
- SELECT <key>,<document> FROM <relation> WHERE <criteria>
+ SELECT <key>, <document> FROM <relation> WHERE <criteria>
</literal>
</para>
<para>
- so those parameters can be *anything* valid in those particular
+ so those parameters can be <emphasis>anything</> valid in those particular
locations. The result from this SELECT needs to return exactly two
columns (which it will unless you try to list multiple fields for key
or document). Beware that this simplistic approach requires that you
validate any user-supplied values to avoid SQL injection attacks.
</para>
-
- <para>
- Using the function
- </para>
-
+
<para>
- The function has to be used in a FROM expression. This gives the following
- form:
+ The function has to be used in a <literal>FROM</> expression, with an
+ <literal>AS</> clause to specify the output columns; for example
</para>
-
+
<programlisting>
SELECT * FROM
-xpath_table('article_id',
- 'article_xml',
- 'articles',
- '/article/author|/article/pages|/article/title',
- 'date_entered > ''2003-01-01'' ')
+xpath_table('article_id',
+ 'article_xml',
+ 'articles',
+ '/article/author|/article/pages|/article/title',
+ 'date_entered > ''2003-01-01'' ')
AS t(article_id integer, author text, page_count integer, title text);
</programlisting>
<para>
- The AS clause defines the names and types of the columns in the
- virtual table. If there are more XPath queries than result columns,
+ The <literal>AS</> clause defines the names and types of the columns in the
+ output table. The first is the <quote>key</> field and the rest correspond
+ to the XPath queries.
+ If there are more XPath queries than result columns,
the extra queries will be ignored. If there are more result columns
than XPath queries, the extra columns will be NULL.
</para>
<para>
- Note that I've said in this example that pages is an integer. The
- function deals internally with string representations, so when you say
- you want an integer in the output, it will take the string
- representation of the XPath result and use PostgreSQL input functions
- to transform it into an integer (or whatever type the AS clause
- requests). An error will result if it can't do this - for example if
- the result is empty - so you may wish to just stick to 'text' as the
- column type if you think your data has any problems.
+ Notice that this example defines the <structname>page_count</> result
+ column as an integer. The function deals internally with string
+ representations, so when you say you want an integer in the output, it will
+ take the string representation of the XPath result and use PostgreSQL input
+ functions to transform it into an integer (or whatever type the <type>AS</>
+ clause requests). An error will result if it can't do this — for
+ example if the result is empty — so you may wish to just stick to
+ <type>text</> as the column type if you think your data has any problems.
</para>
+
<para>
- The select statement doesn't need to use * alone - it can reference the
+ The calling <command>SELECT</> statement doesn't necessarily have be
+ be just <literal>SELECT *</> — it can reference the output
columns by name or join them to other tables. The function produces a
virtual table with which you can perform any operation you wish (e.g.
aggregation, joining, sorting etc). So we could also have:
</para>
<programlisting>
-SELECT t.title, p.fullname, p.email
-FROM xpath_table('article_id','article_xml','articles',
- '/article/title|/article/author/@id',
- 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ')
- AS t(article_id integer, title text, author_id integer),
- tblPeopleInfo AS p
+SELECT t.title, p.fullname, p.email
+FROM xpath_table('article_id', 'article_xml', 'articles',
+ '/article/title|/article/author/@id',
+ 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ')
+ AS t(article_id integer, title text, author_id integer),
+ tblPeopleInfo AS p
WHERE t.author_id = p.person_id;
</programlisting>
as a more complicated example. Of course, you could wrap all
of this in a view for convenience.
</para>
+
<sect3>
<title>Multivalued results</title>
+
<para>
- The xpath_table function assumes that the results of each XPath query
+ The <function>xpath_table</> function assumes that the results of each XPath query
might be multi-valued, so the number of rows returned by the function
may not be the same as the number of input documents. The first row
returned contains the first result from each query, the second row the
second result from each query. If one of the queries has fewer values
than the others, NULLs will be returned instead.
</para>
+
<para>
In some cases, a user will know that a given XPath query will return
- only a single result (perhaps a unique document identifier) - if used
+ only a single result (perhaps a unique document identifier) — if used
alongside an XPath query returning multiple results, the single-valued
result will appear only on the first row of the result. The solution
to this is to use the key field as part of a join against a simpler
XPath query. As an example:
</para>
-
- <para>
- <literal>
- CREATE TABLE test
- (
- id int4 NOT NULL,
- xml text,
- CONSTRAINT pk PRIMARY KEY (id)
- )
- WITHOUT OIDS;
-
- INSERT INTO test VALUES (1, '<doc num="C1">
- <line num="L1"><a>1</a><b>2</b><c>3</c></line>
- <line num="L2"><a>11</a><b>22</b><c>33</c></line>
- </doc>');
-
- INSERT INTO test VALUES (2, '<doc num="C2">
- <line num="L1"><a>111</a><b>222</b><c>333</c></line>
- <line num="L2"><a>111</a><b>222</b><c>333</c></line>
- </doc>');
- </literal>
- </para>
- </sect3>
-
- <sect3>
- <title>The query</title>
-
- <programlisting>
- SELECT * FROM xpath_table('id','xml','test',
- '/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1')
- AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4,
- val2 int4, val3 int4)
- WHERE id = 1 ORDER BY doc_num, line_num
- </programlisting>
-
- <para>
- Gives the result:
- </para>
-
+
<programlisting>
- id | doc_num | line_num | val1 | val2 | val3
- ----+---------+----------+------+------+------
- 1 | C1 | L1 | 1 | 2 | 3
- 1 | | L2 | 11 | 22 | 33
+ CREATE TABLE test (
+ id int4 NOT NULL,
+ xml text,
+ CONSTRAINT pk PRIMARY KEY (id)
+ );
+
+ INSERT INTO test VALUES (1, '<doc num="C1">
+ <line num="L1"><a>1</a><b>2</b><c>3</c></line>
+ <line num="L2"><a>11</a><b>22</b><c>33</c></line>
+ </doc>');
+
+ INSERT INTO test VALUES (2, '<doc num="C2">
+ <line num="L1"><a>111</a><b>222</b><c>333</c></line>
+ <line num="L2"><a>111</a><b>222</b><c>333</c></line>
+ </doc>');
+
+ SELECT * FROM
+ xpath_table('id','xml','test',
+ '/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
+ 'true')
+ AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4, val2 int4, val3 int4)
+ WHERE id = 1 ORDER BY doc_num, line_num
+
+ id | doc_num | line_num | val1 | val2 | val3
+ ----+---------+----------+------+------+------
+ 1 | C1 | L1 | 1 | 2 | 3
+ 1 | | L2 | 11 | 22 | 33
</programlisting>
-
+
<para>
- To get doc_num on every line, the solution is to use two invocations
- of xpath_table and join the results:
+ To get doc_num on every line, the solution is to use two invocations
+ of xpath_table and join the results:
</para>
-
+
<programlisting>
- SELECT t.*,i.doc_num FROM
- xpath_table('id','xml','test',
- '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1')
- AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
- xpath_table('id','xml','test','/doc/@num','1=1')
- AS i(id int4, doc_num varchar(10))
+ SELECT t.*,i.doc_num FROM
+ xpath_table('id', 'xml', 'test',
+ '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
+ 'true')
+ AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
+ xpath_table('id', 'xml', 'test', '/doc/@num', 'true')
+ AS i(id int4, doc_num varchar(10))
WHERE i.id=t.id AND i.id=1
ORDER BY doc_num, line_num;
- </programlisting>
-
- <para>
- which gives the desired result:
- </para>
-
- <programlisting>
+
id | line_num | val1 | val2 | val3 | doc_num
----+----------+------+------+------+---------
1 | L1 | 1 | 2 | 3 | C1
</programlisting>
</sect3>
</sect2>
-
<sect2>
<title>XSLT functions</title>
+
<para>
The following functions are available if libxslt is installed (this is
not currently detected automatically, so you will have to amend the
- Makefile)
+ Makefile):
</para>
<sect3>
<title><literal>xslt_process</literal></title>
- <programlisting>
- xslt_process(document,stylesheet,paramlist) RETURNS text
- </programlisting>
+
+ <synopsis>
+ xslt_process(text document, text stylesheet, text paramlist) returns text
+ </synopsis>
<para>
This function appplies the XSL stylesheet to the document and returns
the transformed result. The paramlist is a list of parameter
assignments to be used in the transformation, specified in the form
- 'a=1,b=2'. Note that this is also proof-of-concept code and the
- parameter parsing is very simple-minded (e.g. parameter values cannot
- contain commas!)
+ <literal>a=1,b=2</>. Note that the
+ parameter parsing is very simple-minded: parameter values cannot
+ contain commas!
</para>
+
<para>
Also note that if either the document or stylesheet values do not
begin with a < then they will be treated as URLs and libxslt will
- fetch them. It thus follows that you can use xslt_process as a means
- to fetch the contents of URLs - you should be aware of the security
- implications of this.
- </para>
+ fetch them. It follows that you can use <function>xslt_process</> as a
+ means to fetch the contents of URLs — you should be aware of the
+ security implications of this.
+ </para>
+
<para>
- There is also a two-parameter version of xslt_process which does not
- pass any parameters to the transformation.
+ There is also a two-parameter version of <function>xslt_process</> which
+ does not pass any parameters to the transformation.
</para>
</sect3>
</sect2>
<sect2>
- <title>Credits</title>
- <para>
- Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com)
- It has the same BSD licence as PostgreSQL.
- </para>
+ <title>Author</title>
+
<para>
- This version of the XML functions provides both XPath querying and
- XSLT functionality. There is also a new table function which allows
- the straightforward return of multiple XML results. Note that the current code
- doesn't take any particular care over character sets - this is
- something that should be fixed at some point!
+ John Gray <email>jgray@azuli.co.uk</email>
</para>
+
<para>
- If you have any comments or suggestions, please do contact me at
- <email>jgray@azuli.co.uk.</email> Unfortunately, this isn't my main job, so
- I can't guarantee a rapid response to your query!
+ Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com).
+ It has the same BSD licence as PostgreSQL.
</para>
</sect2>
-</sect1>
+</sect1>