-<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.11 2007/02/16 03:50:29 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.12 2007/11/13 23:36:26 tgl Exp $ -->
<chapter id="GIN">
<title>GIN Indexes</title>
<para>
The <productname>PostgreSQL</productname> source distribution includes
- <acronym>GIN</acronym> classes for one-dimensional arrays of all internal
- types. The following
+ <acronym>GIN</acronym> operator classes for <type>tsvector</> and
+ for one-dimensional arrays of all internal types. The following
<filename>contrib</> modules also contain <acronym>GIN</acronym>
operator classes:
</para>
<variablelist>
+ <varlistentry>
+ <term>hstore</term>
+ <listitem>
+ <para>Module for storing (key, value) pairs</para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term>intarray</term>
<listitem>
</varlistentry>
<varlistentry>
- <term>tsearch2</term>
+ <term>pg_trgm</term>
<listitem>
- <para>Support for inverted text indexing. This is much faster for very
- large, mostly-static sets of documents.
- </para>
+ <para>Text similarity using trigram matching</para>
</listitem>
</varlistentry>
</variablelist>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.28 2007/01/31 20:56:17 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.29 2007/11/13 23:36:26 tgl Exp $ -->
<chapter id="GiST">
<title>GiST Indexes</title>
<para>
The <productname>PostgreSQL</productname> source distribution includes
several examples of index methods implemented using
- <acronym>GiST</acronym>. The core system currently provides R-Tree
- equivalent functionality for some of the built-in geometric data types
+ <acronym>GiST</acronym>. The core system currently provides text search
+ support (indexing for <type>tsvector</> and <type>tsquery</>) as well as
+ R-Tree equivalent functionality for some of the built-in geometric data types
(see <filename>src/backend/access/gist/gistproc.c</>). The following
<filename>contrib</> modules also contain <acronym>GiST</acronym>
operator classes:
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>hstore</term>
+ <listitem>
+ <para>Module for storing (key, value) pairs</para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term>intarray</term>
<listitem>
<para>Indexing for <quote>float ranges</quote></para>
</listitem>
</varlistentry>
-
- <varlistentry>
- <term>tsearch2</term>
- <listitem>
- <para>Full text indexing</para>
- </listitem>
- </varlistentry>
</variablelist>
</sect1>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.71 2007/04/06 22:33:41 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.72 2007/11/13 23:36:26 tgl Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
(See <xref linkend="functions-geometry"> for the meaning of
these operators.)
- Also, an <literal>IS NULL</> condition on
- an index column can be used with a GiST index.
Many other GiST operator
classes are available in the <literal>contrib</> collection or as separate
projects. For more information see <xref linkend="GiST">.
(See <xref linkend="functions-array"> for the meaning of
these operators.)
- GIN indexes cannot use <literal>IS NULL</> as a search condition.
- Other GIN operator classes are available in the <literal>contrib</>
- <literal>tsearch2</literal> and <literal>intarray</literal> modules.
- For more information see <xref linkend="GIN">.
+ Many other GIN operator
+ classes are available in the <literal>contrib</> collection or as separate
+ projects. For more information see <xref linkend="GIN">.
</para>
</sect1>
</sect2>
<sect2>
- <title>Tsearch2 Integration</title>
+ <title>Text Search Integration</title>
<para>
Trigram matching is a very useful tool when used in conjunction
- with a text index created by the Tsearch2 contrib module. (See
- contrib/tsearch2)
+ with a full text index.
</para>
<para>
The first step is to generate an auxiliary table containing all
- the unique words in the Tsearch2 index:
+ the unique words in the documents:
</para>
<programlisting>
CREATE TABLE words AS SELECT word FROM
stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
</programlisting>
<para>
- Where 'documents' is a table that has a text field 'bodytext'
- that TSearch2 is used to search. The use of the 'simple' dictionary
- with the to_tsvector function, instead of just using the already
+ where <structname>documents</> is a table that has a text field
+ <structfield>bodytext</> that we wish to search. The use of the
+ <literal>simple</> configuration with the <function>to_tsvector</>
+ function, instead of just using the already
existing vector is to avoid creating a list of already stemmed
words. This way, only the original, unstemmed words are added
to the word list.
<para>
<note>
<para>
- Since the 'words' table has been generated as a separate,
+ Since the <structname>words</> table has been generated as a separate,
static table, it will need to be periodically regenerated so that
- it remains up to date with the word list in the Tsearch2 index.
+ it remains up to date with the document collection.
</para>
</note>
</para>
<sect2>
<title>References</title>
- <para>
- Tsearch2 Development Site
- <ulink url="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/"></ulink>
- </para>
<para>
GiST Development Site
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/"></ulink>
</para>
+ <para>
+ Tsearch2 Development Site
+ <ulink url="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/"></ulink>
+ </para>
</sect2>
<sect2>
* User-defined opclasses. (The scheme is similar to GiST.)
* Optimized index creation (Makes use of maintenance_work_mem to accumulate
postings in memory.)
- * Tsearch2 support via an opclass
+ * Text search support via an opclass
* Soft upper limit on the returned results set using a GUC variable:
gin_fuzzy_search_limit