-<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.3 2006/09/14 21:15:07 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.4 2006/09/18 12:11:36 teodor Exp $ -->
<chapter id="GIN">
<title>GIN Indexes</title>
<para>
<acronym>GIN</acronym> stands for Generalized Inverted Index. It is
an index structure storing a set of (key, posting list) pairs, where
- 'posting list' is a set of rows in which the key occurs. The
+ 'posting list' is a set of rows in which the key occurs. Each
row may contain many keys.
</para>
<listitem>
<para>
Returns an array of keys of the query to be executed. n contains
- strategy number of operation (see <xref linkend="xindex-strategies">).
+ the strategy number of the operation
+ (see <xref linkend="xindex-strategies">).
Depending on n, query may be different type.
</para>
</listitem>
<term>bool consistent( bool check[], StrategyNumber n, Datum query)</term>
<listitem>
<para>
- Returns TRUE if indexed value satisfies query qualifier with strategy n
- (or may satisfy in case of RECHECK mark in operator class).
- Each element of the check array is TRUE if indexed value has a
+ Returns TRUE if the indexed value satisfies the query qualifier with
+ strategy n (or may satisfy in case of RECHECK mark in operator class).
+ Each element of the check array is TRUE if the indexed value has a
corresponding key in the query: if (check[i] == TRUE ) the i-th key of
the query is present in the indexed value.
</para>
<term>Create vs insert</term>
<listitem>
<para>
- In most cases, insertion into <acronym>GIN</acronym> index is slow because
- many GIN keys may be inserted for each table row. So, when loading data
- in bulk it may be useful to drop index and recreate it
- after the data is loaded in the table.
+ In most cases, insertion into <acronym>GIN</acronym> index is slow
+ due to the likelihood of many keys being inserted for each value.
+ So, for bulk insertions into a table it is advisable to to drop the GIN
+ index and recreate it after finishing bulk insertion.
</para>
</listitem>
</varlistentry>
<term>gin_fuzzy_search_limit</term>
<listitem>
<para>
- The primary goal of development <acronym>GIN</acronym> indices was
+ The primary goal of developing <acronym>GIN</acronym> indices was
support for highly scalable, full-text search in
<productname>PostgreSQL</productname> and there are often situations when
a full-text search returns a very large set of results. Since reading
<para>
Such queries usually contain very frequent words, so the results are not
very helpful. To facilitate execution of such queries
- <acronym>GIN</acronym> has a configurable soft upper limit of the size
+ <acronym>GIN</acronym> has a configurable soft upper limit of the size
of the returned set, determined by the
<varname>gin_fuzzy_search_limit</varname> GUC variable. It is set to 0 by
default (no limit).
<title>Limitations</title>
<para>
- <acronym>GIN</acronym> doesn't support full scan of index due to it's
- extremely inefficiency: because of a lot of keys per value,
+ <acronym>GIN</acronym> doesn't support full index scans due to their
+ extremely inefficiency: because there are often many keys per value,
each heap pointer will returned several times.
</para>
<para>
- When extractQuery returns zero number of keys, <acronym>GIN</acronym> will
- emit a error: for different opclass and strategy semantic meaning of void
- query may be different (for example, any array contains void array,
- but they aren't overlapped with void one), and <acronym>GIN</acronym> can't
+ When extractQuery returns zero keys, <acronym>GIN</acronym> will emit a
+ error: for different opclasses and strategies the semantic meaning of a void
+ query may be different (for example, any array contains the void array,
+ but they don't overlap the void array), and <acronym>GIN</acronym> can't
suggest reasonable answer.
</para>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.63 2006/09/16 00:30:14 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.64 2006/09/18 12:11:36 teodor Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
<see>index</see>
</indexterm>
GIN is a inverted index and it's usable for values which have more
- than one key, arrays for example. Like to GiST, GIN may support
+ than one key, arrays for example. Like GiST, GIN may support
many different user-defined indexing strategies and the particular
operators with which a GIN index can be used vary depending on the
indexing strategy.
(See <xref linkend="functions-array"> for the meaning of
these operators.)
- Another GIN operator classes are available in the <literal>contrib</>
+ Other GIN operator classes are available in the <literal>contrib</>
tsearch2 and intarray modules. For more information see <xref linkend="GIN">.
</para>
</sect1>
-<!-- $PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.61 2006/09/17 22:50:31 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.62 2006/09/18 12:11:36 teodor Exp $ -->
<chapter id="mvcc">
<title>Concurrency Control</title>
</term>
<listitem>
<para>
- Short-term share/exclusive page-level locks are used for
- read/write access. Locks are released immediately after each
- index row is fetched or inserted. However, note that a GIN index
- usually requires several inserts for each table row.
+ Short-term share/exclusive page-level locks are used for
+ read/write access. Locks are released immediately after each
+ index row is fetched or inserted. But note that a GIN-indexed
+ value insertion usually produces several index key insertions
+ per row, so GIN may do substantial work for a single value's
+ insertion.
</para>
</listitem>
</varlistentry>