<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
To use it, mention the class name in <command>CREATE INDEX</>,
for example
<programlisting>
-CREATE INDEX ON my_table USING gist (my_inet_column inet_ops);
+CREATE INDEX ON my_table USING GIST (my_inet_column inet_ops);
</programlisting>
</para>
<para>
There are seven methods that an index operator class for
- <acronym>GiST</acronym> must provide, and an eighth that is optional.
+ <acronym>GiST</acronym> must provide, and two that are optional.
Correctness of the index is ensured
by proper implementation of the <function>same</>, <function>consistent</>
and <function>union</> methods, while efficiency (size and speed) of the
of the <command>CREATE OPERATOR CLASS</> command can be used.
The optional eighth method is <function>distance</>, which is needed
if the operator class wishes to support ordered scans (nearest-neighbor
- searches).
+ searches). The optional ninth method <function>fetch</> is needed if the
+ operator class wishes to support index-only scans.
</para>
<variablelist>
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_consistent(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_consistent);
Datum
the value being looked up in the index. The <literal>StrategyNumber</>
parameter indicates which operator of your operator class is being
applied — it matches one of the operator numbers in the
- <command>CREATE OPERATOR CLASS</> command. Depending on what operators
- you have included in the class, the data type of <varname>query</> could
- vary with the operator, but the above skeleton assumes it doesn't.
+ <command>CREATE OPERATOR CLASS</> command.
+ </para>
+
+ <para>
+ Depending on which operators you have included in the class, the data
+ type of <varname>query</> could vary with the operator, since it will
+ be whatever type is on the righthand side of the operator, which might
+ be different from the indexed data type appearing on the lefthand side.
+ (The above code skeleton assumes that only one type is possible; if
+ not, fetching the <varname>query</> argument value would have to depend
+ on the operator.) It is recommended that the SQL declaration of
+ the <function>consistent</> function use the opclass's indexed data
+ type for the <varname>query</> argument, even though the actual type
+ might be something else depending on the operator.
</para>
</listitem>
<programlisting>
CREATE OR REPLACE FUNCTION my_union(internal, internal)
-RETURNS internal
+RETURNS storage_type
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
</programlisting>
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_union(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_union);
Datum
</para>
<para>
- The <function>union</> implementation function should return a
- pointer to newly <function>palloc()</>ed memory. You can't just
- return whatever the input is.
+ The result of the <function>union</> function must be a value of the
+ index's storage type, whatever that is (it might or might not be
+ different from the indexed column's type). The <function>union</>
+ function should return a pointer to newly <function>palloc()</>ed
+ memory. You can't just return the input value as-is, even if there is
+ no type change.
+ </para>
+
+ <para>
+ As shown above, the <function>union</> function's
+ first <type>internal</> argument is actually
+ a <structname>GistEntryVector</> pointer. The second argument is a
+ pointer to an integer variable, which can be ignored. (It used to be
+ required that the <function>union</> function store the size of its
+ result value into that variable, but this is no longer necessary.)
</para>
</listitem>
</varlistentry>
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_compress(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_compress);
Datum
<para>
The reverse of the <function>compress</function> method. Converts the
index representation of the data item into a format that can be
- manipulated by the database.
+ manipulated by the other GiST methods in the operator class.
</para>
<para>
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_decompress(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_decompress);
Datum
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_penalty(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_penalty);
Datum
PG_RETURN_POINTER(penalty);
}
</programlisting>
+
+ For historical reasons, the <function>penalty</> function doesn't
+ just return a <type>float</> result; instead it has to store the value
+ at the location indicated by the third argument. The return
+ value per se is ignored, though it's conventional to pass back the
+ address of that argument.
</para>
<para>
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_picksplit(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_picksplit);
Datum
my_picksplit(PG_FUNCTION_ARGS)
{
GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
+ GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
OffsetNumber maxoff = entryvec->n - 1;
GISTENTRY *ent = entryvec->vector;
- GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
int i,
nbytes;
OffsetNumber *left,
PG_RETURN_POINTER(v);
}
</programlisting>
+
+ Notice that the <function>picksplit</> function's result is delivered
+ by modifying the passed-in <structname>v</> structure. The return
+ value per se is ignored, though it's conventional to pass back the
+ address of <structname>v</>.
</para>
<para>
<listitem>
<para>
Returns true if two index entries are identical, false otherwise.
+ (An <quote>index entry</> is a value of the index's storage type,
+ not necessarily the original indexed column's type.)
</para>
<para>
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
-CREATE OR REPLACE FUNCTION my_same(internal, internal, internal)
+CREATE OR REPLACE FUNCTION my_same(storage_type, storage_type, internal)
RETURNS internal
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_same(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_same);
Datum
For historical reasons, the <function>same</> function doesn't
just return a Boolean result; instead it has to store the flag
- at the location indicated by the third argument.
+ at the location indicated by the third argument. The return
+ value per se is ignored, though it's conventional to pass back the
+ address of that argument.
</para>
</listitem>
</varlistentry>
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
-CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
+CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
And the matching code in the C module could then follow this skeleton:
<programlisting>
-Datum my_distance(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_distance);
Datum
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
</programlisting>
The arguments to the <function>distance</> function are identical to
- the arguments of the <function>consistent</> function, except that no
- recheck flag is used. The distance to a leaf index entry must always
- be determined exactly, since there is no way to re-order the tuples
- once they are returned. Some approximation is allowed when determining
- the distance to an internal tree node, so long as the result is never
- greater than any child's actual distance. Thus, for example, distance
- to a bounding box is usually sufficient in geometric applications. The
- result value can be any finite <type>float8</> value. (Infinity and
- minus infinity are used internally to handle cases such as nulls, so it
- is not recommended that <function>distance</> functions return these
- values.)
+ the arguments of the <function>consistent</> function.
+ </para>
+
+ <para>
+ Some approximation is allowed when determining the distance, so long
+ as the result is never greater than the entry's actual distance. Thus,
+ for example, distance to a bounding box is usually sufficient in
+ geometric applications. For an internal tree node, the distance
+ returned must not be greater than the distance to any of the child
+ nodes. If the returned distance is not exact, the function must set
+ <literal>*recheck</> to true. (This is not necessary for internal tree
+ nodes; for them, the calculation is always assumed to be inexact.) In
+ this case the executor will calculate the accurate distance after
+ fetching the tuple from the heap, and reorder the tuples if necessary.
+ </para>
+
+ <para>
+ If the distance function returns <literal>*recheck = true</> for any
+ leaf node, the original ordering operator's return type must
+ be <type>float8</> or <type>float4</>, and the distance function's
+ result values must be comparable to those of the original ordering
+ operator, since the executor will sort using both distance function
+ results and recalculated ordering-operator results. Otherwise, the
+ distance function's result values can be any finite <type>float8</>
+ values, so long as the relative order of the result values matches the
+ order returned by the ordering operator. (Infinity and minus infinity
+ are used internally to handle cases such as nulls, so it is not
+ recommended that <function>distance</> functions return these values.)
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><function>fetch</></term>
+ <listitem>
+ <para>
+ Converts the compressed index representation of a data item into the
+ original data type, for index-only scans. The returned data must be an
+ exact, non-lossy copy of the originally indexed value.
+ </para>
+
+ <para>
+ The <acronym>SQL</> declaration of the function must look like this:
+
+<programlisting>
+CREATE OR REPLACE FUNCTION my_fetch(internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+</programlisting>
+
+ The argument is a pointer to a <structname>GISTENTRY</> struct. On
+ entry, its <structfield>key</> field contains a non-NULL leaf datum in
+ compressed form. The return value is another <structname>GISTENTRY</>
+ struct, whose <structfield>key</> field contains the same datum in its
+ original, uncompressed form. If the opclass's compress function does
+ nothing for leaf entries, the <function>fetch</> method can return the
+ argument as-is.
+ </para>
+
+ <para>
+ The matching code in the C module could then follow this skeleton:
+
+<programlisting>
+PG_FUNCTION_INFO_V1(my_fetch);
+
+Datum
+my_fetch(PG_FUNCTION_ARGS)
+{
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ input_data_type *in = DatumGetP(entry->key);
+ fetched_data_type *fetched_data;
+ GISTENTRY *retval;
+
+ retval = palloc(sizeof(GISTENTRY));
+ fetched_data = palloc(sizeof(fetched_data_type));
+
+ /*
+ * Convert 'fetched_data' into the a Datum of the original datatype.
+ */
+
+ /* fill *retval from fetch_data. */
+ gistentryinit(*retval, PointerGetDatum(converted_datum),
+ entry->rel, entry->page, entry->offset, FALSE);
+
+ PG_RETURN_POINTER(retval);
+}
+</programlisting>
+ </para>
+
+ <para>
+ If the compress method is lossy for leaf entries, the operator class
+ cannot support index-only scans, and must not define
+ a <function>fetch</> function.
+ </para>
+
+ </listitem>
+ </varlistentry>
</variablelist>
<para>