1 <!-- doc/src/sgml/spgist.sgml -->
4 <title>SP-GiST Indexes</title>
7 <primary>index</primary>
8 <secondary>SP-GiST</secondary>
11 <sect1 id="spgist-intro">
12 <title>Introduction</title>
15 <acronym>SP-GiST</acronym> is an abbreviation for space-partitioned
16 <acronym>GiST</acronym>. <acronym>SP-GiST</acronym> supports partitioned
17 search trees, which facilitate development of a wide range of different
18 non-balanced data structures, such as quad-trees, k-d trees, and radix
19 trees (tries). The common feature of these structures is that they
20 repeatedly divide the search space into partitions that need not be
21 of equal size. Searches that are well matched to the partitioning rule
26 These popular data structures were originally developed for in-memory
27 usage. In main memory, they are usually designed as a set of dynamically
28 allocated nodes linked by pointers. This is not suitable for direct
29 storing on disk, since these chains of pointers can be rather long which
30 would require too many disk accesses. In contrast, disk-based data
31 structures should have a high fanout to minimize I/O. The challenge
32 addressed by <acronym>SP-GiST</acronym> is to map search tree nodes to
33 disk pages in such a way that a search need access only a few disk pages,
34 even if it traverses many nodes.
38 Like <acronym>GiST</acronym>, <acronym>SP-GiST</acronym> is meant to allow
39 the development of custom data types with the appropriate access methods,
40 by an expert in the domain of the data type, rather than a database expert.
44 Some of the information here is derived from Purdue University's
45 SP-GiST Indexing Project
46 <ulink url="http://www.cs.purdue.edu/spgist/">web site</ulink>.
47 The <acronym>SP-GiST</acronym> implementation in
48 <productname>PostgreSQL</productname> is primarily maintained by Teodor
49 Sigaev and Oleg Bartunov, and there is more information on their
50 <!-- URL will be changed -->
51 <ulink url="http://www.sai.msu.su/~megera/wiki/spgist_dev">web site</ulink>.
56 <sect1 id="spgist-builtin-opclasses">
57 <title>Built-in Operator Classes</title>
60 The core <productname>PostgreSQL</> distribution
61 includes the <acronym>SP-GiST</acronym> operator classes shown in
62 <xref linkend="spgist-builtin-opclasses-table">.
65 <table id="spgist-builtin-opclasses-table">
66 <title>Built-in <acronym>SP-GiST</acronym> Operator Classes</title>
71 <entry>Indexed Data Type</entry>
72 <entry>Indexable Operators</entry>
77 <entry><literal>kd_point_ops</></entry>
78 <entry><type>point</></entry>
89 <entry><literal>quad_point_ops</></entry>
90 <entry><type>point</></entry>
101 <entry><literal>range_ops</></entry>
102 <entry>any range type</entry>
104 <literal>&&</>
105 <literal>&<</>
106 <literal>&></>
116 <entry><literal>box_ops</></entry>
117 <entry><type>box</></entry>
120 <literal>&<</>
121 <literal>&&</>
122 <literal>&></>
127 <literal>&<|</>
128 <literal><<|</>
129 <literal>|>></literal>
130 <literal>|&></>
134 <entry><literal>text_ops</></entry>
135 <entry><type>text</></entry>
149 <entry><literal>inet_ops</></entry>
150 <entry><type>inet</>, <type>cidr</></entry>
152 <literal>&&</>
154 <literal>>>=</>
159 <literal><<=</>
170 Of the two operator classes for type <type>point</>,
171 <literal>quad_point_ops</> is the default. <literal>kd_point_ops</>
172 supports the same operators but uses a different index data structure which
173 may offer better performance in some applications.
178 <sect1 id="spgist-extensibility">
179 <title>Extensibility</title>
182 <acronym>SP-GiST</acronym> offers an interface with a high level of
183 abstraction, requiring the access method developer to implement only
184 methods specific to a given data type. The <acronym>SP-GiST</acronym> core
185 is responsible for efficient disk mapping and searching the tree structure.
186 It also takes care of concurrency and logging considerations.
190 Leaf tuples of an <acronym>SP-GiST</acronym> tree contain values of the
191 same data type as the indexed column. Leaf tuples at the root level will
192 always contain the original indexed data value, but leaf tuples at lower
193 levels might contain only a compressed representation, such as a suffix.
194 In that case the operator class support functions must be able to
195 reconstruct the original value using information accumulated from the
196 inner tuples that are passed through to reach the leaf level.
200 Inner tuples are more complex, since they are branching points in the
201 search tree. Each inner tuple contains a set of one or more
202 <firstterm>nodes</>, which represent groups of similar leaf values.
203 A node contains a downlink that leads either to another, lower-level inner
204 tuple, or to a short list of leaf tuples that all lie on the same index page.
205 Each node normally has a <firstterm>label</> that describes it; for example,
206 in a radix tree the node label could be the next character of the string
207 value. (Alternatively, an operator class can omit the node labels, if it
208 works with a fixed set of nodes for all inner tuples;
209 see <xref linkend="spgist-null-labels">.)
210 Optionally, an inner tuple can have a <firstterm>prefix</> value
211 that describes all its members. In a radix tree this could be the common
212 prefix of the represented strings. The prefix value is not necessarily
213 really a prefix, but can be any data needed by the operator class;
214 for example, in a quad-tree it can store the central point that the four
215 quadrants are measured with respect to. A quad-tree inner tuple would
216 then also contain four nodes corresponding to the quadrants around this
221 Some tree algorithms require knowledge of level (or depth) of the current
222 tuple, so the <acronym>SP-GiST</acronym> core provides the possibility for
223 operator classes to manage level counting while descending the tree.
224 There is also support for incrementally reconstructing the represented
225 value when that is needed, and for passing down additional data (called
226 <firstterm>traverse values</>) during a tree descent.
231 The <acronym>SP-GiST</acronym> core code takes care of null entries.
232 Although <acronym>SP-GiST</acronym> indexes do store entries for nulls
233 in indexed columns, this is hidden from the index operator class code:
234 no null index entries or search conditions will ever be passed to the
235 operator class methods. (It is assumed that <acronym>SP-GiST</acronym>
236 operators are strict and so cannot succeed for null values.) Null values
237 are therefore not discussed further here.
242 There are five user-defined methods that an index operator class for
243 <acronym>SP-GiST</acronym> must provide. All five follow the convention
244 of accepting two <type>internal</> arguments, the first of which is a
245 pointer to a C struct containing input values for the support method,
246 while the second argument is a pointer to a C struct where output values
247 must be placed. Four of the methods just return <type>void</>, since
248 all their results appear in the output struct; but
249 <function>leaf_consistent</> additionally returns a <type>boolean</> result.
250 The methods must not modify any fields of their input structs. In all
251 cases, the output struct is initialized to zeroes before calling the
256 The five user-defined methods are:
261 <term><function>config</></term>
264 Returns static information about the index implementation, including
265 the data type OIDs of the prefix and node label data types.
268 The <acronym>SQL</> declaration of the function must look like this:
270 CREATE FUNCTION my_config(internal, internal) RETURNS void ...
272 The first argument is a pointer to a <structname>spgConfigIn</>
273 C struct, containing input data for the function.
274 The second argument is a pointer to a <structname>spgConfigOut</>
275 C struct, which the function must fill with result data.
277 typedef struct spgConfigIn
279 Oid attType; /* Data type to be indexed */
282 typedef struct spgConfigOut
284 Oid prefixType; /* Data type of inner-tuple prefixes */
285 Oid labelType; /* Data type of inner-tuple node labels */
286 bool canReturnData; /* Opclass can reconstruct original data */
287 bool longValuesOK; /* Opclass can cope with values > 1 page */
291 <structfield>attType</> is passed in order to support polymorphic
292 index operator classes; for ordinary fixed-data-type operator classes, it
293 will always have the same value and so can be ignored.
297 For operator classes that do not use prefixes,
298 <structfield>prefixType</> can be set to <literal>VOIDOID</>.
299 Likewise, for operator classes that do not use node labels,
300 <structfield>labelType</> can be set to <literal>VOIDOID</>.
301 <structfield>canReturnData</> should be set true if the operator class
302 is capable of reconstructing the originally-supplied index value.
303 <structfield>longValuesOK</> should be set true only when the
304 <structfield>attType</> is of variable length and the operator
305 class is capable of segmenting long values by repeated suffixing
306 (see <xref linkend="spgist-limits">).
312 <term><function>choose</></term>
315 Chooses a method for inserting a new value into an inner tuple.
319 The <acronym>SQL</> declaration of the function must look like this:
321 CREATE FUNCTION my_choose(internal, internal) RETURNS void ...
323 The first argument is a pointer to a <structname>spgChooseIn</>
324 C struct, containing input data for the function.
325 The second argument is a pointer to a <structname>spgChooseOut</>
326 C struct, which the function must fill with result data.
328 typedef struct spgChooseIn
330 Datum datum; /* original datum to be indexed */
331 Datum leafDatum; /* current datum to be stored at leaf */
332 int level; /* current level (counting from zero) */
334 /* Data from current inner tuple */
335 bool allTheSame; /* tuple is marked all-the-same? */
336 bool hasPrefix; /* tuple has a prefix? */
337 Datum prefixDatum; /* if so, the prefix value */
338 int nNodes; /* number of nodes in the inner tuple */
339 Datum *nodeLabels; /* node label values (NULL if none) */
342 typedef enum spgChooseResultType
344 spgMatchNode = 1, /* descend into existing node */
345 spgAddNode, /* add a node to the inner tuple */
346 spgSplitTuple /* split inner tuple (change its prefix) */
347 } spgChooseResultType;
349 typedef struct spgChooseOut
351 spgChooseResultType resultType; /* action code, see above */
354 struct /* results for spgMatchNode */
356 int nodeN; /* descend to this node (index from 0) */
357 int levelAdd; /* increment level by this much */
358 Datum restDatum; /* new leaf datum */
360 struct /* results for spgAddNode */
362 Datum nodeLabel; /* new node's label */
363 int nodeN; /* where to insert it (index from 0) */
365 struct /* results for spgSplitTuple */
367 /* Info to form new upper-level inner tuple with one child tuple */
368 bool prefixHasPrefix; /* tuple should have a prefix? */
369 Datum prefixPrefixDatum; /* if so, its value */
370 int prefixNNodes; /* number of nodes */
371 Datum *prefixNodeLabels; /* their labels (or NULL for
373 int childNodeN; /* which node gets child tuple */
375 /* Info to form new lower-level inner tuple with all old nodes */
376 bool postfixHasPrefix; /* tuple should have a prefix? */
377 Datum postfixPrefixDatum; /* if so, its value */
383 <structfield>datum</> is the original datum that was to be inserted
385 <structfield>leafDatum</> is initially the same as
386 <structfield>datum</>, but can change at lower levels of the tree
387 if the <function>choose</function> or <function>picksplit</function>
388 methods change it. When the insertion search reaches a leaf page,
389 the current value of <structfield>leafDatum</> is what will be stored
390 in the newly created leaf tuple.
391 <structfield>level</> is the current inner tuple's level, starting at
392 zero for the root level.
393 <structfield>allTheSame</> is true if the current inner tuple is
394 marked as containing multiple equivalent nodes
395 (see <xref linkend="spgist-all-the-same">).
396 <structfield>hasPrefix</> is true if the current inner tuple contains
398 <structfield>prefixDatum</> is its value.
399 <structfield>nNodes</> is the number of child nodes contained in the
401 <structfield>nodeLabels</> is an array of their label values, or
402 NULL if there are no labels.
406 The <function>choose</function> function can determine either that
407 the new value matches one of the existing child nodes, or that a new
408 child node must be added, or that the new value is inconsistent with
409 the tuple prefix and so the inner tuple must be split to create a
410 less restrictive prefix.
414 If the new value matches one of the existing child nodes,
415 set <structfield>resultType</> to <literal>spgMatchNode</>.
416 Set <structfield>nodeN</> to the index (from zero) of that node in
418 Set <structfield>levelAdd</> to the increment in
419 <structfield>level</> caused by descending through that node,
420 or leave it as zero if the operator class does not use levels.
421 Set <structfield>restDatum</> to equal <structfield>datum</>
422 if the operator class does not modify datums from one level to the
423 next, or otherwise set it to the modified value to be used as
424 <structfield>leafDatum</> at the next level.
428 If a new child node must be added,
429 set <structfield>resultType</> to <literal>spgAddNode</>.
430 Set <structfield>nodeLabel</> to the label to be used for the new
431 node, and set <structfield>nodeN</> to the index (from zero) at which
432 to insert the node in the node array.
433 After the node has been added, the <function>choose</function>
434 function will be called again with the modified inner tuple;
435 that call should result in an <literal>spgMatchNode</> result.
439 If the new value is inconsistent with the tuple prefix,
440 set <structfield>resultType</> to <literal>spgSplitTuple</>.
441 This action moves all the existing nodes into a new lower-level
442 inner tuple, and replaces the existing inner tuple with a tuple
443 having a single downlink pointing to the new lower-level inner tuple.
444 Set <structfield>prefixHasPrefix</> to indicate whether the new
445 upper tuple should have a prefix, and if so set
446 <structfield>prefixPrefixDatum</> to the prefix value. This new
447 prefix value must be sufficiently less restrictive than the original
448 to accept the new value to be indexed.
449 Set <structfield>prefixNNodes</> to the number of nodes needed in the
450 new tuple, and set <structfield>prefixNodeLabels</> to a palloc'd array
451 holding their labels, or to NULL if node labels are not required.
452 Note that the total size of the new upper tuple must be no more
453 than the total size of the tuple it is replacing; this constrains
454 the lengths of the new prefix and new labels.
455 Set <structfield>childNodeN</> to the index (from zero) of the node
456 that will downlink to the new lower-level inner tuple.
457 Set <structfield>postfixHasPrefix</> to indicate whether the new
458 lower-level inner tuple should have a prefix, and if so set
459 <structfield>postfixPrefixDatum</> to the prefix value. The
460 combination of these two prefixes and the downlink node's label
461 (if any) must have the same meaning as the original prefix, because
462 there is no opportunity to alter the node labels that are moved to
463 the new lower-level tuple, nor to change any child index entries.
464 After the node has been split, the <function>choose</function>
465 function will be called again with the replacement inner tuple.
466 That call may return an <literal>spgAddNode</> result, if no suitable
467 node was created by the <literal>spgSplitTuple</> action. Eventually
468 <function>choose</function> must return <literal>spgMatchNode</> to
469 allow the insertion to descend to the next level.
475 <term><function>picksplit</></term>
478 Decides how to create a new inner tuple over a set of leaf tuples.
482 The <acronym>SQL</> declaration of the function must look like this:
484 CREATE FUNCTION my_picksplit(internal, internal) RETURNS void ...
486 The first argument is a pointer to a <structname>spgPickSplitIn</>
487 C struct, containing input data for the function.
488 The second argument is a pointer to a <structname>spgPickSplitOut</>
489 C struct, which the function must fill with result data.
491 typedef struct spgPickSplitIn
493 int nTuples; /* number of leaf tuples */
494 Datum *datums; /* their datums (array of length nTuples) */
495 int level; /* current level (counting from zero) */
498 typedef struct spgPickSplitOut
500 bool hasPrefix; /* new inner tuple should have a prefix? */
501 Datum prefixDatum; /* if so, its value */
503 int nNodes; /* number of nodes for new inner tuple */
504 Datum *nodeLabels; /* their labels (or NULL for no labels) */
506 int *mapTuplesToNodes; /* node index for each leaf tuple */
507 Datum *leafTupleDatums; /* datum to store in each new leaf tuple */
511 <structfield>nTuples</> is the number of leaf tuples provided.
512 <structfield>datums</> is an array of their datum values.
513 <structfield>level</> is the current level that all the leaf tuples
514 share, which will become the level of the new inner tuple.
518 Set <structfield>hasPrefix</> to indicate whether the new inner
519 tuple should have a prefix, and if so set
520 <structfield>prefixDatum</> to the prefix value.
521 Set <structfield>nNodes</> to indicate the number of nodes that
522 the new inner tuple will contain, and
523 set <structfield>nodeLabels</> to an array of their label values,
524 or to NULL if node labels are not required.
525 Set <structfield>mapTuplesToNodes</> to an array that gives the index
526 (from zero) of the node that each leaf tuple should be assigned to.
527 Set <structfield>leafTupleDatums</> to an array of the values to
528 be stored in the new leaf tuples (these will be the same as the
529 input <structfield>datums</> if the operator class does not modify
530 datums from one level to the next).
531 Note that the <function>picksplit</> function is
532 responsible for palloc'ing the
533 <structfield>nodeLabels</>, <structfield>mapTuplesToNodes</> and
534 <structfield>leafTupleDatums</> arrays.
538 If more than one leaf tuple is supplied, it is expected that the
539 <function>picksplit</> function will classify them into more than
540 one node; otherwise it is not possible to split the leaf tuples
541 across multiple pages, which is the ultimate purpose of this
542 operation. Therefore, if the <function>picksplit</> function
543 ends up placing all the leaf tuples in the same node, the core
544 SP-GiST code will override that decision and generate an inner
545 tuple in which the leaf tuples are assigned at random to several
546 identically-labeled nodes. Such a tuple is marked
547 <literal>allTheSame</> to signify that this has happened. The
548 <function>choose</> and <function>inner_consistent</> functions
549 must take suitable care with such inner tuples.
550 See <xref linkend="spgist-all-the-same"> for more information.
554 <function>picksplit</> can be applied to a single leaf tuple only
555 in the case that the <function>config</> function set
556 <structfield>longValuesOK</> to true and a larger-than-a-page input
557 value has been supplied. In this case the point of the operation is
558 to strip off a prefix and produce a new, shorter leaf datum value.
559 The call will be repeated until a leaf datum short enough to fit on
560 a page has been produced. See <xref linkend="spgist-limits"> for
567 <term><function>inner_consistent</></term>
570 Returns set of nodes (branches) to follow during tree search.
574 The <acronym>SQL</> declaration of the function must look like this:
576 CREATE FUNCTION my_inner_consistent(internal, internal) RETURNS void ...
578 The first argument is a pointer to a <structname>spgInnerConsistentIn</>
579 C struct, containing input data for the function.
580 The second argument is a pointer to a <structname>spgInnerConsistentOut</>
581 C struct, which the function must fill with result data.
584 typedef struct spgInnerConsistentIn
586 ScanKey scankeys; /* array of operators and comparison values */
587 int nkeys; /* length of array */
589 Datum reconstructedValue; /* value reconstructed at parent */
590 void *traversalValue; /* opclass-specific traverse value */
591 MemoryContext traversalMemoryContext; /* put new traverse values here */
592 int level; /* current level (counting from zero) */
593 bool returnData; /* original data must be returned? */
595 /* Data from current inner tuple */
596 bool allTheSame; /* tuple is marked all-the-same? */
597 bool hasPrefix; /* tuple has a prefix? */
598 Datum prefixDatum; /* if so, the prefix value */
599 int nNodes; /* number of nodes in the inner tuple */
600 Datum *nodeLabels; /* node label values (NULL if none) */
601 } spgInnerConsistentIn;
603 typedef struct spgInnerConsistentOut
605 int nNodes; /* number of child nodes to be visited */
606 int *nodeNumbers; /* their indexes in the node array */
607 int *levelAdds; /* increment level by this much for each */
608 Datum *reconstructedValues; /* associated reconstructed values */
609 void **traversalValues; /* opclass-specific traverse values */
610 } spgInnerConsistentOut;
613 The array <structfield>scankeys</>, of length <structfield>nkeys</>,
614 describes the index search condition(s). These conditions are
615 combined with AND — only index entries that satisfy all of
616 them are interesting. (Note that <structfield>nkeys</> = 0 implies
617 that all index entries satisfy the query.) Usually the consistent
618 function only cares about the <structfield>sk_strategy</> and
619 <structfield>sk_argument</> fields of each array entry, which
620 respectively give the indexable operator and comparison value.
621 In particular it is not necessary to check <structfield>sk_flags</> to
622 see if the comparison value is NULL, because the SP-GiST core code
623 will filter out such conditions.
624 <structfield>reconstructedValue</> is the value reconstructed for the
625 parent tuple; it is <literal>(Datum) 0</> at the root level or if the
626 <function>inner_consistent</> function did not provide a value at the
628 <structfield>traversalValue</> is a pointer to any traverse data
629 passed down from the previous call of <function>inner_consistent</>
630 on the parent index tuple, or NULL at the root level.
631 <structfield>traversalMemoryContext</> is the memory context in which
632 to store output traverse values (see below).
633 <structfield>level</> is the current inner tuple's level, starting at
634 zero for the root level.
635 <structfield>returnData</> is <literal>true</> if reconstructed data is
636 required for this query; this will only be so if the
637 <function>config</> function asserted <structfield>canReturnData</>.
638 <structfield>allTheSame</> is true if the current inner tuple is
639 marked <quote>all-the-same</>; in this case all the nodes have the
640 same label (if any) and so either all or none of them match the query
641 (see <xref linkend="spgist-all-the-same">).
642 <structfield>hasPrefix</> is true if the current inner tuple contains
644 <structfield>prefixDatum</> is its value.
645 <structfield>nNodes</> is the number of child nodes contained in the
647 <structfield>nodeLabels</> is an array of their label values, or
648 NULL if the nodes do not have labels.
652 <structfield>nNodes</> must be set to the number of child nodes that
653 need to be visited by the search, and
654 <structfield>nodeNumbers</> must be set to an array of their indexes.
655 If the operator class keeps track of levels, set
656 <structfield>levelAdds</> to an array of the level increments
657 required when descending to each node to be visited. (Often these
658 increments will be the same for all the nodes, but that's not
659 necessarily so, so an array is used.)
660 If value reconstruction is needed, set
661 <structfield>reconstructedValues</> to an array of the values
662 reconstructed for each child node to be visited; otherwise, leave
663 <structfield>reconstructedValues</> as NULL.
664 If it is desired to pass down additional out-of-band information
665 (<quote>traverse values</>) to lower levels of the tree search,
666 set <structfield>traversalValues</> to an array of the appropriate
667 traverse values, one for each child node to be visited; otherwise,
668 leave <structfield>traversalValues</> as NULL.
669 Note that the <function>inner_consistent</> function is
670 responsible for palloc'ing the
671 <structfield>nodeNumbers</>, <structfield>levelAdds</>,
672 <structfield>reconstructedValues</>, and
673 <structfield>traversalValues</> arrays in the current memory context.
674 However, any output traverse values pointed to by
675 the <structfield>traversalValues</> array should be allocated
676 in <structfield>traversalMemoryContext</>.
677 Each traverse value must be a single palloc'd chunk.
683 <term><function>leaf_consistent</></term>
686 Returns true if a leaf tuple satisfies a query.
690 The <acronym>SQL</> declaration of the function must look like this:
692 CREATE FUNCTION my_leaf_consistent(internal, internal) RETURNS bool ...
694 The first argument is a pointer to a <structname>spgLeafConsistentIn</>
695 C struct, containing input data for the function.
696 The second argument is a pointer to a <structname>spgLeafConsistentOut</>
697 C struct, which the function must fill with result data.
699 typedef struct spgLeafConsistentIn
701 ScanKey scankeys; /* array of operators and comparison values */
702 int nkeys; /* length of array */
704 Datum reconstructedValue; /* value reconstructed at parent */
705 void *traversalValue; /* opclass-specific traverse value */
706 int level; /* current level (counting from zero) */
707 bool returnData; /* original data must be returned? */
709 Datum leafDatum; /* datum in leaf tuple */
710 } spgLeafConsistentIn;
712 typedef struct spgLeafConsistentOut
714 Datum leafValue; /* reconstructed original data, if any */
715 bool recheck; /* set true if operator must be rechecked */
716 } spgLeafConsistentOut;
719 The array <structfield>scankeys</>, of length <structfield>nkeys</>,
720 describes the index search condition(s). These conditions are
721 combined with AND — only index entries that satisfy all of
722 them satisfy the query. (Note that <structfield>nkeys</> = 0 implies
723 that all index entries satisfy the query.) Usually the consistent
724 function only cares about the <structfield>sk_strategy</> and
725 <structfield>sk_argument</> fields of each array entry, which
726 respectively give the indexable operator and comparison value.
727 In particular it is not necessary to check <structfield>sk_flags</> to
728 see if the comparison value is NULL, because the SP-GiST core code
729 will filter out such conditions.
730 <structfield>reconstructedValue</> is the value reconstructed for the
731 parent tuple; it is <literal>(Datum) 0</> at the root level or if the
732 <function>inner_consistent</> function did not provide a value at the
734 <structfield>traversalValue</> is a pointer to any traverse data
735 passed down from the previous call of <function>inner_consistent</>
736 on the parent index tuple, or NULL at the root level.
737 <structfield>level</> is the current leaf tuple's level, starting at
738 zero for the root level.
739 <structfield>returnData</> is <literal>true</> if reconstructed data is
740 required for this query; this will only be so if the
741 <function>config</> function asserted <structfield>canReturnData</>.
742 <structfield>leafDatum</> is the key value stored in the current
747 The function must return <literal>true</> if the leaf tuple matches the
748 query, or <literal>false</> if not. In the <literal>true</> case,
749 if <structfield>returnData</> is <literal>true</> then
750 <structfield>leafValue</> must be set to the value originally supplied
751 to be indexed for this leaf tuple. Also,
752 <structfield>recheck</> may be set to <literal>true</> if the match
753 is uncertain and so the operator(s) must be re-applied to the actual
754 heap tuple to verify the match.
761 All the SP-GiST support methods are normally called in a short-lived
762 memory context; that is, <varname>CurrentMemoryContext</> will be reset
763 after processing of each tuple. It is therefore not very important to
764 worry about pfree'ing everything you palloc. (The <function>config</>
765 method is an exception: it should try to avoid leaking memory. But
766 usually the <function>config</> method need do nothing but assign
767 constants into the passed parameter struct.)
771 If the indexed column is of a collatable data type, the index collation
772 will be passed to all the support methods, using the standard
773 <function>PG_GET_COLLATION()</> mechanism.
778 <sect1 id="spgist-implementation">
779 <title>Implementation</title>
782 This section covers implementation details and other tricks that are
783 useful for implementers of <acronym>SP-GiST</acronym> operator classes to
787 <sect2 id="spgist-limits">
788 <title>SP-GiST Limits</title>
791 Individual leaf tuples and inner tuples must fit on a single index page
792 (8kB by default). Therefore, when indexing values of variable-length
793 data types, long values can only be supported by methods such as radix
794 trees, in which each level of the tree includes a prefix that is short
795 enough to fit on a page, and the final leaf level includes a suffix also
796 short enough to fit on a page. The operator class should set
797 <structfield>longValuesOK</> to TRUE only if it is prepared to arrange for
798 this to happen. Otherwise, the <acronym>SP-GiST</acronym> core will
799 reject any request to index a value that is too large to fit
804 Likewise, it is the operator class's responsibility that inner tuples
805 do not grow too large to fit on an index page; this limits the number
806 of child nodes that can be used in one inner tuple, as well as the
807 maximum size of a prefix value.
811 Another limitation is that when an inner tuple's node points to a set
812 of leaf tuples, those tuples must all be in the same index page.
813 (This is a design decision to reduce seeking and save space in the
814 links that chain such tuples together.) If the set of leaf tuples
815 grows too large for a page, a split is performed and an intermediate
816 inner tuple is inserted. For this to fix the problem, the new inner
817 tuple <emphasis>must</> divide the set of leaf values into more than one
818 node group. If the operator class's <function>picksplit</> function
819 fails to do that, the <acronym>SP-GiST</acronym> core resorts to
820 extraordinary measures described in <xref linkend="spgist-all-the-same">.
824 <sect2 id="spgist-null-labels">
825 <title>SP-GiST Without Node Labels</title>
828 Some tree algorithms use a fixed set of nodes for each inner tuple;
829 for example, in a quad-tree there are always exactly four nodes
830 corresponding to the four quadrants around the inner tuple's centroid
831 point. In such a case the code typically works with the nodes by
832 number, and there is no need for explicit node labels. To suppress
833 node labels (and thereby save some space), the <function>picksplit</>
834 function can return NULL for the <structfield>nodeLabels</> array,
835 and likewise the <function>choose</> function can return NULL for
836 the <structfield>prefixNodeLabels</> array during
837 a <literal>spgSplitTuple</> action.
838 This will in turn result in <structfield>nodeLabels</> being NULL during
839 subsequent calls to <function>choose</> and <function>inner_consistent</>.
840 In principle, node labels could be used for some inner tuples and omitted
841 for others in the same index.
845 When working with an inner tuple having unlabeled nodes, it is an error
846 for <function>choose</> to return <literal>spgAddNode</>, since the set
847 of nodes is supposed to be fixed in such cases.
851 <sect2 id="spgist-all-the-same">
852 <title><quote>All-the-same</> Inner Tuples</title>
855 The <acronym>SP-GiST</acronym> core can override the results of the
856 operator class's <function>picksplit</> function when
857 <function>picksplit</> fails to divide the supplied leaf values into
858 at least two node categories. When this happens, the new inner tuple
859 is created with multiple nodes that each have the same label (if any)
860 that <function>picksplit</> gave to the one node it did use, and the
861 leaf values are divided at random among these equivalent nodes.
862 The <literal>allTheSame</> flag is set on the inner tuple to warn the
863 <function>choose</> and <function>inner_consistent</> functions that the
864 tuple does not have the node set that they might otherwise expect.
868 When dealing with an <literal>allTheSame</> tuple, a <function>choose</>
869 result of <literal>spgMatchNode</> is interpreted to mean that the new
870 value can be assigned to any of the equivalent nodes; the core code will
871 ignore the supplied <structfield>nodeN</> value and descend into one
872 of the nodes at random (so as to keep the tree balanced). It is an
873 error for <function>choose</> to return <literal>spgAddNode</>, since
874 that would make the nodes not all equivalent; the
875 <literal>spgSplitTuple</> action must be used if the value to be inserted
876 doesn't match the existing nodes.
880 When dealing with an <literal>allTheSame</> tuple, the
881 <function>inner_consistent</> function should return either all or none
882 of the nodes as targets for continuing the index search, since they are
883 all equivalent. This may or may not require any special-case code,
884 depending on how much the <function>inner_consistent</> function normally
885 assumes about the meaning of the nodes.
891 <sect1 id="spgist-examples">
892 <title>Examples</title>
895 The <productname>PostgreSQL</productname> source distribution includes
896 several examples of index operator classes for <acronym>SP-GiST</acronym>,
897 as described in <xref linkend="spgist-builtin-opclasses-table">. Look
898 into <filename>src/backend/access/spgist/</>
899 and <filename>src/backend/utils/adt/</> to see the code.