1 <!-- doc/src/sgml/xindex.sgml -->
4 <title>Interfacing Extensions to Indexes</title>
6 <indexterm zone="xindex">
7 <primary>index</primary>
8 <secondary>for user-defined data type</secondary>
12 The procedures described thus far let you define new types, new
13 functions, and new operators. However, we cannot yet define an
14 index on a column of a new data type. To do this, we must define an
15 <firstterm>operator class</firstterm> for the new data type. Later in this
16 section, we will illustrate this concept in an example: a new
17 operator class for the B-tree index method that stores and sorts
18 complex numbers in ascending absolute value order.
22 Operator classes can be grouped into <firstterm>operator families</firstterm>
23 to show the relationships between semantically compatible classes.
24 When only a single data type is involved, an operator class is sufficient,
25 so we'll focus on that case first and then return to operator families.
28 <sect2 id="xindex-opclass">
29 <title>Index Methods and Operator Classes</title>
32 The <classname>pg_am</classname> table contains one row for every
33 index method (internally known as access method). Support for
34 regular access to tables is built into
35 <productname>PostgreSQL</productname>, but all index methods are
36 described in <classname>pg_am</classname>. It is possible to add a
37 new index access method by writing the necessary code and
38 then creating an entry in <classname>pg_am</classname> — but that is
39 beyond the scope of this chapter (see <xref linkend="indexam"/>).
43 The routines for an index method do not directly know anything
44 about the data types that the index method will operate on.
45 Instead, an <firstterm>operator
46 class</firstterm><indexterm><primary>operator class</primary></indexterm>
47 identifies the set of operations that the index method needs to use
48 to work with a particular data type. Operator classes are so
49 called because one thing they specify is the set of
50 <literal>WHERE</literal>-clause operators that can be used with an index
51 (i.e., can be converted into an index-scan qualification). An
52 operator class can also specify some <firstterm>support
53 function</firstterm> that are needed by the internal operations of the
54 index method, but do not directly correspond to any
55 <literal>WHERE</literal>-clause operator that can be used with the index.
59 It is possible to define multiple operator classes for the same
60 data type and index method. By doing this, multiple
61 sets of indexing semantics can be defined for a single data type.
62 For example, a B-tree index requires a sort ordering to be defined
63 for each data type it works on.
64 It might be useful for a complex-number data type
65 to have one B-tree operator class that sorts the data by complex
66 absolute value, another that sorts by real part, and so on.
67 Typically, one of the operator classes will be deemed most commonly
68 useful and will be marked as the default operator class for that
69 data type and index method.
73 The same operator class name
74 can be used for several different index methods (for example, both B-tree
75 and hash index methods have operator classes named
76 <literal>int4_ops</literal>), but each such class is an independent
77 entity and must be defined separately.
81 <sect2 id="xindex-strategies">
82 <title>Index Method Strategies</title>
85 The operators associated with an operator class are identified by
86 <quote>strategy numbers</quote>, which serve to identify the semantics of
87 each operator within the context of its operator class.
88 For example, B-trees impose a strict ordering on keys, lesser to greater,
89 and so operators like <quote>less than</quote> and <quote>greater than or equal
90 to</quote> are interesting with respect to a B-tree.
92 <productname>PostgreSQL</productname> allows the user to define operators,
93 <productname>PostgreSQL</productname> cannot look at the name of an operator
94 (e.g., <literal><</literal> or <literal>>=</literal>) and tell what kind of
95 comparison it is. Instead, the index method defines a set of
96 <quote>strategies</quote>, which can be thought of as generalized operators.
97 Each operator class specifies which actual operator corresponds to each
98 strategy for a particular data type and interpretation of the index
103 The B-tree index method defines five strategies, shown in <xref
104 linkend="xindex-btree-strat-table"/>.
107 <table tocentry="1" id="xindex-btree-strat-table">
108 <title>B-Tree Strategies</title>
112 <entry>Operation</entry>
113 <entry>Strategy Number</entry>
118 <entry>less than</entry>
122 <entry>less than or equal</entry>
130 <entry>greater than or equal</entry>
134 <entry>greater than</entry>
142 Hash indexes support only equality comparisons, and so they use only one
143 strategy, shown in <xref linkend="xindex-hash-strat-table"/>.
146 <table tocentry="1" id="xindex-hash-strat-table">
147 <title>Hash Strategies</title>
151 <entry>Operation</entry>
152 <entry>Strategy Number</entry>
165 GiST indexes are more flexible: they do not have a fixed set of
166 strategies at all. Instead, the <quote>consistency</quote> support routine
167 of each particular GiST operator class interprets the strategy numbers
168 however it likes. As an example, several of the built-in GiST index
169 operator classes index two-dimensional geometric objects, providing
170 the <quote>R-tree</quote> strategies shown in
171 <xref linkend="xindex-rtree-strat-table"/>. Four of these are true
172 two-dimensional tests (overlaps, same, contains, contained by);
173 four of them consider only the X direction; and the other four
174 provide the same tests in the Y direction.
177 <table tocentry="1" id="xindex-rtree-strat-table">
178 <title>GiST Two-Dimensional <quote>R-tree</quote> Strategies</title>
182 <entry>Operation</entry>
183 <entry>Strategy Number</entry>
188 <entry>strictly left of</entry>
192 <entry>does not extend to right of</entry>
196 <entry>overlaps</entry>
200 <entry>does not extend to left of</entry>
204 <entry>strictly right of</entry>
212 <entry>contains</entry>
216 <entry>contained by</entry>
220 <entry>does not extend above</entry>
224 <entry>strictly below</entry>
228 <entry>strictly above</entry>
232 <entry>does not extend below</entry>
240 SP-GiST indexes are similar to GiST indexes in flexibility: they don't have
241 a fixed set of strategies. Instead the support routines of each operator
242 class interpret the strategy numbers according to the operator class's
243 definition. As an example, the strategy numbers used by the built-in
244 operator classes for points are shown in <xref
245 linkend="xindex-spgist-point-strat-table"/>.
248 <table tocentry="1" id="xindex-spgist-point-strat-table">
249 <title>SP-GiST Point Strategies</title>
253 <entry>Operation</entry>
254 <entry>Strategy Number</entry>
259 <entry>strictly left of</entry>
263 <entry>strictly right of</entry>
271 <entry>contained by</entry>
275 <entry>strictly below</entry>
279 <entry>strictly above</entry>
287 GIN indexes are similar to GiST and SP-GiST indexes, in that they don't
288 have a fixed set of strategies either. Instead the support routines of
289 each operator class interpret the strategy numbers according to the
290 operator class's definition. As an example, the strategy numbers used by
291 the built-in operator class for arrays are shown in
292 <xref linkend="xindex-gin-array-strat-table"/>.
295 <table tocentry="1" id="xindex-gin-array-strat-table">
296 <title>GIN Array Strategies</title>
300 <entry>Operation</entry>
301 <entry>Strategy Number</entry>
306 <entry>overlap</entry>
310 <entry>contains</entry>
314 <entry>is contained by</entry>
326 BRIN indexes are similar to GiST, SP-GiST and GIN indexes in that they
327 don't have a fixed set of strategies either. Instead the support routines
328 of each operator class interpret the strategy numbers according to the
329 operator class's definition. As an example, the strategy numbers used by
330 the built-in <literal>Minmax</literal> operator classes are shown in
331 <xref linkend="xindex-brin-minmax-strat-table"/>.
334 <table tocentry="1" id="xindex-brin-minmax-strat-table">
335 <title>BRIN Minmax Strategies</title>
339 <entry>Operation</entry>
340 <entry>Strategy Number</entry>
345 <entry>less than</entry>
349 <entry>less than or equal</entry>
357 <entry>greater than or equal</entry>
361 <entry>greater than</entry>
369 Notice that all the operators listed above return Boolean values. In
370 practice, all operators defined as index method search operators must
371 return type <type>boolean</type>, since they must appear at the top
372 level of a <literal>WHERE</literal> clause to be used with an index.
373 (Some index access methods also support <firstterm>ordering operators</firstterm>,
374 which typically don't return Boolean values; that feature is discussed
375 in <xref linkend="xindex-ordering-ops"/>.)
379 <sect2 id="xindex-support">
380 <title>Index Method Support Routines</title>
383 Strategies aren't usually enough information for the system to figure
384 out how to use an index. In practice, the index methods require
385 additional support routines in order to work. For example, the B-tree
386 index method must be able to compare two keys and determine whether one
387 is greater than, equal to, or less than the other. Similarly, the
388 hash index method must be able to compute hash codes for key values.
389 These operations do not correspond to operators used in qualifications in
390 SQL commands; they are administrative routines used by
391 the index methods, internally.
395 Just as with strategies, the operator class identifies which specific
396 functions should play each of these roles for a given data type and
397 semantic interpretation. The index method defines the set
398 of functions it needs, and the operator class identifies the correct
399 functions to use by assigning them to the <quote>support function numbers</quote>
400 specified by the index method.
404 B-trees require a comparison support function,
405 and allow two additional support functions to be
406 supplied at the operator class author's option, as shown in <xref
407 linkend="xindex-btree-support-table"/>.
408 The requirements for these support functions are explained further in
409 <xref linkend="btree-support-funcs"/>.
412 <table tocentry="1" id="xindex-btree-support-table">
413 <title>B-Tree Support Functions</title>
417 <entry>Function</entry>
418 <entry>Support Number</entry>
424 Compare two keys and return an integer less than zero, zero, or
425 greater than zero, indicating whether the first key is less than,
426 equal to, or greater than the second
432 Return the addresses of C-callable sort support function(s)
439 Compare a test value to a base value plus/minus an offset, and return
440 true or false according to the comparison result (optional)
449 Hash indexes require one support function, and allow a second one to be
450 supplied at the operator class author's option, as shown in <xref
451 linkend="xindex-hash-support-table"/>.
454 <table tocentry="1" id="xindex-hash-support-table">
455 <title>Hash Support Functions</title>
459 <entry>Function</entry>
460 <entry>Support Number</entry>
465 <entry>Compute the 32-bit hash value for a key</entry>
470 Compute the 64-bit hash value for a key given a 64-bit salt; if
471 the salt is 0, the low 32 bits of the result must match the value
472 that would have been computed by function 1
482 GiST indexes have nine support functions, two of which are optional,
483 as shown in <xref linkend="xindex-gist-support-table"/>.
484 (For more information see <xref linkend="gist"/>.)
487 <table tocentry="1" id="xindex-gist-support-table">
488 <title>GiST Support Functions</title>
492 <entry>Function</entry>
493 <entry>Description</entry>
494 <entry>Support Number</entry>
499 <entry><function>consistent</function></entry>
500 <entry>determine whether key satisfies the
501 query qualifier</entry>
505 <entry><function>union</function></entry>
506 <entry>compute union of a set of keys</entry>
510 <entry><function>compress</function></entry>
511 <entry>compute a compressed representation of a key or value
512 to be indexed</entry>
516 <entry><function>decompress</function></entry>
517 <entry>compute a decompressed representation of a
518 compressed key</entry>
522 <entry><function>penalty</function></entry>
523 <entry>compute penalty for inserting new key into subtree
524 with given subtree's key</entry>
528 <entry><function>picksplit</function></entry>
529 <entry>determine which entries of a page are to be moved
530 to the new page and compute the union keys for resulting pages</entry>
534 <entry><function>equal</function></entry>
535 <entry>compare two keys and return true if they are equal</entry>
539 <entry><function>distance</function></entry>
540 <entry>determine distance from key to query value (optional)</entry>
544 <entry><function>fetch</function></entry>
545 <entry>compute original representation of a compressed key for
546 index-only scans (optional)</entry>
554 SP-GiST indexes require five support functions, as
555 shown in <xref linkend="xindex-spgist-support-table"/>.
556 (For more information see <xref linkend="spgist"/>.)
559 <table tocentry="1" id="xindex-spgist-support-table">
560 <title>SP-GiST Support Functions</title>
564 <entry>Function</entry>
565 <entry>Description</entry>
566 <entry>Support Number</entry>
571 <entry><function>config</function></entry>
572 <entry>provide basic information about the operator class</entry>
576 <entry><function>choose</function></entry>
577 <entry>determine how to insert a new value into an inner tuple</entry>
581 <entry><function>picksplit</function></entry>
582 <entry>determine how to partition a set of values</entry>
586 <entry><function>inner_consistent</function></entry>
587 <entry>determine which sub-partitions need to be searched for a
592 <entry><function>leaf_consistent</function></entry>
593 <entry>determine whether key satisfies the
594 query qualifier</entry>
602 GIN indexes have six support functions, three of which are optional,
603 as shown in <xref linkend="xindex-gin-support-table"/>.
604 (For more information see <xref linkend="gin"/>.)
607 <table tocentry="1" id="xindex-gin-support-table">
608 <title>GIN Support Functions</title>
612 <entry>Function</entry>
613 <entry>Description</entry>
614 <entry>Support Number</entry>
619 <entry><function>compare</function></entry>
621 compare two keys and return an integer less than zero, zero,
622 or greater than zero, indicating whether the first key is less than,
623 equal to, or greater than the second
628 <entry><function>extractValue</function></entry>
629 <entry>extract keys from a value to be indexed</entry>
633 <entry><function>extractQuery</function></entry>
634 <entry>extract keys from a query condition</entry>
638 <entry><function>consistent</function></entry>
640 determine whether value matches query condition (Boolean variant)
641 (optional if support function 6 is present)
646 <entry><function>comparePartial</function></entry>
648 compare partial key from
649 query and key from index, and return an integer less than zero, zero,
650 or greater than zero, indicating whether GIN should ignore this index
651 entry, treat the entry as a match, or stop the index scan (optional)
656 <entry><function>triConsistent</function></entry>
658 determine whether value matches query condition (ternary variant)
659 (optional if support function 4 is present)
668 BRIN indexes have four basic support functions, as shown in
669 <xref linkend="xindex-brin-support-table"/>; those basic functions
670 may require additional support functions to be provided.
671 (For more information see <xref linkend="brin-extensibility"/>.)
674 <table tocentry="1" id="xindex-brin-support-table">
675 <title>BRIN Support Functions</title>
679 <entry>Function</entry>
680 <entry>Description</entry>
681 <entry>Support Number</entry>
686 <entry><function>opcInfo</function></entry>
688 return internal information describing the indexed columns'
694 <entry><function>add_value</function></entry>
695 <entry>add a new value to an existing summary index tuple</entry>
699 <entry><function>consistent</function></entry>
700 <entry>determine whether value matches query condition</entry>
704 <entry><function>union</function></entry>
706 compute union of two summary tuples
715 Unlike search operators, support functions return whichever data
716 type the particular index method expects; for example in the case
717 of the comparison function for B-trees, a signed integer. The number
718 and types of the arguments to each support function are likewise
719 dependent on the index method. For B-tree and hash the comparison and
720 hashing support functions take the same input data types as do the
721 operators included in the operator class, but this is not the case for
722 most GiST, SP-GiST, GIN, and BRIN support functions.
726 <sect2 id="xindex-example">
727 <title>An Example</title>
730 Now that we have seen the ideas, here is the promised example of
731 creating a new operator class.
732 (You can find a working copy of this example in
733 <filename>src/tutorial/complex.c</filename> and
734 <filename>src/tutorial/complex.sql</filename> in the source
736 The operator class encapsulates
737 operators that sort complex numbers in absolute value order, so we
738 choose the name <literal>complex_abs_ops</literal>. First, we need
739 a set of operators. The procedure for defining operators was
740 discussed in <xref linkend="xoper"/>. For an operator class on
741 B-trees, the operators we require are:
743 <itemizedlist spacing="compact">
744 <listitem><simpara>absolute-value less-than (strategy 1)</simpara></listitem>
745 <listitem><simpara>absolute-value less-than-or-equal (strategy 2)</simpara></listitem>
746 <listitem><simpara>absolute-value equal (strategy 3)</simpara></listitem>
747 <listitem><simpara>absolute-value greater-than-or-equal (strategy 4)</simpara></listitem>
748 <listitem><simpara>absolute-value greater-than (strategy 5)</simpara></listitem>
753 The least error-prone way to define a related set of comparison operators
754 is to write the B-tree comparison support function first, and then write the
755 other functions as one-line wrappers around the support function. This
756 reduces the odds of getting inconsistent results for corner cases.
757 Following this approach, we first write:
759 <programlisting><![CDATA[
760 #define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
763 complex_abs_cmp_internal(Complex *a, Complex *b)
765 double amag = Mag(a),
777 Now the less-than function looks like:
779 <programlisting><![CDATA[
780 PG_FUNCTION_INFO_V1(complex_abs_lt);
783 complex_abs_lt(PG_FUNCTION_ARGS)
785 Complex *a = (Complex *) PG_GETARG_POINTER(0);
786 Complex *b = (Complex *) PG_GETARG_POINTER(1);
788 PG_RETURN_BOOL(complex_abs_cmp_internal(a, b) < 0);
793 The other four functions differ only in how they compare the internal
794 function's result to zero.
798 Next we declare the functions and the operators based on the functions
802 CREATE FUNCTION complex_abs_lt(complex, complex) RETURNS bool
803 AS '<replaceable>filename</replaceable>', 'complex_abs_lt'
804 LANGUAGE C IMMUTABLE STRICT;
806 CREATE OPERATOR < (
807 leftarg = complex, rightarg = complex, procedure = complex_abs_lt,
808 commutator = > , negator = >= ,
809 restrict = scalarltsel, join = scalarltjoinsel
812 It is important to specify the correct commutator and negator operators,
813 as well as suitable restriction and join selectivity
814 functions, otherwise the optimizer will be unable to make effective
819 Other things worth noting are happening here:
824 There can only be one operator named, say, <literal>=</literal>
825 and taking type <type>complex</type> for both operands. In this
826 case we don't have any other operator <literal>=</literal> for
827 <type>complex</type>, but if we were building a practical data
828 type we'd probably want <literal>=</literal> to be the ordinary
829 equality operation for complex numbers (and not the equality of
830 the absolute values). In that case, we'd need to use some other
831 operator name for <function>complex_abs_eq</function>.
837 Although <productname>PostgreSQL</productname> can cope with
838 functions having the same SQL name as long as they have different
839 argument data types, C can only cope with one global function
840 having a given name. So we shouldn't name the C function
841 something simple like <filename>abs_eq</filename>. Usually it's
842 a good practice to include the data type name in the C function
843 name, so as not to conflict with functions for other data types.
849 We could have made the SQL name
850 of the function <filename>abs_eq</filename>, relying on
851 <productname>PostgreSQL</productname> to distinguish it by
852 argument data types from any other SQL function of the same name.
853 To keep the example simple, we make the function have the same
854 names at the C level and SQL level.
861 The next step is the registration of the support routine required
862 by B-trees. The example C code that implements this is in the same
863 file that contains the operator functions. This is how we declare
867 CREATE FUNCTION complex_abs_cmp(complex, complex)
869 AS '<replaceable>filename</replaceable>'
870 LANGUAGE C IMMUTABLE STRICT;
875 Now that we have the required operators and support routine,
876 we can finally create the operator class:
878 <programlisting><![CDATA[
879 CREATE OPERATOR CLASS complex_abs_ops
880 DEFAULT FOR TYPE complex USING btree AS
886 FUNCTION 1 complex_abs_cmp(complex, complex);
892 And we're done! It should now be possible to create
893 and use B-tree indexes on <type>complex</type> columns.
897 We could have written the operator entries more verbosely, as in:
899 OPERATOR 1 < (complex, complex) ,
901 but there is no need to do so when the operators take the same data type
902 we are defining the operator class for.
906 The above example assumes that you want to make this new operator class the
907 default B-tree operator class for the <type>complex</type> data type.
908 If you don't, just leave out the word <literal>DEFAULT</literal>.
912 <sect2 id="xindex-opfamily">
913 <title>Operator Classes and Operator Families</title>
916 So far we have implicitly assumed that an operator class deals with
917 only one data type. While there certainly can be only one data type in
918 a particular index column, it is often useful to index operations that
919 compare an indexed column to a value of a different data type. Also,
920 if there is use for a cross-data-type operator in connection with an
921 operator class, it is often the case that the other data type has a
922 related operator class of its own. It is helpful to make the connections
923 between related classes explicit, because this can aid the planner in
924 optimizing SQL queries (particularly for B-tree operator classes, since
925 the planner contains a great deal of knowledge about how to work with them).
929 To handle these needs, <productname>PostgreSQL</productname>
930 uses the concept of an <firstterm>operator
931 family</firstterm><indexterm><primary>operator family</primary></indexterm>.
932 An operator family contains one or more operator classes, and can also
933 contain indexable operators and corresponding support functions that
934 belong to the family as a whole but not to any single class within the
935 family. We say that such operators and functions are <quote>loose</quote>
936 within the family, as opposed to being bound into a specific class.
937 Typically each operator class contains single-data-type operators
938 while cross-data-type operators are loose in the family.
942 All the operators and functions in an operator family must have compatible
943 semantics, where the compatibility requirements are set by the index
944 method. You might therefore wonder why bother to single out particular
945 subsets of the family as operator classes; and indeed for many purposes
946 the class divisions are irrelevant and the family is the only interesting
947 grouping. The reason for defining operator classes is that they specify
948 how much of the family is needed to support any particular index.
949 If there is an index using an operator class, then that operator class
950 cannot be dropped without dropping the index — but other parts of
951 the operator family, namely other operator classes and loose operators,
952 could be dropped. Thus, an operator class should be specified to contain
953 the minimum set of operators and functions that are reasonably needed
954 to work with an index on a specific data type, and then related but
955 non-essential operators can be added as loose members of the operator
960 As an example, <productname>PostgreSQL</productname> has a built-in
961 B-tree operator family <literal>integer_ops</literal>, which includes operator
962 classes <literal>int8_ops</literal>, <literal>int4_ops</literal>, and
963 <literal>int2_ops</literal> for indexes on <type>bigint</type> (<type>int8</type>),
964 <type>integer</type> (<type>int4</type>), and <type>smallint</type> (<type>int2</type>)
965 columns respectively. The family also contains cross-data-type comparison
966 operators allowing any two of these types to be compared, so that an index
967 on one of these types can be searched using a comparison value of another
968 type. The family could be duplicated by these definitions:
970 <programlisting><![CDATA[
971 CREATE OPERATOR FAMILY integer_ops USING btree;
973 CREATE OPERATOR CLASS int8_ops
974 DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS
975 -- standard int8 comparisons
981 FUNCTION 1 btint8cmp(int8, int8) ,
982 FUNCTION 2 btint8sortsupport(internal) ,
983 FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) ;
985 CREATE OPERATOR CLASS int4_ops
986 DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
987 -- standard int4 comparisons
993 FUNCTION 1 btint4cmp(int4, int4) ,
994 FUNCTION 2 btint4sortsupport(internal) ,
995 FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) ;
997 CREATE OPERATOR CLASS int2_ops
998 DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
999 -- standard int2 comparisons
1005 FUNCTION 1 btint2cmp(int2, int2) ,
1006 FUNCTION 2 btint2sortsupport(internal) ,
1007 FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) ;
1009 ALTER OPERATOR FAMILY integer_ops USING btree ADD
1010 -- cross-type comparisons int8 vs int2
1011 OPERATOR 1 < (int8, int2) ,
1012 OPERATOR 2 <= (int8, int2) ,
1013 OPERATOR 3 = (int8, int2) ,
1014 OPERATOR 4 >= (int8, int2) ,
1015 OPERATOR 5 > (int8, int2) ,
1016 FUNCTION 1 btint82cmp(int8, int2) ,
1018 -- cross-type comparisons int8 vs int4
1019 OPERATOR 1 < (int8, int4) ,
1020 OPERATOR 2 <= (int8, int4) ,
1021 OPERATOR 3 = (int8, int4) ,
1022 OPERATOR 4 >= (int8, int4) ,
1023 OPERATOR 5 > (int8, int4) ,
1024 FUNCTION 1 btint84cmp(int8, int4) ,
1026 -- cross-type comparisons int4 vs int2
1027 OPERATOR 1 < (int4, int2) ,
1028 OPERATOR 2 <= (int4, int2) ,
1029 OPERATOR 3 = (int4, int2) ,
1030 OPERATOR 4 >= (int4, int2) ,
1031 OPERATOR 5 > (int4, int2) ,
1032 FUNCTION 1 btint42cmp(int4, int2) ,
1034 -- cross-type comparisons int4 vs int8
1035 OPERATOR 1 < (int4, int8) ,
1036 OPERATOR 2 <= (int4, int8) ,
1037 OPERATOR 3 = (int4, int8) ,
1038 OPERATOR 4 >= (int4, int8) ,
1039 OPERATOR 5 > (int4, int8) ,
1040 FUNCTION 1 btint48cmp(int4, int8) ,
1042 -- cross-type comparisons int2 vs int8
1043 OPERATOR 1 < (int2, int8) ,
1044 OPERATOR 2 <= (int2, int8) ,
1045 OPERATOR 3 = (int2, int8) ,
1046 OPERATOR 4 >= (int2, int8) ,
1047 OPERATOR 5 > (int2, int8) ,
1048 FUNCTION 1 btint28cmp(int2, int8) ,
1050 -- cross-type comparisons int2 vs int4
1051 OPERATOR 1 < (int2, int4) ,
1052 OPERATOR 2 <= (int2, int4) ,
1053 OPERATOR 3 = (int2, int4) ,
1054 OPERATOR 4 >= (int2, int4) ,
1055 OPERATOR 5 > (int2, int4) ,
1056 FUNCTION 1 btint24cmp(int2, int4) ,
1058 -- cross-type in_range functions
1059 FUNCTION 3 in_range(int4, int4, int8, boolean, boolean) ,
1060 FUNCTION 3 in_range(int4, int4, int2, boolean, boolean) ,
1061 FUNCTION 3 in_range(int2, int2, int8, boolean, boolean) ,
1062 FUNCTION 3 in_range(int2, int2, int4, boolean, boolean) ;
1066 Notice that this definition <quote>overloads</quote> the operator strategy and
1067 support function numbers: each number occurs multiple times within the
1068 family. This is allowed so long as each instance of a
1069 particular number has distinct input data types. The instances that have
1070 both input types equal to an operator class's input type are the
1071 primary operators and support functions for that operator class,
1072 and in most cases should be declared as part of the operator class rather
1073 than as loose members of the family.
1077 In a B-tree operator family, all the operators in the family must sort
1078 compatibly, as is specified in detail in <xref linkend="btree-behavior"/>.
1080 operator in the family there must be a support function having the same
1081 two input data types as the operator. It is recommended that a family be
1082 complete, i.e., for each combination of data types, all operators are
1083 included. Each operator class should include just the non-cross-type
1084 operators and support function for its data type.
1088 To build a multiple-data-type hash operator family, compatible hash
1089 support functions must be created for each data type supported by the
1090 family. Here compatibility means that the functions are guaranteed to
1091 return the same hash code for any two values that are considered equal
1092 by the family's equality operators, even when the values are of different
1093 types. This is usually difficult to accomplish when the types have
1094 different physical representations, but it can be done in some cases.
1095 Furthermore, casting a value from one data type represented in the operator
1096 family to another data type also represented in the operator family via
1097 an implicit or binary coercion cast must not change the computed hash value.
1098 Notice that there is only one support function per data type, not one
1099 per equality operator. It is recommended that a family be complete, i.e.,
1100 provide an equality operator for each combination of data types.
1101 Each operator class should include just the non-cross-type equality
1102 operator and the support function for its data type.
1106 GiST, SP-GiST, and GIN indexes do not have any explicit notion of
1107 cross-data-type operations. The set of operators supported is just
1108 whatever the primary support functions for a given operator class can
1113 In BRIN, the requirements depends on the framework that provides the
1114 operator classes. For operator classes based on <literal>minmax</literal>,
1115 the behavior required is the same as for B-tree operator families:
1116 all the operators in the family must sort compatibly, and casts must
1117 not change the associated sort ordering.
1122 Prior to <productname>PostgreSQL</productname> 8.3, there was no concept
1123 of operator families, and so any cross-data-type operators intended to be
1124 used with an index had to be bound directly into the index's operator
1125 class. While this approach still works, it is deprecated because it
1126 makes an index's dependencies too broad, and because the planner can
1127 handle cross-data-type comparisons more effectively when both data types
1128 have operators in the same operator family.
1133 <sect2 id="xindex-opclass-dependencies">
1134 <title>System Dependencies on Operator Classes</title>
1137 <primary>ordering operator</primary>
1141 <productname>PostgreSQL</productname> uses operator classes to infer the
1142 properties of operators in more ways than just whether they can be used
1143 with indexes. Therefore, you might want to create operator classes
1144 even if you have no intention of indexing any columns of your data type.
1148 In particular, there are SQL features such as <literal>ORDER BY</literal> and
1149 <literal>DISTINCT</literal> that require comparison and sorting of values.
1150 To implement these features on a user-defined data type,
1151 <productname>PostgreSQL</productname> looks for the default B-tree operator
1152 class for the data type. The <quote>equals</quote> member of this operator
1153 class defines the system's notion of equality of values for
1154 <literal>GROUP BY</literal> and <literal>DISTINCT</literal>, and the sort ordering
1155 imposed by the operator class defines the default <literal>ORDER BY</literal>
1160 If there is no default B-tree operator class for a data type, the system
1161 will look for a default hash operator class. But since that kind of
1162 operator class only provides equality, it is only able to support grouping
1167 When there is no default operator class for a data type, you will get
1168 errors like <quote>could not identify an ordering operator</quote> if you
1169 try to use these SQL features with the data type.
1174 In <productname>PostgreSQL</productname> versions before 7.4,
1175 sorting and grouping operations would implicitly use operators named
1176 <literal>=</literal>, <literal><</literal>, and <literal>></literal>. The new
1177 behavior of relying on default operator classes avoids having to make
1178 any assumption about the behavior of operators with particular names.
1183 Sorting by a non-default B-tree operator class is possible by specifying
1184 the class's less-than operator in a <literal>USING</literal> option,
1187 SELECT * FROM mytable ORDER BY somecol USING ~<~;
1189 Alternatively, specifying the class's greater-than operator
1190 in <literal>USING</literal> selects a descending-order sort.
1194 Comparison of arrays of a user-defined type also relies on the semantics
1195 defined by the type's default B-tree operator class. If there is no
1196 default B-tree operator class, but there is a default hash operator class,
1197 then array equality is supported, but not ordering comparisons.
1201 Another SQL feature that requires even more data-type-specific knowledge
1202 is the <literal>RANGE</literal> <replaceable>offset</replaceable>
1203 <literal>PRECEDING</literal>/<literal>FOLLOWING</literal> framing option
1204 for window functions (see <xref linkend="syntax-window-functions"/>).
1207 SELECT sum(x) OVER (ORDER BY x RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING)
1210 it is not sufficient to know how to order by <literal>x</literal>;
1211 the database must also understand how to <quote>subtract 5</quote> or
1212 <quote>add 10</quote> to the current row's value of <literal>x</literal>
1213 to identify the bounds of the current window frame. Comparing the
1214 resulting bounds to other rows' values of <literal>x</literal> is
1215 possible using the comparison operators provided by the B-tree operator
1216 class that defines the <literal>ORDER BY</literal> ordering — but
1217 addition and subtraction operators are not part of the operator class, so
1218 which ones should be used? Hard-wiring that choice would be undesirable,
1219 because different sort orders (different B-tree operator classes) might
1220 need different behavior. Therefore, a B-tree operator class can specify
1221 an <firstterm>in_range</firstterm> support function that encapsulates the
1222 addition and subtraction behaviors that make sense for its sort order.
1223 It can even provide more than one in_range support function, in case
1224 there is more than one data type that makes sense to use as the offset
1225 in <literal>RANGE</literal> clauses.
1226 If the B-tree operator class associated with the window's <literal>ORDER
1227 BY</literal> clause does not have a matching in_range support function,
1228 the <literal>RANGE</literal> <replaceable>offset</replaceable>
1229 <literal>PRECEDING</literal>/<literal>FOLLOWING</literal>
1230 option is not supported.
1234 Another important point is that an equality operator that
1235 appears in a hash operator family is a candidate for hash joins,
1236 hash aggregation, and related optimizations. The hash operator family
1237 is essential here since it identifies the hash function(s) to use.
1241 <sect2 id="xindex-ordering-ops">
1242 <title>Ordering Operators</title>
1245 Some index access methods (currently, only GiST and SP-GiST) support the concept of
1246 <firstterm>ordering operators</firstterm>. What we have been discussing so far
1247 are <firstterm>search operators</firstterm>. A search operator is one for which
1248 the index can be searched to find all rows satisfying
1249 <literal>WHERE</literal>
1250 <replaceable>indexed_column</replaceable>
1251 <replaceable>operator</replaceable>
1252 <replaceable>constant</replaceable>.
1253 Note that nothing is promised about the order in which the matching rows
1254 will be returned. In contrast, an ordering operator does not restrict the
1255 set of rows that can be returned, but instead determines their order.
1256 An ordering operator is one for which the index can be scanned to return
1257 rows in the order represented by
1258 <literal>ORDER BY</literal>
1259 <replaceable>indexed_column</replaceable>
1260 <replaceable>operator</replaceable>
1261 <replaceable>constant</replaceable>.
1262 The reason for defining ordering operators that way is that it supports
1263 nearest-neighbor searches, if the operator is one that measures distance.
1264 For example, a query like
1265 <programlisting><![CDATA[
1266 SELECT * FROM places ORDER BY location <-> point '(101,456)' LIMIT 10;
1269 finds the ten places closest to a given target point. A GiST index
1270 on the location column can do this efficiently because
1271 <literal><-></literal> is an ordering operator.
1275 While search operators have to return Boolean results, ordering operators
1276 usually return some other type, such as float or numeric for distances.
1277 This type is normally not the same as the data type being indexed.
1278 To avoid hard-wiring assumptions about the behavior of different data
1279 types, the definition of an ordering operator is required to name
1280 a B-tree operator family that specifies the sort ordering of the result
1281 data type. As was stated in the previous section, B-tree operator families
1282 define <productname>PostgreSQL</productname>'s notion of ordering, so
1283 this is a natural representation. Since the point <literal><-></literal>
1284 operator returns <type>float8</type>, it could be specified in an operator
1285 class creation command like this:
1286 <programlisting><![CDATA[
1287 OPERATOR 15 <-> (point, point) FOR ORDER BY float_ops
1290 where <literal>float_ops</literal> is the built-in operator family that includes
1291 operations on <type>float8</type>. This declaration states that the index
1292 is able to return rows in order of increasing values of the
1293 <literal><-></literal> operator.
1297 <sect2 id="xindex-opclass-features">
1298 <title>Special Features of Operator Classes</title>
1301 There are two special features of operator classes that we have
1302 not discussed yet, mainly because they are not useful
1303 with the most commonly used index methods.
1307 Normally, declaring an operator as a member of an operator class
1308 (or family) means that the index method can retrieve exactly the set of rows
1309 that satisfy a <literal>WHERE</literal> condition using the operator. For example:
1311 SELECT * FROM table WHERE integer_column < 4;
1313 can be satisfied exactly by a B-tree index on the integer column.
1314 But there are cases where an index is useful as an inexact guide to
1315 the matching rows. For example, if a GiST index stores only bounding boxes
1316 for geometric objects, then it cannot exactly satisfy a <literal>WHERE</literal>
1317 condition that tests overlap between nonrectangular objects such as
1318 polygons. Yet we could use the index to find objects whose bounding
1319 box overlaps the bounding box of the target object, and then do the
1320 exact overlap test only on the objects found by the index. If this
1321 scenario applies, the index is said to be <quote>lossy</quote> for the
1322 operator. Lossy index searches are implemented by having the index
1323 method return a <firstterm>recheck</firstterm> flag when a row might or might
1324 not really satisfy the query condition. The core system will then
1325 test the original query condition on the retrieved row to see whether
1326 it should be returned as a valid match. This approach works if
1327 the index is guaranteed to return all the required rows, plus perhaps
1328 some additional rows, which can be eliminated by performing the original
1329 operator invocation. The index methods that support lossy searches
1330 (currently, GiST, SP-GiST and GIN) allow the support functions of individual
1331 operator classes to set the recheck flag, and so this is essentially an
1332 operator-class feature.
1336 Consider again the situation where we are storing in the index only
1337 the bounding box of a complex object such as a polygon. In this
1338 case there's not much value in storing the whole polygon in the index
1339 entry — we might as well store just a simpler object of type
1340 <type>box</type>. This situation is expressed by the <literal>STORAGE</literal>
1341 option in <command>CREATE OPERATOR CLASS</command>: we'd write something like:
1344 CREATE OPERATOR CLASS polygon_ops
1345 DEFAULT FOR TYPE polygon USING gist AS
1350 At present, only the GiST, GIN and BRIN index methods support a
1351 <literal>STORAGE</literal> type that's different from the column data type.
1352 The GiST <function>compress</function> and <function>decompress</function> support
1353 routines must deal with data-type conversion when <literal>STORAGE</literal>
1354 is used. In GIN, the <literal>STORAGE</literal> type identifies the type of
1355 the <quote>key</quote> values, which normally is different from the type
1356 of the indexed column — for example, an operator class for
1357 integer-array columns might have keys that are just integers. The
1358 GIN <function>extractValue</function> and <function>extractQuery</function> support
1359 routines are responsible for extracting keys from indexed values.
1360 BRIN is similar to GIN: the <literal>STORAGE</literal> type identifies the
1361 type of the stored summary values, and operator classes' support
1362 procedures are responsible for interpreting the summary values