2 $Header: /cvsroot/pgsql/doc/src/sgml/xindex.sgml,v 1.21 2001/11/21 06:09:45 thomas Exp $
3 PostgreSQL documentation
7 <title>Interfacing Extensions To Indexes</title>
10 The procedures described thus far let you define a new type, new
11 functions and new operators. However, we cannot yet define a secondary
12 index (such as a <acronym>B-tree</acronym>, <acronym>R-tree</acronym> or
13 hash access method) over a new type or its operators.
18 <xref linkend="EXTEND-CATALOGS">.
19 The right half shows the catalogs that we must modify in order to tell
20 <productname>PostgreSQL</productname> how to use a user-defined type and/or
21 user-defined operators with an index (i.e., <filename>pg_am, pg_amop,
22 pg_amproc, pg_operator</filename> and <filename>pg_opclass</filename>).
23 Unfortunately, there is no simple command to do this. We will demonstrate
24 how to modify these catalogs through a running example: a new operator
25 class for the <acronym>B-tree</acronym> access method that stores and
26 sorts complex numbers in ascending absolute value order.
30 The <filename>pg_am</filename> table contains one row for every index
31 access method. Support for the heap access method is built into
32 <productname>PostgreSQL</productname>, but every other access method is
33 described in <filename>pg_am</filename>. The schema is
36 <title>Index Access Method Schema</title>
42 <entry>Description</entry>
48 <entry>name of the access method</entry>
51 <entry>amowner</entry>
52 <entry>user id of the owner</entry>
55 <entry>amstrategies</entry>
56 <entry>number of strategies for this access method (see below)</entry>
59 <entry>amsupport</entry>
60 <entry>number of support routines for this access method (see below)</entry>
63 <entry>amorderstrategy</entry>
64 <entry>zero if the index offers no sort order, otherwise the strategy
65 number of the strategy operator that describes the sort order</entry>
68 <entry>amcanunique</entry>
69 <entry>does AM support UNIQUE indexes?</entry>
72 <entry>amcanmulticol</entry>
73 <entry>does AM support multicolumn indexes?</entry>
76 <entry>amindexnulls</entry>
77 <entry>does AM support NULL index entries?</entry>
80 <entry>amconcurrent</entry>
81 <entry>does AM support concurrent updates?</entry>
84 <entry>amgettuple</entry>
87 <entry>aminsert</entry>
91 <entry>procedure identifiers for interface routines to the access
92 method. For example, regproc ids for opening, closing, and
93 getting rows from the access method appear here.</entry>
101 The <acronym>object ID</acronym> of the row in
102 <filename>pg_am</filename> is used as a foreign key in a lot of other
103 tables. You do not need to add a new row to this table; all that
104 you are interested in is the <acronym>object ID</acronym> of the access
105 method you want to extend:
108 SELECT oid FROM pg_am WHERE amname = 'btree';
116 We will use that <command>SELECT</command> in a <command>WHERE</command>
121 The <filename>amstrategies</filename> column exists to standardize
122 comparisons across data types. For example, <acronym>B-tree</acronym>s
123 impose a strict ordering on keys, lesser to greater. Since
124 <productname>PostgreSQL</productname> allows the user to define operators,
125 <productname>PostgreSQL</productname> cannot look at the name of an operator
126 (e.g., <literal>></> or <literal><</>) and tell what kind of comparison it is. In fact,
127 some access methods don't impose any ordering at all. For example,
128 <acronym>R-tree</acronym>s express a rectangle-containment relationship,
129 whereas a hashed data structure expresses only bitwise similarity based
130 on the value of a hash function. <productname>PostgreSQL</productname>
131 needs some consistent way of taking a qualification in your query,
132 looking at the operator and then deciding if a usable index exists. This
133 implies that <productname>PostgreSQL</productname> needs to know, for
134 example, that the <literal><=</> and <literal>></> operators partition a
135 <acronym>B-tree</acronym>. <productname>PostgreSQL</productname>
136 uses strategies to express these relationships between
137 operators and the way they can be used to scan indexes.
141 Defining a new set of strategies is beyond the scope of this discussion,
142 but we'll explain how <acronym>B-tree</acronym> strategies work because
143 you'll need to know that to add a new B-tree operator class. In the
144 <filename>pg_am</filename> table, the amstrategies column is the
145 number of strategies defined for this access method. For
146 <acronym>B-tree</acronym>s, this number is 5. These strategies
150 <title>B-tree Strategies</title>
151 <titleabbrev>B-tree</titleabbrev>
155 <entry>Operation</entry>
161 <entry>less than</entry>
165 <entry>less than or equal</entry>
173 <entry>greater than or equal</entry>
177 <entry>greater than</entry>
186 The idea is that you'll need to add operators corresponding to the
187 comparisons above to the <filename>pg_amop</filename> relation (see below).
188 The access method code can use these strategy numbers, regardless of data
189 type, to figure out how to partition the <acronym>B-tree</acronym>,
190 compute selectivity, and so on. Don't worry about the details of adding
191 operators yet; just understand that there must be a set of these
192 operators for <filename>int2, int4, oid,</filename> and every other
193 data type on which a <acronym>B-tree</acronym> can operate.
197 Sometimes, strategies aren't enough information for the system to figure
198 out how to use an index. Some access methods require additional support
199 routines in order to work. For example, the <acronym>B-tree</acronym>
200 access method must be able to compare two keys and determine whether one
201 is greater than, equal to, or less than the other. Similarly, the
202 <acronym>R-tree</acronym> access method must be able to compute
203 intersections, unions, and sizes of rectangles. These
204 operations do not correspond to operators used in qualifications in
205 SQL queries; they are administrative routines used by
206 the access methods, internally.
210 In order to manage diverse support routines consistently across all
211 <productname>PostgreSQL</productname> access methods,
212 <filename>pg_am</filename> includes a column called
213 <filename>amsupport</filename>. This column records the number of
214 support routines used by an access method. For <acronym>B-tree</acronym>s,
215 this number is one -- the routine to take two keys and return -1, 0, or
216 +1, depending on whether the first key is less than, equal
217 to, or greater than the second.
221 Strictly speaking, this routine can return a negative
222 number (< 0), zero, or a non-zero positive number (> 0).
228 The <filename>amstrategies</filename> entry in <filename>pg_am</filename>
230 of strategies defined for the access method in question. The operators
231 for less than, less equal, and so on don't appear in
232 <filename>pg_am</filename>. Similarly, <filename>amsupport</filename>
233 is just the number of support routines required by the access
234 method. The actual routines are listed elsewhere.
238 By the way, the <filename>amorderstrategy</filename> entry tells whether
239 the access method supports ordered scan. Zero means it doesn't; if it
240 does, <filename>amorderstrategy</filename> is the number of the strategy
241 routine that corresponds to the ordering operator. For example, B-tree
242 has <filename>amorderstrategy</filename> = 1 which is its
243 <quote>less than</quote> strategy number.
247 The next table of interest is <filename>pg_opclass</filename>. This table
248 defines operator class names and input data types for each of the operator
249 classes supported by a given index access method. The same class name
250 can be used for several different access methods (for example, both B-tree
251 and hash access methods have operator classes named
252 <filename>oid_ops</filename>), but a separate
253 <filename>pg_opclass</filename> row must appear for each access method.
254 The <filename>oid</filename> of the <filename>pg_opclass</filename> row is
256 key in other tables to associate specific operators and support routines
257 with the operator class.
261 You need to add a row with your opclass name (for example,
262 <filename>complex_abs_ops</filename>) to
263 <filename>pg_opclass</filename>:
266 INSERT INTO pg_opclass (opcamid, opcname, opcintype, opcdefault, opckeytype)
268 (SELECT oid FROM pg_am WHERE amname = 'btree'),
270 (SELECT oid FROM pg_type WHERE typname = 'complex'),
276 WHERE opcname = 'complex_abs_ops';
278 oid | opcamid | opcname | opcintype | opcdefault | opckeytype
279 --------+---------+-----------------+-----------+------------+------------
280 277975 | 403 | complex_abs_ops | 277946 | t | 0
284 Note that the oid for your <filename>pg_opclass</filename> row will
285 be different! Don't worry about this though. We'll get this number
286 from the system later just like we got the oid of the type here.
290 The above example assumes that you want to make this new opclass the
291 default B-tree opclass for the <filename>complex</filename> data type.
292 If you don't, just set <filename>opcdefault</filename> to false instead.
293 <filename>opckeytype</filename> is not described here; it should always
294 be zero for B-tree opclasses.
298 So now we have an access method and an operator class.
299 We still need a set of operators. The procedure for
300 defining operators was discussed earlier in this manual.
301 For the <filename>complex_abs_ops</filename> operator class on B-trees,
302 the operators we require are:
305 absolute value less-than
306 absolute value less-than-or-equal
308 absolute value greater-than-or-equal
309 absolute value greater-than
314 Suppose the code that implements these functions
315 is stored in the file
316 <replaceable>PGROOT</replaceable><filename>/tutorial/complex.c</filename>,
317 which we have compiled into
318 <replaceable>PGROOT</replaceable><filename>/tutorial/complex.so</filename>.
322 Part of the C code looks like this: (note that we will only show the
323 equality operator for the rest of the examples. The other four
324 operators are very similar. Refer to <filename>complex.c</filename>
325 or <filename>complex.source</filename> for the details.)
328 #define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
331 complex_abs_eq(Complex *a, Complex *b)
333 double amag = Mag(a), bmag = Mag(b);
340 We make the function known to <productname>PostgreSQL</productname> like this:
342 CREATE FUNCTION complex_abs_eq(complex, complex)
344 AS '<replaceable>PGROOT</replaceable>/tutorial/complex'
350 There are some important things that are happening here.
354 First, note that operators for less-than, less-than-or-equal, equal,
355 greater-than-or-equal, and greater-than for <filename>complex</filename>
356 are being defined. We can only have one operator named, say, = and
357 taking type <filename>complex</filename> for both operands. In this case
358 we don't have any other operator = for <filename>complex</filename>,
359 but if we were building a practical data type we'd probably want = to
360 be the ordinary equality operation for complex numbers. In that case,
361 we'd need to use some other operator name for complex_abs_eq.
365 Second, although <productname>PostgreSQL</productname> can cope with operators having
366 the same name as long as they have different input data types, C can only
367 cope with one global routine having a given name, period. So we shouldn't
368 name the C function something simple like <filename>abs_eq</filename>.
369 Usually it's a good practice to include the data type name in the C
370 function name, so as not to conflict with functions for other data types.
374 Third, we could have made the <productname>PostgreSQL</productname> name of the function
375 <filename>abs_eq</filename>, relying on <productname>PostgreSQL</productname> to distinguish it
376 by input data types from any other <productname>PostgreSQL</productname> function of the same name.
377 To keep the example simple, we make the function have the same names
378 at the C level and <productname>PostgreSQL</productname> level.
382 Finally, note that these operator functions return Boolean values.
383 In practice, all operators defined as index access method strategies
384 must return Boolean, since they must appear at the top level of a WHERE
385 clause to be used with an index.
387 hand, the support function returns whatever the particular access method
388 expects -- in this case, a signed integer.)
392 The final routine in the
393 file is the <quote>support routine</quote> mentioned when we discussed the amsupport
394 column of the <filename>pg_am</filename> table. We will use this
395 later on. For now, ignore it.
399 Now we are ready to define the operators:
403 leftarg = complex, rightarg = complex,
404 procedure = complex_abs_eq,
405 restrict = eqsel, join = eqjoinsel
410 things here are the procedure names (which are the <acronym>C</acronym>
411 functions defined above) and the restriction and join selectivity
412 functions. You should just use the selectivity functions used in
413 the example (see <filename>complex.source</filename>).
415 are different such functions for the less-than, equal, and greater-than
416 cases. These must be supplied, or the optimizer will be unable to
417 make effective use of the index.
421 The next step is to add entries for these operators to
422 the <filename>pg_amop</filename> relation. To do this,
423 we'll need the <filename>oid</filename>s of the operators we just
424 defined. We'll look up the names of all the operators that take
425 two <filename>complex</filename>es, and pick ours out:
428 SELECT o.oid AS opoid, o.oprname
429 INTO TEMP TABLE complex_ops_tmp
430 FROM pg_operator o, pg_type t
431 WHERE o.oprleft = t.oid and o.oprright = t.oid
432 and t.typname = 'complex';
445 (Again, some of your <filename>oid</filename> numbers will almost
446 certainly be different.) The operators we are interested in are those
447 with <filename>oid</filename>s 277970 through 277974. The values you
448 get will probably be different, and you should substitute them for the
449 values below. We will do this with a select statement.
453 Now we are ready to insert entries into <filename>pg_amop</filename> for
454 our new operator class. These entries must associate the correct
455 B-tree strategy numbers with each of the operators we need.
456 The command to insert the less-than operator looks like:
459 INSERT INTO pg_amop (amopclaid, amopstrategy, amopreqcheck, amopopr)
460 SELECT opcl.oid, 1, false, c.opoid
461 FROM pg_opclass opcl, complex_ops_tmp c
463 opcamid = (SELECT oid FROM pg_am WHERE amname = 'btree') AND
464 opcname = 'complex_abs_ops' AND
468 Now do this for the other operators substituting for the <literal>1</> in the
469 second line above and the <literal><</> in the last line. Note the order:
470 <quote>less than</> is 1, <quote>less than or equal</> is 2,
471 <quote>equal</> is 3, <quote>greater than or equal</quote> is 4, and
472 <quote>greater than</quote> is 5.
476 The field <filename>amopreqcheck</filename> is not discussed here; it
477 should always be false for B-tree operators.
481 The final step is registration of the <quote>support routine</quote> previously
482 described in our discussion of <filename>pg_am</filename>. The
483 <filename>oid</filename> of this support routine is stored in the
484 <filename>pg_amproc</filename> table, keyed by the operator class
485 <filename>oid</filename> and the support routine number.
486 First, we need to register the function in
487 <productname>PostgreSQL</productname> (recall that we put the
488 <acronym>C</acronym> code that implements this routine in the bottom of
489 the file in which we implemented the operator routines):
492 CREATE FUNCTION complex_abs_cmp(complex, complex)
494 AS '<replaceable>PGROOT</replaceable>/tutorial/complex'
497 SELECT oid, proname FROM pg_proc
498 WHERE proname = 'complex_abs_cmp';
501 --------+-----------------
502 277997 | complex_abs_cmp
506 (Again, your <filename>oid</filename> number will probably be different.)
507 We can add the new row as follows:
510 INSERT INTO pg_amproc (amopclaid, amprocnum, amproc)
511 SELECT opcl.oid, 1, p.oid
512 FROM pg_opclass opcl, pg_proc p
514 opcamid = (SELECT oid FROM pg_am WHERE amname = 'btree') AND
515 opcname = 'complex_abs_ops' AND
516 p.proname = 'complex_abs_cmp';
521 And we're done! (Whew.) It should now be possible to create
522 and use B-tree indexes on <filename>complex</filename> columns.
527 <!-- Keep this comment at the end of the file
532 sgml-minimize-attributes:nil
533 sgml-always-quote-attributes:t
536 sgml-parent-document:nil
537 sgml-default-dtd-file:"./reference.ced"
538 sgml-exposed-tags:nil
539 sgml-local-catalogs:("/usr/lib/sgml/catalog")
540 sgml-local-ecat-files:nil