2 $Header: /cvsroot/pgsql/doc/src/sgml/xindex.sgml,v 1.9 2000/03/28 02:53:02 tgl Exp $
7 <title>Interfacing Extensions To Indices</title>
10 The procedures described thus far let you define a new type, new
11 functions and new operators. However, we cannot yet define a secondary
12 index (such as a <acronym>B-tree</acronym>, <acronym>R-tree</acronym> or
13 hash access method) over a new type or its operators.
18 <xref endterm="EXTEND-CATALOGS" linkend="EXTEND-CATALOGS">.
19 The right half shows the catalogs that we must modify in order to tell
20 <productname>Postgres</productname> how to use a user-defined type and/or
21 user-defined operators with an index (i.e., <filename>pg_am, pg_amop,
22 pg_amproc, pg_operator</filename> and <filename>pg_opclass</filename>).
23 Unfortunately, there is no simple command to do this. We will demonstrate
24 how to modify these catalogs through a running example: a new operator
25 class for the <acronym>B-tree</acronym> access method that stores and
26 sorts complex numbers in ascending absolute value order.
30 The <filename>pg_am</filename> class contains one instance for every user
31 defined access method. Support for the heap access method is built into
32 <productname>Postgres</productname>, but every other access method is
33 described here. The schema is
36 <title>Index Schema</title>
37 <titleabbrev>Indices</titleabbrev>
41 <entry>Attribute</entry>
42 <entry>Description</entry>
48 <entry>name of the access method</entry>
51 <entry>amowner</entry>
52 <entry>object id of the owner's instance in pg_user</entry>
55 <entry>amstrategies</entry>
56 <entry>number of strategies for this access method (see below)</entry>
59 <entry>amsupport</entry>
60 <entry>number of support routines for this access method (see below)</entry>
63 <entry>amorderstrategy</entry>
64 <entry>zero if the index offers no sort order, otherwise the strategy
65 number of the strategy operator that describes the sort order</entry>
68 <entry>amgettuple</entry>
71 <entry>aminsert</entry>
75 <entry>procedure identifiers for interface routines to the access
76 method. For example, regproc ids for opening, closing, and
77 getting instances from the access method appear here.</entry>
85 The <acronym>object ID</acronym> of the instance in
86 <filename>pg_am</filename> is used as a foreign key in lots of other
87 classes. You don't need to add a new instance to this class; all
88 you're interested in is the <acronym>object ID</acronym> of the access
89 method instance you want to extend:
92 SELECT oid FROM pg_am WHERE amname = 'btree';
100 We will use that <command>SELECT</command> in a <command>WHERE</command>
105 The <filename>amstrategies</filename> attribute exists to standardize
106 comparisons across data types. For example, <acronym>B-tree</acronym>s
107 impose a strict ordering on keys, lesser to greater. Since
108 <productname>Postgres</productname> allows the user to define operators,
109 <productname>Postgres</productname> cannot look at the name of an operator
110 (eg, ">" or "<") and tell what kind of comparison it is. In fact,
111 some access methods don't impose any ordering at all. For example,
112 <acronym>R-tree</acronym>s express a rectangle-containment relationship,
113 whereas a hashed data structure expresses only bitwise similarity based
114 on the value of a hash function. <productname>Postgres</productname>
115 needs some consistent way of taking a qualification in your query,
116 looking at the operator and then deciding if a usable index exists. This
117 implies that <productname>Postgres</productname> needs to know, for
118 example, that the "<=" and ">" operators partition a
119 <acronym>B-tree</acronym>. <productname>Postgres</productname>
120 uses strategies to express these relationships between
121 operators and the way they can be used to scan indices.
125 Defining a new set of strategies is beyond the scope of this discussion,
126 but we'll explain how <acronym>B-tree</acronym> strategies work because
127 you'll need to know that to add a new operator class. In the
128 <filename>pg_am</filename> class, the amstrategies attribute is the
129 number of strategies defined for this access method. For
130 <acronym>B-tree</acronym>s, this number is 5. These strategies
134 <title>B-tree Strategies</title>
135 <titleabbrev>B-tree</titleabbrev>
139 <entry>Operation</entry>
145 <entry>less than</entry>
149 <entry>less than or equal</entry>
157 <entry>greater than or equal</entry>
161 <entry>greater than</entry>
170 The idea is that you'll need to add procedures corresponding to the
171 comparisons above to the <filename>pg_amop</filename> relation (see below).
172 The access method code can use these strategy numbers, regardless of data
173 type, to figure out how to partition the <acronym>B-tree</acronym>,
174 compute selectivity, and so on. Don't worry about the details of adding
175 procedures yet; just understand that there must be a set of these
176 procedures for <filename>int2, int4, oid,</filename> and every other
177 data type on which a <acronym>B-tree</acronym> can operate.
181 Sometimes, strategies aren't enough information for the system to figure
182 out how to use an index. Some access methods require other support
183 routines in order to work. For example, the <acronym>B-tree</acronym>
184 access method must be able to compare two keys and determine whether one
185 is greater than, equal to, or less than the other. Similarly, the
186 <acronym>R-tree</acronym> access method must be able to compute
187 intersections, unions, and sizes of rectangles. These
188 operations do not correspond to user qualifications in
189 SQL queries; they are administrative routines used by
190 the access methods, internally.
194 In order to manage diverse support routines consistently across all
195 <productname>Postgres</productname> access methods,
196 <filename>pg_am</filename> includes an attribute called
197 <filename>amsupport</filename>. This attribute records the number of
198 support routines used by an access method. For <acronym>B-tree</acronym>s,
199 this number is one -- the routine to take two keys and return -1, 0, or
200 +1, depending on whether the first key is less than, equal
201 to, or greater than the second.
205 Strictly speaking, this routine can return a negative
206 number (< 0), 0, or a non-zero positive number (> 0).
212 The <filename>amstrategies</filename> entry in <filename>pg_am</filename>
214 of strategies defined for the access method in question. The procedures
215 for less than, less equal, and so on don't appear in
216 <filename>pg_am</filename>. Similarly, <filename>amsupport</filename>
217 is just the number of support routines required by the access
218 method. The actual routines are listed elsewhere.
222 By the way, the <filename>amorderstrategy</filename> entry tells whether
223 the access method supports ordered scan. Zero means it doesn't; if it
224 does, <filename>amorderstrategy</filename> is the number of the strategy
225 routine that corresponds to the ordering operator. For example, btree
226 has <filename>amorderstrategy</filename> = 1 which is its
227 "less than" strategy number.
231 The next class of interest is <filename>pg_opclass</filename>. This class
232 exists only to associate an operator class name and perhaps a default type
233 with an operator class oid. Some existing opclasses are <filename>int2_ops,
234 int4_ops,</filename> and <filename>oid_ops</filename>. You need to add an
235 instance with your opclass name (for example,
236 <filename>complex_abs_ops</filename>) to
237 <filename>pg_opclass</filename>. The <filename>oid</filename> of
238 this instance will be a foreign key in other classes, notably
239 <filename>pg_amop</filename>.
242 INSERT INTO pg_opclass (opcname, opcdeftype)
243 SELECT 'complex_abs_ops', oid FROM pg_type WHERE typname = 'complex';
245 SELECT oid, opcname, opcdeftype
247 WHERE opcname = 'complex_abs_ops';
249 oid | opcname | opcdeftype
250 --------+-----------------+------------
251 277975 | complex_abs_ops | 277946
255 Note that the oid for your <filename>pg_opclass</filename> instance will
256 be different! Don't worry about this though. We'll get this number
257 from the system later just like we got the oid of the type here.
261 The above example assumes that you want to make this new opclass the
262 default index opclass for the <filename>complex</filename> datatype.
263 If you don't, just insert zero into <filename>opcdeftype</filename>,
264 rather than inserting the datatype's oid:
267 INSERT INTO pg_opclass (opcname, opcdeftype) VALUES ('complex_abs_ops', 0);
273 So now we have an access method and an operator class.
274 We still need a set of operators. The procedure for
275 defining operators was discussed earlier in this manual.
276 For the <filename>complex_abs_ops</filename> operator class on Btrees,
277 the operators we require are:
280 absolute value less-than
281 absolute value less-than-or-equal
283 absolute value greater-than-or-equal
284 absolute value greater-than
289 Suppose the code that implements the functions defined
290 is stored in the file
291 <filename>PGROOT/src/tutorial/complex.c</filename>
295 Part of the C code looks like this: (note that we will only show the
296 equality operator for the rest of the examples. The other four
297 operators are very similar. Refer to <filename>complex.c</filename>
298 or <filename>complex.source</filename> for the details.)
301 #define Mag(c) ((c)->x*(c)->x + (c)->y*(c)->y)
304 complex_abs_eq(Complex *a, Complex *b)
306 double amag = Mag(a), bmag = Mag(b);
313 We make the function known to Postgres like this:
315 CREATE FUNCTION complex_abs_eq(complex, complex)
317 AS 'PGROOT/tutorial/obj/complex.so'
323 There are some important things that are happening here.
327 First, note that operators for less-than, less-than-or-equal, equal,
328 greater-than-or-equal, and greater-than for <filename>complex</filename>
329 are being defined. We can only have one operator named, say, = and
330 taking type <filename>complex</filename> for both operands. In this case
331 we don't have any other operator = for <filename>complex</filename>,
332 but if we were building a practical datatype we'd probably want = to
333 be the ordinary equality operation for complex numbers. In that case,
334 we'd need to use some other operator name for complex_abs_eq.
338 Second, although Postgres can cope with operators having
339 the same name as long as they have different input datatypes, C can only
340 cope with one global routine having a given name, period. So we shouldn't
341 name the C function something simple like <filename>abs_eq</filename>.
342 Usually it's a good practice to include the datatype name in the C
343 function name, so as not to conflict with functions for other datatypes.
347 Third, we could have made the Postgres name of the function
348 <filename>abs_eq</filename>, relying on Postgres to distinguish it
349 by input datatypes from any other Postgres function of the same name.
350 To keep the example simple, we make the function have the same names
351 at the C level and Postgres level.
355 Finally, note that these operator functions return Boolean values.
356 The access methods rely on this fact. (On the other
357 hand, the support function returns whatever the particular access method
358 expects -- in this case, a signed integer.) The final routine in the
359 file is the "support routine" mentioned when we discussed the amsupport
360 attribute of the <filename>pg_am</filename> class. We will use this
361 later on. For now, ignore it.
365 Now we are ready to define the operators:
369 leftarg = complex, rightarg = complex,
370 procedure = complex_abs_eq,
371 restrict = eqsel, join = eqjoinsel
376 things here are the procedure names (which are the <acronym>C</acronym>
377 functions defined above) and the restriction and join selectivity
378 functions. You should just use the selectivity functions used in
379 the example (see <filename>complex.source</filename>).
381 are different such functions for the less-than, equal, and greater-than
382 cases. These must be supplied, or the optimizer will be unable to
383 make effective use of the index.
387 The next step is to add entries for these operators to
388 the <filename>pg_amop</filename> relation. To do this,
389 we'll need the <filename>oid</filename>s of the operators we just
390 defined. We'll look up the names of all the operators that take
391 two <filename>complex</filename>es, and pick ours out:
394 SELECT o.oid AS opoid, o.oprname
395 INTO TABLE complex_ops_tmp
396 FROM pg_operator o, pg_type t
397 WHERE o.oprleft = t.oid and o.oprright = t.oid
398 and t.typname = 'complex';
411 (Again, some of your <filename>oid</filename> numbers will almost
412 certainly be different.) The operators we are interested in are those
413 with <filename>oid</filename>s 277970 through 277974. The values you
414 get will probably be different, and you should substitute them for the
415 values below. We will do this with a select statement.
419 Now we're ready to update <filename>pg_amop</filename> with our new
420 operator class. The most important thing in this entire discussion
421 is that the operators are ordered, from less than through greater
422 than, in <filename>pg_amop</filename>. We add the instances we need:
425 INSERT INTO pg_amop (amopid, amopclaid, amopopr, amopstrategy)
426 SELECT am.oid, opcl.oid, c.opoid, 1
427 FROM pg_am am, pg_opclass opcl, complex_ops_tmp c
428 WHERE amname = 'btree' AND
429 opcname = 'complex_abs_ops' AND
433 Now do this for the other operators substituting for the "1" in the
434 third line above and the "<" in the last line. Note the order:
435 "less than" is 1, "less than or equal" is 2, "equal" is 3, "greater
436 than or equal" is 4, and "greater than" is 5.
440 The next step is registration of the "support routine" previously
441 described in our discussion of <filename>pg_am</filename>. The
442 <filename>oid</filename> of this support routine is stored in the
443 <filename>pg_amproc</filename> class, keyed by the access method
444 <filename>oid</filename> and the operator class <filename>oid</filename>.
445 First, we need to register the function in
446 <productname>Postgres</productname> (recall that we put the
447 <acronym>C</acronym> code that implements this routine in the bottom of
448 the file in which we implemented the operator routines):
451 CREATE FUNCTION complex_abs_cmp(complex, complex)
453 AS 'PGROOT/tutorial/obj/complex.so'
456 SELECT oid, proname FROM pg_proc
457 WHERE proname = 'complex_abs_cmp';
460 --------+-----------------
461 277997 | complex_abs_cmp
465 (Again, your <filename>oid</filename> number will probably be different.)
466 We can add the new instance as follows:
469 INSERT INTO pg_amproc (amid, amopclaid, amproc, amprocnum)
470 SELECT a.oid, b.oid, c.oid, 1
471 FROM pg_am a, pg_opclass b, pg_proc c
472 WHERE a.amname = 'btree' AND
473 b.opcname = 'complex_abs_ops' AND
474 c.proname = 'complex_abs_cmp';
479 And we're done! (Whew.) It should now be possible to create
480 and use btree indexes on <filename>complex</filename> columns.
485 <!-- Keep this comment at the end of the file
490 sgml-minimize-attributes:nil
491 sgml-always-quote-attributes:t
494 sgml-parent-document:nil
495 sgml-default-dtd-file:"./reference.ced"
496 sgml-exposed-tags:nil
497 sgml-local-catalogs:"/usr/lib/sgml/catalog"
498 sgml-local-ecat-files:nil