2 $Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.39 2001/10/26 19:58:12 tgl Exp $
6 <title id="xfunc-title">Extending <acronym>SQL</acronym>: Functions</title>
8 <sect1 id="xfunc-intro">
9 <title>Introduction</title>
12 Historically, functions were perhaps considered a tool for creating
13 types. Today, few people build their own types but many write
14 their own functions. This introduction ought to be changed to
19 As it turns out, part of defining a new type is the
20 definition of functions that describe its behavior.
21 Consequently, while it is possible to define a new
22 function without defining a new type, the reverse is
23 not true. We therefore describe how to add new functions
24 to <productname>Postgres</productname> before describing
29 <productname>PostgreSQL</productname> provides four kinds of
35 query language functions
36 (functions written in <acronym>SQL</acronym>)
42 functions (functions written in, for example, <application>PL/Tcl</> or <application>PL/pgSQL</>)
60 of function can take a base type, a composite type or
61 some combination as arguments (parameters). In addition,
62 every kind of function can return a base type or
63 a composite type. It's easiest to define <acronym>SQL</acronym>
64 functions, so we'll start with those. Examples in this section
65 can also be found in <filename>funcs.sql</filename>
66 and <filename>funcs.c</filename> in the tutorial directory.
70 <sect1 id="xfunc-sql">
71 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
74 SQL functions execute an arbitrary list of SQL statements, returning
75 the results of the last query in the list. In the simple (non-set)
76 case, the first row of the last query's result will be returned.
77 (Bear in mind that <quote>the first row</quote> is not well-defined
78 unless you use <literal>ORDER BY</>.) If the last query happens
79 to return no rows at all, NULL will be returned.
83 Alternatively, an SQL function may be declared to return a set,
84 by specifying the function's return type
85 as <literal>SETOF</literal> <replaceable>sometype</>. In this case
86 all rows of the last query's result are returned. Further details
91 The body of an SQL function should be a list of one or more SQL
92 statements separated by semicolons. Note that because the syntax
93 of the <command>CREATE FUNCTION</command> command requires the body of the
94 function to be enclosed in single quotes, single quote marks
96 in the body of the function must be escaped, by writing two single
97 quotes (<literal>''</>) or a backslash (<literal>\'</>) where each
102 Arguments to the SQL function may be referenced in the function
103 body using the syntax <literal>$<replaceable>n</></>: $1 refers to
104 the first argument, $2 to the second, and so on. If an argument
105 is of a composite type, then the <quote>dot notation</quote>,
106 e.g., <literal>$1.emp</literal>, may be used to access attributes
111 <title>Examples</title>
114 To illustrate a simple SQL function, consider the following,
115 which might be used to debit a bank account:
118 CREATE FUNCTION tp1 (integer, numeric) RETURNS integer AS '
120 SET balance = balance - $2
121 WHERE accountno = $1;
126 A user could execute this function to debit account 17 by $100.00 as
130 SELECT tp1(17, 100.0);
135 In practice one would probably like a more useful result from the
136 function than a constant <quote>1</>, so a more likely definition
140 CREATE FUNCTION tp1 (integer, numeric) RETURNS numeric AS '
142 SET balance = balance - $2
143 WHERE accountno = $1;
144 SELECT balance FROM bank WHERE accountno = $1;
148 which adjusts the balance and returns the new balance.
152 Any collection of commands in the <acronym>SQL</acronym>
153 language can be packaged together and defined as a function.
154 The commands can include data modification (i.e.,
155 <command>INSERT</command>, <command>UPDATE</command>, and
156 <command>DELETE</command>) as well
157 as <command>SELECT</command> queries. However, the final command
158 must be a <command>SELECT</command> that returns whatever is
159 specified as the function's return type.
162 CREATE FUNCTION clean_EMP () RETURNS integer AS '
164 WHERE EMP.salary <= 0;
165 SELECT 1 AS ignore_this;
180 <title><acronym>SQL</acronym> Functions on Base Types</title>
183 The simplest possible <acronym>SQL</acronym> function has no arguments and
184 simply returns a base type, such as <type>integer</type>:
187 CREATE FUNCTION one() RETURNS integer AS '
202 Notice that we defined a column alias within the function body for the result of the function
203 (with the name <literal>RESULT</>), but this column alias is not visible
204 outside the function. Hence, the result is labelled <literal>one</>
205 instead of <literal>RESULT</>.
209 It is almost as easy to define <acronym>SQL</acronym> functions
210 that take base types as arguments. In the example below, notice
211 how we refer to the arguments within the function as <literal>$1</>
215 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS '
219 SELECT add_em(1, 2) AS answer;
231 <title><acronym>SQL</acronym> Functions on Composite Types</title>
234 When specifying functions with arguments of composite
235 types, we must not only specify which
236 argument we want (as we did above with <literal>$1</> and <literal>$2</literal>) but
237 also the attributes of that argument. For example, suppose that
238 <type>EMP</type> is a table containing employee data, and therefore
239 also the name of the composite type of each row of the table. Here
240 is a function <function>double_salary</function> that computes what your
241 salary would be if it were doubled:
244 CREATE FUNCTION double_salary(EMP) RETURNS integer AS '
245 SELECT $1.salary * 2 AS salary;
248 SELECT name, double_salary(EMP) AS dream
250 WHERE EMP.cubicle ~= point '(2,1)';
261 Notice the use of the syntax <literal>$1.salary</literal>
262 to select one field of the argument row value. Also notice
263 how the calling SELECT command uses a table name to denote
264 the entire current row of that table as a composite value.
268 It is also possible to build a function that returns a composite type.
269 (However, as we'll see below, there are some
270 unfortunate restrictions on how the function may be used.)
271 This is an example of a function
272 that returns a single <type>EMP</type> row:
275 CREATE FUNCTION new_emp() RETURNS EMP AS '
276 SELECT text ''None'' AS name,
279 point ''(2,2)'' AS cubicle;
285 In this case we have specified each of the attributes
286 with a constant value, but any computation or expression
287 could have been substituted for these constants.
288 Note two important things about defining the function:
293 The target list order must be exactly the same as
294 that in which the columns appear in the table associated
295 with the composite type.
300 You must typecast the expressions to match the
301 definition of the composite type, or you will get errors like this:
304 ERROR: function declared to return emp returns varchar instead of text at column 1
313 In the present release of <productname>PostgreSQL</productname>
314 there are some unpleasant restrictions on how functions returning
315 composite types can be used. Briefly, when calling a function that
316 returns a row, we cannot retrieve the entire row. We must either
317 project a single attribute out of the row or pass the entire row into
318 another function. (Trying to display the entire row value will yield
319 a meaningless number.) For example,
322 SELECT name(new_emp());
333 This example makes use of the
334 function notation for projecting attributes. The simple way
335 to explain this is that we can usually use the
336 notations <literal>attribute(table)</> and <literal>table.attribute</>
341 -- this is the same as:
342 -- SELECT EMP.name AS youngster FROM EMP WHERE EMP.age < 30
344 SELECT name(EMP) AS youngster
346 WHERE age(EMP) < 30;
357 The reason why, in general, we must use the function
358 syntax for projecting attributes of function return
359 values is that the parser just doesn't understand
360 the dot syntax for projection when combined
364 SELECT new_emp().name AS nobody;
365 ERROR: parser: parse error at or near "."
370 Another way to use a function returning a row result is to declare a
371 second function accepting a rowtype parameter, and pass the function
375 CREATE FUNCTION getname(emp) RETURNS text AS
381 SELECT getname(new_emp());
391 <title><acronym>SQL</acronym> Functions Returning Sets</title>
394 As previously mentioned, an SQL function may be declared as
395 returning <literal>SETOF</literal> <replaceable>sometype</>.
396 In this case the function's final SELECT query is executed to
397 completion, and each row it outputs is returned as an element
402 Functions returning sets may only be called in the target list
403 of a SELECT query. For each row that the SELECT generates by itself,
404 the function returning set is invoked, and an output row is generated
405 for each element of the function's result set. An example:
408 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS
409 'SELECT name FROM nodes WHERE parent = $1'
425 SELECT listchildren('Top');
433 SELECT name, listchildren(name) FROM nodes;
435 --------+--------------
444 Notice that no output row appears for Child2, Child3, etc.
445 This happens because listchildren() returns an empty set
446 for those inputs, so no output rows are generated.
451 <sect1 id="xfunc-pl">
452 <title>Procedural Language Functions</title>
455 Procedural languages aren't built into the <productname>PostgreSQL</productname> server; they are offered
456 by loadable modules. Please refer to the documentation of the
457 procedural language in question for details about the syntax and how the function body
458 is interpreted for each language.
462 There are currently four procedural languages available in the
463 standard <productname>PostgreSQL</productname> distribution:
464 <application>PL/pgSQL</application>, <application>PL/Tcl</application>,
465 <application>PL/Perl</application>, and <application>PL/Python</application>. Other languages can be
466 defined by users. Refer to <xref linkend="xplang"> for more
467 information. The basics of developing a new procedural language are covered in <xref linkend="xfunc-plhandler">.
471 <sect1 id="xfunc-internal">
472 <title>Internal Functions</title>
475 Internal functions are functions written in C that have been statically
476 linked into the <productname>PostgreSQL</productname> server.
477 The <quote>body</quote> of the function definition
478 specifies the C-language name of the function, which need not be the
479 same as the name being declared for SQL use.
480 (For reasons of backwards compatibility, an empty body
481 is accepted as meaning that the C-language function name is the
482 same as the SQL name.)
486 Normally, all internal functions present in the
487 backend are declared during the initialization of the database cluster (<command>initdb</command>),
488 but a user could use <command>CREATE FUNCTION</command>
489 to create additional alias names for an internal function.
490 Internal functions are declared in <command>CREATE FUNCTION</command>
491 with language name <literal>internal</literal>. For instance, to
492 create an alias for the <function>sqrt</function> function:
494 CREATE FUNCTION square_root(double precision) RETURNS double precision
499 (Most internal functions expect to be declared <quote>strict</quote>.)
504 Not all <quote>predefined</quote> functions are
505 <quote>internal</quote> in the above sense. Some predefined
506 functions are written in SQL.
512 <title>C Language Functions</title>
515 User-defined functions can be written in C (or a language that can
516 be made compatible with C, such as C++). Such functions are
517 compiled into dynamically loadable objects (also called shared
518 libraries) and are loaded by the server on demand. The dynamic
519 loading feature is what distinguishes <quote>C language</> functions
520 from <quote>internal</> functions --- the actual coding conventions
521 are essentially the same for both. (Hence, the standard internal
522 function library is a rich source of coding examples for user-defined
527 Two different calling conventions are currently used for C functions.
528 The newer <quote>version 1</quote> calling convention is indicated by writing
529 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
530 as illustrated below. Lack of such a macro indicates an old-style
531 ("version 0") function. The language name specified in <command>CREATE FUNCTION</command>
532 is <literal>C</literal> in either case. Old-style functions are now deprecated
533 because of portability problems and lack of functionality, but they
534 are still supported for compatibility reasons.
537 <sect2 id="xfunc-c-dynload">
538 <title>Dynamic Loading</title>
541 The first time a user-defined function in a particular
542 loadable object file is called in a backend session,
543 the dynamic loader loads that object file into memory so that the
544 function can be called. The <command>CREATE FUNCTION</command>
545 for a user-defined C function must therefore specify two pieces of
546 information for the function: the name of the loadable
547 object file, and the C name (link symbol) of the specific function to call
548 within that object file. If the C name is not explicitly specified then
549 it is assumed to be the same as the SQL function name.
553 The following algorithm is used to locate the shared object file
554 based on the name given in the <command>CREATE FUNCTION</command>
560 If the name is an absolute path, the given file is loaded.
566 If the name starts with the string <literal>$libdir</literal>,
567 that part is replaced by the PostgreSQL package library directory
568 name, which is determined at build time.
574 If the name does not contain a directory part, the file is
575 searched for in the path specified by the configuration variable
576 <varname>dynamic_library_path</varname>.
582 Otherwise (the file was not found in the path, or it contains a
583 non-absolute directory part), the dynamic loader will try to
584 take the name as given, which will most likely fail. (It is
585 unreliable to depend on the current working directory.)
590 If this sequence does not work, the platform-specific shared
591 library file name extension (often <filename>.so</filename>) is
592 appended to the given name and this sequence is tried again. If
593 that fails as well, the load will fail.
598 The user id the <application>PostgreSQL</application> server runs
599 as must be able to traverse the path to the file you intend to
600 load. Making the file or a higher-level directory not readable
601 and/or not executable by the <quote>postgres</quote> user is a
607 In any case, the file name that is given in the
608 <command>CREATE FUNCTION</command> command is recorded literally
609 in the system catalogs, so if the file needs to be loaded again
610 the same procedure is applied.
615 <application>PostgreSQL</application> will not compile a C function
616 automatically. The object file must be compiled before it is referenced
618 FUNCTION</> command. See <xref linkend="dfunc"> for additional
625 After it is used for the first time, a dynamically loaded object
626 file is retained in memory. Future calls in the same session to the
627 function(s) in that file will only incur the small overhead of a symbol
628 table lookup. If you need to force a reload of an object file, for
629 example after recompiling it, use the <command>LOAD</> command or
630 begin a fresh session.
635 It is recommended to locate shared libraries either relative to
636 <literal>$libdir</literal> or through the dynamic library path.
637 This simplifies version upgrades if the new installation is at a
638 different location. The actual directory that
639 <literal>$libdir</literal> stands for can be found out with the
640 command <literal>pg_config --pkglibdir</literal>.
645 Before <application>PostgreSQL</application> release 7.2, only exact
646 absolute paths to object files could be specified in <command>CREATE
647 FUNCTION</>. This approach is now deprecated since it makes the
648 function definition unnecessarily unportable. It's best to specify
649 just the shared library name with no path nor extension, and let
650 the search mechanism provide that information instead.
657 <title>Base Types in C-Language Functions</title>
660 <xref linkend="xfunc-c-type-table"> gives the C type required for
661 parameters in the C functions that will be loaded into Postgres.
662 The <quote>Defined In</quote> column gives the header file that
663 needs to be included to get the type definition. (The actual
664 definition may be in a different file that is included by the
665 listed file. It is recommended that users stick to the defined
666 interface.) Note that you should always include
667 <filename>postgres.h</filename> first in any source file, because
668 it declares a number of things that you will need anyway.
671 <table tocentry="1" id="xfunc-c-type-table">
672 <title>Equivalent C Types
673 for Built-In <productname>PostgreSQL</productname> Types</title>
674 <titleabbrev>Equivalent C Types</titleabbrev>
691 <entry><type>abstime</type></entry>
692 <entry><type>AbsoluteTime</type></entry>
693 <entry><filename>utils/nabstime.h</filename></entry>
696 <entry><type>boolean</type></entry>
697 <entry><type>bool</type></entry>
698 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
701 <entry><type>box</type></entry>
702 <entry><type>BOX*</type></entry>
703 <entry><filename>utils/geo-decls.h</filename></entry>
706 <entry><type>bytea</type></entry>
707 <entry><type>bytea*</type></entry>
708 <entry><filename>postgres.h</filename></entry>
711 <entry><type>"char"</type></entry>
712 <entry><type>char</type></entry>
713 <entry>(compiler built-in)</entry>
716 <entry><type>character</type></entry>
717 <entry><type>BpChar*</type></entry>
718 <entry><filename>postgres.h</filename></entry>
721 <entry><type>cid</type></entry>
722 <entry><type>CommandId</type></entry>
723 <entry><filename>postgres.h</filename></entry>
726 <entry><type>date</type></entry>
727 <entry><type>DateADT</type></entry>
728 <entry><filename>utils/date.h</filename></entry>
731 <entry><type>smallint</type> (<type>int2</type>)</entry>
732 <entry><type>int2</type> or <type>int16</type></entry>
733 <entry><filename>postgres.h</filename></entry>
736 <entry><type>int2vector</type></entry>
737 <entry><type>int2vector*</type></entry>
738 <entry><filename>postgres.h</filename></entry>
741 <entry><type>integer</type> (<type>int4</type>)</entry>
742 <entry><type>int4</type> or <type>int32</type></entry>
743 <entry><filename>postgres.h</filename></entry>
746 <entry><type>real</type> (<type>float4</type>)</entry>
747 <entry><type>float4*</type></entry>
748 <entry><filename>postgres.h</filename></entry>
751 <entry><type>double precision</type> (<type>float8</type>)</entry>
752 <entry><type>float8*</type></entry>
753 <entry><filename>postgres.h</filename></entry>
756 <entry><type>interval</type></entry>
757 <entry><type>Interval*</type></entry>
758 <entry><filename>utils/timestamp.h</filename></entry>
761 <entry><type>lseg</type></entry>
762 <entry><type>LSEG*</type></entry>
763 <entry><filename>utils/geo-decls.h</filename></entry>
766 <entry><type>name</type></entry>
767 <entry><type>Name</type></entry>
768 <entry><filename>postgres.h</filename></entry>
771 <entry><type>oid</type></entry>
772 <entry><type>Oid</type></entry>
773 <entry><filename>postgres.h</filename></entry>
776 <entry><type>oidvector</type></entry>
777 <entry><type>oidvector*</type></entry>
778 <entry><filename>postgres.h</filename></entry>
781 <entry><type>path</type></entry>
782 <entry><type>PATH*</type></entry>
783 <entry><filename>utils/geo-decls.h</filename></entry>
786 <entry><type>point</type></entry>
787 <entry><type>POINT*</type></entry>
788 <entry><filename>utils/geo-decls.h</filename></entry>
791 <entry><type>regproc</type></entry>
792 <entry><type>regproc</type></entry>
793 <entry><filename>postgres.h</filename></entry>
796 <entry><type>reltime</type></entry>
797 <entry><type>RelativeTime</type></entry>
798 <entry><filename>utils/nabstime.h</filename></entry>
801 <entry><type>text</type></entry>
802 <entry><type>text*</type></entry>
803 <entry><filename>postgres.h</filename></entry>
806 <entry><type>tid</type></entry>
807 <entry><type>ItemPointer</type></entry>
808 <entry><filename>storage/itemptr.h</filename></entry>
811 <entry><type>time</type></entry>
812 <entry><type>TimeADT</type></entry>
813 <entry><filename>utils/date.h</filename></entry>
816 <entry><type>time with time zone</type></entry>
817 <entry><type>TimeTzADT</type></entry>
818 <entry><filename>utils/date.h</filename></entry>
821 <entry><type>timestamp</type></entry>
822 <entry><type>Timestamp*</type></entry>
823 <entry><filename>utils/timestamp.h</filename></entry>
826 <entry><type>tinterval</type></entry>
827 <entry><type>TimeInterval</type></entry>
828 <entry><filename>utils/nabstime.h</filename></entry>
831 <entry><type>varchar</type></entry>
832 <entry><type>VarChar*</type></entry>
833 <entry><filename>postgres.h</filename></entry>
836 <entry><type>xid</type></entry>
837 <entry><type>TransactionId</type></entry>
838 <entry><filename>postgres.h</filename></entry>
845 Internally, <productname>Postgres</productname> regards a
846 base type as a <quote>blob of memory</quote>. The user-defined
847 functions that you define over a type in turn define the
848 way that <productname>Postgres</productname> can operate
849 on it. That is, <productname>Postgres</productname> will
850 only store and retrieve the data from disk and use your
851 user-defined functions to input, process, and output the data.
852 Base types can have one of three internal formats:
857 pass by value, fixed-length
862 pass by reference, fixed-length
867 pass by reference, variable-length
874 By-value types can only be 1, 2 or 4 bytes in length
875 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
876 You should be careful
877 to define your types such that they will be the same
878 size (in bytes) on all architectures. For example, the
879 <literal>long</literal> type is dangerous because it
880 is 4 bytes on some machines and 8 bytes on others, whereas
881 <type>int</type> type is 4 bytes on most
882 Unix machines. A reasonable implementation of
883 the <type>int4</type> type on Unix
887 /* 4-byte integer, passed by value */
891 <productname>PostgreSQL</productname> automatically figures
892 things out so that the integer types really have the size they
897 On the other hand, fixed-length types of any size may
898 be passed by-reference. For example, here is a sample
899 implementation of a <productname>PostgreSQL</productname> type:
902 /* 16-byte structure, passed by reference */
911 Only pointers to such types can be used when passing
912 them in and out of <productname>Postgres</productname> functions.
913 To return a value of such a type, allocate the right amount of
914 memory with <literal>palloc()</literal>, fill in the allocated memory,
915 and return a pointer to it. (Alternatively, you can return an input
916 value of the same type by returning its pointer. <emphasis>Never</>
917 modify the contents of a pass-by-reference input value, however.)
921 Finally, all variable-length types must also be passed
922 by reference. All variable-length types must begin
923 with a length field of exactly 4 bytes, and all data to
924 be stored within that type must be located in the memory
925 immediately following that length field. The
926 length field is the total length of the structure
927 (i.e., it includes the size of the length field
928 itself). We can define the text type as follows:
939 Obviously, the data field shown here is not long enough to hold
940 all possible strings; it's impossible to declare such
941 a structure in <acronym>C</acronym>. When manipulating
942 variable-length types, we must be careful to allocate
943 the correct amount of memory and initialize the length field.
944 For example, if we wanted to store 40 bytes in a text
945 structure, we might use a code fragment like this:
948 #include "postgres.h"
950 char buffer[40]; /* our source data */
952 text *destination = (text *) palloc(VARHDRSZ + 40);
953 destination->length = VARHDRSZ + 40;
954 memmove(destination->data, buffer, 40);
960 Now that we've gone over all of the possible structures
961 for base types, we can show some examples of real functions.
966 <title>Version-0 Calling Conventions for C-Language Functions</title>
969 We present the <quote>old style</quote> calling convention first --- although
970 this approach is now deprecated, it's easier to get a handle on
971 initially. In the version-0 method, the arguments and result
972 of the C function are just declared in normal C style, but being
973 careful to use the C representation of each SQL data type as shown
978 Here are some examples:
981 #include "postgres.h"
982 #include <string.h>
992 /* By Reference, Fixed Length */
995 add_one_float8(float8 *arg)
997 float8 *result = (float8 *) palloc(sizeof(float8));
999 *result = *arg + 1.0;
1005 makepoint(Point *pointx, Point *pointy)
1007 Point *new_point = (Point *) palloc(sizeof(Point));
1009 new_point->x = pointx->x;
1010 new_point->y = pointy->y;
1015 /* By Reference, Variable Length */
1021 * VARSIZE is the total size of the struct in bytes.
1023 text *new_t = (text *) palloc(VARSIZE(t));
1024 VARATT_SIZEP(new_t) = VARSIZE(t);
1026 * VARDATA is a pointer to the data region of the struct.
1028 memcpy((void *) VARDATA(new_t), /* destination */
1029 (void *) VARDATA(t), /* source */
1030 VARSIZE(t)-VARHDRSZ); /* how many bytes */
1035 concat_text(text *arg1, text *arg2)
1037 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1038 text *new_text = (text *) palloc(new_text_size);
1040 VARATT_SIZEP(new_text) = new_text_size;
1041 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1042 memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1043 VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1050 Supposing that the above code has been prepared in file
1051 <filename>funcs.c</filename> and compiled into a shared object,
1052 we could define the functions to <productname>Postgres</productname>
1053 with commands like this:
1056 CREATE FUNCTION add_one(int4) RETURNS int4
1057 AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1060 -- note overloading of SQL function name add_one()
1061 CREATE FUNCTION add_one(float8) RETURNS float8
1062 AS '<replaceable>PGROOT</replaceable>/tutorial/funcs',
1064 LANGUAGE 'c' WITH (isStrict);
1066 CREATE FUNCTION makepoint(point, point) RETURNS point
1067 AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1070 CREATE FUNCTION copytext(text) RETURNS text
1071 AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1074 CREATE FUNCTION concat_text(text, text) RETURNS text
1075 AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1081 Here <replaceable>PGROOT</replaceable> stands for the full path to
1082 the <productname>Postgres</productname> source tree. (Better style would
1083 be to use just <literal>'funcs'</> in the <literal>AS</> clause,
1084 after having added <replaceable>PGROOT</replaceable><literal>/tutorial</>
1085 to the search path. In any case, we may omit the system-specific
1086 extension for a shared library, commonly <literal>.so</literal> or
1087 <literal>.sl</literal>.)
1091 Notice that we have specified the functions as <quote>strict</quote>,
1093 the system should automatically assume a NULL result if any input
1094 value is NULL. By doing this, we avoid having to check for NULL inputs
1095 in the function code. Without this, we'd have to check for NULLs
1096 explicitly, for example by checking for a null pointer for each
1097 pass-by-reference argument. (For pass-by-value arguments, we don't
1098 even have a way to check!)
1102 Although this calling convention is simple to use,
1103 it is not very portable; on some architectures there are problems
1104 with passing smaller-than-int data types this way. Also, there is
1105 no simple way to return a NULL result, nor to cope with NULL arguments
1106 in any way other than making the function strict. The version-1
1107 convention, presented next, overcomes these objections.
1112 <title>Version-1 Calling Conventions for C-Language Functions</title>
1115 The version-1 calling convention relies on macros to suppress most
1116 of the complexity of passing arguments and results. The C declaration
1117 of a version-1 function is always
1119 Datum funcname(PG_FUNCTION_ARGS)
1121 In addition, the macro call
1123 PG_FUNCTION_INFO_V1(funcname);
1125 must appear in the same source file (conventionally it's written
1126 just before the function itself). This macro call is not needed
1127 for <literal>internal</>-language functions, since Postgres currently
1128 assumes all internal functions are version-1. However, it is
1129 <emphasis>required</emphasis> for dynamically-loaded functions.
1133 In a version-1 function, each actual argument is fetched using a
1134 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1135 macro that corresponds to the argument's datatype, and the result
1137 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1138 macro for the return type.
1142 Here we show the same functions as above, coded in version-1 style:
1145 #include "postgres.h"
1146 #include <string.h>
1151 PG_FUNCTION_INFO_V1(add_one);
1154 add_one(PG_FUNCTION_ARGS)
1156 int32 arg = PG_GETARG_INT32(0);
1158 PG_RETURN_INT32(arg + 1);
1161 /* By Reference, Fixed Length */
1163 PG_FUNCTION_INFO_V1(add_one_float8);
1166 add_one_float8(PG_FUNCTION_ARGS)
1168 /* The macros for FLOAT8 hide its pass-by-reference nature */
1169 float8 arg = PG_GETARG_FLOAT8(0);
1171 PG_RETURN_FLOAT8(arg + 1.0);
1174 PG_FUNCTION_INFO_V1(makepoint);
1177 makepoint(PG_FUNCTION_ARGS)
1179 /* Here, the pass-by-reference nature of Point is not hidden */
1180 Point *pointx = PG_GETARG_POINT_P(0);
1181 Point *pointy = PG_GETARG_POINT_P(1);
1182 Point *new_point = (Point *) palloc(sizeof(Point));
1184 new_point->x = pointx->x;
1185 new_point->y = pointy->y;
1187 PG_RETURN_POINT_P(new_point);
1190 /* By Reference, Variable Length */
1192 PG_FUNCTION_INFO_V1(copytext);
1195 copytext(PG_FUNCTION_ARGS)
1197 text *t = PG_GETARG_TEXT_P(0);
1199 * VARSIZE is the total size of the struct in bytes.
1201 text *new_t = (text *) palloc(VARSIZE(t));
1202 VARATT_SIZEP(new_t) = VARSIZE(t);
1204 * VARDATA is a pointer to the data region of the struct.
1206 memcpy((void *) VARDATA(new_t), /* destination */
1207 (void *) VARDATA(t), /* source */
1208 VARSIZE(t)-VARHDRSZ); /* how many bytes */
1209 PG_RETURN_TEXT_P(new_t);
1212 PG_FUNCTION_INFO_V1(concat_text);
1215 concat_text(PG_FUNCTION_ARGS)
1217 text *arg1 = PG_GETARG_TEXT_P(0);
1218 text *arg2 = PG_GETARG_TEXT_P(1);
1219 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1220 text *new_text = (text *) palloc(new_text_size);
1222 VARATT_SIZEP(new_text) = new_text_size;
1223 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1224 memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1225 VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1226 PG_RETURN_TEXT_P(new_text);
1232 The <command>CREATE FUNCTION</command> commands are the same as
1233 for the version-0 equivalents.
1237 At first glance, the version-1 coding conventions may appear to
1238 be just pointless obscurantism. However, they do offer a number
1239 of improvements, because the macros can hide unnecessary detail.
1240 An example is that in coding add_one_float8, we no longer need to
1241 be aware that float8 is a pass-by-reference type. Another
1242 example is that the GETARG macros for variable-length types hide
1243 the need to deal with fetching <quote>toasted</quote> (compressed or
1244 out-of-line) values. The old-style <function>copytext</function>
1245 and <function>concat_text</function> functions shown above are
1246 actually wrong in the presence of toasted values, because they
1247 don't call <function>pg_detoast_datum()</function> on their
1248 inputs. (The handler for old-style dynamically-loaded functions
1249 currently takes care of this detail, but it does so less
1250 efficiently than is possible for a version-1 function.)
1254 One big improvement in version-1 functions is better handling of NULL
1255 inputs and results. The macro <function>PG_ARGISNULL(n)</function>
1256 allows a function to test whether each input is NULL (of course, doing
1257 this is only necessary in functions not declared <quote>strict</>).
1259 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
1260 the input arguments are counted beginning at zero. Note that one
1261 should refrain from executing
1262 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
1263 one has verified that the argument isn't NULL.
1264 To return a NULL result, execute <function>PG_RETURN_NULL()</function>;
1265 this works in both strict and non-strict functions.
1269 The version-1 function call conventions make it possible to
1270 return <quote>set</quote> results and implement trigger functions and
1271 procedural-language call handlers. Version-1 code is also more
1272 portable than version-0, because it does not break ANSI C restrictions
1273 on function call protocol. For more details see
1274 <filename>src/backend/utils/fmgr/README</filename> in the source
1280 <title>Composite Types in C-Language Functions</title>
1283 Composite types do not have a fixed layout like C
1284 structures. Instances of a composite type may contain
1285 null fields. In addition, composite types that are
1286 part of an inheritance hierarchy may have different
1287 fields than other members of the same inheritance hierarchy.
1288 Therefore, <productname>Postgres</productname> provides
1289 a procedural interface for accessing fields of composite types
1290 from C. As <productname>Postgres</productname> processes
1291 a set of rows, each row will be passed into your
1292 function as an opaque structure of type <literal>TUPLE</literal>.
1293 Suppose we want to write a function to answer the query
1296 SELECT name, c_overpaid(emp, 1500) AS overpaid
1298 WHERE name = 'Bill' OR name = 'Sam';
1301 In the query above, we can define c_overpaid as:
1304 #include "postgres.h"
1305 #include "executor/executor.h" /* for GetAttributeByName() */
1308 c_overpaid(TupleTableSlot *t, /* the current row of EMP */
1314 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
1317 return salary > limit;
1320 /* In version-1 coding, the above would look like this: */
1322 PG_FUNCTION_INFO_V1(c_overpaid);
1325 c_overpaid(PG_FUNCTION_ARGS)
1327 TupleTableSlot *t = (TupleTableSlot *) PG_GETARG_POINTER(0);
1328 int32 limit = PG_GETARG_INT32(1);
1332 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
1334 PG_RETURN_BOOL(false);
1335 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary */
1337 PG_RETURN_BOOL(salary > limit);
1343 <function>GetAttributeByName</function> is the
1344 <productname>Postgres</productname> system function that
1345 returns attributes out of the current row. It has
1346 three arguments: the argument of type <type>TupleTableSlot*</type> passed into
1347 the function, the name of the desired attribute, and a
1348 return parameter that tells whether the attribute
1349 is null. <function>GetAttributeByName</function> returns a Datum
1350 value that you can convert to the proper datatype by using the
1351 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> macro.
1355 The following query lets <productname>Postgres</productname>
1356 know about the <function>c_overpaid</function> function:
1359 CREATE FUNCTION c_overpaid(emp, int4)
1361 AS '<replaceable>PGROOT</replaceable>/tutorial/obj/funcs'
1367 While there are ways to construct new rows or modify
1368 existing rows from within a C function, these
1369 are far too complex to discuss in this manual.
1370 Consult the backend source code for examples.
1375 <title>Writing Code</title>
1378 We now turn to the more difficult task of writing
1379 programming language functions. Be warned: this section
1380 of the manual will not make you a programmer. You must
1381 have a good understanding of <acronym>C</acronym>
1382 (including the use of pointers and the malloc memory manager)
1383 before trying to write <acronym>C</acronym> functions for
1384 use with <productname>Postgres</productname>. While it may
1385 be possible to load functions written in languages other
1386 than <acronym>C</acronym> into <productname>Postgres</productname>,
1387 this is often difficult (when it is possible at all)
1388 because other languages, such as <acronym>FORTRAN</acronym>
1389 and <acronym>Pascal</acronym> often do not follow the same
1390 <firstterm>calling convention</firstterm>
1391 as <acronym>C</acronym>. That is, other
1392 languages do not pass argument and return values
1393 between functions in the same way. For this reason, we
1394 will assume that your programming language functions
1395 are written in <acronym>C</acronym>.
1399 The basic rules for building <acronym>C</acronym> functions
1405 Use <literal>pg_config --includedir-server</literal> to find
1406 out where the PostgreSQL server header files are installed on
1407 your system (or the system that your users will be running
1408 on). This option is new with PostgreSQL 7.2. For PostgreSQL
1409 7.1 you should use the option <option>--includedir</option>.
1410 (<command>pg_config</command> will exit with a non-zero status
1411 if it encounters an unknown option.) For releases prior to
1412 7.1 you will have to guess, but since that was before the
1413 current calling conventions were introduced, it is unlikely
1414 that you want to support those releases.
1420 When allocating memory, use the
1421 <productname>Postgres</productname> routines
1422 <function>palloc</function> and <function>pfree</function>
1423 instead of the corresponding <acronym>C</acronym> library
1424 routines <function>malloc</function> and
1425 <function>free</function>. The memory allocated by
1426 <function>palloc</function> will be freed automatically at the
1427 end of each transaction, preventing memory leaks.
1433 Always zero the bytes of your structures using
1434 <function>memset</function> or <function>bzero</function>.
1435 Several routines (such as the hash access method, hash join
1436 and the sort algorithm) compute functions of the raw bits
1437 contained in your structure. Even if you initialize all
1438 fields of your structure, there may be several bytes of
1439 alignment padding (holes in the structure) that may contain
1446 Most of the internal <productname>Postgres</productname> types
1447 are declared in <filename>postgres.h</filename>, while the function
1448 manager interfaces (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)
1449 are in <filename>fmgr.h</filename>, so you will need to
1450 include at least these two files. For portability reasons it's best
1451 to include <filename>postgres.h</filename> <emphasis>first</>,
1452 before any other system or user header files.
1453 Including <filename>postgres.h</filename> will also include
1454 <filename>elog.h</filename> and <filename>palloc.h</filename>
1461 Symbol names defined within object files must not conflict
1462 with each other or with symbols defined in the
1463 <productname>PostgreSQL</productname> server executable. You
1464 will have to rename your functions or variables if you get
1465 error messages to this effect.
1471 Compiling and linking your object code so that
1472 it can be dynamically loaded into
1473 <productname>Postgres</productname>
1474 always requires special flags.
1475 See <xref linkend="dfunc">
1476 for a detailed explanation of how to do it for
1477 your particular operating system.
1488 <sect1 id="xfunc-overload">
1489 <title>Function Overloading</title>
1492 More than one function may be defined with the same SQL name, so long
1493 as the arguments they take are different. In other words,
1494 function names can be <firstterm>overloaded</firstterm>. When a
1495 query is executed, the server will determine which function to
1496 call from the data types and the number of the provided arguments.
1497 Overloading can also be used to simulate functions with a variable
1498 number of arguments, up to a finite maximum number.
1502 A function may also have the same name as an attribute. In the case
1503 that there is an ambiguity between a function on a complex type and
1504 an attribute of the complex type, the attribute will always be used.
1508 When creating a family of overloaded functions, one should be
1509 careful not to create ambiguities. For instance, given the
1512 CREATE FUNCTION test(int, real) RETURNS ...
1513 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1515 it is not immediately clear which function would be called with
1516 some trivial input like <literal>test(1, 1.5)</literal>. The
1517 currently implemented resolution rules are described in the
1518 <citetitle>User's Guide</citetitle>, but it is unwise to design a
1519 system that subtly relies on this behavior.
1523 When overloading C language functions, there is an additional
1524 constraint: The C name of each function in the family of
1525 overloaded functions must be different from the C names of all
1526 other functions, either internal or dynamically loaded. If this
1527 rule is violated, the behavior is not portable. You might get a
1528 run-time linker error, or one of the functions will get called
1529 (usually the internal one). The alternative form of the
1530 <literal>AS</> clause for the SQL <command>CREATE
1531 FUNCTION</command> command decouples the SQL function name from
1532 the function name in the C source code. E.g.,
1534 CREATE FUNCTION test(int) RETURNS int
1535 AS '<replaceable>filename</>', 'test_1arg'
1537 CREATE FUNCTION test(int, int) RETURNS int
1538 AS '<replaceable>filename</>', 'test_2arg'
1541 The names of the C functions here reflect one of many possible conventions.
1545 Prior to <productname>PostgreSQL</productname> 7.0, this
1546 alternative syntax did not exist. There is a trick to get around
1547 the problem, by defining a set of C functions with different names
1548 and then define a set of identically-named SQL function wrappers
1549 that take the appropriate argument types and call the matching C
1555 <sect1 id="xfunc-plhandler">
1556 <title>Procedural Language Handlers</title>
1559 All calls to functions that are written in a language other than
1560 the current <quote>version 1</quote> interface for compiled
1561 languages (this includes functions in user-defined procedural languages,
1562 functions written in SQL, and functions using the version 0 compiled
1563 language interface), go through a <firstterm>call handler</firstterm>
1564 function for the specific language. It is the responsibility of
1565 the call handler to execute the function in a meaningful way, such
1566 as by interpreting the supplied source text. This section
1567 describes how a language call handler can be written. This is not
1568 a common task, in fact, it has only been done a handful of times
1569 in the history of <productname>PostgreSQL</productname>, but the
1570 topic naturally belongs in this chapter, and the material might
1571 give some insight into the extensible nature of the
1572 <productname>PostgreSQL</productname> system.
1576 The call handler for a procedural language is a
1577 <quote>normal</quote> function, which must be written in a
1578 compiled language such as C and registered with
1579 <productname>PostgreSQL</productname> as taking no arguments and
1580 returning the <type>opaque</type> type, a placeholder for
1581 unspecified or undefined types. This prevents the call handler
1582 from being called directly as a function from queries. (However,
1583 arguments may be supplied in the actual call to the handler when a
1584 function in the language offered by the handler is to be
1590 In <productname>PostgreSQL</productname> 7.1 and later, call
1591 handlers must adhere to the <quote>version 1</quote> function
1592 manager interface, not the old-style interface.
1597 The call handler is called in the same way as any other function:
1598 It receives a pointer to a
1599 <structname>FunctionCallInfoData</structname> struct containing
1600 argument values and information about the called function, and it
1601 is expected to return a <type>Datum</type> result (and possibly
1602 set the <structfield>isnull</structfield> field of the
1603 <structname>FunctionCallInfoData</structname> struct, if it wishes
1604 to return an SQL NULL result). The difference between a call
1605 handler and an ordinary callee function is that the
1606 <structfield>flinfo->fn_oid</structfield> field of the
1607 <structname>FunctionCallInfoData</structname> struct will contain
1608 the OID of the actual function to be called, not of the call
1609 handler itself. The call handler must use this field to determine
1610 which function to execute. Also, the passed argument list has
1611 been set up according to the declaration of the target function,
1612 not of the call handler.
1616 It's up to the call handler to fetch the
1617 <classname>pg_proc</classname> entry and to analyze the argument
1618 and return types of the called procedure. The AS clause from the
1619 <command>CREATE FUNCTION</command> of the procedure will be found
1620 in the <literal>prosrc</literal> attribute of the
1621 <classname>pg_proc</classname> table entry. This may be the source
1622 text in the procedural language itself (like for PL/Tcl), a
1623 path name to a file, or anything else that tells the call handler
1624 what to do in detail.
1628 Often, the same function is called many times per SQL statement.
1629 A call handler can avoid repeated lookups of information about the
1630 called function by using the
1631 <structfield>flinfo->fn_extra</structfield> field. This will
1632 initially be NULL, but can be set by the call handler to point at
1633 information about the PL function. On subsequent calls, if
1634 <structfield>flinfo->fn_extra</structfield> is already non-NULL
1635 then it can be used and the information lookup step skipped. The
1636 call handler must be careful that
1637 <structfield>flinfo->fn_extra</structfield> is made to point at
1638 memory that will live at least until the end of the current query,
1639 since an <structname>FmgrInfo</structname> data structure could be
1640 kept that long. One way to do this is to allocate the extra data
1641 in the memory context specified by
1642 <structfield>flinfo->fn_mcxt</structfield>; such data will
1643 normally have the same lifespan as the
1644 <structname>FmgrInfo</structname> itself. But the handler could
1645 also choose to use a longer-lived context so that it can cache
1646 function definition information across queries.
1650 When a PL function is invoked as a trigger, no explicit arguments
1652 <structname>FunctionCallInfoData</structname>'s
1653 <structfield>context</structfield> field points at a
1654 <structname>TriggerData</structname> node, rather than being NULL
1655 as it is in a plain function call. A language handler should
1656 provide mechanisms for PL functions to get at the trigger
1661 This is a template for a PL handler written in C:
1663 #include "postgres.h"
1664 #include "executor/spi.h"
1665 #include "commands/trigger.h"
1666 #include "utils/elog.h"
1668 #include "access/heapam.h"
1669 #include "utils/syscache.h"
1670 #include "catalog/pg_proc.h"
1671 #include "catalog/pg_type.h"
1673 PG_FUNCTION_INFO_V1(plsample_call_handler);
1676 plsample_call_handler(PG_FUNCTION_ARGS)
1680 if (CALLED_AS_TRIGGER(fcinfo))
1683 * Called as a trigger procedure
1685 TriggerData *trigdata = (TriggerData *) fcinfo->context;
1691 * Called as a function
1703 Only a few thousand lines of code have to be added instead of the
1704 dots to complete the call handler. See <xref linkend="xfunc-c">
1705 for information on how to compile it into a loadable module.
1709 The following commands then register the sample procedural
1712 CREATE FUNCTION plsample_call_handler () RETURNS opaque
1713 AS '/usr/local/pgsql/lib/plsample'
1715 CREATE LANGUAGE plsample
1716 HANDLER plsample_call_handler;
1722 <!-- Keep this comment at the end of the file
1727 sgml-minimize-attributes:nil
1728 sgml-always-quote-attributes:t
1731 sgml-parent-document:nil
1732 sgml-default-dtd-file:"./reference.ced"
1733 sgml-exposed-tags:nil
1734 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1735 sgml-local-ecat-files:nil