1 <!-- doc/src/sgml/xfunc.sgml -->
4 <title>User-defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
12 <productname>PostgreSQL</productname> provides four kinds of
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
26 (<xref linkend="xfunc-pl">)
31 internal functions (<xref linkend="xfunc-internal">)
36 C-language functions (<xref linkend="xfunc-c">)
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"> command to
67 understand the examples better. Some examples from this chapter
68 can be found in <filename>funcs.sql</filename> and
69 <filename>funcs.c</filename> in the <filename>src/tutorial</>
70 directory in the <productname>PostgreSQL</productname> source
75 <sect1 id="xfunc-sql">
76 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
78 <indexterm zone="xfunc-sql">
79 <primary>function</primary>
80 <secondary>user-defined</secondary>
81 <tertiary>in SQL</tertiary>
85 SQL functions execute an arbitrary list of SQL statements, returning
86 the result of the last query in the list.
87 In the simple (non-set)
88 case, the first row of the last query's result will be returned.
89 (Bear in mind that <quote>the first row</quote> of a multirow
90 result is not well-defined unless you use <literal>ORDER BY</>.)
91 If the last query happens
92 to return no rows at all, the null value will be returned.
96 Alternatively, an SQL function can be declared to return a set (that is,
97 multiple rows) by specifying the function's return type as <literal>SETOF
98 <replaceable>sometype</></literal>, or equivalently by declaring it as
99 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>. In this case
100 all rows of the last query's result are returned. Further details appear
105 The body of an SQL function must be a list of SQL
106 statements separated by semicolons. A semicolon after the last
107 statement is optional. Unless the function is declared to return
108 <type>void</>, the last statement must be a <command>SELECT</>,
109 or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
110 that has a <literal>RETURNING</> clause.
114 Any collection of commands in the <acronym>SQL</acronym>
115 language can be packaged together and defined as a function.
116 Besides <command>SELECT</command> queries, the commands can include data
117 modification queries (<command>INSERT</command>,
118 <command>UPDATE</command>, and <command>DELETE</command>), as well as
119 other SQL commands. (You cannot use transaction control commands, e.g.
120 <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
121 commands, e.g. <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
122 However, the final command
123 must be a <command>SELECT</command> or have a <literal>RETURNING</>
124 clause that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</>.
128 For example, this function removes rows with negative salaries from
129 the <literal>emp</> table:
132 CREATE FUNCTION clean_emp() RETURNS void AS '
147 The syntax of the <command>CREATE FUNCTION</command> command requires
148 the function body to be written as a string constant. It is usually
149 most convenient to use dollar quoting (see <xref
150 linkend="sql-syntax-dollar-quoting">) for the string constant.
151 If you choose to use regular single-quoted string constant syntax,
152 you must double single quote marks (<literal>'</>) and backslashes
153 (<literal>\</>) (assuming escape string syntax) in the body of
154 the function (see <xref linkend="sql-syntax-strings">).
157 <sect2 id="xfunc-sql-function-arguments">
158 <title>Arguments for <acronym>SQL</acronym> Functions</title>
161 <primary>function</primary>
162 <secondary>named argument</secondary>
166 Arguments of a SQL function can be referenced in the function
167 body using either names or numbers. Examples of both methods appear
172 To use a name, declare the function argument as having a name, and
173 then just write that name in the function body. If the argument name
174 is the same as any column name in the current SQL command within the
175 function, the column name will take precedence. To override this,
176 qualify the argument name with the name of the function itself, that is
177 <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
178 (If this would conflict with a qualified column name, again the column
179 name wins. You can avoid the ambiguity by choosing a different alias for
180 the table within the SQL command.)
184 In the older numeric approach, arguments are referenced using the syntax
185 <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
186 argument, <literal>$2</> to the second, and so on. This will work
187 whether or not the particular argument was declared with a name.
191 If an argument is of a composite type, then the dot notation,
192 e.g., <literal>argname.fieldname</literal> or
193 <literal>$1.fieldname</literal>, can be used to access attributes of the
194 argument. Again, you might need to qualify the argument's name with the
195 function name to make the form with an argument name unambiguous.
199 SQL function arguments can only be used as data values,
200 not as identifiers. Thus for example this is reasonable:
202 INSERT INTO mytable VALUES ($1);
204 but this will not work:
206 INSERT INTO $1 VALUES (42);
212 The ability to use names to reference SQL function arguments was added
213 in <productname>PostgreSQL</productname> 9.2. Functions to be used in
214 older servers must use the <literal>$<replaceable>n</></> notation.
219 <sect2 id="xfunc-sql-base-functions">
220 <title><acronym>SQL</acronym> Functions on Base Types</title>
223 The simplest possible <acronym>SQL</acronym> function has no arguments and
224 simply returns a base type, such as <type>integer</type>:
227 CREATE FUNCTION one() RETURNS integer AS $$
231 -- Alternative syntax for string literal:
232 CREATE FUNCTION one() RETURNS integer AS '
245 Notice that we defined a column alias within the function body for the result of the function
246 (with the name <literal>result</>), but this column alias is not visible
247 outside the function. Hence, the result is labeled <literal>one</>
248 instead of <literal>result</>.
252 It is almost as easy to define <acronym>SQL</acronym> functions
253 that take base types as arguments:
256 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
260 SELECT add_em(1, 2) AS answer;
269 Alternatively, we could dispense with names for the arguments and
273 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
277 SELECT add_em(1, 2) AS answer;
286 Here is a more useful function, which might be used to debit a
290 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
292 SET balance = balance - debit
293 WHERE accountno = tf1.accountno;
298 A user could execute this function to debit account 17 by $100.00 as
302 SELECT tf1(17, 100.0);
307 In this example, we chose the name <literal>accountno</> for the first
308 argument, but this is the same as the name of a column in the
309 <literal>bank</> table. Within the <command>UPDATE</> command,
310 <literal>accountno</> refers to the column <literal>bank.accountno</>,
311 so <literal>tf1.accountno</> must be used to refer to the argument.
312 We could of course avoid this by using a different name for the argument.
316 In practice one would probably like a more useful result from the
317 function than a constant 1, so a more likely definition
321 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
323 SET balance = balance - debit
324 WHERE accountno = tf1.accountno;
325 SELECT balance FROM bank WHERE accountno = tf1.accountno;
329 which adjusts the balance and returns the new balance.
330 The same thing could be done in one command using <literal>RETURNING</>:
333 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
335 SET balance = balance - debit
336 WHERE accountno = tf1.accountno
343 <sect2 id="xfunc-sql-composite-functions">
344 <title><acronym>SQL</acronym> Functions on Composite Types</title>
347 When writing functions with arguments of composite types, we must not
348 only specify which argument we want but also the desired attribute
349 (field) of that argument. For example, suppose that
350 <type>emp</type> is a table containing employee data, and therefore
351 also the name of the composite type of each row of the table. Here
352 is a function <function>double_salary</function> that computes what someone's
353 salary would be if it were doubled:
363 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
365 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
366 SELECT $1.salary * 2 AS salary;
369 SELECT name, double_salary(emp.*) AS dream
371 WHERE emp.cubicle ~= point '(2,1)';
380 Notice the use of the syntax <literal>$1.salary</literal>
381 to select one field of the argument row value. Also notice
382 how the calling <command>SELECT</> command uses <literal>*</>
384 the entire current row of a table as a composite value. The table
385 row can alternatively be referenced using just the table name,
388 SELECT name, double_salary(emp) AS dream
390 WHERE emp.cubicle ~= point '(2,1)';
392 but this usage is deprecated since it's easy to get confused.
396 Sometimes it is handy to construct a composite argument value
397 on-the-fly. This can be done with the <literal>ROW</> construct.
398 For example, we could adjust the data being passed to the function:
400 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
406 It is also possible to build a function that returns a composite type.
407 This is an example of a function
408 that returns a single <type>emp</type> row:
411 CREATE FUNCTION new_emp() RETURNS emp AS $$
412 SELECT text 'None' AS name,
415 point '(2,2)' AS cubicle;
419 In this example we have specified each of the attributes
420 with a constant value, but any computation
421 could have been substituted for these constants.
425 Note two important things about defining the function:
430 The select list order in the query must be exactly the same as
431 that in which the columns appear in the table associated
432 with the composite type. (Naming the columns, as we did above,
433 is irrelevant to the system.)
438 You must typecast the expressions to match the
439 definition of the composite type, or you will get errors like this:
442 ERROR: function declared to return emp returns varchar instead of text at column 1
451 A different way to define the same function is:
454 CREATE FUNCTION new_emp() RETURNS emp AS $$
455 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
459 Here we wrote a <command>SELECT</> that returns just a single
460 column of the correct composite type. This isn't really better
461 in this situation, but it is a handy alternative in some cases
462 — for example, if we need to compute the result by calling
463 another function that returns the desired composite value.
467 We could call this function directly in either of two ways:
473 --------------------------
474 (None,1000.0,25,"(2,2)")
476 SELECT * FROM new_emp();
478 name | salary | age | cubicle
479 ------+--------+-----+---------
480 None | 1000.0 | 25 | (2,2)
483 The second way is described more fully in <xref
484 linkend="xfunc-sql-table-functions">.
488 When you use a function that returns a composite type,
489 you might want only one field (attribute) from its result.
490 You can do that with syntax like this:
493 SELECT (new_emp()).name;
500 The extra parentheses are needed to keep the parser from getting
501 confused. If you try to do it without them, you get something like this:
504 SELECT new_emp().name;
505 ERROR: syntax error at or near "."
506 LINE 1: SELECT new_emp().name;
512 Another option is to use
513 functional notation for extracting an attribute. The simple way
514 to explain this is that we can use the
515 notations <literal>attribute(table)</> and <literal>table.attribute</>
519 SELECT name(new_emp());
527 -- This is the same as:
528 -- SELECT emp.name AS youngster FROM emp WHERE emp.age < 30;
530 SELECT name(emp) AS youngster FROM emp WHERE age(emp) < 30;
541 The equivalence between functional notation and attribute notation
542 makes it possible to use functions on composite types to emulate
543 <quote>computed fields</>.
545 <primary>computed field</primary>
548 <primary>field</primary>
549 <secondary>computed</secondary>
551 For example, using the previous definition
552 for <literal>double_salary(emp)</>, we can write
555 SELECT emp.name, emp.double_salary FROM emp;
558 An application using this wouldn't need to be directly aware that
559 <literal>double_salary</> isn't a real column of the table.
560 (You can also emulate computed fields with views.)
564 Because of this behavior, it's unwise to give a function that takes
565 a single composite-type argument the same name as any of the fields of
571 Another way to use a function returning a composite type is to pass the
572 result to another function that accepts the correct row type as input:
575 CREATE FUNCTION getname(emp) RETURNS text AS $$
579 SELECT getname(new_emp());
588 Still another way to use a function that returns a composite type is to
589 call it as a table function, as described in <xref
590 linkend="xfunc-sql-table-functions">.
594 <sect2 id="xfunc-output-parameters">
595 <title><acronym>SQL</> Functions with Output Parameters</title>
598 <primary>function</primary>
599 <secondary>output parameter</secondary>
603 An alternative way of describing a function's results is to define it
604 with <firstterm>output parameters</>, as in this example:
607 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
618 This is not essentially different from the version of <literal>add_em</>
619 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
620 output parameters is that they provide a convenient way of defining
621 functions that return several columns. For example,
624 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
625 AS 'SELECT x + y, x * y'
628 SELECT * FROM sum_n_product(11,42);
635 What has essentially happened here is that we have created an anonymous
636 composite type for the result of the function. The above example has
637 the same end result as
640 CREATE TYPE sum_prod AS (sum int, product int);
642 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
643 AS 'SELECT $1 + $2, $1 * $2'
647 but not having to bother with the separate composite type definition
648 is often handy. Notice that the names attached to the output parameters
649 are not just decoration, but determine the column names of the anonymous
650 composite type. (If you omit a name for an output parameter, the
651 system will choose a name on its own.)
655 Notice that output parameters are not included in the calling argument
656 list when invoking such a function from SQL. This is because
657 <productname>PostgreSQL</productname> considers only the input
658 parameters to define the function's calling signature. That means
659 also that only the input parameters matter when referencing the function
660 for purposes such as dropping it. We could drop the above function
664 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
665 DROP FUNCTION sum_n_product (int, int);
670 Parameters can be marked as <literal>IN</> (the default),
671 <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
673 parameter serves as both an input parameter (part of the calling
674 argument list) and an output parameter (part of the result record type).
675 <literal>VARIADIC</> parameters are input parameters, but are treated
676 specially as described next.
680 <sect2 id="xfunc-sql-variadic-functions">
681 <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
684 <primary>function</primary>
685 <secondary>variadic</secondary>
689 <primary>variadic function</primary>
693 <acronym>SQL</acronym> functions can be declared to accept
694 variable numbers of arguments, so long as all the <quote>optional</>
695 arguments are of the same data type. The optional arguments will be
696 passed to the function as an array. The function is declared by
697 marking the last parameter as <literal>VARIADIC</>; this parameter
698 must be declared as being of an array type. For example:
701 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
702 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
705 SELECT mleast(10, -1, 5, 4.4);
712 Effectively, all the actual arguments at or beyond the
713 <literal>VARIADIC</> position are gathered up into a one-dimensional
714 array, as if you had written
717 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
720 You can't actually write that, though — or at least, it will
721 not match this function definition. A parameter marked
722 <literal>VARIADIC</> matches one or more occurrences of its element
723 type, not of its own type.
727 Sometimes it is useful to be able to pass an already-constructed array
728 to a variadic function; this is particularly handy when one variadic
729 function wants to pass on its array parameter to another one. You can
730 do that by specifying <literal>VARIADIC</> in the call:
733 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
736 This prevents expansion of the function's variadic parameter into its
737 element type, thereby allowing the array argument value to match
738 normally. <literal>VARIADIC</> can only be attached to the last
739 actual argument of a function call.
743 The array element parameters generated from a variadic parameter are
744 treated as not having any names of their own. This means it is not
745 possible to call a variadic function using named arguments (<xref
746 linkend="sql-syntax-calling-funcs">), except when you specify
747 <literal>VARIADIC</>. For example, this will work:
750 SELECT mleast(VARIADIC arr := ARRAY[10, -1, 5, 4.4]);
756 SELECT mleast(arr := 10);
757 SELECT mleast(arr := ARRAY[10, -1, 5, 4.4]);
762 <sect2 id="xfunc-sql-parameter-defaults">
763 <title><acronym>SQL</> Functions with Default Values for Arguments</title>
766 <primary>function</primary>
767 <secondary>default values for arguments</secondary>
771 Functions can be declared with default values for some or all input
772 arguments. The default values are inserted whenever the function is
773 called with insufficiently many actual arguments. Since arguments
774 can only be omitted from the end of the actual argument list, all
775 parameters after a parameter with a default value have to have
776 default values as well. (Although the use of named argument notation
777 could allow this restriction to be relaxed, it's still enforced so that
778 positional argument notation works sensibly.)
784 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
791 SELECT foo(10, 20, 30);
809 SELECT foo(); -- fails since there is no default for the first argument
810 ERROR: function foo() does not exist
812 The <literal>=</literal> sign can also be used in place of the
813 key word <literal>DEFAULT</literal>.
817 <sect2 id="xfunc-sql-table-functions">
818 <title><acronym>SQL</acronym> Functions as Table Sources</title>
821 All SQL functions can be used in the <literal>FROM</> clause of a query,
822 but it is particularly useful for functions returning composite types.
823 If the function is defined to return a base type, the table function
824 produces a one-column table. If the function is defined to return
825 a composite type, the table function produces a column for each attribute
826 of the composite type.
833 CREATE TABLE foo (fooid int, foosubid int, fooname text);
834 INSERT INTO foo VALUES (1, 1, 'Joe');
835 INSERT INTO foo VALUES (1, 2, 'Ed');
836 INSERT INTO foo VALUES (2, 1, 'Mary');
838 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
839 SELECT * FROM foo WHERE fooid = $1;
842 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
844 fooid | foosubid | fooname | upper
845 -------+----------+---------+-------
850 As the example shows, we can work with the columns of the function's
851 result just the same as if they were columns of a regular table.
855 Note that we only got one row out of the function. This is because
856 we did not use <literal>SETOF</>. That is described in the next section.
860 <sect2 id="xfunc-sql-functions-returning-set">
861 <title><acronym>SQL</acronym> Functions Returning Sets</title>
864 <primary>function</primary>
865 <secondary>with SETOF</secondary>
869 When an SQL function is declared as returning <literal>SETOF
870 <replaceable>sometype</></literal>, the function's final
871 query is executed to completion, and each row it
872 outputs is returned as an element of the result set.
876 This feature is normally used when calling the function in the <literal>FROM</>
877 clause. In this case each row returned by the function becomes
878 a row of the table seen by the query. For example, assume that
879 table <literal>foo</> has the same contents as above, and we say:
882 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
883 SELECT * FROM foo WHERE fooid = $1;
886 SELECT * FROM getfoo(1) AS t1;
891 fooid | foosubid | fooname
892 -------+----------+---------
900 It is also possible to return multiple rows with the columns defined by
901 output parameters, like this:
904 CREATE TABLE tab (y int, z int);
905 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
907 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
910 SELECT $1 + tab.y, $1 * tab.y FROM tab;
913 SELECT * FROM sum_n_product_with_tab(10);
923 The key point here is that you must write <literal>RETURNS SETOF record</>
924 to indicate that the function returns multiple rows instead of just one.
925 If there is only one output parameter, write that parameter's type
926 instead of <type>record</>.
930 It is frequently useful to construct a query's result by invoking a
931 set-returning function multiple times, with the parameters for each
932 invocation coming from successive rows of a table or subquery. The
933 preferred way to do this is to use the <literal>LATERAL</> key word,
934 which is described in <xref linkend="queries-lateral">.
935 Here is an example using a set-returning function to enumerate
936 elements of a tree structure:
950 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
951 SELECT name FROM nodes WHERE parent = $1
952 $$ LANGUAGE SQL STABLE;
954 SELECT * FROM listchildren('Top');
962 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
973 This example does not do anything that we couldn't have done with a
974 simple join, but in more complex calculations the option to put
975 some of the work into a function can be quite convenient.
979 Currently, functions returning sets can also be called in the select list
980 of a query. For each row that the query
981 generates by itself, the function returning set is invoked, and an output
982 row is generated for each element of the function's result set. Note,
983 however, that this capability is deprecated and might be removed in future
984 releases. The previous example could also be done with queries like
988 SELECT listchildren('Top');
996 SELECT name, listchildren(name) FROM nodes;
998 --------+--------------
1007 In the last <command>SELECT</command>,
1008 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
1009 This happens because <function>listchildren</function> returns an empty set
1010 for those arguments, so no result rows are generated. This is the same
1011 behavior as we got from an inner join to the function result when using
1012 the <literal>LATERAL</> syntax.
1017 If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1018 or <command>DELETE</> with <literal>RETURNING</>, that command will
1019 always be executed to completion, even if the function is not declared
1020 with <literal>SETOF</> or the calling query does not fetch all the
1021 result rows. Any extra rows produced by the <literal>RETURNING</>
1022 clause are silently dropped, but the commanded table modifications
1023 still happen (and are all completed before returning from the function).
1029 The key problem with using set-returning functions in the select list,
1030 rather than the <literal>FROM</> clause, is that putting more than one
1031 set-returning function in the same select list does not behave very
1032 sensibly. (What you actually get if you do so is a number of output
1033 rows equal to the least common multiple of the numbers of rows produced
1034 by each set-returning function.) The <literal>LATERAL</> syntax
1035 produces less surprising results when calling multiple set-returning
1036 functions, and should usually be used instead.
1041 <sect2 id="xfunc-sql-functions-returning-table">
1042 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1045 <primary>function</primary>
1046 <secondary>RETURNS TABLE</secondary>
1050 There is another way to declare a function as returning a set,
1051 which is to use the syntax
1052 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1053 This is equivalent to using one or more <literal>OUT</> parameters plus
1054 marking the function as returning <literal>SETOF record</> (or
1055 <literal>SETOF</> a single output parameter's type, as appropriate).
1056 This notation is specified in recent versions of the SQL standard, and
1057 thus may be more portable than using <literal>SETOF</>.
1061 For example, the preceding sum-and-product example could also be
1065 CREATE FUNCTION sum_n_product_with_tab (x int)
1066 RETURNS TABLE(sum int, product int) AS $$
1067 SELECT $1 + tab.y, $1 * tab.y FROM tab;
1071 It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1072 parameters with the <literal>RETURNS TABLE</> notation — you must
1073 put all the output columns in the <literal>TABLE</> list.
1078 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1081 <acronym>SQL</acronym> functions can be declared to accept and
1082 return the polymorphic types <type>anyelement</type>,
1083 <type>anyarray</type>, <type>anynonarray</type>,
1084 <type>anyenum</type>, and <type>anyrange</type>. See <xref
1085 linkend="extend-types-polymorphic"> for a more detailed
1086 explanation of polymorphic functions. Here is a polymorphic
1087 function <function>make_array</function> that builds up an array
1088 from two arbitrary data type elements:
1090 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1091 SELECT ARRAY[$1, $2];
1094 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1095 intarray | textarray
1096 ----------+-----------
1103 Notice the use of the typecast <literal>'a'::text</literal>
1104 to specify that the argument is of type <type>text</type>. This is
1105 required if the argument is just a string literal, since otherwise
1106 it would be treated as type
1107 <type>unknown</type>, and array of <type>unknown</type> is not a valid
1109 Without the typecast, you will get errors like this:
1112 ERROR: could not determine polymorphic type because input has type "unknown"
1118 It is permitted to have polymorphic arguments with a fixed
1119 return type, but the converse is not. For example:
1121 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1125 SELECT is_greater(1, 2);
1131 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1134 ERROR: cannot determine result data type
1135 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
1140 Polymorphism can be used with functions that have output arguments.
1143 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1144 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1146 SELECT * FROM dup(22);
1155 Polymorphism can also be used with variadic functions.
1158 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1159 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1162 SELECT anyleast(10, -1, 5, 4);
1168 SELECT anyleast('abc'::text, 'def');
1174 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1175 SELECT array_to_string($2, $1);
1178 SELECT concat_values('|', 1, 4, 2);
1188 <title><acronym>SQL</acronym> Functions with Collations</title>
1191 <primary>collation</>
1192 <secondary>in SQL functions</>
1196 When a SQL function has one or more parameters of collatable data types,
1197 a collation is identified for each function call depending on the
1198 collations assigned to the actual arguments, as described in <xref
1199 linkend="collation">. If a collation is successfully identified
1200 (i.e., there are no conflicts of implicit collations among the arguments)
1201 then all the collatable parameters are treated as having that collation
1202 implicitly. This will affect the behavior of collation-sensitive
1203 operations within the function. For example, using the
1204 <function>anyleast</> function described above, the result of
1206 SELECT anyleast('abc'::text, 'ABC');
1208 will depend on the database's default collation. In <literal>C</> locale
1209 the result will be <literal>ABC</>, but in many other locales it will
1210 be <literal>abc</>. The collation to use can be forced by adding
1211 a <literal>COLLATE</> clause to any of the arguments, for example
1213 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1215 Alternatively, if you wish a function to operate with a particular
1216 collation regardless of what it is called with, insert
1217 <literal>COLLATE</> clauses as needed in the function definition.
1218 This version of <function>anyleast</> would always use <literal>en_US</>
1219 locale to compare strings:
1221 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1222 SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1225 But note that this will throw an error if applied to a non-collatable
1230 If no common collation can be identified among the actual arguments,
1231 then a SQL function treats its parameters as having their data types'
1232 default collation (which is usually the database's default collation,
1233 but could be different for parameters of domain types).
1237 The behavior of collatable parameters can be thought of as a limited
1238 form of polymorphism, applicable only to textual data types.
1243 <sect1 id="xfunc-overload">
1244 <title>Function Overloading</title>
1246 <indexterm zone="xfunc-overload">
1247 <primary>overloading</primary>
1248 <secondary>functions</secondary>
1252 More than one function can be defined with the same SQL name, so long
1253 as the arguments they take are different. In other words,
1254 function names can be <firstterm>overloaded</firstterm>. When a
1255 query is executed, the server will determine which function to
1256 call from the data types and the number of the provided arguments.
1257 Overloading can also be used to simulate functions with a variable
1258 number of arguments, up to a finite maximum number.
1262 When creating a family of overloaded functions, one should be
1263 careful not to create ambiguities. For instance, given the
1266 CREATE FUNCTION test(int, real) RETURNS ...
1267 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1269 it is not immediately clear which function would be called with
1270 some trivial input like <literal>test(1, 1.5)</literal>. The
1271 currently implemented resolution rules are described in
1272 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1273 relies on this behavior.
1277 A function that takes a single argument of a composite type should
1278 generally not have the same name as any attribute (field) of that type.
1279 Recall that <literal>attribute(table)</literal> is considered equivalent
1280 to <literal>table.attribute</literal>. In the case that there is an
1281 ambiguity between a function on a composite type and an attribute of
1282 the composite type, the attribute will always be used. It is possible
1283 to override that choice by schema-qualifying the function name
1284 (that is, <literal>schema.func(table)</literal>) but it's better to
1285 avoid the problem by not choosing conflicting names.
1289 Another possible conflict is between variadic and non-variadic functions.
1290 For instance, it is possible to create both <literal>foo(numeric)</> and
1291 <literal>foo(VARIADIC numeric[])</>. In this case it is unclear which one
1292 should be matched to a call providing a single numeric argument, such as
1293 <literal>foo(10.1)</>. The rule is that the function appearing
1294 earlier in the search path is used, or if the two functions are in the
1295 same schema, the non-variadic one is preferred.
1299 When overloading C-language functions, there is an additional
1300 constraint: The C name of each function in the family of
1301 overloaded functions must be different from the C names of all
1302 other functions, either internal or dynamically loaded. If this
1303 rule is violated, the behavior is not portable. You might get a
1304 run-time linker error, or one of the functions will get called
1305 (usually the internal one). The alternative form of the
1306 <literal>AS</> clause for the SQL <command>CREATE
1307 FUNCTION</command> command decouples the SQL function name from
1308 the function name in the C source code. For instance:
1310 CREATE FUNCTION test(int) RETURNS int
1311 AS '<replaceable>filename</>', 'test_1arg'
1313 CREATE FUNCTION test(int, int) RETURNS int
1314 AS '<replaceable>filename</>', 'test_2arg'
1317 The names of the C functions here reflect one of many possible conventions.
1321 <sect1 id="xfunc-volatility">
1322 <title>Function Volatility Categories</title>
1324 <indexterm zone="xfunc-volatility">
1325 <primary>volatility</primary>
1326 <secondary>functions</secondary>
1328 <indexterm zone="xfunc-volatility">
1329 <primary>VOLATILE</primary>
1331 <indexterm zone="xfunc-volatility">
1332 <primary>STABLE</primary>
1334 <indexterm zone="xfunc-volatility">
1335 <primary>IMMUTABLE</primary>
1339 Every function has a <firstterm>volatility</> classification, with
1340 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1341 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
1342 <xref linkend="sql-createfunction">
1343 command does not specify a category. The volatility category is a
1344 promise to the optimizer about the behavior of the function:
1349 A <literal>VOLATILE</> function can do anything, including modifying
1350 the database. It can return different results on successive calls with
1351 the same arguments. The optimizer makes no assumptions about the
1352 behavior of such functions. A query using a volatile function will
1353 re-evaluate the function at every row where its value is needed.
1358 A <literal>STABLE</> function cannot modify the database and is
1359 guaranteed to return the same results given the same arguments
1360 for all rows within a single statement. This category allows the
1361 optimizer to optimize multiple calls of the function to a single
1362 call. In particular, it is safe to use an expression containing
1363 such a function in an index scan condition. (Since an index scan
1364 will evaluate the comparison value only once, not once at each
1365 row, it is not valid to use a <literal>VOLATILE</> function in an
1366 index scan condition.)
1371 An <literal>IMMUTABLE</> function cannot modify the database and is
1372 guaranteed to return the same results given the same arguments forever.
1373 This category allows the optimizer to pre-evaluate the function when
1374 a query calls it with constant arguments. For example, a query like
1375 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1376 <literal>SELECT ... WHERE x = 4</>, because the function underlying
1377 the integer addition operator is marked <literal>IMMUTABLE</>.
1384 For best optimization results, you should label your functions with the
1385 strictest volatility category that is valid for them.
1389 Any function with side-effects <emphasis>must</> be labeled
1390 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1391 Even a function with no side-effects needs to be labeled
1392 <literal>VOLATILE</> if its value can change within a single query;
1393 some examples are <literal>random()</>, <literal>currval()</>,
1394 <literal>timeofday()</>.
1398 Another important example is that the <function>current_timestamp</>
1399 family of functions qualify as <literal>STABLE</>, since their values do
1400 not change within a transaction.
1404 There is relatively little difference between <literal>STABLE</> and
1405 <literal>IMMUTABLE</> categories when considering simple interactive
1406 queries that are planned and immediately executed: it doesn't matter
1407 a lot whether a function is executed once during planning or once during
1408 query execution startup. But there is a big difference if the plan is
1409 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
1410 it really isn't might allow it to be prematurely folded to a constant during
1411 planning, resulting in a stale value being re-used during subsequent uses
1412 of the plan. This is a hazard when using prepared statements or when
1413 using function languages that cache plans (such as
1414 <application>PL/pgSQL</>).
1418 For functions written in SQL or in any of the standard procedural
1419 languages, there is a second important property determined by the
1420 volatility category, namely the visibility of any data changes that have
1421 been made by the SQL command that is calling the function. A
1422 <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1423 or <literal>IMMUTABLE</> function will not. This behavior is implemented
1424 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1425 <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1426 established as of the start of the calling query, whereas
1427 <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1428 each query they execute.
1433 Functions written in C can manage snapshots however they want, but it's
1434 usually a good idea to make C functions work this way too.
1439 Because of this snapshotting behavior,
1440 a function containing only <command>SELECT</> commands can safely be
1441 marked <literal>STABLE</>, even if it selects from tables that might be
1442 undergoing modifications by concurrent queries.
1443 <productname>PostgreSQL</productname> will execute all commands of a
1444 <literal>STABLE</> function using the snapshot established for the
1445 calling query, and so it will see a fixed view of the database throughout
1450 The same snapshotting behavior is used for <command>SELECT</> commands
1451 within <literal>IMMUTABLE</> functions. It is generally unwise to select
1452 from database tables within an <literal>IMMUTABLE</> function at all,
1453 since the immutability will be broken if the table contents ever change.
1454 However, <productname>PostgreSQL</productname> does not enforce that you
1459 A common error is to label a function <literal>IMMUTABLE</> when its
1460 results depend on a configuration parameter. For example, a function
1461 that manipulates timestamps might well have results that depend on the
1462 <xref linkend="guc-timezone"> setting. For safety, such functions should
1463 be labeled <literal>STABLE</> instead.
1468 Before <productname>PostgreSQL</productname> release 8.0, the requirement
1469 that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
1470 the database was not enforced by the system. Releases 8.0 and later enforce it
1471 by requiring SQL functions and procedural language functions of these
1472 categories to contain no SQL commands other than <command>SELECT</>.
1473 (This is not a completely bulletproof test, since such functions could
1474 still call <literal>VOLATILE</> functions that modify the database.
1475 If you do that, you will find that the <literal>STABLE</> or
1476 <literal>IMMUTABLE</> function does not notice the database changes
1477 applied by the called function, since they are hidden from its snapshot.)
1482 <sect1 id="xfunc-pl">
1483 <title>Procedural Language Functions</title>
1486 <productname>PostgreSQL</productname> allows user-defined functions
1487 to be written in other languages besides SQL and C. These other
1488 languages are generically called <firstterm>procedural
1489 languages</firstterm> (<acronym>PL</>s).
1490 Procedural languages aren't built into the
1491 <productname>PostgreSQL</productname> server; they are offered
1492 by loadable modules.
1493 See <xref linkend="xplang"> and following chapters for more
1498 <sect1 id="xfunc-internal">
1499 <title>Internal Functions</title>
1501 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1504 Internal functions are functions written in C that have been statically
1505 linked into the <productname>PostgreSQL</productname> server.
1506 The <quote>body</quote> of the function definition
1507 specifies the C-language name of the function, which need not be the
1508 same as the name being declared for SQL use.
1509 (For reasons of backward compatibility, an empty body
1510 is accepted as meaning that the C-language function name is the
1511 same as the SQL name.)
1515 Normally, all internal functions present in the
1516 server are declared during the initialization of the database cluster
1517 (see <xref linkend="creating-cluster">),
1518 but a user could use <command>CREATE FUNCTION</command>
1519 to create additional alias names for an internal function.
1520 Internal functions are declared in <command>CREATE FUNCTION</command>
1521 with language name <literal>internal</literal>. For instance, to
1522 create an alias for the <function>sqrt</function> function:
1524 CREATE FUNCTION square_root(double precision) RETURNS double precision
1529 (Most internal functions expect to be declared <quote>strict</quote>.)
1534 Not all <quote>predefined</quote> functions are
1535 <quote>internal</quote> in the above sense. Some predefined
1536 functions are written in SQL.
1541 <sect1 id="xfunc-c">
1542 <title>C-Language Functions</title>
1544 <indexterm zone="xfunc-c">
1545 <primary>function</primary>
1546 <secondary>user-defined</secondary>
1547 <tertiary>in C</tertiary>
1551 User-defined functions can be written in C (or a language that can
1552 be made compatible with C, such as C++). Such functions are
1553 compiled into dynamically loadable objects (also called shared
1554 libraries) and are loaded by the server on demand. The dynamic
1555 loading feature is what distinguishes <quote>C language</> functions
1556 from <quote>internal</> functions — the actual coding conventions
1557 are essentially the same for both. (Hence, the standard internal
1558 function library is a rich source of coding examples for user-defined
1563 Two different calling conventions are currently used for C functions.
1564 The newer <quote>version 1</quote> calling convention is indicated by writing
1565 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1566 as illustrated below. Lack of such a macro indicates an old-style
1567 (<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
1568 is <literal>C</literal> in either case. Old-style functions are now deprecated
1569 because of portability problems and lack of functionality, but they
1570 are still supported for compatibility reasons.
1573 <sect2 id="xfunc-c-dynload">
1574 <title>Dynamic Loading</title>
1576 <indexterm zone="xfunc-c-dynload">
1577 <primary>dynamic loading</primary>
1581 The first time a user-defined function in a particular
1582 loadable object file is called in a session,
1583 the dynamic loader loads that object file into memory so that the
1584 function can be called. The <command>CREATE FUNCTION</command>
1585 for a user-defined C function must therefore specify two pieces of
1586 information for the function: the name of the loadable
1587 object file, and the C name (link symbol) of the specific function to call
1588 within that object file. If the C name is not explicitly specified then
1589 it is assumed to be the same as the SQL function name.
1593 The following algorithm is used to locate the shared object file
1594 based on the name given in the <command>CREATE FUNCTION</command>
1600 If the name is an absolute path, the given file is loaded.
1606 If the name starts with the string <literal>$libdir</literal>,
1607 that part is replaced by the <productname>PostgreSQL</> package
1609 name, which is determined at build time.<indexterm><primary>$libdir</></>
1615 If the name does not contain a directory part, the file is
1616 searched for in the path specified by the configuration variable
1617 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1623 Otherwise (the file was not found in the path, or it contains a
1624 non-absolute directory part), the dynamic loader will try to
1625 take the name as given, which will most likely fail. (It is
1626 unreliable to depend on the current working directory.)
1631 If this sequence does not work, the platform-specific shared
1632 library file name extension (often <filename>.so</filename>) is
1633 appended to the given name and this sequence is tried again. If
1634 that fails as well, the load will fail.
1638 It is recommended to locate shared libraries either relative to
1639 <literal>$libdir</literal> or through the dynamic library path.
1640 This simplifies version upgrades if the new installation is at a
1641 different location. The actual directory that
1642 <literal>$libdir</literal> stands for can be found out with the
1643 command <literal>pg_config --pkglibdir</literal>.
1647 The user ID the <productname>PostgreSQL</productname> server runs
1648 as must be able to traverse the path to the file you intend to
1649 load. Making the file or a higher-level directory not readable
1650 and/or not executable by the <systemitem>postgres</systemitem>
1651 user is a common mistake.
1655 In any case, the file name that is given in the
1656 <command>CREATE FUNCTION</command> command is recorded literally
1657 in the system catalogs, so if the file needs to be loaded again
1658 the same procedure is applied.
1663 <productname>PostgreSQL</productname> will not compile a C function
1664 automatically. The object file must be compiled before it is referenced
1665 in a <command>CREATE
1666 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1671 <indexterm zone="xfunc-c-dynload">
1672 <primary>magic block</primary>
1676 To ensure that a dynamically loaded object file is not loaded into an
1677 incompatible server, <productname>PostgreSQL</productname> checks that the
1678 file contains a <quote>magic block</> with the appropriate contents.
1679 This allows the server to detect obvious incompatibilities, such as code
1680 compiled for a different major version of
1681 <productname>PostgreSQL</productname>. A magic block is required as of
1682 <productname>PostgreSQL</productname> 8.2. To include a magic block,
1683 write this in one (and only one) of the module source files, after having
1684 included the header <filename>fmgr.h</>:
1687 #ifdef PG_MODULE_MAGIC
1692 The <literal>#ifdef</> test can be omitted if the code doesn't
1693 need to compile against pre-8.2 <productname>PostgreSQL</productname>
1698 After it is used for the first time, a dynamically loaded object
1699 file is retained in memory. Future calls in the same session to
1700 the function(s) in that file will only incur the small overhead of
1701 a symbol table lookup. If you need to force a reload of an object
1702 file, for example after recompiling it, begin a fresh session.
1705 <indexterm zone="xfunc-c-dynload">
1706 <primary>_PG_init</primary>
1708 <indexterm zone="xfunc-c-dynload">
1709 <primary>_PG_fini</primary>
1711 <indexterm zone="xfunc-c-dynload">
1712 <primary>library initialization function</primary>
1714 <indexterm zone="xfunc-c-dynload">
1715 <primary>library finalization function</primary>
1719 Optionally, a dynamically loaded file can contain initialization and
1720 finalization functions. If the file includes a function named
1721 <function>_PG_init</>, that function will be called immediately after
1722 loading the file. The function receives no parameters and should
1723 return void. If the file includes a function named
1724 <function>_PG_fini</>, that function will be called immediately before
1725 unloading the file. Likewise, the function receives no parameters and
1726 should return void. Note that <function>_PG_fini</> will only be called
1727 during an unload of the file, not during process termination.
1728 (Presently, unloads are disabled and will never occur, but this may
1729 change in the future.)
1734 <sect2 id="xfunc-c-basetype">
1735 <title>Base Types in C-Language Functions</title>
1737 <indexterm zone="xfunc-c-basetype">
1738 <primary>data type</primary>
1739 <secondary>internal organization</secondary>
1743 To know how to write C-language functions, you need to know how
1744 <productname>PostgreSQL</productname> internally represents base
1745 data types and how they can be passed to and from functions.
1746 Internally, <productname>PostgreSQL</productname> regards a base
1747 type as a <quote>blob of memory</quote>. The user-defined
1748 functions that you define over a type in turn define the way that
1749 <productname>PostgreSQL</productname> can operate on it. That
1750 is, <productname>PostgreSQL</productname> will only store and
1751 retrieve the data from disk and use your user-defined functions
1752 to input, process, and output the data.
1756 Base types can have one of three internal formats:
1761 pass by value, fixed-length
1766 pass by reference, fixed-length
1771 pass by reference, variable-length
1778 By-value types can only be 1, 2, or 4 bytes in length
1779 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1780 You should be careful to define your types such that they will be the
1781 same size (in bytes) on all architectures. For example, the
1782 <literal>long</literal> type is dangerous because it is 4 bytes on some
1783 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1784 on most Unix machines. A reasonable implementation of the
1785 <type>int4</type> type on Unix machines might be:
1788 /* 4-byte integer, passed by value */
1792 (The actual PostgreSQL C code calls this type <type>int32</type>, because
1793 it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1794 means <replaceable>XX</replaceable> <emphasis>bits</emphasis>. Note
1795 therefore also that the C type <type>int8</type> is 1 byte in size. The
1796 SQL type <type>int8</type> is called <type>int64</type> in C. See also
1797 <xref linkend="xfunc-c-type-table">.)
1801 On the other hand, fixed-length types of any size can
1802 be passed by-reference. For example, here is a sample
1803 implementation of a <productname>PostgreSQL</productname> type:
1806 /* 16-byte structure, passed by reference */
1813 Only pointers to such types can be used when passing
1814 them in and out of <productname>PostgreSQL</productname> functions.
1815 To return a value of such a type, allocate the right amount of
1816 memory with <literal>palloc</literal>, fill in the allocated memory,
1817 and return a pointer to it. (Also, if you just want to return the
1818 same value as one of your input arguments that's of the same data type,
1819 you can skip the extra <literal>palloc</literal> and just return the
1820 pointer to the input value.)
1824 Finally, all variable-length types must also be passed
1825 by reference. All variable-length types must begin
1826 with an opaque length field of exactly 4 bytes, which will be set
1827 by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1828 be stored within that type must be located in the memory
1829 immediately following that length field. The
1830 length field contains the total length of the structure,
1831 that is, it includes the size of the length field
1836 Another important point is to avoid leaving any uninitialized bits
1837 within data type values; for example, take care to zero out any
1838 alignment padding bytes that might be present in structs. Without
1839 this, logically-equivalent constants of your data type might be
1840 seen as unequal by the planner, leading to inefficient (though not
1846 <emphasis>Never</> modify the contents of a pass-by-reference input
1847 value. If you do so you are likely to corrupt on-disk data, since
1848 the pointer you are given might point directly into a disk buffer.
1849 The sole exception to this rule is explained in
1850 <xref linkend="xaggr">.
1855 As an example, we can define the type <type>text</type> as
1865 Obviously, the data field declared here is not long enough to hold
1866 all possible strings. Since it's impossible to declare a variable-size
1867 structure in <acronym>C</acronym>, we rely on the knowledge that the
1868 <acronym>C</acronym> compiler won't range-check array subscripts. We
1869 just allocate the necessary amount of space and then access the array as
1870 if it were declared the right length. (This is a common trick, which
1871 you can read about in many textbooks about C.)
1876 variable-length types, we must be careful to allocate
1877 the correct amount of memory and set the length field correctly.
1878 For example, if we wanted to store 40 bytes in a <structname>text</>
1879 structure, we might use a code fragment like this:
1881 <programlisting><![CDATA[
1882 #include "postgres.h"
1884 char buffer[40]; /* our source data */
1886 text *destination = (text *) palloc(VARHDRSZ + 40);
1887 SET_VARSIZE(destination, VARHDRSZ + 40);
1888 memcpy(destination->data, buffer, 40);
1893 <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1894 it's considered good style to use the macro <literal>VARHDRSZ</>
1895 to refer to the size of the overhead for a variable-length type.
1896 Also, the length field <emphasis>must</> be set using the
1897 <literal>SET_VARSIZE</> macro, not by simple assignment.
1901 <xref linkend="xfunc-c-type-table"> specifies which C type
1902 corresponds to which SQL type when writing a C-language function
1903 that uses a built-in type of <productname>PostgreSQL</>.
1904 The <quote>Defined In</quote> column gives the header file that
1905 needs to be included to get the type definition. (The actual
1906 definition might be in a different file that is included by the
1907 listed file. It is recommended that users stick to the defined
1908 interface.) Note that you should always include
1909 <filename>postgres.h</filename> first in any source file, because
1910 it declares a number of things that you will need anyway.
1913 <table tocentry="1" id="xfunc-c-type-table">
1914 <title>Equivalent C Types for Built-in SQL Types</title>
1931 <entry><type>abstime</type></entry>
1932 <entry><type>AbsoluteTime</type></entry>
1933 <entry><filename>utils/nabstime.h</filename></entry>
1936 <entry><type>boolean</type></entry>
1937 <entry><type>bool</type></entry>
1938 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1941 <entry><type>box</type></entry>
1942 <entry><type>BOX*</type></entry>
1943 <entry><filename>utils/geo_decls.h</filename></entry>
1946 <entry><type>bytea</type></entry>
1947 <entry><type>bytea*</type></entry>
1948 <entry><filename>postgres.h</filename></entry>
1951 <entry><type>"char"</type></entry>
1952 <entry><type>char</type></entry>
1953 <entry>(compiler built-in)</entry>
1956 <entry><type>character</type></entry>
1957 <entry><type>BpChar*</type></entry>
1958 <entry><filename>postgres.h</filename></entry>
1961 <entry><type>cid</type></entry>
1962 <entry><type>CommandId</type></entry>
1963 <entry><filename>postgres.h</filename></entry>
1966 <entry><type>date</type></entry>
1967 <entry><type>DateADT</type></entry>
1968 <entry><filename>utils/date.h</filename></entry>
1971 <entry><type>smallint</type> (<type>int2</type>)</entry>
1972 <entry><type>int16</type></entry>
1973 <entry><filename>postgres.h</filename></entry>
1976 <entry><type>int2vector</type></entry>
1977 <entry><type>int2vector*</type></entry>
1978 <entry><filename>postgres.h</filename></entry>
1981 <entry><type>integer</type> (<type>int4</type>)</entry>
1982 <entry><type>int32</type></entry>
1983 <entry><filename>postgres.h</filename></entry>
1986 <entry><type>real</type> (<type>float4</type>)</entry>
1987 <entry><type>float4*</type></entry>
1988 <entry><filename>postgres.h</filename></entry>
1991 <entry><type>double precision</type> (<type>float8</type>)</entry>
1992 <entry><type>float8*</type></entry>
1993 <entry><filename>postgres.h</filename></entry>
1996 <entry><type>interval</type></entry>
1997 <entry><type>Interval*</type></entry>
1998 <entry><filename>datatype/timestamp.h</filename></entry>
2001 <entry><type>lseg</type></entry>
2002 <entry><type>LSEG*</type></entry>
2003 <entry><filename>utils/geo_decls.h</filename></entry>
2006 <entry><type>name</type></entry>
2007 <entry><type>Name</type></entry>
2008 <entry><filename>postgres.h</filename></entry>
2011 <entry><type>oid</type></entry>
2012 <entry><type>Oid</type></entry>
2013 <entry><filename>postgres.h</filename></entry>
2016 <entry><type>oidvector</type></entry>
2017 <entry><type>oidvector*</type></entry>
2018 <entry><filename>postgres.h</filename></entry>
2021 <entry><type>path</type></entry>
2022 <entry><type>PATH*</type></entry>
2023 <entry><filename>utils/geo_decls.h</filename></entry>
2026 <entry><type>point</type></entry>
2027 <entry><type>POINT*</type></entry>
2028 <entry><filename>utils/geo_decls.h</filename></entry>
2031 <entry><type>regproc</type></entry>
2032 <entry><type>regproc</type></entry>
2033 <entry><filename>postgres.h</filename></entry>
2036 <entry><type>reltime</type></entry>
2037 <entry><type>RelativeTime</type></entry>
2038 <entry><filename>utils/nabstime.h</filename></entry>
2041 <entry><type>text</type></entry>
2042 <entry><type>text*</type></entry>
2043 <entry><filename>postgres.h</filename></entry>
2046 <entry><type>tid</type></entry>
2047 <entry><type>ItemPointer</type></entry>
2048 <entry><filename>storage/itemptr.h</filename></entry>
2051 <entry><type>time</type></entry>
2052 <entry><type>TimeADT</type></entry>
2053 <entry><filename>utils/date.h</filename></entry>
2056 <entry><type>time with time zone</type></entry>
2057 <entry><type>TimeTzADT</type></entry>
2058 <entry><filename>utils/date.h</filename></entry>
2061 <entry><type>timestamp</type></entry>
2062 <entry><type>Timestamp*</type></entry>
2063 <entry><filename>datatype/timestamp.h</filename></entry>
2066 <entry><type>tinterval</type></entry>
2067 <entry><type>TimeInterval</type></entry>
2068 <entry><filename>utils/nabstime.h</filename></entry>
2071 <entry><type>varchar</type></entry>
2072 <entry><type>VarChar*</type></entry>
2073 <entry><filename>postgres.h</filename></entry>
2076 <entry><type>xid</type></entry>
2077 <entry><type>TransactionId</type></entry>
2078 <entry><filename>postgres.h</filename></entry>
2085 Now that we've gone over all of the possible structures
2086 for base types, we can show some examples of real functions.
2091 <title>Version 0 Calling Conventions</title>
2094 We present the <quote>old style</quote> calling convention first — although
2095 this approach is now deprecated, it's easier to get a handle on
2096 initially. In the version-0 method, the arguments and result
2097 of the C function are just declared in normal C style, but being
2098 careful to use the C representation of each SQL data type as shown
2103 Here are some examples:
2105 <programlisting><![CDATA[
2106 #include "postgres.h"
2108 #include "utils/geo_decls.h"
2110 #ifdef PG_MODULE_MAGIC
2122 /* by reference, fixed length */
2125 add_one_float8(float8 *arg)
2127 float8 *result = (float8 *) palloc(sizeof(float8));
2129 *result = *arg + 1.0;
2135 makepoint(Point *pointx, Point *pointy)
2137 Point *new_point = (Point *) palloc(sizeof(Point));
2139 new_point->x = pointx->x;
2140 new_point->y = pointy->y;
2145 /* by reference, variable length */
2151 * VARSIZE is the total size of the struct in bytes.
2153 text *new_t = (text *) palloc(VARSIZE(t));
2154 SET_VARSIZE(new_t, VARSIZE(t));
2156 * VARDATA is a pointer to the data region of the struct.
2158 memcpy((void *) VARDATA(new_t), /* destination */
2159 (void *) VARDATA(t), /* source */
2160 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2165 concat_text(text *arg1, text *arg2)
2167 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2168 text *new_text = (text *) palloc(new_text_size);
2170 SET_VARSIZE(new_text, new_text_size);
2171 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2172 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2173 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2181 Supposing that the above code has been prepared in file
2182 <filename>funcs.c</filename> and compiled into a shared object,
2183 we could define the functions to <productname>PostgreSQL</productname>
2184 with commands like this:
2187 CREATE FUNCTION add_one(integer) RETURNS integer
2188 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2191 -- note overloading of SQL function name "add_one"
2192 CREATE FUNCTION add_one(double precision) RETURNS double precision
2193 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2196 CREATE FUNCTION makepoint(point, point) RETURNS point
2197 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2200 CREATE FUNCTION copytext(text) RETURNS text
2201 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2204 CREATE FUNCTION concat_text(text, text) RETURNS text
2205 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2211 Here, <replaceable>DIRECTORY</replaceable> stands for the
2212 directory of the shared library file (for instance the
2213 <productname>PostgreSQL</productname> tutorial directory, which
2214 contains the code for the examples used in this section).
2215 (Better style would be to use just <literal>'funcs'</> in the
2216 <literal>AS</> clause, after having added
2217 <replaceable>DIRECTORY</replaceable> to the search path. In any
2218 case, we can omit the system-specific extension for a shared
2219 library, commonly <literal>.so</literal> or
2220 <literal>.sl</literal>.)
2224 Notice that we have specified the functions as <quote>strict</quote>,
2226 the system should automatically assume a null result if any input
2227 value is null. By doing this, we avoid having to check for null inputs
2228 in the function code. Without this, we'd have to check for null values
2229 explicitly, by checking for a null pointer for each
2230 pass-by-reference argument. (For pass-by-value arguments, we don't
2231 even have a way to check!)
2235 Although this calling convention is simple to use,
2236 it is not very portable; on some architectures there are problems
2237 with passing data types that are smaller than <type>int</type> this way. Also, there is
2238 no simple way to return a null result, nor to cope with null arguments
2239 in any way other than making the function strict. The version-1
2240 convention, presented next, overcomes these objections.
2245 <title>Version 1 Calling Conventions</title>
2248 The version-1 calling convention relies on macros to suppress most
2249 of the complexity of passing arguments and results. The C declaration
2250 of a version-1 function is always:
2252 Datum funcname(PG_FUNCTION_ARGS)
2254 In addition, the macro call:
2256 PG_FUNCTION_INFO_V1(funcname);
2258 must appear in the same source file. (Conventionally, it's
2259 written just before the function itself.) This macro call is not
2260 needed for <literal>internal</>-language functions, since
2261 <productname>PostgreSQL</> assumes that all internal functions
2262 use the version-1 convention. It is, however, required for
2263 dynamically-loaded functions.
2267 In a version-1 function, each actual argument is fetched using a
2268 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2269 macro that corresponds to the argument's data type, and the
2270 result is returned using a
2271 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2272 macro for the return type.
2273 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2274 takes as its argument the number of the function argument to
2275 fetch, where the count starts at 0.
2276 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2277 takes as its argument the actual value to return.
2281 Here we show the same functions as above, coded in version-1 style:
2283 <programlisting><![CDATA[
2284 #include "postgres.h"
2287 #include "utils/geo_decls.h"
2289 #ifdef PG_MODULE_MAGIC
2295 PG_FUNCTION_INFO_V1(add_one);
2298 add_one(PG_FUNCTION_ARGS)
2300 int32 arg = PG_GETARG_INT32(0);
2302 PG_RETURN_INT32(arg + 1);
2305 /* by reference, fixed length */
2307 PG_FUNCTION_INFO_V1(add_one_float8);
2310 add_one_float8(PG_FUNCTION_ARGS)
2312 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2313 float8 arg = PG_GETARG_FLOAT8(0);
2315 PG_RETURN_FLOAT8(arg + 1.0);
2318 PG_FUNCTION_INFO_V1(makepoint);
2321 makepoint(PG_FUNCTION_ARGS)
2323 /* Here, the pass-by-reference nature of Point is not hidden. */
2324 Point *pointx = PG_GETARG_POINT_P(0);
2325 Point *pointy = PG_GETARG_POINT_P(1);
2326 Point *new_point = (Point *) palloc(sizeof(Point));
2328 new_point->x = pointx->x;
2329 new_point->y = pointy->y;
2331 PG_RETURN_POINT_P(new_point);
2334 /* by reference, variable length */
2336 PG_FUNCTION_INFO_V1(copytext);
2339 copytext(PG_FUNCTION_ARGS)
2341 text *t = PG_GETARG_TEXT_P(0);
2343 * VARSIZE is the total size of the struct in bytes.
2345 text *new_t = (text *) palloc(VARSIZE(t));
2346 SET_VARSIZE(new_t, VARSIZE(t));
2348 * VARDATA is a pointer to the data region of the struct.
2350 memcpy((void *) VARDATA(new_t), /* destination */
2351 (void *) VARDATA(t), /* source */
2352 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2353 PG_RETURN_TEXT_P(new_t);
2356 PG_FUNCTION_INFO_V1(concat_text);
2359 concat_text(PG_FUNCTION_ARGS)
2361 text *arg1 = PG_GETARG_TEXT_P(0);
2362 text *arg2 = PG_GETARG_TEXT_P(1);
2363 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2364 text *new_text = (text *) palloc(new_text_size);
2366 SET_VARSIZE(new_text, new_text_size);
2367 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2368 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2369 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2370 PG_RETURN_TEXT_P(new_text);
2377 The <command>CREATE FUNCTION</command> commands are the same as
2378 for the version-0 equivalents.
2382 At first glance, the version-1 coding conventions might appear to
2383 be just pointless obscurantism. They do, however, offer a number
2384 of improvements, because the macros can hide unnecessary detail.
2385 An example is that in coding <function>add_one_float8</>, we no longer need to
2386 be aware that <type>float8</type> is a pass-by-reference type. Another
2387 example is that the <literal>GETARG</> macros for variable-length types allow
2388 for more efficient fetching of <quote>toasted</quote> (compressed or
2389 out-of-line) values.
2393 One big improvement in version-1 functions is better handling of null
2394 inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2395 allows a function to test whether each input is null. (Of course, doing
2396 this is only necessary in functions not declared <quote>strict</>.)
2398 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2399 the input arguments are counted beginning at zero. Note that one
2400 should refrain from executing
2401 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2402 one has verified that the argument isn't null.
2403 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2404 this works in both strict and nonstrict functions.
2408 Other options provided in the new-style interface are two
2410 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2411 macros. The first of these,
2412 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2413 guarantees to return a copy of the specified argument that is
2414 safe for writing into. (The normal macros will sometimes return a
2415 pointer to a value that is physically stored in a table, which
2416 must not be written to. Using the
2417 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2418 macros guarantees a writable result.)
2419 The second variant consists of the
2420 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2421 macros which take three arguments. The first is the number of the
2422 function argument (as above). The second and third are the offset and
2423 length of the segment to be returned. Offsets are counted from
2424 zero, and a negative length requests that the remainder of the
2425 value be returned. These macros provide more efficient access to
2426 parts of large values in the case where they have storage type
2427 <quote>external</quote>. (The storage type of a column can be specified using
2428 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2429 COLUMN <replaceable>colname</replaceable> SET STORAGE
2430 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2431 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2432 or <literal>main</>.)
2436 Finally, the version-1 function call conventions make it possible
2437 to return set results (<xref linkend="xfunc-c-return-set">) and
2438 implement trigger functions (<xref linkend="triggers">) and
2439 procedural-language call handlers (<xref
2440 linkend="plhandler">). Version-1 code is also more
2441 portable than version-0, because it does not break restrictions
2442 on function call protocol in the C standard. For more details
2443 see <filename>src/backend/utils/fmgr/README</filename> in the
2444 source distribution.
2449 <title>Writing Code</title>
2452 Before we turn to the more advanced topics, we should discuss
2453 some coding rules for <productname>PostgreSQL</productname>
2454 C-language functions. While it might be possible to load functions
2455 written in languages other than C into
2456 <productname>PostgreSQL</productname>, this is usually difficult
2457 (when it is possible at all) because other languages, such as
2458 C++, FORTRAN, or Pascal often do not follow the same calling
2459 convention as C. That is, other languages do not pass argument
2460 and return values between functions in the same way. For this
2461 reason, we will assume that your C-language functions are
2462 actually written in C.
2466 The basic rules for writing and building C functions are as follows:
2471 Use <literal>pg_config
2472 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2473 to find out where the <productname>PostgreSQL</> server header
2474 files are installed on your system (or the system that your
2475 users will be running on).
2481 Compiling and linking your code so that it can be dynamically
2482 loaded into <productname>PostgreSQL</productname> always
2483 requires special flags. See <xref linkend="dfunc"> for a
2484 detailed explanation of how to do it for your particular
2491 Remember to define a <quote>magic block</> for your shared library,
2492 as described in <xref linkend="xfunc-c-dynload">.
2498 When allocating memory, use the
2499 <productname>PostgreSQL</productname> functions
2500 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2501 instead of the corresponding C library functions
2502 <function>malloc</function> and <function>free</function>.
2503 The memory allocated by <function>palloc</function> will be
2504 freed automatically at the end of each transaction, preventing
2511 Always zero the bytes of your structures using <function>memset</>
2512 (or allocate them with <function>palloc0</> in the first place).
2513 Even if you assign to each field of your structure, there might be
2514 alignment padding (holes in the structure) that contain
2515 garbage values. Without this, it's difficult to
2516 support hash indexes or hash joins, as you must pick out only
2517 the significant bits of your data structure to compute a hash.
2518 The planner also sometimes relies on comparing constants via
2519 bitwise equality, so you can get undesirable planning results if
2520 logically-equivalent values aren't bitwise equal.
2526 Most of the internal <productname>PostgreSQL</productname>
2527 types are declared in <filename>postgres.h</filename>, while
2528 the function manager interfaces
2529 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2530 <filename>fmgr.h</filename>, so you will need to include at
2531 least these two files. For portability reasons it's best to
2532 include <filename>postgres.h</filename> <emphasis>first</>,
2533 before any other system or user header files. Including
2534 <filename>postgres.h</filename> will also include
2535 <filename>elog.h</filename> and <filename>palloc.h</filename>
2542 Symbol names defined within object files must not conflict
2543 with each other or with symbols defined in the
2544 <productname>PostgreSQL</productname> server executable. You
2545 will have to rename your functions or variables if you get
2546 error messages to this effect.
2556 <title>Composite-type Arguments</title>
2559 Composite types do not have a fixed layout like C structures.
2560 Instances of a composite type can contain null fields. In
2561 addition, composite types that are part of an inheritance
2562 hierarchy can have different fields than other members of the
2563 same inheritance hierarchy. Therefore,
2564 <productname>PostgreSQL</productname> provides a function
2565 interface for accessing fields of composite types from C.
2569 Suppose we want to write a function to answer the query:
2572 SELECT name, c_overpaid(emp, 1500) AS overpaid
2574 WHERE name = 'Bill' OR name = 'Sam';
2577 Using call conventions version 0, we can define
2578 <function>c_overpaid</> as:
2580 <programlisting><![CDATA[
2581 #include "postgres.h"
2582 #include "executor/executor.h" /* for GetAttributeByName() */
2584 #ifdef PG_MODULE_MAGIC
2589 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2595 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2598 return salary > limit;
2603 In version-1 coding, the above would look like this:
2605 <programlisting><![CDATA[
2606 #include "postgres.h"
2607 #include "executor/executor.h" /* for GetAttributeByName() */
2609 #ifdef PG_MODULE_MAGIC
2613 PG_FUNCTION_INFO_V1(c_overpaid);
2616 c_overpaid(PG_FUNCTION_ARGS)
2618 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2619 int32 limit = PG_GETARG_INT32(1);
2623 salary = GetAttributeByName(t, "salary", &isnull);
2625 PG_RETURN_BOOL(false);
2626 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2628 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2635 <function>GetAttributeByName</function> is the
2636 <productname>PostgreSQL</productname> system function that
2637 returns attributes out of the specified row. It has
2638 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2640 the function, the name of the desired attribute, and a
2641 return parameter that tells whether the attribute
2642 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2643 value that you can convert to the proper data type by using the
2644 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2645 macro. Note that the return value is meaningless if the null flag is
2646 set; always check the null flag before trying to do anything with the
2651 There is also <function>GetAttributeByNum</function>, which selects
2652 the target attribute by column number instead of name.
2656 The following command declares the function
2657 <function>c_overpaid</function> in SQL:
2660 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2661 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2665 Notice we have used <literal>STRICT</> so that we did not have to
2666 check whether the input arguments were NULL.
2671 <title>Returning Rows (Composite Types)</title>
2674 To return a row or composite-type value from a C-language
2675 function, you can use a special API that provides macros and
2676 functions to hide most of the complexity of building composite
2677 data types. To use this API, the source file must include:
2679 #include "funcapi.h"
2684 There are two ways you can build a composite data value (henceforth
2685 a <quote>tuple</>): you can build it from an array of Datum values,
2686 or from an array of C strings that can be passed to the input
2687 conversion functions of the tuple's column data types. In either
2688 case, you first need to obtain or construct a <structname>TupleDesc</>
2689 descriptor for the tuple structure. When working with Datums, you
2690 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2691 and then call <function>heap_form_tuple</> for each row. When working
2692 with C strings, you pass the <structname>TupleDesc</> to
2693 <function>TupleDescGetAttInMetadata</>, and then call
2694 <function>BuildTupleFromCStrings</> for each row. In the case of a
2695 function returning a set of tuples, the setup steps can all be done
2696 once during the first call of the function.
2700 Several helper functions are available for setting up the needed
2701 <structname>TupleDesc</>. The recommended way to do this in most
2702 functions returning composite values is to call:
2704 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2706 TupleDesc *resultTupleDesc)
2708 passing the same <literal>fcinfo</> struct passed to the calling function
2709 itself. (This of course requires that you use the version-1
2710 calling conventions.) <varname>resultTypeId</> can be specified
2711 as <literal>NULL</> or as the address of a local variable to receive the
2712 function's result type OID. <varname>resultTupleDesc</> should be the
2713 address of a local <structname>TupleDesc</> variable. Check that the
2714 result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2715 <varname>resultTupleDesc</> has been filled with the needed
2716 <structname>TupleDesc</>. (If it is not, you can report an error along
2717 the lines of <quote>function returning record called in context that
2718 cannot accept type record</quote>.)
2723 <function>get_call_result_type</> can resolve the actual type of a
2724 polymorphic function result; so it is useful in functions that return
2725 scalar polymorphic results, not only functions that return composites.
2726 The <varname>resultTypeId</> output is primarily useful for functions
2727 returning polymorphic scalars.
2733 <function>get_call_result_type</> has a sibling
2734 <function>get_expr_result_type</>, which can be used to resolve the
2735 expected output type for a function call represented by an expression
2736 tree. This can be used when trying to determine the result type from
2737 outside the function itself. There is also
2738 <function>get_func_result_type</>, which can be used when only the
2739 function's OID is available. However these functions are not able
2740 to deal with functions declared to return <structname>record</>, and
2741 <function>get_func_result_type</> cannot resolve polymorphic types,
2742 so you should preferentially use <function>get_call_result_type</>.
2747 Older, now-deprecated functions for obtaining
2748 <structname>TupleDesc</>s are:
2750 TupleDesc RelationNameGetTupleDesc(const char *relname)
2752 to get a <structname>TupleDesc</> for the row type of a named relation,
2755 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2757 to get a <structname>TupleDesc</> based on a type OID. This can
2758 be used to get a <structname>TupleDesc</> for a base or
2759 composite type. It will not work for a function that returns
2760 <structname>record</>, however, and it cannot resolve polymorphic
2765 Once you have a <structname>TupleDesc</>, call:
2767 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2769 if you plan to work with Datums, or:
2771 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2773 if you plan to work with C strings. If you are writing a function
2774 returning set, you can save the results of these functions in the
2775 <structname>FuncCallContext</> structure — use the
2776 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2781 When working with Datums, use:
2783 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2785 to build a <structname>HeapTuple</> given user data in Datum form.
2789 When working with C strings, use:
2791 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2793 to build a <structname>HeapTuple</> given user data
2794 in C string form. <literal>values</literal> is an array of C strings,
2795 one for each attribute of the return row. Each C string should be in
2796 the form expected by the input function of the attribute data
2797 type. In order to return a null value for one of the attributes,
2798 the corresponding pointer in the <parameter>values</> array
2799 should be set to <symbol>NULL</>. This function will need to
2800 be called again for each row you return.
2804 Once you have built a tuple to return from your function, it
2805 must be converted into a <type>Datum</>. Use:
2807 HeapTupleGetDatum(HeapTuple tuple)
2809 to convert a <structname>HeapTuple</> into a valid Datum. This
2810 <type>Datum</> can be returned directly if you intend to return
2811 just a single row, or it can be used as the current return value
2812 in a set-returning function.
2816 An example appears in the next section.
2821 <sect2 id="xfunc-c-return-set">
2822 <title>Returning Sets</title>
2825 There is also a special API that provides support for returning
2826 sets (multiple rows) from a C-language function. A set-returning
2827 function must follow the version-1 calling conventions. Also,
2828 source files must include <filename>funcapi.h</filename>, as
2833 A set-returning function (<acronym>SRF</>) is called
2834 once for each item it returns. The <acronym>SRF</> must
2835 therefore save enough state to remember what it was doing and
2836 return the next item on each call.
2837 The structure <structname>FuncCallContext</> is provided to help
2838 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2839 is used to hold a pointer to <structname>FuncCallContext</>
2845 * Number of times we've been called before
2847 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2848 * incremented for you every time SRF_RETURN_NEXT() is called.
2853 * OPTIONAL maximum number of calls
2855 * max_calls is here for convenience only and setting it is optional.
2856 * If not set, you must provide alternative means to know when the
2862 * OPTIONAL pointer to result slot
2864 * This is obsolete and only present for backward compatibility, viz,
2865 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2867 TupleTableSlot *slot;
2870 * OPTIONAL pointer to miscellaneous user-provided context information
2872 * user_fctx is for use as a pointer to your own data to retain
2873 * arbitrary context information between calls of your function.
2878 * OPTIONAL pointer to struct containing attribute type input metadata
2880 * attinmeta is for use when returning tuples (i.e., composite data types)
2881 * and is not used when returning base data types. It is only needed
2882 * if you intend to use BuildTupleFromCStrings() to create the return
2885 AttInMetadata *attinmeta;
2888 * memory context used for structures that must live for multiple calls
2890 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2891 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2892 * context for any memory that is to be reused across multiple calls
2895 MemoryContext multi_call_memory_ctx;
2898 * OPTIONAL pointer to struct containing tuple description
2900 * tuple_desc is for use when returning tuples (i.e., composite data types)
2901 * and is only needed if you are going to build the tuples with
2902 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2903 * the TupleDesc pointer stored here should usually have been run through
2904 * BlessTupleDesc() first.
2906 TupleDesc tuple_desc;
2913 An <acronym>SRF</> uses several functions and macros that
2914 automatically manipulate the <structname>FuncCallContext</>
2915 structure (and expect to find it via <literal>fn_extra</>). Use:
2919 to determine if your function is being called for the first or a
2920 subsequent time. On the first call (only) use:
2922 SRF_FIRSTCALL_INIT()
2924 to initialize the <structname>FuncCallContext</>. On every function call,
2925 including the first, use:
2929 to properly set up for using the <structname>FuncCallContext</>
2930 and clearing any previously returned data left over from the
2935 If your function has data to return, use:
2937 SRF_RETURN_NEXT(funcctx, result)
2939 to return it to the caller. (<literal>result</> must be of type
2940 <type>Datum</>, either a single value or a tuple prepared as
2941 described above.) Finally, when your function is finished
2942 returning data, use:
2944 SRF_RETURN_DONE(funcctx)
2946 to clean up and end the <acronym>SRF</>.
2950 The memory context that is current when the <acronym>SRF</> is called is
2951 a transient context that will be cleared between calls. This means
2952 that you do not need to call <function>pfree</> on everything
2953 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2954 any data structures to live across calls, you need to put them somewhere
2955 else. The memory context referenced by
2956 <structfield>multi_call_memory_ctx</> is a suitable location for any
2957 data that needs to survive until the <acronym>SRF</> is finished running. In most
2958 cases, this means that you should switch into
2959 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2963 A complete pseudo-code example looks like the following:
2966 my_set_returning_function(PG_FUNCTION_ARGS)
2968 FuncCallContext *funcctx;
2970 <replaceable>further declarations as needed</replaceable>
2972 if (SRF_IS_FIRSTCALL())
2974 MemoryContext oldcontext;
2976 funcctx = SRF_FIRSTCALL_INIT();
2977 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2978 /* One-time setup code appears here: */
2979 <replaceable>user code</replaceable>
2980 <replaceable>if returning composite</replaceable>
2981 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2982 <replaceable>endif returning composite</replaceable>
2983 <replaceable>user code</replaceable>
2984 MemoryContextSwitchTo(oldcontext);
2987 /* Each-time setup code appears here: */
2988 <replaceable>user code</replaceable>
2989 funcctx = SRF_PERCALL_SETUP();
2990 <replaceable>user code</replaceable>
2992 /* this is just one way we might test whether we are done: */
2993 if (funcctx->call_cntr < funcctx->max_calls)
2995 /* Here we want to return another item: */
2996 <replaceable>user code</replaceable>
2997 <replaceable>obtain result Datum</replaceable>
2998 SRF_RETURN_NEXT(funcctx, result);
3002 /* Here we are done returning items and just need to clean up: */
3003 <replaceable>user code</replaceable>
3004 SRF_RETURN_DONE(funcctx);
3011 A complete example of a simple <acronym>SRF</> returning a composite type
3013 <programlisting><![CDATA[
3014 PG_FUNCTION_INFO_V1(retcomposite);
3017 retcomposite(PG_FUNCTION_ARGS)
3019 FuncCallContext *funcctx;
3023 AttInMetadata *attinmeta;
3025 /* stuff done only on the first call of the function */
3026 if (SRF_IS_FIRSTCALL())
3028 MemoryContext oldcontext;
3030 /* create a function context for cross-call persistence */
3031 funcctx = SRF_FIRSTCALL_INIT();
3033 /* switch to memory context appropriate for multiple function calls */
3034 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3036 /* total number of tuples to be returned */
3037 funcctx->max_calls = PG_GETARG_UINT32(0);
3039 /* Build a tuple descriptor for our result type */
3040 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3042 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3043 errmsg("function returning record called in context "
3044 "that cannot accept type record")));
3047 * generate attribute metadata needed later to produce tuples from raw
3050 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3051 funcctx->attinmeta = attinmeta;
3053 MemoryContextSwitchTo(oldcontext);
3056 /* stuff done on every call of the function */
3057 funcctx = SRF_PERCALL_SETUP();
3059 call_cntr = funcctx->call_cntr;
3060 max_calls = funcctx->max_calls;
3061 attinmeta = funcctx->attinmeta;
3063 if (call_cntr < max_calls) /* do when there is more left to send */
3070 * Prepare a values array for building the returned tuple.
3071 * This should be an array of C strings which will
3072 * be processed later by the type input functions.
3074 values = (char **) palloc(3 * sizeof(char *));
3075 values[0] = (char *) palloc(16 * sizeof(char));
3076 values[1] = (char *) palloc(16 * sizeof(char));
3077 values[2] = (char *) palloc(16 * sizeof(char));
3079 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3080 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3081 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3084 tuple = BuildTupleFromCStrings(attinmeta, values);
3086 /* make the tuple into a datum */
3087 result = HeapTupleGetDatum(tuple);
3089 /* clean up (this is not really necessary) */
3095 SRF_RETURN_NEXT(funcctx, result);
3097 else /* do when there is no more left */
3099 SRF_RETURN_DONE(funcctx);
3105 One way to declare this function in SQL is:
3107 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3109 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3110 RETURNS SETOF __retcomposite
3111 AS '<replaceable>filename</>', 'retcomposite'
3112 LANGUAGE C IMMUTABLE STRICT;
3114 A different way is to use OUT parameters:
3116 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3117 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3118 RETURNS SETOF record
3119 AS '<replaceable>filename</>', 'retcomposite'
3120 LANGUAGE C IMMUTABLE STRICT;
3122 Notice that in this method the output type of the function is formally
3123 an anonymous <structname>record</> type.
3127 The directory <link linkend="tablefunc">contrib/tablefunc</>
3128 module in the source distribution contains more examples of
3129 set-returning functions.
3134 <title>Polymorphic Arguments and Return Types</title>
3137 C-language functions can be declared to accept and
3138 return the polymorphic types
3139 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3140 <type>anyenum</type>, and <type>anyrange</type>.
3141 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3142 of polymorphic functions. When function arguments or return types
3143 are defined as polymorphic types, the function author cannot know
3144 in advance what data type it will be called with, or
3145 need to return. There are two routines provided in <filename>fmgr.h</>
3146 to allow a version-1 C function to discover the actual data types
3147 of its arguments and the type it is expected to return. The routines are
3148 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3149 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3150 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3151 information is not available.
3152 The structure <literal>flinfo</> is normally accessed as
3153 <literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
3154 is zero based. <function>get_call_result_type</> can also be used
3155 as an alternative to <function>get_fn_expr_rettype</>.
3156 There is also <function>get_fn_expr_variadic</>, which can be used to
3157 find out whether the call contained an explicit <literal>VARIADIC</>
3158 keyword. This is primarily useful for <literal>VARIADIC "any"</>
3159 functions, as described below.
3163 For example, suppose we want to write a function to accept a single
3164 element of any type, and return a one-dimensional array of that type:
3167 PG_FUNCTION_INFO_V1(make_array);
3169 make_array(PG_FUNCTION_ARGS)
3172 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
3182 if (!OidIsValid(element_type))
3183 elog(ERROR, "could not determine data type of input");
3185 /* get the provided element, being careful in case it's NULL */
3186 isnull = PG_ARGISNULL(0);
3188 element = (Datum) 0;
3190 element = PG_GETARG_DATUM(0);
3192 /* we have one dimension */
3194 /* and one element */
3196 /* and lower bound is 1 */
3199 /* get required info about the element type */
3200 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
3202 /* now build the array */
3203 result = construct_md_array(&element, &isnull, ndims, dims, lbs,
3204 element_type, typlen, typbyval, typalign);
3206 PG_RETURN_ARRAYTYPE_P(result);
3212 The following command declares the function
3213 <function>make_array</function> in SQL:
3216 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3217 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3218 LANGUAGE C IMMUTABLE;
3223 There is a variant of polymorphism that is only available to C-language
3224 functions: they can be declared to take parameters of type
3225 <literal>"any"</>. (Note that this type name must be double-quoted,
3226 since it's also a SQL reserved word.) This works like
3227 <type>anyelement</> except that it does not constrain different
3228 <literal>"any"</> arguments to be the same type, nor do they help
3229 determine the function's result type. A C-language function can also
3230 declare its final parameter to be <literal>VARIADIC "any"</>. This will
3231 match one or more actual arguments of any type (not necessarily the same
3232 type). These arguments will <emphasis>not</> be gathered into an array
3233 as happens with normal variadic functions; they will just be passed to
3234 the function separately. The <function>PG_NARGS()</> macro and the
3235 methods described above must be used to determine the number of actual
3236 arguments and their types when using this feature. Also, users of such
3237 a function might wish to use the <literal>VARIADIC</> keyword in their
3238 function call, with the expectation that the function would treat the
3239 array elements as separate arguments. The function itself must implement
3240 that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3241 detect that the actual argument was marked with <literal>VARIADIC</>.
3245 <sect2 id="xfunc-transform-functions">
3246 <title>Transform Functions</title>
3249 Some function calls can be simplified during planning based on
3250 properties specific to the function. For example,
3251 <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3252 To define such function-specific optimizations, write a
3253 <firstterm>transform function</> and place its OID in the
3254 <structfield>protransform</> field of the primary function's
3255 <structname>pg_proc</> entry. The transform function must have the SQL
3256 signature <literal>protransform(internal) RETURNS internal</>. The
3257 argument, actually <type>FuncExpr *</>, is a dummy node representing a
3258 call to the primary function. If the transform function's study of the
3259 expression tree proves that a simplified expression tree can substitute
3260 for all possible concrete calls represented thereby, build and return
3261 that simplified expression. Otherwise, return a <literal>NULL</>
3262 pointer (<emphasis>not</> a SQL null).
3266 We make no guarantee that <productname>PostgreSQL</> will never call the
3267 primary function in cases that the transform function could simplify.
3268 Ensure rigorous equivalence between the simplified expression and an
3269 actual call to the primary function.
3273 Currently, this facility is not exposed to users at the SQL level
3274 because of security concerns, so it is only practical to use for
3275 optimizing built-in functions.
3280 <title>Shared Memory and LWLocks</title>
3283 Add-ins can reserve LWLocks and an allocation of shared memory on server
3284 startup. The add-in's shared library must be preloaded by specifying
3286 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3287 Shared memory is reserved by calling:
3289 void RequestAddinShmemSpace(int size)
3291 from your <function>_PG_init</> function.
3294 LWLocks are reserved by calling:
3296 void RequestAddinLWLocks(int n)
3298 from <function>_PG_init</>.
3301 To avoid possible race-conditions, each backend should use the LWLock
3302 <function>AddinShmemInitLock</> when connecting to and initializing
3303 its allocation of shared memory, as shown here:
3305 static mystruct *ptr = NULL;
3311 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3312 ptr = ShmemInitStruct("my struct name", size, &found);
3315 initialize contents of shmem area;
3316 acquire any requested LWLocks using:
3317 ptr->mylockid = LWLockAssign();
3319 LWLockRelease(AddinShmemInitLock);
3325 <sect2 id="extend-Cpp">
3326 <title>Using C++ for Extensibility</title>
3328 <indexterm zone="extend-Cpp">
3329 <primary>C++</primary>
3333 Although the <productname>PostgreSQL</productname> backend is written in
3334 C, it is possible to write extensions in C++ if these guidelines are
3340 All functions accessed by the backend must present a C interface
3341 to the backend; these C functions can then call C++ functions.
3342 For example, <literal>extern C</> linkage is required for
3343 backend-accessed functions. This is also necessary for any
3344 functions that are passed as pointers between the backend and
3350 Free memory using the appropriate deallocation method. For example,
3351 most backend memory is allocated using <function>palloc()</>, so use
3352 <function>pfree()</> to free it. Using C++
3353 <function>delete</> in such cases will fail.
3358 Prevent exceptions from propagating into the C code (use a catch-all
3359 block at the top level of all <literal>extern C</> functions). This
3360 is necessary even if the C++ code does not explicitly throw any
3361 exceptions, because events like out-of-memory can still throw
3362 exceptions. Any exceptions must be caught and appropriate errors
3363 passed back to the C interface. If possible, compile C++ with
3364 <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3365 cases, you must check for failures in your C++ code, e.g. check for
3366 NULL returned by <function>new()</>.
3371 If calling backend functions from C++ code, be sure that the
3372 C++ call stack contains only plain old data structures
3373 (<acronym>POD</>). This is necessary because backend errors
3374 generate a distant <function>longjmp()</> that does not properly
3375 unroll a C++ call stack with non-POD objects.
3382 In summary, it is best to place C++ code behind a wall of
3383 <literal>extern C</> functions that interface to the backend,
3384 and avoid exception, memory, and call stack leakage.