1 <!-- doc/src/sgml/xfunc.sgml -->
4 <title>User-defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
12 <productname>PostgreSQL</productname> provides four kinds of
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
26 (<xref linkend="xfunc-pl">)
31 internal functions (<xref linkend="xfunc-internal">)
36 C-language functions (<xref linkend="xfunc-c">)
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"> command to
67 understand the examples better. Some examples from this chapter
68 can be found in <filename>funcs.sql</filename> and
69 <filename>funcs.c</filename> in the <filename>src/tutorial</>
70 directory in the <productname>PostgreSQL</productname> source
75 <sect1 id="xfunc-sql">
76 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
78 <indexterm zone="xfunc-sql">
79 <primary>function</primary>
80 <secondary>user-defined</secondary>
81 <tertiary>in SQL</tertiary>
85 SQL functions execute an arbitrary list of SQL statements, returning
86 the result of the last query in the list.
87 In the simple (non-set)
88 case, the first row of the last query's result will be returned.
89 (Bear in mind that <quote>the first row</quote> of a multirow
90 result is not well-defined unless you use <literal>ORDER BY</>.)
91 If the last query happens
92 to return no rows at all, the null value will be returned.
96 Alternatively, an SQL function can be declared to return a set (that is,
97 multiple rows) by specifying the function's return type as <literal>SETOF
98 <replaceable>sometype</></literal>, or equivalently by declaring it as
99 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>. In this case
100 all rows of the last query's result are returned. Further details appear
105 The body of an SQL function must be a list of SQL
106 statements separated by semicolons. A semicolon after the last
107 statement is optional. Unless the function is declared to return
108 <type>void</>, the last statement must be a <command>SELECT</>,
109 or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
110 that has a <literal>RETURNING</> clause.
114 Any collection of commands in the <acronym>SQL</acronym>
115 language can be packaged together and defined as a function.
116 Besides <command>SELECT</command> queries, the commands can include data
117 modification queries (<command>INSERT</command>,
118 <command>UPDATE</command>, and <command>DELETE</command>), as well as
119 other SQL commands. (You cannot use transaction control commands, e.g.
120 <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
121 commands, e.g. <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
122 However, the final command
123 must be a <command>SELECT</command> or have a <literal>RETURNING</>
124 clause that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</>.
128 For example, this function removes rows with negative salaries from
129 the <literal>emp</> table:
132 CREATE FUNCTION clean_emp() RETURNS void AS '
148 The entire body of a SQL function is parsed before any of it is
149 executed. While a SQL function can contain commands that alter
150 the system catalogs (e.g., <command>CREATE TABLE</>), the effects
151 of such commands will not be visible during parse analysis of
152 later commands in the function. Thus, for example,
153 <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
154 will not work as desired if packaged up into a single SQL function,
155 since <structname>foo</> won't exist yet when the <command>INSERT</>
156 command is parsed. It's recommended to use <application>PL/PgSQL</>
157 instead of a SQL function in this type of situation.
162 The syntax of the <command>CREATE FUNCTION</command> command requires
163 the function body to be written as a string constant. It is usually
164 most convenient to use dollar quoting (see <xref
165 linkend="sql-syntax-dollar-quoting">) for the string constant.
166 If you choose to use regular single-quoted string constant syntax,
167 you must double single quote marks (<literal>'</>) and backslashes
168 (<literal>\</>) (assuming escape string syntax) in the body of
169 the function (see <xref linkend="sql-syntax-strings">).
172 <sect2 id="xfunc-sql-function-arguments">
173 <title>Arguments for <acronym>SQL</acronym> Functions</title>
176 <primary>function</primary>
177 <secondary>named argument</secondary>
181 Arguments of a SQL function can be referenced in the function
182 body using either names or numbers. Examples of both methods appear
187 To use a name, declare the function argument as having a name, and
188 then just write that name in the function body. If the argument name
189 is the same as any column name in the current SQL command within the
190 function, the column name will take precedence. To override this,
191 qualify the argument name with the name of the function itself, that is
192 <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
193 (If this would conflict with a qualified column name, again the column
194 name wins. You can avoid the ambiguity by choosing a different alias for
195 the table within the SQL command.)
199 In the older numeric approach, arguments are referenced using the syntax
200 <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
201 argument, <literal>$2</> to the second, and so on. This will work
202 whether or not the particular argument was declared with a name.
206 If an argument is of a composite type, then the dot notation,
207 e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
208 <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
209 argument. Again, you might need to qualify the argument's name with the
210 function name to make the form with an argument name unambiguous.
214 SQL function arguments can only be used as data values,
215 not as identifiers. Thus for example this is reasonable:
217 INSERT INTO mytable VALUES ($1);
219 but this will not work:
221 INSERT INTO $1 VALUES (42);
227 The ability to use names to reference SQL function arguments was added
228 in <productname>PostgreSQL</productname> 9.2. Functions to be used in
229 older servers must use the <literal>$<replaceable>n</></> notation.
234 <sect2 id="xfunc-sql-base-functions">
235 <title><acronym>SQL</acronym> Functions on Base Types</title>
238 The simplest possible <acronym>SQL</acronym> function has no arguments and
239 simply returns a base type, such as <type>integer</type>:
242 CREATE FUNCTION one() RETURNS integer AS $$
246 -- Alternative syntax for string literal:
247 CREATE FUNCTION one() RETURNS integer AS '
260 Notice that we defined a column alias within the function body for the result of the function
261 (with the name <literal>result</>), but this column alias is not visible
262 outside the function. Hence, the result is labeled <literal>one</>
263 instead of <literal>result</>.
267 It is almost as easy to define <acronym>SQL</acronym> functions
268 that take base types as arguments:
271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
275 SELECT add_em(1, 2) AS answer;
284 Alternatively, we could dispense with names for the arguments and
288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
292 SELECT add_em(1, 2) AS answer;
301 Here is a more useful function, which might be used to debit a
305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
307 SET balance = balance - debit
308 WHERE accountno = tf1.accountno;
313 A user could execute this function to debit account 17 by $100.00 as
317 SELECT tf1(17, 100.0);
322 In this example, we chose the name <literal>accountno</> for the first
323 argument, but this is the same as the name of a column in the
324 <literal>bank</> table. Within the <command>UPDATE</> command,
325 <literal>accountno</> refers to the column <literal>bank.accountno</>,
326 so <literal>tf1.accountno</> must be used to refer to the argument.
327 We could of course avoid this by using a different name for the argument.
331 In practice one would probably like a more useful result from the
332 function than a constant 1, so a more likely definition
336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
338 SET balance = balance - debit
339 WHERE accountno = tf1.accountno;
340 SELECT balance FROM bank WHERE accountno = tf1.accountno;
344 which adjusts the balance and returns the new balance.
345 The same thing could be done in one command using <literal>RETURNING</>:
348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
350 SET balance = balance - debit
351 WHERE accountno = tf1.accountno
358 <sect2 id="xfunc-sql-composite-functions">
359 <title><acronym>SQL</acronym> Functions on Composite Types</title>
362 When writing functions with arguments of composite types, we must not
363 only specify which argument we want but also the desired attribute
364 (field) of that argument. For example, suppose that
365 <type>emp</type> is a table containing employee data, and therefore
366 also the name of the composite type of each row of the table. Here
367 is a function <function>double_salary</function> that computes what someone's
368 salary would be if it were doubled:
378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
381 SELECT $1.salary * 2 AS salary;
384 SELECT name, double_salary(emp.*) AS dream
386 WHERE emp.cubicle ~= point '(2,1)';
395 Notice the use of the syntax <literal>$1.salary</literal>
396 to select one field of the argument row value. Also notice
397 how the calling <command>SELECT</> command
398 uses <replaceable>table_name</><literal>.*</> to select
399 the entire current row of a table as a composite value. The table
400 row can alternatively be referenced using just the table name,
403 SELECT name, double_salary(emp) AS dream
405 WHERE emp.cubicle ~= point '(2,1)';
407 but this usage is deprecated since it's easy to get confused.
408 (See <xref linkend="rowtypes-usage"> for details about these
409 two notations for the composite value of a table row.)
413 Sometimes it is handy to construct a composite argument value
414 on-the-fly. This can be done with the <literal>ROW</> construct.
415 For example, we could adjust the data being passed to the function:
417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
423 It is also possible to build a function that returns a composite type.
424 This is an example of a function
425 that returns a single <type>emp</type> row:
428 CREATE FUNCTION new_emp() RETURNS emp AS $$
429 SELECT text 'None' AS name,
432 point '(2,2)' AS cubicle;
436 In this example we have specified each of the attributes
437 with a constant value, but any computation
438 could have been substituted for these constants.
442 Note two important things about defining the function:
447 The select list order in the query must be exactly the same as
448 that in which the columns appear in the table associated
449 with the composite type. (Naming the columns, as we did above,
450 is irrelevant to the system.)
455 You must typecast the expressions to match the
456 definition of the composite type, or you will get errors like this:
459 ERROR: function declared to return emp returns varchar instead of text at column 1
468 A different way to define the same function is:
471 CREATE FUNCTION new_emp() RETURNS emp AS $$
472 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
476 Here we wrote a <command>SELECT</> that returns just a single
477 column of the correct composite type. This isn't really better
478 in this situation, but it is a handy alternative in some cases
479 — for example, if we need to compute the result by calling
480 another function that returns the desired composite value.
484 We could call this function directly either by using it in
491 --------------------------
492 (None,1000.0,25,"(2,2)")
495 or by calling it as a table function:
498 SELECT * FROM new_emp();
500 name | salary | age | cubicle
501 ------+--------+-----+---------
502 None | 1000.0 | 25 | (2,2)
505 The second way is described more fully in <xref
506 linkend="xfunc-sql-table-functions">.
510 When you use a function that returns a composite type,
511 you might want only one field (attribute) from its result.
512 You can do that with syntax like this:
515 SELECT (new_emp()).name;
522 The extra parentheses are needed to keep the parser from getting
523 confused. If you try to do it without them, you get something like this:
526 SELECT new_emp().name;
527 ERROR: syntax error at or near "."
528 LINE 1: SELECT new_emp().name;
534 Another option is to use functional notation for extracting an attribute:
537 SELECT name(new_emp());
544 As explained in <xref linkend="rowtypes-usage">, the field notation and
545 functional notation are equivalent.
549 Another way to use a function returning a composite type is to pass the
550 result to another function that accepts the correct row type as input:
553 CREATE FUNCTION getname(emp) RETURNS text AS $$
557 SELECT getname(new_emp());
566 <sect2 id="xfunc-output-parameters">
567 <title><acronym>SQL</> Functions with Output Parameters</title>
570 <primary>function</primary>
571 <secondary>output parameter</secondary>
575 An alternative way of describing a function's results is to define it
576 with <firstterm>output parameters</>, as in this example:
579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
590 This is not essentially different from the version of <literal>add_em</>
591 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
592 output parameters is that they provide a convenient way of defining
593 functions that return several columns. For example,
596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
597 AS 'SELECT x + y, x * y'
600 SELECT * FROM sum_n_product(11,42);
607 What has essentially happened here is that we have created an anonymous
608 composite type for the result of the function. The above example has
609 the same end result as
612 CREATE TYPE sum_prod AS (sum int, product int);
614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
615 AS 'SELECT $1 + $2, $1 * $2'
619 but not having to bother with the separate composite type definition
620 is often handy. Notice that the names attached to the output parameters
621 are not just decoration, but determine the column names of the anonymous
622 composite type. (If you omit a name for an output parameter, the
623 system will choose a name on its own.)
627 Notice that output parameters are not included in the calling argument
628 list when invoking such a function from SQL. This is because
629 <productname>PostgreSQL</productname> considers only the input
630 parameters to define the function's calling signature. That means
631 also that only the input parameters matter when referencing the function
632 for purposes such as dropping it. We could drop the above function
636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
637 DROP FUNCTION sum_n_product (int, int);
642 Parameters can be marked as <literal>IN</> (the default),
643 <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
645 parameter serves as both an input parameter (part of the calling
646 argument list) and an output parameter (part of the result record type).
647 <literal>VARIADIC</> parameters are input parameters, but are treated
648 specially as described next.
652 <sect2 id="xfunc-sql-variadic-functions">
653 <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
656 <primary>function</primary>
657 <secondary>variadic</secondary>
661 <primary>variadic function</primary>
665 <acronym>SQL</acronym> functions can be declared to accept
666 variable numbers of arguments, so long as all the <quote>optional</>
667 arguments are of the same data type. The optional arguments will be
668 passed to the function as an array. The function is declared by
669 marking the last parameter as <literal>VARIADIC</>; this parameter
670 must be declared as being of an array type. For example:
673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
674 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
677 SELECT mleast(10, -1, 5, 4.4);
684 Effectively, all the actual arguments at or beyond the
685 <literal>VARIADIC</> position are gathered up into a one-dimensional
686 array, as if you had written
689 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
692 You can't actually write that, though — or at least, it will
693 not match this function definition. A parameter marked
694 <literal>VARIADIC</> matches one or more occurrences of its element
695 type, not of its own type.
699 Sometimes it is useful to be able to pass an already-constructed array
700 to a variadic function; this is particularly handy when one variadic
701 function wants to pass on its array parameter to another one. You can
702 do that by specifying <literal>VARIADIC</> in the call:
705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
708 This prevents expansion of the function's variadic parameter into its
709 element type, thereby allowing the array argument value to match
710 normally. <literal>VARIADIC</> can only be attached to the last
711 actual argument of a function call.
715 Specifying <literal>VARIADIC</> in the call is also the only way to
716 pass an empty array to a variadic function, for example:
719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
722 Simply writing <literal>SELECT mleast()</> does not work because a
723 variadic parameter must match at least one actual argument.
724 (You could define a second function also named <literal>mleast</>,
725 with no parameters, if you wanted to allow such calls.)
729 The array element parameters generated from a variadic parameter are
730 treated as not having any names of their own. This means it is not
731 possible to call a variadic function using named arguments (<xref
732 linkend="sql-syntax-calling-funcs">), except when you specify
733 <literal>VARIADIC</>. For example, this will work:
736 SELECT mleast(VARIADIC arr => ARRAY[10, -1, 5, 4.4]);
742 SELECT mleast(arr => 10);
743 SELECT mleast(arr => ARRAY[10, -1, 5, 4.4]);
748 <sect2 id="xfunc-sql-parameter-defaults">
749 <title><acronym>SQL</> Functions with Default Values for Arguments</title>
752 <primary>function</primary>
753 <secondary>default values for arguments</secondary>
757 Functions can be declared with default values for some or all input
758 arguments. The default values are inserted whenever the function is
759 called with insufficiently many actual arguments. Since arguments
760 can only be omitted from the end of the actual argument list, all
761 parameters after a parameter with a default value have to have
762 default values as well. (Although the use of named argument notation
763 could allow this restriction to be relaxed, it's still enforced so that
764 positional argument notation works sensibly.)
770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
777 SELECT foo(10, 20, 30);
795 SELECT foo(); -- fails since there is no default for the first argument
796 ERROR: function foo() does not exist
798 The <literal>=</literal> sign can also be used in place of the
799 key word <literal>DEFAULT</literal>.
803 <sect2 id="xfunc-sql-table-functions">
804 <title><acronym>SQL</acronym> Functions as Table Sources</title>
807 All SQL functions can be used in the <literal>FROM</> clause of a query,
808 but it is particularly useful for functions returning composite types.
809 If the function is defined to return a base type, the table function
810 produces a one-column table. If the function is defined to return
811 a composite type, the table function produces a column for each attribute
812 of the composite type.
819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
820 INSERT INTO foo VALUES (1, 1, 'Joe');
821 INSERT INTO foo VALUES (1, 2, 'Ed');
822 INSERT INTO foo VALUES (2, 1, 'Mary');
824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
825 SELECT * FROM foo WHERE fooid = $1;
828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
830 fooid | foosubid | fooname | upper
831 -------+----------+---------+-------
836 As the example shows, we can work with the columns of the function's
837 result just the same as if they were columns of a regular table.
841 Note that we only got one row out of the function. This is because
842 we did not use <literal>SETOF</>. That is described in the next section.
846 <sect2 id="xfunc-sql-functions-returning-set">
847 <title><acronym>SQL</acronym> Functions Returning Sets</title>
850 <primary>function</primary>
851 <secondary>with SETOF</secondary>
855 When an SQL function is declared as returning <literal>SETOF
856 <replaceable>sometype</></literal>, the function's final
857 query is executed to completion, and each row it
858 outputs is returned as an element of the result set.
862 This feature is normally used when calling the function in the <literal>FROM</>
863 clause. In this case each row returned by the function becomes
864 a row of the table seen by the query. For example, assume that
865 table <literal>foo</> has the same contents as above, and we say:
868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
869 SELECT * FROM foo WHERE fooid = $1;
872 SELECT * FROM getfoo(1) AS t1;
877 fooid | foosubid | fooname
878 -------+----------+---------
886 It is also possible to return multiple rows with the columns defined by
887 output parameters, like this:
890 CREATE TABLE tab (y int, z int);
891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
896 SELECT $1 + tab.y, $1 * tab.y FROM tab;
899 SELECT * FROM sum_n_product_with_tab(10);
909 The key point here is that you must write <literal>RETURNS SETOF record</>
910 to indicate that the function returns multiple rows instead of just one.
911 If there is only one output parameter, write that parameter's type
912 instead of <type>record</>.
916 It is frequently useful to construct a query's result by invoking a
917 set-returning function multiple times, with the parameters for each
918 invocation coming from successive rows of a table or subquery. The
919 preferred way to do this is to use the <literal>LATERAL</> key word,
920 which is described in <xref linkend="queries-lateral">.
921 Here is an example using a set-returning function to enumerate
922 elements of a tree structure:
936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
937 SELECT name FROM nodes WHERE parent = $1
938 $$ LANGUAGE SQL STABLE;
940 SELECT * FROM listchildren('Top');
948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
959 This example does not do anything that we couldn't have done with a
960 simple join, but in more complex calculations the option to put
961 some of the work into a function can be quite convenient.
965 Currently, functions returning sets can also be called in the select list
966 of a query. For each row that the query
967 generates by itself, the function returning set is invoked, and an output
968 row is generated for each element of the function's result set. Note,
969 however, that this capability is deprecated and might be removed in future
970 releases. The previous example could also be done with queries like
974 SELECT listchildren('Top');
982 SELECT name, listchildren(name) FROM nodes;
984 --------+--------------
993 In the last <command>SELECT</command>,
994 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
995 This happens because <function>listchildren</function> returns an empty set
996 for those arguments, so no result rows are generated. This is the same
997 behavior as we got from an inner join to the function result when using
998 the <literal>LATERAL</> syntax.
1003 If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1004 or <command>DELETE</> with <literal>RETURNING</>, that command will
1005 always be executed to completion, even if the function is not declared
1006 with <literal>SETOF</> or the calling query does not fetch all the
1007 result rows. Any extra rows produced by the <literal>RETURNING</>
1008 clause are silently dropped, but the commanded table modifications
1009 still happen (and are all completed before returning from the function).
1015 The key problem with using set-returning functions in the select list,
1016 rather than the <literal>FROM</> clause, is that putting more than one
1017 set-returning function in the same select list does not behave very
1018 sensibly. (What you actually get if you do so is a number of output
1019 rows equal to the least common multiple of the numbers of rows produced
1020 by each set-returning function.) The <literal>LATERAL</> syntax
1021 produces less surprising results when calling multiple set-returning
1022 functions, and should usually be used instead.
1027 <sect2 id="xfunc-sql-functions-returning-table">
1028 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1031 <primary>function</primary>
1032 <secondary>RETURNS TABLE</secondary>
1036 There is another way to declare a function as returning a set,
1037 which is to use the syntax
1038 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1039 This is equivalent to using one or more <literal>OUT</> parameters plus
1040 marking the function as returning <literal>SETOF record</> (or
1041 <literal>SETOF</> a single output parameter's type, as appropriate).
1042 This notation is specified in recent versions of the SQL standard, and
1043 thus may be more portable than using <literal>SETOF</>.
1047 For example, the preceding sum-and-product example could also be
1051 CREATE FUNCTION sum_n_product_with_tab (x int)
1052 RETURNS TABLE(sum int, product int) AS $$
1053 SELECT $1 + tab.y, $1 * tab.y FROM tab;
1057 It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1058 parameters with the <literal>RETURNS TABLE</> notation — you must
1059 put all the output columns in the <literal>TABLE</> list.
1064 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1067 <acronym>SQL</acronym> functions can be declared to accept and
1068 return the polymorphic types <type>anyelement</type>,
1069 <type>anyarray</type>, <type>anynonarray</type>,
1070 <type>anyenum</type>, and <type>anyrange</type>. See <xref
1071 linkend="extend-types-polymorphic"> for a more detailed
1072 explanation of polymorphic functions. Here is a polymorphic
1073 function <function>make_array</function> that builds up an array
1074 from two arbitrary data type elements:
1076 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1077 SELECT ARRAY[$1, $2];
1080 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1081 intarray | textarray
1082 ----------+-----------
1089 Notice the use of the typecast <literal>'a'::text</literal>
1090 to specify that the argument is of type <type>text</type>. This is
1091 required if the argument is just a string literal, since otherwise
1092 it would be treated as type
1093 <type>unknown</type>, and array of <type>unknown</type> is not a valid
1095 Without the typecast, you will get errors like this:
1098 ERROR: could not determine polymorphic type because input has type "unknown"
1104 It is permitted to have polymorphic arguments with a fixed
1105 return type, but the converse is not. For example:
1107 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1111 SELECT is_greater(1, 2);
1117 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1120 ERROR: cannot determine result data type
1121 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
1126 Polymorphism can be used with functions that have output arguments.
1129 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1130 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1132 SELECT * FROM dup(22);
1141 Polymorphism can also be used with variadic functions.
1144 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1145 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1148 SELECT anyleast(10, -1, 5, 4);
1154 SELECT anyleast('abc'::text, 'def');
1160 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1161 SELECT array_to_string($2, $1);
1164 SELECT concat_values('|', 1, 4, 2);
1174 <title><acronym>SQL</acronym> Functions with Collations</title>
1177 <primary>collation</>
1178 <secondary>in SQL functions</>
1182 When a SQL function has one or more parameters of collatable data types,
1183 a collation is identified for each function call depending on the
1184 collations assigned to the actual arguments, as described in <xref
1185 linkend="collation">. If a collation is successfully identified
1186 (i.e., there are no conflicts of implicit collations among the arguments)
1187 then all the collatable parameters are treated as having that collation
1188 implicitly. This will affect the behavior of collation-sensitive
1189 operations within the function. For example, using the
1190 <function>anyleast</> function described above, the result of
1192 SELECT anyleast('abc'::text, 'ABC');
1194 will depend on the database's default collation. In <literal>C</> locale
1195 the result will be <literal>ABC</>, but in many other locales it will
1196 be <literal>abc</>. The collation to use can be forced by adding
1197 a <literal>COLLATE</> clause to any of the arguments, for example
1199 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1201 Alternatively, if you wish a function to operate with a particular
1202 collation regardless of what it is called with, insert
1203 <literal>COLLATE</> clauses as needed in the function definition.
1204 This version of <function>anyleast</> would always use <literal>en_US</>
1205 locale to compare strings:
1207 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1208 SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1211 But note that this will throw an error if applied to a non-collatable
1216 If no common collation can be identified among the actual arguments,
1217 then a SQL function treats its parameters as having their data types'
1218 default collation (which is usually the database's default collation,
1219 but could be different for parameters of domain types).
1223 The behavior of collatable parameters can be thought of as a limited
1224 form of polymorphism, applicable only to textual data types.
1229 <sect1 id="xfunc-overload">
1230 <title>Function Overloading</title>
1232 <indexterm zone="xfunc-overload">
1233 <primary>overloading</primary>
1234 <secondary>functions</secondary>
1238 More than one function can be defined with the same SQL name, so long
1239 as the arguments they take are different. In other words,
1240 function names can be <firstterm>overloaded</firstterm>. When a
1241 query is executed, the server will determine which function to
1242 call from the data types and the number of the provided arguments.
1243 Overloading can also be used to simulate functions with a variable
1244 number of arguments, up to a finite maximum number.
1248 When creating a family of overloaded functions, one should be
1249 careful not to create ambiguities. For instance, given the
1252 CREATE FUNCTION test(int, real) RETURNS ...
1253 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1255 it is not immediately clear which function would be called with
1256 some trivial input like <literal>test(1, 1.5)</literal>. The
1257 currently implemented resolution rules are described in
1258 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1259 relies on this behavior.
1263 A function that takes a single argument of a composite type should
1264 generally not have the same name as any attribute (field) of that type.
1265 Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1266 is considered equivalent
1267 to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1268 In the case that there is an
1269 ambiguity between a function on a composite type and an attribute of
1270 the composite type, the attribute will always be used. It is possible
1271 to override that choice by schema-qualifying the function name
1272 (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1273 </literal>) but it's better to
1274 avoid the problem by not choosing conflicting names.
1278 Another possible conflict is between variadic and non-variadic functions.
1279 For instance, it is possible to create both <literal>foo(numeric)</> and
1280 <literal>foo(VARIADIC numeric[])</>. In this case it is unclear which one
1281 should be matched to a call providing a single numeric argument, such as
1282 <literal>foo(10.1)</>. The rule is that the function appearing
1283 earlier in the search path is used, or if the two functions are in the
1284 same schema, the non-variadic one is preferred.
1288 When overloading C-language functions, there is an additional
1289 constraint: The C name of each function in the family of
1290 overloaded functions must be different from the C names of all
1291 other functions, either internal or dynamically loaded. If this
1292 rule is violated, the behavior is not portable. You might get a
1293 run-time linker error, or one of the functions will get called
1294 (usually the internal one). The alternative form of the
1295 <literal>AS</> clause for the SQL <command>CREATE
1296 FUNCTION</command> command decouples the SQL function name from
1297 the function name in the C source code. For instance:
1299 CREATE FUNCTION test(int) RETURNS int
1300 AS '<replaceable>filename</>', 'test_1arg'
1302 CREATE FUNCTION test(int, int) RETURNS int
1303 AS '<replaceable>filename</>', 'test_2arg'
1306 The names of the C functions here reflect one of many possible conventions.
1310 <sect1 id="xfunc-volatility">
1311 <title>Function Volatility Categories</title>
1313 <indexterm zone="xfunc-volatility">
1314 <primary>volatility</primary>
1315 <secondary>functions</secondary>
1317 <indexterm zone="xfunc-volatility">
1318 <primary>VOLATILE</primary>
1320 <indexterm zone="xfunc-volatility">
1321 <primary>STABLE</primary>
1323 <indexterm zone="xfunc-volatility">
1324 <primary>IMMUTABLE</primary>
1328 Every function has a <firstterm>volatility</> classification, with
1329 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1330 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
1331 <xref linkend="sql-createfunction">
1332 command does not specify a category. The volatility category is a
1333 promise to the optimizer about the behavior of the function:
1338 A <literal>VOLATILE</> function can do anything, including modifying
1339 the database. It can return different results on successive calls with
1340 the same arguments. The optimizer makes no assumptions about the
1341 behavior of such functions. A query using a volatile function will
1342 re-evaluate the function at every row where its value is needed.
1347 A <literal>STABLE</> function cannot modify the database and is
1348 guaranteed to return the same results given the same arguments
1349 for all rows within a single statement. This category allows the
1350 optimizer to optimize multiple calls of the function to a single
1351 call. In particular, it is safe to use an expression containing
1352 such a function in an index scan condition. (Since an index scan
1353 will evaluate the comparison value only once, not once at each
1354 row, it is not valid to use a <literal>VOLATILE</> function in an
1355 index scan condition.)
1360 An <literal>IMMUTABLE</> function cannot modify the database and is
1361 guaranteed to return the same results given the same arguments forever.
1362 This category allows the optimizer to pre-evaluate the function when
1363 a query calls it with constant arguments. For example, a query like
1364 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1365 <literal>SELECT ... WHERE x = 4</>, because the function underlying
1366 the integer addition operator is marked <literal>IMMUTABLE</>.
1373 For best optimization results, you should label your functions with the
1374 strictest volatility category that is valid for them.
1378 Any function with side-effects <emphasis>must</> be labeled
1379 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1380 Even a function with no side-effects needs to be labeled
1381 <literal>VOLATILE</> if its value can change within a single query;
1382 some examples are <literal>random()</>, <literal>currval()</>,
1383 <literal>timeofday()</>.
1387 Another important example is that the <function>current_timestamp</>
1388 family of functions qualify as <literal>STABLE</>, since their values do
1389 not change within a transaction.
1393 There is relatively little difference between <literal>STABLE</> and
1394 <literal>IMMUTABLE</> categories when considering simple interactive
1395 queries that are planned and immediately executed: it doesn't matter
1396 a lot whether a function is executed once during planning or once during
1397 query execution startup. But there is a big difference if the plan is
1398 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
1399 it really isn't might allow it to be prematurely folded to a constant during
1400 planning, resulting in a stale value being re-used during subsequent uses
1401 of the plan. This is a hazard when using prepared statements or when
1402 using function languages that cache plans (such as
1403 <application>PL/pgSQL</>).
1407 For functions written in SQL or in any of the standard procedural
1408 languages, there is a second important property determined by the
1409 volatility category, namely the visibility of any data changes that have
1410 been made by the SQL command that is calling the function. A
1411 <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1412 or <literal>IMMUTABLE</> function will not. This behavior is implemented
1413 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1414 <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1415 established as of the start of the calling query, whereas
1416 <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1417 each query they execute.
1422 Functions written in C can manage snapshots however they want, but it's
1423 usually a good idea to make C functions work this way too.
1428 Because of this snapshotting behavior,
1429 a function containing only <command>SELECT</> commands can safely be
1430 marked <literal>STABLE</>, even if it selects from tables that might be
1431 undergoing modifications by concurrent queries.
1432 <productname>PostgreSQL</productname> will execute all commands of a
1433 <literal>STABLE</> function using the snapshot established for the
1434 calling query, and so it will see a fixed view of the database throughout
1439 The same snapshotting behavior is used for <command>SELECT</> commands
1440 within <literal>IMMUTABLE</> functions. It is generally unwise to select
1441 from database tables within an <literal>IMMUTABLE</> function at all,
1442 since the immutability will be broken if the table contents ever change.
1443 However, <productname>PostgreSQL</productname> does not enforce that you
1448 A common error is to label a function <literal>IMMUTABLE</> when its
1449 results depend on a configuration parameter. For example, a function
1450 that manipulates timestamps might well have results that depend on the
1451 <xref linkend="guc-timezone"> setting. For safety, such functions should
1452 be labeled <literal>STABLE</> instead.
1457 <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1458 and <literal>IMMUTABLE</> functions contain no SQL commands other
1459 than <command>SELECT</> to prevent data modification.
1460 (This is not a completely bulletproof test, since such functions could
1461 still call <literal>VOLATILE</> functions that modify the database.
1462 If you do that, you will find that the <literal>STABLE</> or
1463 <literal>IMMUTABLE</> function does not notice the database changes
1464 applied by the called function, since they are hidden from its snapshot.)
1469 <sect1 id="xfunc-pl">
1470 <title>Procedural Language Functions</title>
1473 <productname>PostgreSQL</productname> allows user-defined functions
1474 to be written in other languages besides SQL and C. These other
1475 languages are generically called <firstterm>procedural
1476 languages</firstterm> (<acronym>PL</>s).
1477 Procedural languages aren't built into the
1478 <productname>PostgreSQL</productname> server; they are offered
1479 by loadable modules.
1480 See <xref linkend="xplang"> and following chapters for more
1485 <sect1 id="xfunc-internal">
1486 <title>Internal Functions</title>
1488 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1491 Internal functions are functions written in C that have been statically
1492 linked into the <productname>PostgreSQL</productname> server.
1493 The <quote>body</quote> of the function definition
1494 specifies the C-language name of the function, which need not be the
1495 same as the name being declared for SQL use.
1496 (For reasons of backward compatibility, an empty body
1497 is accepted as meaning that the C-language function name is the
1498 same as the SQL name.)
1502 Normally, all internal functions present in the
1503 server are declared during the initialization of the database cluster
1504 (see <xref linkend="creating-cluster">),
1505 but a user could use <command>CREATE FUNCTION</command>
1506 to create additional alias names for an internal function.
1507 Internal functions are declared in <command>CREATE FUNCTION</command>
1508 with language name <literal>internal</literal>. For instance, to
1509 create an alias for the <function>sqrt</function> function:
1511 CREATE FUNCTION square_root(double precision) RETURNS double precision
1516 (Most internal functions expect to be declared <quote>strict</quote>.)
1521 Not all <quote>predefined</quote> functions are
1522 <quote>internal</quote> in the above sense. Some predefined
1523 functions are written in SQL.
1528 <sect1 id="xfunc-c">
1529 <title>C-Language Functions</title>
1531 <indexterm zone="xfunc-c">
1532 <primary>function</primary>
1533 <secondary>user-defined</secondary>
1534 <tertiary>in C</tertiary>
1538 User-defined functions can be written in C (or a language that can
1539 be made compatible with C, such as C++). Such functions are
1540 compiled into dynamically loadable objects (also called shared
1541 libraries) and are loaded by the server on demand. The dynamic
1542 loading feature is what distinguishes <quote>C language</> functions
1543 from <quote>internal</> functions — the actual coding conventions
1544 are essentially the same for both. (Hence, the standard internal
1545 function library is a rich source of coding examples for user-defined
1550 Two different calling conventions are currently used for C functions.
1551 The newer <quote>version 1</quote> calling convention is indicated by writing
1552 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1553 as illustrated below. Lack of such a macro indicates an old-style
1554 (<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
1555 is <literal>C</literal> in either case. Old-style functions are now deprecated
1556 because of portability problems and lack of functionality, but they
1557 are still supported for compatibility reasons.
1560 <sect2 id="xfunc-c-dynload">
1561 <title>Dynamic Loading</title>
1563 <indexterm zone="xfunc-c-dynload">
1564 <primary>dynamic loading</primary>
1568 The first time a user-defined function in a particular
1569 loadable object file is called in a session,
1570 the dynamic loader loads that object file into memory so that the
1571 function can be called. The <command>CREATE FUNCTION</command>
1572 for a user-defined C function must therefore specify two pieces of
1573 information for the function: the name of the loadable
1574 object file, and the C name (link symbol) of the specific function to call
1575 within that object file. If the C name is not explicitly specified then
1576 it is assumed to be the same as the SQL function name.
1580 The following algorithm is used to locate the shared object file
1581 based on the name given in the <command>CREATE FUNCTION</command>
1587 If the name is an absolute path, the given file is loaded.
1593 If the name starts with the string <literal>$libdir</literal>,
1594 that part is replaced by the <productname>PostgreSQL</> package
1596 name, which is determined at build time.<indexterm><primary>$libdir</></>
1602 If the name does not contain a directory part, the file is
1603 searched for in the path specified by the configuration variable
1604 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1610 Otherwise (the file was not found in the path, or it contains a
1611 non-absolute directory part), the dynamic loader will try to
1612 take the name as given, which will most likely fail. (It is
1613 unreliable to depend on the current working directory.)
1618 If this sequence does not work, the platform-specific shared
1619 library file name extension (often <filename>.so</filename>) is
1620 appended to the given name and this sequence is tried again. If
1621 that fails as well, the load will fail.
1625 It is recommended to locate shared libraries either relative to
1626 <literal>$libdir</literal> or through the dynamic library path.
1627 This simplifies version upgrades if the new installation is at a
1628 different location. The actual directory that
1629 <literal>$libdir</literal> stands for can be found out with the
1630 command <literal>pg_config --pkglibdir</literal>.
1634 The user ID the <productname>PostgreSQL</productname> server runs
1635 as must be able to traverse the path to the file you intend to
1636 load. Making the file or a higher-level directory not readable
1637 and/or not executable by the <systemitem>postgres</systemitem>
1638 user is a common mistake.
1642 In any case, the file name that is given in the
1643 <command>CREATE FUNCTION</command> command is recorded literally
1644 in the system catalogs, so if the file needs to be loaded again
1645 the same procedure is applied.
1650 <productname>PostgreSQL</productname> will not compile a C function
1651 automatically. The object file must be compiled before it is referenced
1652 in a <command>CREATE
1653 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1658 <indexterm zone="xfunc-c-dynload">
1659 <primary>magic block</primary>
1663 To ensure that a dynamically loaded object file is not loaded into an
1664 incompatible server, <productname>PostgreSQL</productname> checks that the
1665 file contains a <quote>magic block</> with the appropriate contents.
1666 This allows the server to detect obvious incompatibilities, such as code
1667 compiled for a different major version of
1668 <productname>PostgreSQL</productname>. A magic block is required as of
1669 <productname>PostgreSQL</productname> 8.2. To include a magic block,
1670 write this in one (and only one) of the module source files, after having
1671 included the header <filename>fmgr.h</>:
1674 #ifdef PG_MODULE_MAGIC
1679 The <literal>#ifdef</> test can be omitted if the code doesn't
1680 need to compile against pre-8.2 <productname>PostgreSQL</productname>
1685 After it is used for the first time, a dynamically loaded object
1686 file is retained in memory. Future calls in the same session to
1687 the function(s) in that file will only incur the small overhead of
1688 a symbol table lookup. If you need to force a reload of an object
1689 file, for example after recompiling it, begin a fresh session.
1692 <indexterm zone="xfunc-c-dynload">
1693 <primary>_PG_init</primary>
1695 <indexterm zone="xfunc-c-dynload">
1696 <primary>_PG_fini</primary>
1698 <indexterm zone="xfunc-c-dynload">
1699 <primary>library initialization function</primary>
1701 <indexterm zone="xfunc-c-dynload">
1702 <primary>library finalization function</primary>
1706 Optionally, a dynamically loaded file can contain initialization and
1707 finalization functions. If the file includes a function named
1708 <function>_PG_init</>, that function will be called immediately after
1709 loading the file. The function receives no parameters and should
1710 return void. If the file includes a function named
1711 <function>_PG_fini</>, that function will be called immediately before
1712 unloading the file. Likewise, the function receives no parameters and
1713 should return void. Note that <function>_PG_fini</> will only be called
1714 during an unload of the file, not during process termination.
1715 (Presently, unloads are disabled and will never occur, but this may
1716 change in the future.)
1721 <sect2 id="xfunc-c-basetype">
1722 <title>Base Types in C-Language Functions</title>
1724 <indexterm zone="xfunc-c-basetype">
1725 <primary>data type</primary>
1726 <secondary>internal organization</secondary>
1730 To know how to write C-language functions, you need to know how
1731 <productname>PostgreSQL</productname> internally represents base
1732 data types and how they can be passed to and from functions.
1733 Internally, <productname>PostgreSQL</productname> regards a base
1734 type as a <quote>blob of memory</quote>. The user-defined
1735 functions that you define over a type in turn define the way that
1736 <productname>PostgreSQL</productname> can operate on it. That
1737 is, <productname>PostgreSQL</productname> will only store and
1738 retrieve the data from disk and use your user-defined functions
1739 to input, process, and output the data.
1743 Base types can have one of three internal formats:
1748 pass by value, fixed-length
1753 pass by reference, fixed-length
1758 pass by reference, variable-length
1765 By-value types can only be 1, 2, or 4 bytes in length
1766 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1767 You should be careful to define your types such that they will be the
1768 same size (in bytes) on all architectures. For example, the
1769 <literal>long</literal> type is dangerous because it is 4 bytes on some
1770 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1771 on most Unix machines. A reasonable implementation of the
1772 <type>int4</type> type on Unix machines might be:
1775 /* 4-byte integer, passed by value */
1779 (The actual PostgreSQL C code calls this type <type>int32</type>, because
1780 it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1781 means <replaceable>XX</replaceable> <emphasis>bits</emphasis>. Note
1782 therefore also that the C type <type>int8</type> is 1 byte in size. The
1783 SQL type <type>int8</type> is called <type>int64</type> in C. See also
1784 <xref linkend="xfunc-c-type-table">.)
1788 On the other hand, fixed-length types of any size can
1789 be passed by-reference. For example, here is a sample
1790 implementation of a <productname>PostgreSQL</productname> type:
1793 /* 16-byte structure, passed by reference */
1800 Only pointers to such types can be used when passing
1801 them in and out of <productname>PostgreSQL</productname> functions.
1802 To return a value of such a type, allocate the right amount of
1803 memory with <literal>palloc</literal>, fill in the allocated memory,
1804 and return a pointer to it. (Also, if you just want to return the
1805 same value as one of your input arguments that's of the same data type,
1806 you can skip the extra <literal>palloc</literal> and just return the
1807 pointer to the input value.)
1811 Finally, all variable-length types must also be passed
1812 by reference. All variable-length types must begin
1813 with an opaque length field of exactly 4 bytes, which will be set
1814 by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1815 be stored within that type must be located in the memory
1816 immediately following that length field. The
1817 length field contains the total length of the structure,
1818 that is, it includes the size of the length field
1823 Another important point is to avoid leaving any uninitialized bits
1824 within data type values; for example, take care to zero out any
1825 alignment padding bytes that might be present in structs. Without
1826 this, logically-equivalent constants of your data type might be
1827 seen as unequal by the planner, leading to inefficient (though not
1833 <emphasis>Never</> modify the contents of a pass-by-reference input
1834 value. If you do so you are likely to corrupt on-disk data, since
1835 the pointer you are given might point directly into a disk buffer.
1836 The sole exception to this rule is explained in
1837 <xref linkend="xaggr">.
1842 As an example, we can define the type <type>text</type> as
1848 char data[FLEXIBLE_ARRAY_MEMBER];
1852 The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1853 length of the data part is not specified by this declaration.
1858 variable-length types, we must be careful to allocate
1859 the correct amount of memory and set the length field correctly.
1860 For example, if we wanted to store 40 bytes in a <structname>text</>
1861 structure, we might use a code fragment like this:
1863 <programlisting><![CDATA[
1864 #include "postgres.h"
1866 char buffer[40]; /* our source data */
1868 text *destination = (text *) palloc(VARHDRSZ + 40);
1869 SET_VARSIZE(destination, VARHDRSZ + 40);
1870 memcpy(destination->data, buffer, 40);
1875 <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1876 it's considered good style to use the macro <literal>VARHDRSZ</>
1877 to refer to the size of the overhead for a variable-length type.
1878 Also, the length field <emphasis>must</> be set using the
1879 <literal>SET_VARSIZE</> macro, not by simple assignment.
1883 <xref linkend="xfunc-c-type-table"> specifies which C type
1884 corresponds to which SQL type when writing a C-language function
1885 that uses a built-in type of <productname>PostgreSQL</>.
1886 The <quote>Defined In</quote> column gives the header file that
1887 needs to be included to get the type definition. (The actual
1888 definition might be in a different file that is included by the
1889 listed file. It is recommended that users stick to the defined
1890 interface.) Note that you should always include
1891 <filename>postgres.h</filename> first in any source file, because
1892 it declares a number of things that you will need anyway.
1895 <table tocentry="1" id="xfunc-c-type-table">
1896 <title>Equivalent C Types for Built-in SQL Types</title>
1913 <entry><type>abstime</type></entry>
1914 <entry><type>AbsoluteTime</type></entry>
1915 <entry><filename>utils/nabstime.h</filename></entry>
1918 <entry><type>bigint</type> (<type>int8</type>)</entry>
1919 <entry><type>int64</type></entry>
1920 <entry><filename>postgres.h</filename></entry>
1923 <entry><type>boolean</type></entry>
1924 <entry><type>bool</type></entry>
1925 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1928 <entry><type>box</type></entry>
1929 <entry><type>BOX*</type></entry>
1930 <entry><filename>utils/geo_decls.h</filename></entry>
1933 <entry><type>bytea</type></entry>
1934 <entry><type>bytea*</type></entry>
1935 <entry><filename>postgres.h</filename></entry>
1938 <entry><type>"char"</type></entry>
1939 <entry><type>char</type></entry>
1940 <entry>(compiler built-in)</entry>
1943 <entry><type>character</type></entry>
1944 <entry><type>BpChar*</type></entry>
1945 <entry><filename>postgres.h</filename></entry>
1948 <entry><type>cid</type></entry>
1949 <entry><type>CommandId</type></entry>
1950 <entry><filename>postgres.h</filename></entry>
1953 <entry><type>date</type></entry>
1954 <entry><type>DateADT</type></entry>
1955 <entry><filename>utils/date.h</filename></entry>
1958 <entry><type>smallint</type> (<type>int2</type>)</entry>
1959 <entry><type>int16</type></entry>
1960 <entry><filename>postgres.h</filename></entry>
1963 <entry><type>int2vector</type></entry>
1964 <entry><type>int2vector*</type></entry>
1965 <entry><filename>postgres.h</filename></entry>
1968 <entry><type>integer</type> (<type>int4</type>)</entry>
1969 <entry><type>int32</type></entry>
1970 <entry><filename>postgres.h</filename></entry>
1973 <entry><type>real</type> (<type>float4</type>)</entry>
1974 <entry><type>float4*</type></entry>
1975 <entry><filename>postgres.h</filename></entry>
1978 <entry><type>double precision</type> (<type>float8</type>)</entry>
1979 <entry><type>float8*</type></entry>
1980 <entry><filename>postgres.h</filename></entry>
1983 <entry><type>interval</type></entry>
1984 <entry><type>Interval*</type></entry>
1985 <entry><filename>datatype/timestamp.h</filename></entry>
1988 <entry><type>lseg</type></entry>
1989 <entry><type>LSEG*</type></entry>
1990 <entry><filename>utils/geo_decls.h</filename></entry>
1993 <entry><type>name</type></entry>
1994 <entry><type>Name</type></entry>
1995 <entry><filename>postgres.h</filename></entry>
1998 <entry><type>oid</type></entry>
1999 <entry><type>Oid</type></entry>
2000 <entry><filename>postgres.h</filename></entry>
2003 <entry><type>oidvector</type></entry>
2004 <entry><type>oidvector*</type></entry>
2005 <entry><filename>postgres.h</filename></entry>
2008 <entry><type>path</type></entry>
2009 <entry><type>PATH*</type></entry>
2010 <entry><filename>utils/geo_decls.h</filename></entry>
2013 <entry><type>point</type></entry>
2014 <entry><type>POINT*</type></entry>
2015 <entry><filename>utils/geo_decls.h</filename></entry>
2018 <entry><type>regproc</type></entry>
2019 <entry><type>regproc</type></entry>
2020 <entry><filename>postgres.h</filename></entry>
2023 <entry><type>reltime</type></entry>
2024 <entry><type>RelativeTime</type></entry>
2025 <entry><filename>utils/nabstime.h</filename></entry>
2028 <entry><type>text</type></entry>
2029 <entry><type>text*</type></entry>
2030 <entry><filename>postgres.h</filename></entry>
2033 <entry><type>tid</type></entry>
2034 <entry><type>ItemPointer</type></entry>
2035 <entry><filename>storage/itemptr.h</filename></entry>
2038 <entry><type>time</type></entry>
2039 <entry><type>TimeADT</type></entry>
2040 <entry><filename>utils/date.h</filename></entry>
2043 <entry><type>time with time zone</type></entry>
2044 <entry><type>TimeTzADT</type></entry>
2045 <entry><filename>utils/date.h</filename></entry>
2048 <entry><type>timestamp</type></entry>
2049 <entry><type>Timestamp*</type></entry>
2050 <entry><filename>datatype/timestamp.h</filename></entry>
2053 <entry><type>tinterval</type></entry>
2054 <entry><type>TimeInterval</type></entry>
2055 <entry><filename>utils/nabstime.h</filename></entry>
2058 <entry><type>varchar</type></entry>
2059 <entry><type>VarChar*</type></entry>
2060 <entry><filename>postgres.h</filename></entry>
2063 <entry><type>xid</type></entry>
2064 <entry><type>TransactionId</type></entry>
2065 <entry><filename>postgres.h</filename></entry>
2072 Now that we've gone over all of the possible structures
2073 for base types, we can show some examples of real functions.
2078 <title>Version 0 Calling Conventions</title>
2081 We present the <quote>old style</quote> calling convention first — although
2082 this approach is now deprecated, it's easier to get a handle on
2083 initially. In the version-0 method, the arguments and result
2084 of the C function are just declared in normal C style, but being
2085 careful to use the C representation of each SQL data type as shown
2090 Here are some examples:
2092 <programlisting><![CDATA[
2093 #include "postgres.h"
2095 #include "utils/geo_decls.h"
2097 #ifdef PG_MODULE_MAGIC
2109 /* by reference, fixed length */
2112 add_one_float8(float8 *arg)
2114 float8 *result = (float8 *) palloc(sizeof(float8));
2116 *result = *arg + 1.0;
2122 makepoint(Point *pointx, Point *pointy)
2124 Point *new_point = (Point *) palloc(sizeof(Point));
2126 new_point->x = pointx->x;
2127 new_point->y = pointy->y;
2132 /* by reference, variable length */
2138 * VARSIZE is the total size of the struct in bytes.
2140 text *new_t = (text *) palloc(VARSIZE(t));
2141 SET_VARSIZE(new_t, VARSIZE(t));
2143 * VARDATA is a pointer to the data region of the struct.
2145 memcpy((void *) VARDATA(new_t), /* destination */
2146 (void *) VARDATA(t), /* source */
2147 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2152 concat_text(text *arg1, text *arg2)
2154 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2155 text *new_text = (text *) palloc(new_text_size);
2157 SET_VARSIZE(new_text, new_text_size);
2158 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2159 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2160 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2168 Supposing that the above code has been prepared in file
2169 <filename>funcs.c</filename> and compiled into a shared object,
2170 we could define the functions to <productname>PostgreSQL</productname>
2171 with commands like this:
2174 CREATE FUNCTION add_one(integer) RETURNS integer
2175 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2178 -- note overloading of SQL function name "add_one"
2179 CREATE FUNCTION add_one(double precision) RETURNS double precision
2180 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2183 CREATE FUNCTION makepoint(point, point) RETURNS point
2184 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2187 CREATE FUNCTION copytext(text) RETURNS text
2188 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2191 CREATE FUNCTION concat_text(text, text) RETURNS text
2192 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2198 Here, <replaceable>DIRECTORY</replaceable> stands for the
2199 directory of the shared library file (for instance the
2200 <productname>PostgreSQL</productname> tutorial directory, which
2201 contains the code for the examples used in this section).
2202 (Better style would be to use just <literal>'funcs'</> in the
2203 <literal>AS</> clause, after having added
2204 <replaceable>DIRECTORY</replaceable> to the search path. In any
2205 case, we can omit the system-specific extension for a shared
2206 library, commonly <literal>.so</literal> or
2207 <literal>.sl</literal>.)
2211 Notice that we have specified the functions as <quote>strict</quote>,
2213 the system should automatically assume a null result if any input
2214 value is null. By doing this, we avoid having to check for null inputs
2215 in the function code. Without this, we'd have to check for null values
2216 explicitly, by checking for a null pointer for each
2217 pass-by-reference argument. (For pass-by-value arguments, we don't
2218 even have a way to check!)
2222 Although this calling convention is simple to use,
2223 it is not very portable; on some architectures there are problems
2224 with passing data types that are smaller than <type>int</type> this way. Also, there is
2225 no simple way to return a null result, nor to cope with null arguments
2226 in any way other than making the function strict. The version-1
2227 convention, presented next, overcomes these objections.
2232 <title>Version 1 Calling Conventions</title>
2235 The version-1 calling convention relies on macros to suppress most
2236 of the complexity of passing arguments and results. The C declaration
2237 of a version-1 function is always:
2239 Datum funcname(PG_FUNCTION_ARGS)
2241 In addition, the macro call:
2243 PG_FUNCTION_INFO_V1(funcname);
2245 must appear in the same source file. (Conventionally, it's
2246 written just before the function itself.) This macro call is not
2247 needed for <literal>internal</>-language functions, since
2248 <productname>PostgreSQL</> assumes that all internal functions
2249 use the version-1 convention. It is, however, required for
2250 dynamically-loaded functions.
2254 In a version-1 function, each actual argument is fetched using a
2255 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2256 macro that corresponds to the argument's data type, and the
2257 result is returned using a
2258 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2259 macro for the return type.
2260 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2261 takes as its argument the number of the function argument to
2262 fetch, where the count starts at 0.
2263 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2264 takes as its argument the actual value to return.
2268 Here we show the same functions as above, coded in version-1 style:
2270 <programlisting><![CDATA[
2271 #include "postgres.h"
2274 #include "utils/geo_decls.h"
2276 #ifdef PG_MODULE_MAGIC
2282 PG_FUNCTION_INFO_V1(add_one);
2285 add_one(PG_FUNCTION_ARGS)
2287 int32 arg = PG_GETARG_INT32(0);
2289 PG_RETURN_INT32(arg + 1);
2292 /* by reference, fixed length */
2294 PG_FUNCTION_INFO_V1(add_one_float8);
2297 add_one_float8(PG_FUNCTION_ARGS)
2299 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2300 float8 arg = PG_GETARG_FLOAT8(0);
2302 PG_RETURN_FLOAT8(arg + 1.0);
2305 PG_FUNCTION_INFO_V1(makepoint);
2308 makepoint(PG_FUNCTION_ARGS)
2310 /* Here, the pass-by-reference nature of Point is not hidden. */
2311 Point *pointx = PG_GETARG_POINT_P(0);
2312 Point *pointy = PG_GETARG_POINT_P(1);
2313 Point *new_point = (Point *) palloc(sizeof(Point));
2315 new_point->x = pointx->x;
2316 new_point->y = pointy->y;
2318 PG_RETURN_POINT_P(new_point);
2321 /* by reference, variable length */
2323 PG_FUNCTION_INFO_V1(copytext);
2326 copytext(PG_FUNCTION_ARGS)
2328 text *t = PG_GETARG_TEXT_P(0);
2330 * VARSIZE is the total size of the struct in bytes.
2332 text *new_t = (text *) palloc(VARSIZE(t));
2333 SET_VARSIZE(new_t, VARSIZE(t));
2335 * VARDATA is a pointer to the data region of the struct.
2337 memcpy((void *) VARDATA(new_t), /* destination */
2338 (void *) VARDATA(t), /* source */
2339 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2340 PG_RETURN_TEXT_P(new_t);
2343 PG_FUNCTION_INFO_V1(concat_text);
2346 concat_text(PG_FUNCTION_ARGS)
2348 text *arg1 = PG_GETARG_TEXT_P(0);
2349 text *arg2 = PG_GETARG_TEXT_P(1);
2350 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2351 text *new_text = (text *) palloc(new_text_size);
2353 SET_VARSIZE(new_text, new_text_size);
2354 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2355 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2356 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2357 PG_RETURN_TEXT_P(new_text);
2364 The <command>CREATE FUNCTION</command> commands are the same as
2365 for the version-0 equivalents.
2369 At first glance, the version-1 coding conventions might appear to
2370 be just pointless obscurantism. They do, however, offer a number
2371 of improvements, because the macros can hide unnecessary detail.
2372 An example is that in coding <function>add_one_float8</>, we no longer need to
2373 be aware that <type>float8</type> is a pass-by-reference type. Another
2374 example is that the <literal>GETARG</> macros for variable-length types allow
2375 for more efficient fetching of <quote>toasted</quote> (compressed or
2376 out-of-line) values.
2380 One big improvement in version-1 functions is better handling of null
2381 inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2382 allows a function to test whether each input is null. (Of course, doing
2383 this is only necessary in functions not declared <quote>strict</>.)
2385 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2386 the input arguments are counted beginning at zero. Note that one
2387 should refrain from executing
2388 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2389 one has verified that the argument isn't null.
2390 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2391 this works in both strict and nonstrict functions.
2395 Other options provided in the new-style interface are two
2397 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2398 macros. The first of these,
2399 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2400 guarantees to return a copy of the specified argument that is
2401 safe for writing into. (The normal macros will sometimes return a
2402 pointer to a value that is physically stored in a table, which
2403 must not be written to. Using the
2404 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2405 macros guarantees a writable result.)
2406 The second variant consists of the
2407 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2408 macros which take three arguments. The first is the number of the
2409 function argument (as above). The second and third are the offset and
2410 length of the segment to be returned. Offsets are counted from
2411 zero, and a negative length requests that the remainder of the
2412 value be returned. These macros provide more efficient access to
2413 parts of large values in the case where they have storage type
2414 <quote>external</quote>. (The storage type of a column can be specified using
2415 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2416 COLUMN <replaceable>colname</replaceable> SET STORAGE
2417 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2418 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2419 or <literal>main</>.)
2423 Finally, the version-1 function call conventions make it possible
2424 to return set results (<xref linkend="xfunc-c-return-set">) and
2425 implement trigger functions (<xref linkend="triggers">) and
2426 procedural-language call handlers (<xref
2427 linkend="plhandler">). Version-1 code is also more
2428 portable than version-0, because it does not break restrictions
2429 on function call protocol in the C standard. For more details
2430 see <filename>src/backend/utils/fmgr/README</filename> in the
2431 source distribution.
2436 <title>Writing Code</title>
2439 Before we turn to the more advanced topics, we should discuss
2440 some coding rules for <productname>PostgreSQL</productname>
2441 C-language functions. While it might be possible to load functions
2442 written in languages other than C into
2443 <productname>PostgreSQL</productname>, this is usually difficult
2444 (when it is possible at all) because other languages, such as
2445 C++, FORTRAN, or Pascal often do not follow the same calling
2446 convention as C. That is, other languages do not pass argument
2447 and return values between functions in the same way. For this
2448 reason, we will assume that your C-language functions are
2449 actually written in C.
2453 The basic rules for writing and building C functions are as follows:
2458 Use <literal>pg_config
2459 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2460 to find out where the <productname>PostgreSQL</> server header
2461 files are installed on your system (or the system that your
2462 users will be running on).
2468 Compiling and linking your code so that it can be dynamically
2469 loaded into <productname>PostgreSQL</productname> always
2470 requires special flags. See <xref linkend="dfunc"> for a
2471 detailed explanation of how to do it for your particular
2478 Remember to define a <quote>magic block</> for your shared library,
2479 as described in <xref linkend="xfunc-c-dynload">.
2485 When allocating memory, use the
2486 <productname>PostgreSQL</productname> functions
2487 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2488 instead of the corresponding C library functions
2489 <function>malloc</function> and <function>free</function>.
2490 The memory allocated by <function>palloc</function> will be
2491 freed automatically at the end of each transaction, preventing
2498 Always zero the bytes of your structures using <function>memset</>
2499 (or allocate them with <function>palloc0</> in the first place).
2500 Even if you assign to each field of your structure, there might be
2501 alignment padding (holes in the structure) that contain
2502 garbage values. Without this, it's difficult to
2503 support hash indexes or hash joins, as you must pick out only
2504 the significant bits of your data structure to compute a hash.
2505 The planner also sometimes relies on comparing constants via
2506 bitwise equality, so you can get undesirable planning results if
2507 logically-equivalent values aren't bitwise equal.
2513 Most of the internal <productname>PostgreSQL</productname>
2514 types are declared in <filename>postgres.h</filename>, while
2515 the function manager interfaces
2516 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2517 <filename>fmgr.h</filename>, so you will need to include at
2518 least these two files. For portability reasons it's best to
2519 include <filename>postgres.h</filename> <emphasis>first</>,
2520 before any other system or user header files. Including
2521 <filename>postgres.h</filename> will also include
2522 <filename>elog.h</filename> and <filename>palloc.h</filename>
2529 Symbol names defined within object files must not conflict
2530 with each other or with symbols defined in the
2531 <productname>PostgreSQL</productname> server executable. You
2532 will have to rename your functions or variables if you get
2533 error messages to this effect.
2543 <title>Composite-type Arguments</title>
2546 Composite types do not have a fixed layout like C structures.
2547 Instances of a composite type can contain null fields. In
2548 addition, composite types that are part of an inheritance
2549 hierarchy can have different fields than other members of the
2550 same inheritance hierarchy. Therefore,
2551 <productname>PostgreSQL</productname> provides a function
2552 interface for accessing fields of composite types from C.
2556 Suppose we want to write a function to answer the query:
2559 SELECT name, c_overpaid(emp, 1500) AS overpaid
2561 WHERE name = 'Bill' OR name = 'Sam';
2564 Using call conventions version 0, we can define
2565 <function>c_overpaid</> as:
2567 <programlisting><![CDATA[
2568 #include "postgres.h"
2569 #include "executor/executor.h" /* for GetAttributeByName() */
2571 #ifdef PG_MODULE_MAGIC
2576 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2582 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2585 return salary > limit;
2590 In version-1 coding, the above would look like this:
2592 <programlisting><![CDATA[
2593 #include "postgres.h"
2594 #include "executor/executor.h" /* for GetAttributeByName() */
2596 #ifdef PG_MODULE_MAGIC
2600 PG_FUNCTION_INFO_V1(c_overpaid);
2603 c_overpaid(PG_FUNCTION_ARGS)
2605 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2606 int32 limit = PG_GETARG_INT32(1);
2610 salary = GetAttributeByName(t, "salary", &isnull);
2612 PG_RETURN_BOOL(false);
2613 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2615 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2622 <function>GetAttributeByName</function> is the
2623 <productname>PostgreSQL</productname> system function that
2624 returns attributes out of the specified row. It has
2625 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2627 the function, the name of the desired attribute, and a
2628 return parameter that tells whether the attribute
2629 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2630 value that you can convert to the proper data type by using the
2631 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2632 macro. Note that the return value is meaningless if the null flag is
2633 set; always check the null flag before trying to do anything with the
2638 There is also <function>GetAttributeByNum</function>, which selects
2639 the target attribute by column number instead of name.
2643 The following command declares the function
2644 <function>c_overpaid</function> in SQL:
2647 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2648 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2652 Notice we have used <literal>STRICT</> so that we did not have to
2653 check whether the input arguments were NULL.
2658 <title>Returning Rows (Composite Types)</title>
2661 To return a row or composite-type value from a C-language
2662 function, you can use a special API that provides macros and
2663 functions to hide most of the complexity of building composite
2664 data types. To use this API, the source file must include:
2666 #include "funcapi.h"
2671 There are two ways you can build a composite data value (henceforth
2672 a <quote>tuple</>): you can build it from an array of Datum values,
2673 or from an array of C strings that can be passed to the input
2674 conversion functions of the tuple's column data types. In either
2675 case, you first need to obtain or construct a <structname>TupleDesc</>
2676 descriptor for the tuple structure. When working with Datums, you
2677 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2678 and then call <function>heap_form_tuple</> for each row. When working
2679 with C strings, you pass the <structname>TupleDesc</> to
2680 <function>TupleDescGetAttInMetadata</>, and then call
2681 <function>BuildTupleFromCStrings</> for each row. In the case of a
2682 function returning a set of tuples, the setup steps can all be done
2683 once during the first call of the function.
2687 Several helper functions are available for setting up the needed
2688 <structname>TupleDesc</>. The recommended way to do this in most
2689 functions returning composite values is to call:
2691 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2693 TupleDesc *resultTupleDesc)
2695 passing the same <literal>fcinfo</> struct passed to the calling function
2696 itself. (This of course requires that you use the version-1
2697 calling conventions.) <varname>resultTypeId</> can be specified
2698 as <literal>NULL</> or as the address of a local variable to receive the
2699 function's result type OID. <varname>resultTupleDesc</> should be the
2700 address of a local <structname>TupleDesc</> variable. Check that the
2701 result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2702 <varname>resultTupleDesc</> has been filled with the needed
2703 <structname>TupleDesc</>. (If it is not, you can report an error along
2704 the lines of <quote>function returning record called in context that
2705 cannot accept type record</quote>.)
2710 <function>get_call_result_type</> can resolve the actual type of a
2711 polymorphic function result; so it is useful in functions that return
2712 scalar polymorphic results, not only functions that return composites.
2713 The <varname>resultTypeId</> output is primarily useful for functions
2714 returning polymorphic scalars.
2720 <function>get_call_result_type</> has a sibling
2721 <function>get_expr_result_type</>, which can be used to resolve the
2722 expected output type for a function call represented by an expression
2723 tree. This can be used when trying to determine the result type from
2724 outside the function itself. There is also
2725 <function>get_func_result_type</>, which can be used when only the
2726 function's OID is available. However these functions are not able
2727 to deal with functions declared to return <structname>record</>, and
2728 <function>get_func_result_type</> cannot resolve polymorphic types,
2729 so you should preferentially use <function>get_call_result_type</>.
2734 Older, now-deprecated functions for obtaining
2735 <structname>TupleDesc</>s are:
2737 TupleDesc RelationNameGetTupleDesc(const char *relname)
2739 to get a <structname>TupleDesc</> for the row type of a named relation,
2742 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2744 to get a <structname>TupleDesc</> based on a type OID. This can
2745 be used to get a <structname>TupleDesc</> for a base or
2746 composite type. It will not work for a function that returns
2747 <structname>record</>, however, and it cannot resolve polymorphic
2752 Once you have a <structname>TupleDesc</>, call:
2754 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2756 if you plan to work with Datums, or:
2758 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2760 if you plan to work with C strings. If you are writing a function
2761 returning set, you can save the results of these functions in the
2762 <structname>FuncCallContext</> structure — use the
2763 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2768 When working with Datums, use:
2770 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2772 to build a <structname>HeapTuple</> given user data in Datum form.
2776 When working with C strings, use:
2778 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2780 to build a <structname>HeapTuple</> given user data
2781 in C string form. <parameter>values</parameter> is an array of C strings,
2782 one for each attribute of the return row. Each C string should be in
2783 the form expected by the input function of the attribute data
2784 type. In order to return a null value for one of the attributes,
2785 the corresponding pointer in the <parameter>values</> array
2786 should be set to <symbol>NULL</>. This function will need to
2787 be called again for each row you return.
2791 Once you have built a tuple to return from your function, it
2792 must be converted into a <type>Datum</>. Use:
2794 HeapTupleGetDatum(HeapTuple tuple)
2796 to convert a <structname>HeapTuple</> into a valid Datum. This
2797 <type>Datum</> can be returned directly if you intend to return
2798 just a single row, or it can be used as the current return value
2799 in a set-returning function.
2803 An example appears in the next section.
2808 <sect2 id="xfunc-c-return-set">
2809 <title>Returning Sets</title>
2812 There is also a special API that provides support for returning
2813 sets (multiple rows) from a C-language function. A set-returning
2814 function must follow the version-1 calling conventions. Also,
2815 source files must include <filename>funcapi.h</filename>, as
2820 A set-returning function (<acronym>SRF</>) is called
2821 once for each item it returns. The <acronym>SRF</> must
2822 therefore save enough state to remember what it was doing and
2823 return the next item on each call.
2824 The structure <structname>FuncCallContext</> is provided to help
2825 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2826 is used to hold a pointer to <structname>FuncCallContext</>
2829 typedef struct FuncCallContext
2832 * Number of times we've been called before
2834 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2835 * incremented for you every time SRF_RETURN_NEXT() is called.
2840 * OPTIONAL maximum number of calls
2842 * max_calls is here for convenience only and setting it is optional.
2843 * If not set, you must provide alternative means to know when the
2849 * OPTIONAL pointer to result slot
2851 * This is obsolete and only present for backward compatibility, viz,
2852 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2854 TupleTableSlot *slot;
2857 * OPTIONAL pointer to miscellaneous user-provided context information
2859 * user_fctx is for use as a pointer to your own data to retain
2860 * arbitrary context information between calls of your function.
2865 * OPTIONAL pointer to struct containing attribute type input metadata
2867 * attinmeta is for use when returning tuples (i.e., composite data types)
2868 * and is not used when returning base data types. It is only needed
2869 * if you intend to use BuildTupleFromCStrings() to create the return
2872 AttInMetadata *attinmeta;
2875 * memory context used for structures that must live for multiple calls
2877 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2878 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2879 * context for any memory that is to be reused across multiple calls
2882 MemoryContext multi_call_memory_ctx;
2885 * OPTIONAL pointer to struct containing tuple description
2887 * tuple_desc is for use when returning tuples (i.e., composite data types)
2888 * and is only needed if you are going to build the tuples with
2889 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2890 * the TupleDesc pointer stored here should usually have been run through
2891 * BlessTupleDesc() first.
2893 TupleDesc tuple_desc;
2900 An <acronym>SRF</> uses several functions and macros that
2901 automatically manipulate the <structname>FuncCallContext</>
2902 structure (and expect to find it via <literal>fn_extra</>). Use:
2906 to determine if your function is being called for the first or a
2907 subsequent time. On the first call (only) use:
2909 SRF_FIRSTCALL_INIT()
2911 to initialize the <structname>FuncCallContext</>. On every function call,
2912 including the first, use:
2916 to properly set up for using the <structname>FuncCallContext</>
2917 and clearing any previously returned data left over from the
2922 If your function has data to return, use:
2924 SRF_RETURN_NEXT(funcctx, result)
2926 to return it to the caller. (<literal>result</> must be of type
2927 <type>Datum</>, either a single value or a tuple prepared as
2928 described above.) Finally, when your function is finished
2929 returning data, use:
2931 SRF_RETURN_DONE(funcctx)
2933 to clean up and end the <acronym>SRF</>.
2937 The memory context that is current when the <acronym>SRF</> is called is
2938 a transient context that will be cleared between calls. This means
2939 that you do not need to call <function>pfree</> on everything
2940 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2941 any data structures to live across calls, you need to put them somewhere
2942 else. The memory context referenced by
2943 <structfield>multi_call_memory_ctx</> is a suitable location for any
2944 data that needs to survive until the <acronym>SRF</> is finished running. In most
2945 cases, this means that you should switch into
2946 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2951 While the actual arguments to the function remain unchanged between
2952 calls, if you detoast the argument values (which is normally done
2953 transparently by the
2954 <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2955 in the transient context then the detoasted copies will be freed on
2956 each cycle. Accordingly, if you keep references to such values in
2957 your <structfield>user_fctx</>, you must either copy them into the
2958 <structfield>multi_call_memory_ctx</> after detoasting, or ensure
2959 that you detoast the values only in that context.
2964 A complete pseudo-code example looks like the following:
2967 my_set_returning_function(PG_FUNCTION_ARGS)
2969 FuncCallContext *funcctx;
2971 <replaceable>further declarations as needed</replaceable>
2973 if (SRF_IS_FIRSTCALL())
2975 MemoryContext oldcontext;
2977 funcctx = SRF_FIRSTCALL_INIT();
2978 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2979 /* One-time setup code appears here: */
2980 <replaceable>user code</replaceable>
2981 <replaceable>if returning composite</replaceable>
2982 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2983 <replaceable>endif returning composite</replaceable>
2984 <replaceable>user code</replaceable>
2985 MemoryContextSwitchTo(oldcontext);
2988 /* Each-time setup code appears here: */
2989 <replaceable>user code</replaceable>
2990 funcctx = SRF_PERCALL_SETUP();
2991 <replaceable>user code</replaceable>
2993 /* this is just one way we might test whether we are done: */
2994 if (funcctx->call_cntr < funcctx->max_calls)
2996 /* Here we want to return another item: */
2997 <replaceable>user code</replaceable>
2998 <replaceable>obtain result Datum</replaceable>
2999 SRF_RETURN_NEXT(funcctx, result);
3003 /* Here we are done returning items and just need to clean up: */
3004 <replaceable>user code</replaceable>
3005 SRF_RETURN_DONE(funcctx);
3012 A complete example of a simple <acronym>SRF</> returning a composite type
3014 <programlisting><![CDATA[
3015 PG_FUNCTION_INFO_V1(retcomposite);
3018 retcomposite(PG_FUNCTION_ARGS)
3020 FuncCallContext *funcctx;
3024 AttInMetadata *attinmeta;
3026 /* stuff done only on the first call of the function */
3027 if (SRF_IS_FIRSTCALL())
3029 MemoryContext oldcontext;
3031 /* create a function context for cross-call persistence */
3032 funcctx = SRF_FIRSTCALL_INIT();
3034 /* switch to memory context appropriate for multiple function calls */
3035 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3037 /* total number of tuples to be returned */
3038 funcctx->max_calls = PG_GETARG_UINT32(0);
3040 /* Build a tuple descriptor for our result type */
3041 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3043 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3044 errmsg("function returning record called in context "
3045 "that cannot accept type record")));
3048 * generate attribute metadata needed later to produce tuples from raw
3051 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3052 funcctx->attinmeta = attinmeta;
3054 MemoryContextSwitchTo(oldcontext);
3057 /* stuff done on every call of the function */
3058 funcctx = SRF_PERCALL_SETUP();
3060 call_cntr = funcctx->call_cntr;
3061 max_calls = funcctx->max_calls;
3062 attinmeta = funcctx->attinmeta;
3064 if (call_cntr < max_calls) /* do when there is more left to send */
3071 * Prepare a values array for building the returned tuple.
3072 * This should be an array of C strings which will
3073 * be processed later by the type input functions.
3075 values = (char **) palloc(3 * sizeof(char *));
3076 values[0] = (char *) palloc(16 * sizeof(char));
3077 values[1] = (char *) palloc(16 * sizeof(char));
3078 values[2] = (char *) palloc(16 * sizeof(char));
3080 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3081 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3082 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3085 tuple = BuildTupleFromCStrings(attinmeta, values);
3087 /* make the tuple into a datum */
3088 result = HeapTupleGetDatum(tuple);
3090 /* clean up (this is not really necessary) */
3096 SRF_RETURN_NEXT(funcctx, result);
3098 else /* do when there is no more left */
3100 SRF_RETURN_DONE(funcctx);
3106 One way to declare this function in SQL is:
3108 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3110 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3111 RETURNS SETOF __retcomposite
3112 AS '<replaceable>filename</>', 'retcomposite'
3113 LANGUAGE C IMMUTABLE STRICT;
3115 A different way is to use OUT parameters:
3117 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3118 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3119 RETURNS SETOF record
3120 AS '<replaceable>filename</>', 'retcomposite'
3121 LANGUAGE C IMMUTABLE STRICT;
3123 Notice that in this method the output type of the function is formally
3124 an anonymous <structname>record</> type.
3128 The directory <link linkend="tablefunc">contrib/tablefunc</>
3129 module in the source distribution contains more examples of
3130 set-returning functions.
3135 <title>Polymorphic Arguments and Return Types</title>
3138 C-language functions can be declared to accept and
3139 return the polymorphic types
3140 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3141 <type>anyenum</type>, and <type>anyrange</type>.
3142 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3143 of polymorphic functions. When function arguments or return types
3144 are defined as polymorphic types, the function author cannot know
3145 in advance what data type it will be called with, or
3146 need to return. There are two routines provided in <filename>fmgr.h</>
3147 to allow a version-1 C function to discover the actual data types
3148 of its arguments and the type it is expected to return. The routines are
3149 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3150 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3151 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3152 information is not available.
3153 The structure <literal>flinfo</> is normally accessed as
3154 <literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
3155 is zero based. <function>get_call_result_type</> can also be used
3156 as an alternative to <function>get_fn_expr_rettype</>.
3157 There is also <function>get_fn_expr_variadic</>, which can be used to
3158 find out whether variadic arguments have been merged into an array.
3159 This is primarily useful for <literal>VARIADIC "any"</> functions,
3160 since such merging will always have occurred for variadic functions
3161 taking ordinary array types.
3165 For example, suppose we want to write a function to accept a single
3166 element of any type, and return a one-dimensional array of that type:
3169 PG_FUNCTION_INFO_V1(make_array);
3171 make_array(PG_FUNCTION_ARGS)
3174 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
3184 if (!OidIsValid(element_type))
3185 elog(ERROR, "could not determine data type of input");
3187 /* get the provided element, being careful in case it's NULL */
3188 isnull = PG_ARGISNULL(0);
3190 element = (Datum) 0;
3192 element = PG_GETARG_DATUM(0);
3194 /* we have one dimension */
3196 /* and one element */
3198 /* and lower bound is 1 */
3201 /* get required info about the element type */
3202 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
3204 /* now build the array */
3205 result = construct_md_array(&element, &isnull, ndims, dims, lbs,
3206 element_type, typlen, typbyval, typalign);
3208 PG_RETURN_ARRAYTYPE_P(result);
3214 The following command declares the function
3215 <function>make_array</function> in SQL:
3218 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3219 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3220 LANGUAGE C IMMUTABLE;
3225 There is a variant of polymorphism that is only available to C-language
3226 functions: they can be declared to take parameters of type
3227 <literal>"any"</>. (Note that this type name must be double-quoted,
3228 since it's also a SQL reserved word.) This works like
3229 <type>anyelement</> except that it does not constrain different
3230 <literal>"any"</> arguments to be the same type, nor do they help
3231 determine the function's result type. A C-language function can also
3232 declare its final parameter to be <literal>VARIADIC "any"</>. This will
3233 match one or more actual arguments of any type (not necessarily the same
3234 type). These arguments will <emphasis>not</> be gathered into an array
3235 as happens with normal variadic functions; they will just be passed to
3236 the function separately. The <function>PG_NARGS()</> macro and the
3237 methods described above must be used to determine the number of actual
3238 arguments and their types when using this feature. Also, users of such
3239 a function might wish to use the <literal>VARIADIC</> keyword in their
3240 function call, with the expectation that the function would treat the
3241 array elements as separate arguments. The function itself must implement
3242 that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3243 detect that the actual argument was marked with <literal>VARIADIC</>.
3247 <sect2 id="xfunc-transform-functions">
3248 <title>Transform Functions</title>
3251 Some function calls can be simplified during planning based on
3252 properties specific to the function. For example,
3253 <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3254 To define such function-specific optimizations, write a
3255 <firstterm>transform function</> and place its OID in the
3256 <structfield>protransform</> field of the primary function's
3257 <structname>pg_proc</> entry. The transform function must have the SQL
3258 signature <literal>protransform(internal) RETURNS internal</>. The
3259 argument, actually <type>FuncExpr *</>, is a dummy node representing a
3260 call to the primary function. If the transform function's study of the
3261 expression tree proves that a simplified expression tree can substitute
3262 for all possible concrete calls represented thereby, build and return
3263 that simplified expression. Otherwise, return a <literal>NULL</>
3264 pointer (<emphasis>not</> a SQL null).
3268 We make no guarantee that <productname>PostgreSQL</> will never call the
3269 primary function in cases that the transform function could simplify.
3270 Ensure rigorous equivalence between the simplified expression and an
3271 actual call to the primary function.
3275 Currently, this facility is not exposed to users at the SQL level
3276 because of security concerns, so it is only practical to use for
3277 optimizing built-in functions.
3282 <title>Shared Memory and LWLocks</title>
3285 Add-ins can reserve LWLocks and an allocation of shared memory on server
3286 startup. The add-in's shared library must be preloaded by specifying
3288 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3289 Shared memory is reserved by calling:
3291 void RequestAddinShmemSpace(int size)
3293 from your <function>_PG_init</> function.
3296 LWLocks are reserved by calling:
3298 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3300 from <function>_PG_init</>. This will ensure that an array of
3301 <literal>num_lwlocks</> LWLocks is available under the name
3302 <literal>tranche_name</>. Use <function>GetNamedLWLockTranche</>
3303 to get a pointer to this array.
3306 To avoid possible race-conditions, each backend should use the LWLock
3307 <function>AddinShmemInitLock</> when connecting to and initializing
3308 its allocation of shared memory, as shown here:
3310 static mystruct *ptr = NULL;
3316 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3317 ptr = ShmemInitStruct("my struct name", size, &found);
3320 initialize contents of shmem area;
3321 acquire any requested LWLocks using:
3322 ptr->locks = GetNamedLWLockTranche("my tranche name");
3324 LWLockRelease(AddinShmemInitLock);
3330 <sect2 id="extend-Cpp">
3331 <title>Using C++ for Extensibility</title>
3333 <indexterm zone="extend-Cpp">
3334 <primary>C++</primary>
3338 Although the <productname>PostgreSQL</productname> backend is written in
3339 C, it is possible to write extensions in C++ if these guidelines are
3345 All functions accessed by the backend must present a C interface
3346 to the backend; these C functions can then call C++ functions.
3347 For example, <literal>extern C</> linkage is required for
3348 backend-accessed functions. This is also necessary for any
3349 functions that are passed as pointers between the backend and
3355 Free memory using the appropriate deallocation method. For example,
3356 most backend memory is allocated using <function>palloc()</>, so use
3357 <function>pfree()</> to free it. Using C++
3358 <function>delete</> in such cases will fail.
3363 Prevent exceptions from propagating into the C code (use a catch-all
3364 block at the top level of all <literal>extern C</> functions). This
3365 is necessary even if the C++ code does not explicitly throw any
3366 exceptions, because events like out-of-memory can still throw
3367 exceptions. Any exceptions must be caught and appropriate errors
3368 passed back to the C interface. If possible, compile C++ with
3369 <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3370 cases, you must check for failures in your C++ code, e.g. check for
3371 NULL returned by <function>new()</>.
3376 If calling backend functions from C++ code, be sure that the
3377 C++ call stack contains only plain old data structures
3378 (<acronym>POD</>). This is necessary because backend errors
3379 generate a distant <function>longjmp()</> that does not properly
3380 unroll a C++ call stack with non-POD objects.
3387 In summary, it is best to place C++ code behind a wall of
3388 <literal>extern C</> functions that interface to the backend,
3389 and avoid exception, memory, and call stack leakage.