1 <!-- doc/src/sgml/xfunc.sgml -->
4 <title>User-defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
12 <productname>PostgreSQL</productname> provides four kinds of
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</application> or <application>PL/Tcl</application>)
26 (<xref linkend="xfunc-pl">)
31 internal functions (<xref linkend="xfunc-internal">)
36 C-language functions (<xref linkend="xfunc-c">)
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"> command to
67 understand the examples better. Some examples from this chapter
68 can be found in <filename>funcs.sql</filename> and
69 <filename>funcs.c</filename> in the <filename>src/tutorial</filename>
70 directory in the <productname>PostgreSQL</productname> source
75 <sect1 id="xfunc-sql">
76 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
78 <indexterm zone="xfunc-sql">
79 <primary>function</primary>
80 <secondary>user-defined</secondary>
81 <tertiary>in SQL</tertiary>
85 SQL functions execute an arbitrary list of SQL statements, returning
86 the result of the last query in the list.
87 In the simple (non-set)
88 case, the first row of the last query's result will be returned.
89 (Bear in mind that <quote>the first row</quote> of a multirow
90 result is not well-defined unless you use <literal>ORDER BY</literal>.)
91 If the last query happens
92 to return no rows at all, the null value will be returned.
96 Alternatively, an SQL function can be declared to return a set (that is,
97 multiple rows) by specifying the function's return type as <literal>SETOF
98 <replaceable>sometype</replaceable></literal>, or equivalently by declaring it as
99 <literal>RETURNS TABLE(<replaceable>columns</replaceable>)</literal>. In this case
100 all rows of the last query's result are returned. Further details appear
105 The body of an SQL function must be a list of SQL
106 statements separated by semicolons. A semicolon after the last
107 statement is optional. Unless the function is declared to return
108 <type>void</type>, the last statement must be a <command>SELECT</command>,
109 or an <command>INSERT</command>, <command>UPDATE</command>, or <command>DELETE</command>
110 that has a <literal>RETURNING</literal> clause.
114 Any collection of commands in the <acronym>SQL</acronym>
115 language can be packaged together and defined as a function.
116 Besides <command>SELECT</command> queries, the commands can include data
117 modification queries (<command>INSERT</command>,
118 <command>UPDATE</command>, and <command>DELETE</command>), as well as
119 other SQL commands. (You cannot use transaction control commands, e.g.
120 <command>COMMIT</command>, <command>SAVEPOINT</command>, and some utility
121 commands, e.g. <literal>VACUUM</literal>, in <acronym>SQL</acronym> functions.)
122 However, the final command
123 must be a <command>SELECT</command> or have a <literal>RETURNING</literal>
124 clause that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</type>.
128 For example, this function removes rows with negative salaries from
129 the <literal>emp</literal> table:
132 CREATE FUNCTION clean_emp() RETURNS void AS '
148 The entire body of a SQL function is parsed before any of it is
149 executed. While a SQL function can contain commands that alter
150 the system catalogs (e.g., <command>CREATE TABLE</command>), the effects
151 of such commands will not be visible during parse analysis of
152 later commands in the function. Thus, for example,
153 <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
154 will not work as desired if packaged up into a single SQL function,
155 since <structname>foo</structname> won't exist yet when the <command>INSERT</command>
156 command is parsed. It's recommended to use <application>PL/pgSQL</application>
157 instead of a SQL function in this type of situation.
162 The syntax of the <command>CREATE FUNCTION</command> command requires
163 the function body to be written as a string constant. It is usually
164 most convenient to use dollar quoting (see <xref
165 linkend="sql-syntax-dollar-quoting">) for the string constant.
166 If you choose to use regular single-quoted string constant syntax,
167 you must double single quote marks (<literal>'</literal>) and backslashes
168 (<literal>\</literal>) (assuming escape string syntax) in the body of
169 the function (see <xref linkend="sql-syntax-strings">).
172 <sect2 id="xfunc-sql-function-arguments">
173 <title>Arguments for <acronym>SQL</acronym> Functions</title>
176 <primary>function</primary>
177 <secondary>named argument</secondary>
181 Arguments of a SQL function can be referenced in the function
182 body using either names or numbers. Examples of both methods appear
187 To use a name, declare the function argument as having a name, and
188 then just write that name in the function body. If the argument name
189 is the same as any column name in the current SQL command within the
190 function, the column name will take precedence. To override this,
191 qualify the argument name with the name of the function itself, that is
192 <literal><replaceable>function_name</replaceable>.<replaceable>argument_name</replaceable></literal>.
193 (If this would conflict with a qualified column name, again the column
194 name wins. You can avoid the ambiguity by choosing a different alias for
195 the table within the SQL command.)
199 In the older numeric approach, arguments are referenced using the syntax
200 <literal>$<replaceable>n</replaceable></literal>: <literal>$1</literal> refers to the first input
201 argument, <literal>$2</literal> to the second, and so on. This will work
202 whether or not the particular argument was declared with a name.
206 If an argument is of a composite type, then the dot notation,
207 e.g., <literal><replaceable>argname</replaceable>.<replaceable>fieldname</replaceable></literal> or
208 <literal>$1.<replaceable>fieldname</replaceable></literal>, can be used to access attributes of the
209 argument. Again, you might need to qualify the argument's name with the
210 function name to make the form with an argument name unambiguous.
214 SQL function arguments can only be used as data values,
215 not as identifiers. Thus for example this is reasonable:
217 INSERT INTO mytable VALUES ($1);
219 but this will not work:
221 INSERT INTO $1 VALUES (42);
227 The ability to use names to reference SQL function arguments was added
228 in <productname>PostgreSQL</productname> 9.2. Functions to be used in
229 older servers must use the <literal>$<replaceable>n</replaceable></literal> notation.
234 <sect2 id="xfunc-sql-base-functions">
235 <title><acronym>SQL</acronym> Functions on Base Types</title>
238 The simplest possible <acronym>SQL</acronym> function has no arguments and
239 simply returns a base type, such as <type>integer</type>:
242 CREATE FUNCTION one() RETURNS integer AS $$
246 -- Alternative syntax for string literal:
247 CREATE FUNCTION one() RETURNS integer AS '
260 Notice that we defined a column alias within the function body for the result of the function
261 (with the name <literal>result</literal>), but this column alias is not visible
262 outside the function. Hence, the result is labeled <literal>one</literal>
263 instead of <literal>result</literal>.
267 It is almost as easy to define <acronym>SQL</acronym> functions
268 that take base types as arguments:
271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
275 SELECT add_em(1, 2) AS answer;
284 Alternatively, we could dispense with names for the arguments and
288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
292 SELECT add_em(1, 2) AS answer;
301 Here is a more useful function, which might be used to debit a
305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
307 SET balance = balance - debit
308 WHERE accountno = tf1.accountno;
313 A user could execute this function to debit account 17 by $100.00 as
317 SELECT tf1(17, 100.0);
322 In this example, we chose the name <literal>accountno</literal> for the first
323 argument, but this is the same as the name of a column in the
324 <literal>bank</literal> table. Within the <command>UPDATE</command> command,
325 <literal>accountno</literal> refers to the column <literal>bank.accountno</literal>,
326 so <literal>tf1.accountno</literal> must be used to refer to the argument.
327 We could of course avoid this by using a different name for the argument.
331 In practice one would probably like a more useful result from the
332 function than a constant 1, so a more likely definition
336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
338 SET balance = balance - debit
339 WHERE accountno = tf1.accountno;
340 SELECT balance FROM bank WHERE accountno = tf1.accountno;
344 which adjusts the balance and returns the new balance.
345 The same thing could be done in one command using <literal>RETURNING</literal>:
348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
350 SET balance = balance - debit
351 WHERE accountno = tf1.accountno
358 <sect2 id="xfunc-sql-composite-functions">
359 <title><acronym>SQL</acronym> Functions on Composite Types</title>
362 When writing functions with arguments of composite types, we must not
363 only specify which argument we want but also the desired attribute
364 (field) of that argument. For example, suppose that
365 <type>emp</type> is a table containing employee data, and therefore
366 also the name of the composite type of each row of the table. Here
367 is a function <function>double_salary</function> that computes what someone's
368 salary would be if it were doubled:
378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
381 SELECT $1.salary * 2 AS salary;
384 SELECT name, double_salary(emp.*) AS dream
386 WHERE emp.cubicle ~= point '(2,1)';
395 Notice the use of the syntax <literal>$1.salary</literal>
396 to select one field of the argument row value. Also notice
397 how the calling <command>SELECT</command> command
398 uses <replaceable>table_name</replaceable><literal>.*</literal> to select
399 the entire current row of a table as a composite value. The table
400 row can alternatively be referenced using just the table name,
403 SELECT name, double_salary(emp) AS dream
405 WHERE emp.cubicle ~= point '(2,1)';
407 but this usage is deprecated since it's easy to get confused.
408 (See <xref linkend="rowtypes-usage"> for details about these
409 two notations for the composite value of a table row.)
413 Sometimes it is handy to construct a composite argument value
414 on-the-fly. This can be done with the <literal>ROW</literal> construct.
415 For example, we could adjust the data being passed to the function:
417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
423 It is also possible to build a function that returns a composite type.
424 This is an example of a function
425 that returns a single <type>emp</type> row:
428 CREATE FUNCTION new_emp() RETURNS emp AS $$
429 SELECT text 'None' AS name,
432 point '(2,2)' AS cubicle;
436 In this example we have specified each of the attributes
437 with a constant value, but any computation
438 could have been substituted for these constants.
442 Note two important things about defining the function:
447 The select list order in the query must be exactly the same as
448 that in which the columns appear in the table associated
449 with the composite type. (Naming the columns, as we did above,
450 is irrelevant to the system.)
455 You must typecast the expressions to match the
456 definition of the composite type, or you will get errors like this:
459 ERROR: function declared to return emp returns varchar instead of text at column 1
468 A different way to define the same function is:
471 CREATE FUNCTION new_emp() RETURNS emp AS $$
472 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
476 Here we wrote a <command>SELECT</command> that returns just a single
477 column of the correct composite type. This isn't really better
478 in this situation, but it is a handy alternative in some cases
479 — for example, if we need to compute the result by calling
480 another function that returns the desired composite value.
484 We could call this function directly either by using it in
491 --------------------------
492 (None,1000.0,25,"(2,2)")
495 or by calling it as a table function:
498 SELECT * FROM new_emp();
500 name | salary | age | cubicle
501 ------+--------+-----+---------
502 None | 1000.0 | 25 | (2,2)
505 The second way is described more fully in <xref
506 linkend="xfunc-sql-table-functions">.
510 When you use a function that returns a composite type,
511 you might want only one field (attribute) from its result.
512 You can do that with syntax like this:
515 SELECT (new_emp()).name;
522 The extra parentheses are needed to keep the parser from getting
523 confused. If you try to do it without them, you get something like this:
526 SELECT new_emp().name;
527 ERROR: syntax error at or near "."
528 LINE 1: SELECT new_emp().name;
534 Another option is to use functional notation for extracting an attribute:
537 SELECT name(new_emp());
544 As explained in <xref linkend="rowtypes-usage">, the field notation and
545 functional notation are equivalent.
549 Another way to use a function returning a composite type is to pass the
550 result to another function that accepts the correct row type as input:
553 CREATE FUNCTION getname(emp) RETURNS text AS $$
557 SELECT getname(new_emp());
566 <sect2 id="xfunc-output-parameters">
567 <title><acronym>SQL</acronym> Functions with Output Parameters</title>
570 <primary>function</primary>
571 <secondary>output parameter</secondary>
575 An alternative way of describing a function's results is to define it
576 with <firstterm>output parameters</firstterm>, as in this example:
579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
590 This is not essentially different from the version of <literal>add_em</literal>
591 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
592 output parameters is that they provide a convenient way of defining
593 functions that return several columns. For example,
596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
597 AS 'SELECT x + y, x * y'
600 SELECT * FROM sum_n_product(11,42);
607 What has essentially happened here is that we have created an anonymous
608 composite type for the result of the function. The above example has
609 the same end result as
612 CREATE TYPE sum_prod AS (sum int, product int);
614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
615 AS 'SELECT $1 + $2, $1 * $2'
619 but not having to bother with the separate composite type definition
620 is often handy. Notice that the names attached to the output parameters
621 are not just decoration, but determine the column names of the anonymous
622 composite type. (If you omit a name for an output parameter, the
623 system will choose a name on its own.)
627 Notice that output parameters are not included in the calling argument
628 list when invoking such a function from SQL. This is because
629 <productname>PostgreSQL</productname> considers only the input
630 parameters to define the function's calling signature. That means
631 also that only the input parameters matter when referencing the function
632 for purposes such as dropping it. We could drop the above function
636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
637 DROP FUNCTION sum_n_product (int, int);
642 Parameters can be marked as <literal>IN</literal> (the default),
643 <literal>OUT</literal>, <literal>INOUT</literal>, or <literal>VARIADIC</literal>.
644 An <literal>INOUT</literal>
645 parameter serves as both an input parameter (part of the calling
646 argument list) and an output parameter (part of the result record type).
647 <literal>VARIADIC</literal> parameters are input parameters, but are treated
648 specially as described next.
652 <sect2 id="xfunc-sql-variadic-functions">
653 <title><acronym>SQL</acronym> Functions with Variable Numbers of Arguments</title>
656 <primary>function</primary>
657 <secondary>variadic</secondary>
661 <primary>variadic function</primary>
665 <acronym>SQL</acronym> functions can be declared to accept
666 variable numbers of arguments, so long as all the <quote>optional</quote>
667 arguments are of the same data type. The optional arguments will be
668 passed to the function as an array. The function is declared by
669 marking the last parameter as <literal>VARIADIC</literal>; this parameter
670 must be declared as being of an array type. For example:
673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
674 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
677 SELECT mleast(10, -1, 5, 4.4);
684 Effectively, all the actual arguments at or beyond the
685 <literal>VARIADIC</literal> position are gathered up into a one-dimensional
686 array, as if you had written
689 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
692 You can't actually write that, though — or at least, it will
693 not match this function definition. A parameter marked
694 <literal>VARIADIC</literal> matches one or more occurrences of its element
695 type, not of its own type.
699 Sometimes it is useful to be able to pass an already-constructed array
700 to a variadic function; this is particularly handy when one variadic
701 function wants to pass on its array parameter to another one. You can
702 do that by specifying <literal>VARIADIC</literal> in the call:
705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
708 This prevents expansion of the function's variadic parameter into its
709 element type, thereby allowing the array argument value to match
710 normally. <literal>VARIADIC</literal> can only be attached to the last
711 actual argument of a function call.
715 Specifying <literal>VARIADIC</literal> in the call is also the only way to
716 pass an empty array to a variadic function, for example:
719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
722 Simply writing <literal>SELECT mleast()</literal> does not work because a
723 variadic parameter must match at least one actual argument.
724 (You could define a second function also named <literal>mleast</literal>,
725 with no parameters, if you wanted to allow such calls.)
729 The array element parameters generated from a variadic parameter are
730 treated as not having any names of their own. This means it is not
731 possible to call a variadic function using named arguments (<xref
732 linkend="sql-syntax-calling-funcs">), except when you specify
733 <literal>VARIADIC</literal>. For example, this will work:
736 SELECT mleast(VARIADIC arr => ARRAY[10, -1, 5, 4.4]);
742 SELECT mleast(arr => 10);
743 SELECT mleast(arr => ARRAY[10, -1, 5, 4.4]);
748 <sect2 id="xfunc-sql-parameter-defaults">
749 <title><acronym>SQL</acronym> Functions with Default Values for Arguments</title>
752 <primary>function</primary>
753 <secondary>default values for arguments</secondary>
757 Functions can be declared with default values for some or all input
758 arguments. The default values are inserted whenever the function is
759 called with insufficiently many actual arguments. Since arguments
760 can only be omitted from the end of the actual argument list, all
761 parameters after a parameter with a default value have to have
762 default values as well. (Although the use of named argument notation
763 could allow this restriction to be relaxed, it's still enforced so that
764 positional argument notation works sensibly.)
770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
777 SELECT foo(10, 20, 30);
795 SELECT foo(); -- fails since there is no default for the first argument
796 ERROR: function foo() does not exist
798 The <literal>=</literal> sign can also be used in place of the
799 key word <literal>DEFAULT</literal>.
803 <sect2 id="xfunc-sql-table-functions">
804 <title><acronym>SQL</acronym> Functions as Table Sources</title>
807 All SQL functions can be used in the <literal>FROM</literal> clause of a query,
808 but it is particularly useful for functions returning composite types.
809 If the function is defined to return a base type, the table function
810 produces a one-column table. If the function is defined to return
811 a composite type, the table function produces a column for each attribute
812 of the composite type.
819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
820 INSERT INTO foo VALUES (1, 1, 'Joe');
821 INSERT INTO foo VALUES (1, 2, 'Ed');
822 INSERT INTO foo VALUES (2, 1, 'Mary');
824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
825 SELECT * FROM foo WHERE fooid = $1;
828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
830 fooid | foosubid | fooname | upper
831 -------+----------+---------+-------
836 As the example shows, we can work with the columns of the function's
837 result just the same as if they were columns of a regular table.
841 Note that we only got one row out of the function. This is because
842 we did not use <literal>SETOF</literal>. That is described in the next section.
846 <sect2 id="xfunc-sql-functions-returning-set">
847 <title><acronym>SQL</acronym> Functions Returning Sets</title>
850 <primary>function</primary>
851 <secondary>with SETOF</secondary>
855 When an SQL function is declared as returning <literal>SETOF
856 <replaceable>sometype</replaceable></literal>, the function's final
857 query is executed to completion, and each row it
858 outputs is returned as an element of the result set.
862 This feature is normally used when calling the function in the <literal>FROM</literal>
863 clause. In this case each row returned by the function becomes
864 a row of the table seen by the query. For example, assume that
865 table <literal>foo</literal> has the same contents as above, and we say:
868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
869 SELECT * FROM foo WHERE fooid = $1;
872 SELECT * FROM getfoo(1) AS t1;
877 fooid | foosubid | fooname
878 -------+----------+---------
886 It is also possible to return multiple rows with the columns defined by
887 output parameters, like this:
890 CREATE TABLE tab (y int, z int);
891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
896 SELECT $1 + tab.y, $1 * tab.y FROM tab;
899 SELECT * FROM sum_n_product_with_tab(10);
909 The key point here is that you must write <literal>RETURNS SETOF record</literal>
910 to indicate that the function returns multiple rows instead of just one.
911 If there is only one output parameter, write that parameter's type
912 instead of <type>record</type>.
916 It is frequently useful to construct a query's result by invoking a
917 set-returning function multiple times, with the parameters for each
918 invocation coming from successive rows of a table or subquery. The
919 preferred way to do this is to use the <literal>LATERAL</literal> key word,
920 which is described in <xref linkend="queries-lateral">.
921 Here is an example using a set-returning function to enumerate
922 elements of a tree structure:
936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
937 SELECT name FROM nodes WHERE parent = $1
938 $$ LANGUAGE SQL STABLE;
940 SELECT * FROM listchildren('Top');
948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
959 This example does not do anything that we couldn't have done with a
960 simple join, but in more complex calculations the option to put
961 some of the work into a function can be quite convenient.
965 Functions returning sets can also be called in the select list
966 of a query. For each row that the query
967 generates by itself, the set-returning function is invoked, and an output
968 row is generated for each element of the function's result set.
969 The previous example could also be done with queries like
973 SELECT listchildren('Top');
981 SELECT name, listchildren(name) FROM nodes;
983 --------+--------------
992 In the last <command>SELECT</command>,
993 notice that no output row appears for <literal>Child2</literal>, <literal>Child3</literal>, etc.
994 This happens because <function>listchildren</function> returns an empty set
995 for those arguments, so no result rows are generated. This is the same
996 behavior as we got from an inner join to the function result when using
997 the <literal>LATERAL</literal> syntax.
1001 <productname>PostgreSQL</productname>'s behavior for a set-returning function in a
1002 query's select list is almost exactly the same as if the set-returning
1003 function had been written in a <literal>LATERAL FROM</literal>-clause item
1004 instead. For example,
1006 SELECT x, generate_series(1,5) AS g FROM tab;
1008 is almost equivalent to
1010 SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
1012 It would be exactly the same, except that in this specific example,
1013 the planner could choose to put <structname>g</structname> on the outside of the
1014 nestloop join, since <structname>g</structname> has no actual lateral dependency
1015 on <structname>tab</structname>. That would result in a different output row
1016 order. Set-returning functions in the select list are always evaluated
1017 as though they are on the inside of a nestloop join with the rest of
1018 the <literal>FROM</literal> clause, so that the function(s) are run to
1019 completion before the next row from the <literal>FROM</literal> clause is
1024 If there is more than one set-returning function in the query's select
1025 list, the behavior is similar to what you get from putting the functions
1026 into a single <literal>LATERAL ROWS FROM( ... )</literal> <literal>FROM</literal>-clause
1027 item. For each row from the underlying query, there is an output row
1028 using the first result from each function, then an output row using the
1029 second result, and so on. If some of the set-returning functions
1030 produce fewer outputs than others, null values are substituted for the
1031 missing data, so that the total number of rows emitted for one
1032 underlying row is the same as for the set-returning function that
1033 produced the most outputs. Thus the set-returning functions
1034 run <quote>in lockstep</quote> until they are all exhausted, and then
1035 execution continues with the next underlying row.
1039 Set-returning functions can be nested in a select list, although that is
1040 not allowed in <literal>FROM</literal>-clause items. In such cases, each level
1041 of nesting is treated separately, as though it were
1042 a separate <literal>LATERAL ROWS FROM( ... )</literal> item. For example, in
1044 SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
1046 the set-returning functions <function>srf2</function>, <function>srf3</function>,
1047 and <function>srf5</function> would be run in lockstep for each row
1048 of <structname>tab</structname>, and then <function>srf1</function> and <function>srf4</function>
1049 would be applied in lockstep to each row produced by the lower
1054 Set-returning functions cannot be used within conditional-evaluation
1055 constructs, such as <literal>CASE</literal> or <literal>COALESCE</literal>. For
1058 SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
1060 It might seem that this should produce five repetitions of input rows
1061 that have <literal>x > 0</literal>, and a single repetition of those that do
1062 not; but actually, because <function>generate_series(1, 5)</function> would be
1063 run in an implicit <literal>LATERAL FROM</literal> item before
1064 the <literal>CASE</literal> expression is ever evaluated, it would produce five
1065 repetitions of every input row. To reduce confusion, such cases produce
1066 a parse-time error instead.
1071 If a function's last command is <command>INSERT</command>, <command>UPDATE</command>,
1072 or <command>DELETE</command> with <literal>RETURNING</literal>, that command will
1073 always be executed to completion, even if the function is not declared
1074 with <literal>SETOF</literal> or the calling query does not fetch all the
1075 result rows. Any extra rows produced by the <literal>RETURNING</literal>
1076 clause are silently dropped, but the commanded table modifications
1077 still happen (and are all completed before returning from the function).
1083 Before <productname>PostgreSQL</productname> 10, putting more than one
1084 set-returning function in the same select list did not behave very
1085 sensibly unless they always produced equal numbers of rows. Otherwise,
1086 what you got was a number of output rows equal to the least common
1087 multiple of the numbers of rows produced by the set-returning
1088 functions. Also, nested set-returning functions did not work as
1089 described above; instead, a set-returning function could have at most
1090 one set-returning argument, and each nest of set-returning functions
1091 was run independently. Also, conditional execution (set-returning
1092 functions inside <literal>CASE</literal> etc) was previously allowed,
1093 complicating things even more.
1094 Use of the <literal>LATERAL</literal> syntax is recommended when writing
1095 queries that need to work in older <productname>PostgreSQL</productname> versions,
1096 because that will give consistent results across different versions.
1097 If you have a query that is relying on conditional execution of a
1098 set-returning function, you may be able to fix it by moving the
1099 conditional test into a custom set-returning function. For example,
1101 SELECT x, CASE WHEN y > 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
1105 CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
1106 RETURNS SETOF int AS $$
1109 RETURN QUERY SELECT generate_series(start, fin);
1111 RETURN QUERY SELECT els;
1113 END$$ LANGUAGE plpgsql;
1115 SELECT x, case_generate_series(y > 0, 1, z, 5) FROM tab;
1117 This formulation will work the same in all versions
1118 of <productname>PostgreSQL</productname>.
1123 <sect2 id="xfunc-sql-functions-returning-table">
1124 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</literal></title>
1127 <primary>function</primary>
1128 <secondary>RETURNS TABLE</secondary>
1132 There is another way to declare a function as returning a set,
1133 which is to use the syntax
1134 <literal>RETURNS TABLE(<replaceable>columns</replaceable>)</literal>.
1135 This is equivalent to using one or more <literal>OUT</literal> parameters plus
1136 marking the function as returning <literal>SETOF record</literal> (or
1137 <literal>SETOF</literal> a single output parameter's type, as appropriate).
1138 This notation is specified in recent versions of the SQL standard, and
1139 thus may be more portable than using <literal>SETOF</literal>.
1143 For example, the preceding sum-and-product example could also be
1147 CREATE FUNCTION sum_n_product_with_tab (x int)
1148 RETURNS TABLE(sum int, product int) AS $$
1149 SELECT $1 + tab.y, $1 * tab.y FROM tab;
1153 It is not allowed to use explicit <literal>OUT</literal> or <literal>INOUT</literal>
1154 parameters with the <literal>RETURNS TABLE</literal> notation — you must
1155 put all the output columns in the <literal>TABLE</literal> list.
1160 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1163 <acronym>SQL</acronym> functions can be declared to accept and
1164 return the polymorphic types <type>anyelement</type>,
1165 <type>anyarray</type>, <type>anynonarray</type>,
1166 <type>anyenum</type>, and <type>anyrange</type>. See <xref
1167 linkend="extend-types-polymorphic"> for a more detailed
1168 explanation of polymorphic functions. Here is a polymorphic
1169 function <function>make_array</function> that builds up an array
1170 from two arbitrary data type elements:
1172 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1173 SELECT ARRAY[$1, $2];
1176 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1177 intarray | textarray
1178 ----------+-----------
1185 Notice the use of the typecast <literal>'a'::text</literal>
1186 to specify that the argument is of type <type>text</type>. This is
1187 required if the argument is just a string literal, since otherwise
1188 it would be treated as type
1189 <type>unknown</type>, and array of <type>unknown</type> is not a valid
1191 Without the typecast, you will get errors like this:
1194 ERROR: could not determine polymorphic type because input has type "unknown"
1200 It is permitted to have polymorphic arguments with a fixed
1201 return type, but the converse is not. For example:
1203 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1207 SELECT is_greater(1, 2);
1213 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1216 ERROR: cannot determine result data type
1217 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
1222 Polymorphism can be used with functions that have output arguments.
1225 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1226 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1228 SELECT * FROM dup(22);
1237 Polymorphism can also be used with variadic functions.
1240 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1241 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1244 SELECT anyleast(10, -1, 5, 4);
1250 SELECT anyleast('abc'::text, 'def');
1256 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1257 SELECT array_to_string($2, $1);
1260 SELECT concat_values('|', 1, 4, 2);
1270 <title><acronym>SQL</acronym> Functions with Collations</title>
1273 <primary>collation</primary>
1274 <secondary>in SQL functions</secondary>
1278 When a SQL function has one or more parameters of collatable data types,
1279 a collation is identified for each function call depending on the
1280 collations assigned to the actual arguments, as described in <xref
1281 linkend="collation">. If a collation is successfully identified
1282 (i.e., there are no conflicts of implicit collations among the arguments)
1283 then all the collatable parameters are treated as having that collation
1284 implicitly. This will affect the behavior of collation-sensitive
1285 operations within the function. For example, using the
1286 <function>anyleast</function> function described above, the result of
1288 SELECT anyleast('abc'::text, 'ABC');
1290 will depend on the database's default collation. In <literal>C</literal> locale
1291 the result will be <literal>ABC</literal>, but in many other locales it will
1292 be <literal>abc</literal>. The collation to use can be forced by adding
1293 a <literal>COLLATE</literal> clause to any of the arguments, for example
1295 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1297 Alternatively, if you wish a function to operate with a particular
1298 collation regardless of what it is called with, insert
1299 <literal>COLLATE</literal> clauses as needed in the function definition.
1300 This version of <function>anyleast</function> would always use <literal>en_US</literal>
1301 locale to compare strings:
1303 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1304 SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1307 But note that this will throw an error if applied to a non-collatable
1312 If no common collation can be identified among the actual arguments,
1313 then a SQL function treats its parameters as having their data types'
1314 default collation (which is usually the database's default collation,
1315 but could be different for parameters of domain types).
1319 The behavior of collatable parameters can be thought of as a limited
1320 form of polymorphism, applicable only to textual data types.
1325 <sect1 id="xfunc-overload">
1326 <title>Function Overloading</title>
1328 <indexterm zone="xfunc-overload">
1329 <primary>overloading</primary>
1330 <secondary>functions</secondary>
1334 More than one function can be defined with the same SQL name, so long
1335 as the arguments they take are different. In other words,
1336 function names can be <firstterm>overloaded</firstterm>. When a
1337 query is executed, the server will determine which function to
1338 call from the data types and the number of the provided arguments.
1339 Overloading can also be used to simulate functions with a variable
1340 number of arguments, up to a finite maximum number.
1344 When creating a family of overloaded functions, one should be
1345 careful not to create ambiguities. For instance, given the
1348 CREATE FUNCTION test(int, real) RETURNS ...
1349 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1351 it is not immediately clear which function would be called with
1352 some trivial input like <literal>test(1, 1.5)</literal>. The
1353 currently implemented resolution rules are described in
1354 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1355 relies on this behavior.
1359 A function that takes a single argument of a composite type should
1360 generally not have the same name as any attribute (field) of that type.
1361 Recall that <literal><replaceable>attribute</replaceable>(<replaceable>table</replaceable>)</literal>
1362 is considered equivalent
1363 to <literal><replaceable>table</replaceable>.<replaceable>attribute</replaceable></literal>.
1364 In the case that there is an
1365 ambiguity between a function on a composite type and an attribute of
1366 the composite type, the attribute will always be used. It is possible
1367 to override that choice by schema-qualifying the function name
1368 (that is, <literal><replaceable>schema</replaceable>.<replaceable>func</replaceable>(<replaceable>table</replaceable>)
1369 </literal>) but it's better to
1370 avoid the problem by not choosing conflicting names.
1374 Another possible conflict is between variadic and non-variadic functions.
1375 For instance, it is possible to create both <literal>foo(numeric)</literal> and
1376 <literal>foo(VARIADIC numeric[])</literal>. In this case it is unclear which one
1377 should be matched to a call providing a single numeric argument, such as
1378 <literal>foo(10.1)</literal>. The rule is that the function appearing
1379 earlier in the search path is used, or if the two functions are in the
1380 same schema, the non-variadic one is preferred.
1384 When overloading C-language functions, there is an additional
1385 constraint: The C name of each function in the family of
1386 overloaded functions must be different from the C names of all
1387 other functions, either internal or dynamically loaded. If this
1388 rule is violated, the behavior is not portable. You might get a
1389 run-time linker error, or one of the functions will get called
1390 (usually the internal one). The alternative form of the
1391 <literal>AS</literal> clause for the SQL <command>CREATE
1392 FUNCTION</command> command decouples the SQL function name from
1393 the function name in the C source code. For instance:
1395 CREATE FUNCTION test(int) RETURNS int
1396 AS '<replaceable>filename</replaceable>', 'test_1arg'
1398 CREATE FUNCTION test(int, int) RETURNS int
1399 AS '<replaceable>filename</replaceable>', 'test_2arg'
1402 The names of the C functions here reflect one of many possible conventions.
1406 <sect1 id="xfunc-volatility">
1407 <title>Function Volatility Categories</title>
1409 <indexterm zone="xfunc-volatility">
1410 <primary>volatility</primary>
1411 <secondary>functions</secondary>
1413 <indexterm zone="xfunc-volatility">
1414 <primary>VOLATILE</primary>
1416 <indexterm zone="xfunc-volatility">
1417 <primary>STABLE</primary>
1419 <indexterm zone="xfunc-volatility">
1420 <primary>IMMUTABLE</primary>
1424 Every function has a <firstterm>volatility</firstterm> classification, with
1425 the possibilities being <literal>VOLATILE</literal>, <literal>STABLE</literal>, or
1426 <literal>IMMUTABLE</literal>. <literal>VOLATILE</literal> is the default if the
1427 <xref linkend="sql-createfunction">
1428 command does not specify a category. The volatility category is a
1429 promise to the optimizer about the behavior of the function:
1434 A <literal>VOLATILE</literal> function can do anything, including modifying
1435 the database. It can return different results on successive calls with
1436 the same arguments. The optimizer makes no assumptions about the
1437 behavior of such functions. A query using a volatile function will
1438 re-evaluate the function at every row where its value is needed.
1443 A <literal>STABLE</literal> function cannot modify the database and is
1444 guaranteed to return the same results given the same arguments
1445 for all rows within a single statement. This category allows the
1446 optimizer to optimize multiple calls of the function to a single
1447 call. In particular, it is safe to use an expression containing
1448 such a function in an index scan condition. (Since an index scan
1449 will evaluate the comparison value only once, not once at each
1450 row, it is not valid to use a <literal>VOLATILE</literal> function in an
1451 index scan condition.)
1456 An <literal>IMMUTABLE</literal> function cannot modify the database and is
1457 guaranteed to return the same results given the same arguments forever.
1458 This category allows the optimizer to pre-evaluate the function when
1459 a query calls it with constant arguments. For example, a query like
1460 <literal>SELECT ... WHERE x = 2 + 2</literal> can be simplified on sight to
1461 <literal>SELECT ... WHERE x = 4</literal>, because the function underlying
1462 the integer addition operator is marked <literal>IMMUTABLE</literal>.
1469 For best optimization results, you should label your functions with the
1470 strictest volatility category that is valid for them.
1474 Any function with side-effects <emphasis>must</emphasis> be labeled
1475 <literal>VOLATILE</literal>, so that calls to it cannot be optimized away.
1476 Even a function with no side-effects needs to be labeled
1477 <literal>VOLATILE</literal> if its value can change within a single query;
1478 some examples are <literal>random()</literal>, <literal>currval()</literal>,
1479 <literal>timeofday()</literal>.
1483 Another important example is that the <function>current_timestamp</function>
1484 family of functions qualify as <literal>STABLE</literal>, since their values do
1485 not change within a transaction.
1489 There is relatively little difference between <literal>STABLE</literal> and
1490 <literal>IMMUTABLE</literal> categories when considering simple interactive
1491 queries that are planned and immediately executed: it doesn't matter
1492 a lot whether a function is executed once during planning or once during
1493 query execution startup. But there is a big difference if the plan is
1494 saved and reused later. Labeling a function <literal>IMMUTABLE</literal> when
1495 it really isn't might allow it to be prematurely folded to a constant during
1496 planning, resulting in a stale value being re-used during subsequent uses
1497 of the plan. This is a hazard when using prepared statements or when
1498 using function languages that cache plans (such as
1499 <application>PL/pgSQL</application>).
1503 For functions written in SQL or in any of the standard procedural
1504 languages, there is a second important property determined by the
1505 volatility category, namely the visibility of any data changes that have
1506 been made by the SQL command that is calling the function. A
1507 <literal>VOLATILE</literal> function will see such changes, a <literal>STABLE</literal>
1508 or <literal>IMMUTABLE</literal> function will not. This behavior is implemented
1509 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1510 <literal>STABLE</literal> and <literal>IMMUTABLE</literal> functions use a snapshot
1511 established as of the start of the calling query, whereas
1512 <literal>VOLATILE</literal> functions obtain a fresh snapshot at the start of
1513 each query they execute.
1518 Functions written in C can manage snapshots however they want, but it's
1519 usually a good idea to make C functions work this way too.
1524 Because of this snapshotting behavior,
1525 a function containing only <command>SELECT</command> commands can safely be
1526 marked <literal>STABLE</literal>, even if it selects from tables that might be
1527 undergoing modifications by concurrent queries.
1528 <productname>PostgreSQL</productname> will execute all commands of a
1529 <literal>STABLE</literal> function using the snapshot established for the
1530 calling query, and so it will see a fixed view of the database throughout
1535 The same snapshotting behavior is used for <command>SELECT</command> commands
1536 within <literal>IMMUTABLE</literal> functions. It is generally unwise to select
1537 from database tables within an <literal>IMMUTABLE</literal> function at all,
1538 since the immutability will be broken if the table contents ever change.
1539 However, <productname>PostgreSQL</productname> does not enforce that you
1544 A common error is to label a function <literal>IMMUTABLE</literal> when its
1545 results depend on a configuration parameter. For example, a function
1546 that manipulates timestamps might well have results that depend on the
1547 <xref linkend="guc-timezone"> setting. For safety, such functions should
1548 be labeled <literal>STABLE</literal> instead.
1553 <productname>PostgreSQL</productname> requires that <literal>STABLE</literal>
1554 and <literal>IMMUTABLE</literal> functions contain no SQL commands other
1555 than <command>SELECT</command> to prevent data modification.
1556 (This is not a completely bulletproof test, since such functions could
1557 still call <literal>VOLATILE</literal> functions that modify the database.
1558 If you do that, you will find that the <literal>STABLE</literal> or
1559 <literal>IMMUTABLE</literal> function does not notice the database changes
1560 applied by the called function, since they are hidden from its snapshot.)
1565 <sect1 id="xfunc-pl">
1566 <title>Procedural Language Functions</title>
1569 <productname>PostgreSQL</productname> allows user-defined functions
1570 to be written in other languages besides SQL and C. These other
1571 languages are generically called <firstterm>procedural
1572 languages</firstterm> (<acronym>PL</acronym>s).
1573 Procedural languages aren't built into the
1574 <productname>PostgreSQL</productname> server; they are offered
1575 by loadable modules.
1576 See <xref linkend="xplang"> and following chapters for more
1581 <sect1 id="xfunc-internal">
1582 <title>Internal Functions</title>
1584 <indexterm zone="xfunc-internal"><primary>function</primary><secondary>internal</secondary></indexterm>
1587 Internal functions are functions written in C that have been statically
1588 linked into the <productname>PostgreSQL</productname> server.
1589 The <quote>body</quote> of the function definition
1590 specifies the C-language name of the function, which need not be the
1591 same as the name being declared for SQL use.
1592 (For reasons of backward compatibility, an empty body
1593 is accepted as meaning that the C-language function name is the
1594 same as the SQL name.)
1598 Normally, all internal functions present in the
1599 server are declared during the initialization of the database cluster
1600 (see <xref linkend="creating-cluster">),
1601 but a user could use <command>CREATE FUNCTION</command>
1602 to create additional alias names for an internal function.
1603 Internal functions are declared in <command>CREATE FUNCTION</command>
1604 with language name <literal>internal</literal>. For instance, to
1605 create an alias for the <function>sqrt</function> function:
1607 CREATE FUNCTION square_root(double precision) RETURNS double precision
1612 (Most internal functions expect to be declared <quote>strict</quote>.)
1617 Not all <quote>predefined</quote> functions are
1618 <quote>internal</quote> in the above sense. Some predefined
1619 functions are written in SQL.
1624 <sect1 id="xfunc-c">
1625 <title>C-Language Functions</title>
1627 <indexterm zone="xfunc-c">
1628 <primary>function</primary>
1629 <secondary>user-defined</secondary>
1630 <tertiary>in C</tertiary>
1634 User-defined functions can be written in C (or a language that can
1635 be made compatible with C, such as C++). Such functions are
1636 compiled into dynamically loadable objects (also called shared
1637 libraries) and are loaded by the server on demand. The dynamic
1638 loading feature is what distinguishes <quote>C language</quote> functions
1639 from <quote>internal</quote> functions — the actual coding conventions
1640 are essentially the same for both. (Hence, the standard internal
1641 function library is a rich source of coding examples for user-defined
1646 Currently only one calling convention is used for C functions
1647 (<quote>version 1</quote>). Support for that calling convention is
1648 indicated by writing a <literal>PG_FUNCTION_INFO_V1()</literal> macro
1649 call for the function, as illustrated below.
1652 <sect2 id="xfunc-c-dynload">
1653 <title>Dynamic Loading</title>
1655 <indexterm zone="xfunc-c-dynload">
1656 <primary>dynamic loading</primary>
1660 The first time a user-defined function in a particular
1661 loadable object file is called in a session,
1662 the dynamic loader loads that object file into memory so that the
1663 function can be called. The <command>CREATE FUNCTION</command>
1664 for a user-defined C function must therefore specify two pieces of
1665 information for the function: the name of the loadable
1666 object file, and the C name (link symbol) of the specific function to call
1667 within that object file. If the C name is not explicitly specified then
1668 it is assumed to be the same as the SQL function name.
1672 The following algorithm is used to locate the shared object file
1673 based on the name given in the <command>CREATE FUNCTION</command>
1679 If the name is an absolute path, the given file is loaded.
1685 If the name starts with the string <literal>$libdir</literal>,
1686 that part is replaced by the <productname>PostgreSQL</productname> package
1688 name, which is determined at build time.<indexterm><primary>$libdir</primary></indexterm>
1694 If the name does not contain a directory part, the file is
1695 searched for in the path specified by the configuration variable
1696 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</primary></indexterm>
1702 Otherwise (the file was not found in the path, or it contains a
1703 non-absolute directory part), the dynamic loader will try to
1704 take the name as given, which will most likely fail. (It is
1705 unreliable to depend on the current working directory.)
1710 If this sequence does not work, the platform-specific shared
1711 library file name extension (often <filename>.so</filename>) is
1712 appended to the given name and this sequence is tried again. If
1713 that fails as well, the load will fail.
1717 It is recommended to locate shared libraries either relative to
1718 <literal>$libdir</literal> or through the dynamic library path.
1719 This simplifies version upgrades if the new installation is at a
1720 different location. The actual directory that
1721 <literal>$libdir</literal> stands for can be found out with the
1722 command <literal>pg_config --pkglibdir</literal>.
1726 The user ID the <productname>PostgreSQL</productname> server runs
1727 as must be able to traverse the path to the file you intend to
1728 load. Making the file or a higher-level directory not readable
1729 and/or not executable by the <systemitem>postgres</systemitem>
1730 user is a common mistake.
1734 In any case, the file name that is given in the
1735 <command>CREATE FUNCTION</command> command is recorded literally
1736 in the system catalogs, so if the file needs to be loaded again
1737 the same procedure is applied.
1742 <productname>PostgreSQL</productname> will not compile a C function
1743 automatically. The object file must be compiled before it is referenced
1744 in a <command>CREATE
1745 FUNCTION</command> command. See <xref linkend="dfunc"> for additional
1750 <indexterm zone="xfunc-c-dynload">
1751 <primary>magic block</primary>
1755 To ensure that a dynamically loaded object file is not loaded into an
1756 incompatible server, <productname>PostgreSQL</productname> checks that the
1757 file contains a <quote>magic block</quote> with the appropriate contents.
1758 This allows the server to detect obvious incompatibilities, such as code
1759 compiled for a different major version of
1760 <productname>PostgreSQL</productname>. To include a magic block,
1761 write this in one (and only one) of the module source files, after having
1762 included the header <filename>fmgr.h</filename>:
1770 After it is used for the first time, a dynamically loaded object
1771 file is retained in memory. Future calls in the same session to
1772 the function(s) in that file will only incur the small overhead of
1773 a symbol table lookup. If you need to force a reload of an object
1774 file, for example after recompiling it, begin a fresh session.
1777 <indexterm zone="xfunc-c-dynload">
1778 <primary>_PG_init</primary>
1780 <indexterm zone="xfunc-c-dynload">
1781 <primary>_PG_fini</primary>
1783 <indexterm zone="xfunc-c-dynload">
1784 <primary>library initialization function</primary>
1786 <indexterm zone="xfunc-c-dynload">
1787 <primary>library finalization function</primary>
1791 Optionally, a dynamically loaded file can contain initialization and
1792 finalization functions. If the file includes a function named
1793 <function>_PG_init</function>, that function will be called immediately after
1794 loading the file. The function receives no parameters and should
1795 return void. If the file includes a function named
1796 <function>_PG_fini</function>, that function will be called immediately before
1797 unloading the file. Likewise, the function receives no parameters and
1798 should return void. Note that <function>_PG_fini</function> will only be called
1799 during an unload of the file, not during process termination.
1800 (Presently, unloads are disabled and will never occur, but this may
1801 change in the future.)
1806 <sect2 id="xfunc-c-basetype">
1807 <title>Base Types in C-Language Functions</title>
1809 <indexterm zone="xfunc-c-basetype">
1810 <primary>data type</primary>
1811 <secondary>internal organization</secondary>
1815 To know how to write C-language functions, you need to know how
1816 <productname>PostgreSQL</productname> internally represents base
1817 data types and how they can be passed to and from functions.
1818 Internally, <productname>PostgreSQL</productname> regards a base
1819 type as a <quote>blob of memory</quote>. The user-defined
1820 functions that you define over a type in turn define the way that
1821 <productname>PostgreSQL</productname> can operate on it. That
1822 is, <productname>PostgreSQL</productname> will only store and
1823 retrieve the data from disk and use your user-defined functions
1824 to input, process, and output the data.
1828 Base types can have one of three internal formats:
1833 pass by value, fixed-length
1838 pass by reference, fixed-length
1843 pass by reference, variable-length
1850 By-value types can only be 1, 2, or 4 bytes in length
1851 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1852 You should be careful to define your types such that they will be the
1853 same size (in bytes) on all architectures. For example, the
1854 <literal>long</literal> type is dangerous because it is 4 bytes on some
1855 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1856 on most Unix machines. A reasonable implementation of the
1857 <type>int4</type> type on Unix machines might be:
1860 /* 4-byte integer, passed by value */
1864 (The actual PostgreSQL C code calls this type <type>int32</type>, because
1865 it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1866 means <replaceable>XX</replaceable> <emphasis>bits</emphasis>. Note
1867 therefore also that the C type <type>int8</type> is 1 byte in size. The
1868 SQL type <type>int8</type> is called <type>int64</type> in C. See also
1869 <xref linkend="xfunc-c-type-table">.)
1873 On the other hand, fixed-length types of any size can
1874 be passed by-reference. For example, here is a sample
1875 implementation of a <productname>PostgreSQL</productname> type:
1878 /* 16-byte structure, passed by reference */
1885 Only pointers to such types can be used when passing
1886 them in and out of <productname>PostgreSQL</productname> functions.
1887 To return a value of such a type, allocate the right amount of
1888 memory with <literal>palloc</literal>, fill in the allocated memory,
1889 and return a pointer to it. (Also, if you just want to return the
1890 same value as one of your input arguments that's of the same data type,
1891 you can skip the extra <literal>palloc</literal> and just return the
1892 pointer to the input value.)
1896 Finally, all variable-length types must also be passed
1897 by reference. All variable-length types must begin
1898 with an opaque length field of exactly 4 bytes, which will be set
1899 by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1900 be stored within that type must be located in the memory
1901 immediately following that length field. The
1902 length field contains the total length of the structure,
1903 that is, it includes the size of the length field
1908 Another important point is to avoid leaving any uninitialized bits
1909 within data type values; for example, take care to zero out any
1910 alignment padding bytes that might be present in structs. Without
1911 this, logically-equivalent constants of your data type might be
1912 seen as unequal by the planner, leading to inefficient (though not
1918 <emphasis>Never</emphasis> modify the contents of a pass-by-reference input
1919 value. If you do so you are likely to corrupt on-disk data, since
1920 the pointer you are given might point directly into a disk buffer.
1921 The sole exception to this rule is explained in
1922 <xref linkend="xaggr">.
1927 As an example, we can define the type <type>text</type> as
1933 char data[FLEXIBLE_ARRAY_MEMBER];
1937 The <literal>[FLEXIBLE_ARRAY_MEMBER]</literal> notation means that the actual
1938 length of the data part is not specified by this declaration.
1943 variable-length types, we must be careful to allocate
1944 the correct amount of memory and set the length field correctly.
1945 For example, if we wanted to store 40 bytes in a <structname>text</structname>
1946 structure, we might use a code fragment like this:
1948 <programlisting><![CDATA[
1949 #include "postgres.h"
1951 char buffer[40]; /* our source data */
1953 text *destination = (text *) palloc(VARHDRSZ + 40);
1954 SET_VARSIZE(destination, VARHDRSZ + 40);
1955 memcpy(destination->data, buffer, 40);
1960 <literal>VARHDRSZ</literal> is the same as <literal>sizeof(int32)</literal>, but
1961 it's considered good style to use the macro <literal>VARHDRSZ</literal>
1962 to refer to the size of the overhead for a variable-length type.
1963 Also, the length field <emphasis>must</emphasis> be set using the
1964 <literal>SET_VARSIZE</literal> macro, not by simple assignment.
1968 <xref linkend="xfunc-c-type-table"> specifies which C type
1969 corresponds to which SQL type when writing a C-language function
1970 that uses a built-in type of <productname>PostgreSQL</productname>.
1971 The <quote>Defined In</quote> column gives the header file that
1972 needs to be included to get the type definition. (The actual
1973 definition might be in a different file that is included by the
1974 listed file. It is recommended that users stick to the defined
1975 interface.) Note that you should always include
1976 <filename>postgres.h</filename> first in any source file, because
1977 it declares a number of things that you will need anyway.
1980 <table tocentry="1" id="xfunc-c-type-table">
1981 <title>Equivalent C Types for Built-in SQL Types</title>
1998 <entry><type>abstime</type></entry>
1999 <entry><type>AbsoluteTime</type></entry>
2000 <entry><filename>utils/nabstime.h</filename></entry>
2003 <entry><type>bigint</type> (<type>int8</type>)</entry>
2004 <entry><type>int64</type></entry>
2005 <entry><filename>postgres.h</filename></entry>
2008 <entry><type>boolean</type></entry>
2009 <entry><type>bool</type></entry>
2010 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
2013 <entry><type>box</type></entry>
2014 <entry><type>BOX*</type></entry>
2015 <entry><filename>utils/geo_decls.h</filename></entry>
2018 <entry><type>bytea</type></entry>
2019 <entry><type>bytea*</type></entry>
2020 <entry><filename>postgres.h</filename></entry>
2023 <entry><type>"char"</type></entry>
2024 <entry><type>char</type></entry>
2025 <entry>(compiler built-in)</entry>
2028 <entry><type>character</type></entry>
2029 <entry><type>BpChar*</type></entry>
2030 <entry><filename>postgres.h</filename></entry>
2033 <entry><type>cid</type></entry>
2034 <entry><type>CommandId</type></entry>
2035 <entry><filename>postgres.h</filename></entry>
2038 <entry><type>date</type></entry>
2039 <entry><type>DateADT</type></entry>
2040 <entry><filename>utils/date.h</filename></entry>
2043 <entry><type>smallint</type> (<type>int2</type>)</entry>
2044 <entry><type>int16</type></entry>
2045 <entry><filename>postgres.h</filename></entry>
2048 <entry><type>int2vector</type></entry>
2049 <entry><type>int2vector*</type></entry>
2050 <entry><filename>postgres.h</filename></entry>
2053 <entry><type>integer</type> (<type>int4</type>)</entry>
2054 <entry><type>int32</type></entry>
2055 <entry><filename>postgres.h</filename></entry>
2058 <entry><type>real</type> (<type>float4</type>)</entry>
2059 <entry><type>float4*</type></entry>
2060 <entry><filename>postgres.h</filename></entry>
2063 <entry><type>double precision</type> (<type>float8</type>)</entry>
2064 <entry><type>float8*</type></entry>
2065 <entry><filename>postgres.h</filename></entry>
2068 <entry><type>interval</type></entry>
2069 <entry><type>Interval*</type></entry>
2070 <entry><filename>datatype/timestamp.h</filename></entry>
2073 <entry><type>lseg</type></entry>
2074 <entry><type>LSEG*</type></entry>
2075 <entry><filename>utils/geo_decls.h</filename></entry>
2078 <entry><type>name</type></entry>
2079 <entry><type>Name</type></entry>
2080 <entry><filename>postgres.h</filename></entry>
2083 <entry><type>oid</type></entry>
2084 <entry><type>Oid</type></entry>
2085 <entry><filename>postgres.h</filename></entry>
2088 <entry><type>oidvector</type></entry>
2089 <entry><type>oidvector*</type></entry>
2090 <entry><filename>postgres.h</filename></entry>
2093 <entry><type>path</type></entry>
2094 <entry><type>PATH*</type></entry>
2095 <entry><filename>utils/geo_decls.h</filename></entry>
2098 <entry><type>point</type></entry>
2099 <entry><type>POINT*</type></entry>
2100 <entry><filename>utils/geo_decls.h</filename></entry>
2103 <entry><type>regproc</type></entry>
2104 <entry><type>regproc</type></entry>
2105 <entry><filename>postgres.h</filename></entry>
2108 <entry><type>reltime</type></entry>
2109 <entry><type>RelativeTime</type></entry>
2110 <entry><filename>utils/nabstime.h</filename></entry>
2113 <entry><type>text</type></entry>
2114 <entry><type>text*</type></entry>
2115 <entry><filename>postgres.h</filename></entry>
2118 <entry><type>tid</type></entry>
2119 <entry><type>ItemPointer</type></entry>
2120 <entry><filename>storage/itemptr.h</filename></entry>
2123 <entry><type>time</type></entry>
2124 <entry><type>TimeADT</type></entry>
2125 <entry><filename>utils/date.h</filename></entry>
2128 <entry><type>time with time zone</type></entry>
2129 <entry><type>TimeTzADT</type></entry>
2130 <entry><filename>utils/date.h</filename></entry>
2133 <entry><type>timestamp</type></entry>
2134 <entry><type>Timestamp*</type></entry>
2135 <entry><filename>datatype/timestamp.h</filename></entry>
2138 <entry><type>tinterval</type></entry>
2139 <entry><type>TimeInterval</type></entry>
2140 <entry><filename>utils/nabstime.h</filename></entry>
2143 <entry><type>varchar</type></entry>
2144 <entry><type>VarChar*</type></entry>
2145 <entry><filename>postgres.h</filename></entry>
2148 <entry><type>xid</type></entry>
2149 <entry><type>TransactionId</type></entry>
2150 <entry><filename>postgres.h</filename></entry>
2157 Now that we've gone over all of the possible structures
2158 for base types, we can show some examples of real functions.
2163 <title>Version 1 Calling Conventions</title>
2166 The version-1 calling convention relies on macros to suppress most
2167 of the complexity of passing arguments and results. The C declaration
2168 of a version-1 function is always:
2170 Datum funcname(PG_FUNCTION_ARGS)
2172 In addition, the macro call:
2174 PG_FUNCTION_INFO_V1(funcname);
2176 must appear in the same source file. (Conventionally, it's
2177 written just before the function itself.) This macro call is not
2178 needed for <literal>internal</literal>-language functions, since
2179 <productname>PostgreSQL</productname> assumes that all internal functions
2180 use the version-1 convention. It is, however, required for
2181 dynamically-loaded functions.
2185 In a version-1 function, each actual argument is fetched using a
2186 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2187 macro that corresponds to the argument's data type. In non-strict
2188 functions there needs to be a previous check about argument null-ness
2189 using <function>PG_ARGNULL_<replaceable>xxx</replaceable>()</function>.
2190 The result is returned using a
2191 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2192 macro for the return type.
2193 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2194 takes as its argument the number of the function argument to
2195 fetch, where the count starts at 0.
2196 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2197 takes as its argument the actual value to return.
2201 Here are some examples using the version-1 calling convention:
2204 <programlisting><![CDATA[
2205 #include "postgres.h"
2208 #include "utils/geo_decls.h"
2214 PG_FUNCTION_INFO_V1(add_one);
2217 add_one(PG_FUNCTION_ARGS)
2219 int32 arg = PG_GETARG_INT32(0);
2221 PG_RETURN_INT32(arg + 1);
2224 /* by reference, fixed length */
2226 PG_FUNCTION_INFO_V1(add_one_float8);
2229 add_one_float8(PG_FUNCTION_ARGS)
2231 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2232 float8 arg = PG_GETARG_FLOAT8(0);
2234 PG_RETURN_FLOAT8(arg + 1.0);
2237 PG_FUNCTION_INFO_V1(makepoint);
2240 makepoint(PG_FUNCTION_ARGS)
2242 /* Here, the pass-by-reference nature of Point is not hidden. */
2243 Point *pointx = PG_GETARG_POINT_P(0);
2244 Point *pointy = PG_GETARG_POINT_P(1);
2245 Point *new_point = (Point *) palloc(sizeof(Point));
2247 new_point->x = pointx->x;
2248 new_point->y = pointy->y;
2250 PG_RETURN_POINT_P(new_point);
2253 /* by reference, variable length */
2255 PG_FUNCTION_INFO_V1(copytext);
2258 copytext(PG_FUNCTION_ARGS)
2260 text *t = PG_GETARG_TEXT_PP(0);
2263 * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
2264 * VARHDRSZ or VARHDRSZ_SHORT of its header. Construct the copy with a
2265 * full-length header.
2267 text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2268 SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2271 * VARDATA is a pointer to the data region of the new struct. The source
2272 * could be a short datum, so retrieve its data through VARDATA_ANY.
2274 memcpy((void *) VARDATA(new_t), /* destination */
2275 (void *) VARDATA_ANY(t), /* source */
2276 VARSIZE_ANY_EXHDR(t)); /* how many bytes */
2277 PG_RETURN_TEXT_P(new_t);
2280 PG_FUNCTION_INFO_V1(concat_text);
2283 concat_text(PG_FUNCTION_ARGS)
2285 text *arg1 = PG_GETARG_TEXT_PP(0);
2286 text *arg2 = PG_GETARG_TEXT_PP(1);
2287 int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
2288 int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
2289 int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
2290 text *new_text = (text *) palloc(new_text_size);
2292 SET_VARSIZE(new_text, new_text_size);
2293 memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
2294 memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
2295 PG_RETURN_TEXT_P(new_text);
2301 Supposing that the above code has been prepared in file
2302 <filename>funcs.c</filename> and compiled into a shared object,
2303 we could define the functions to <productname>PostgreSQL</productname>
2304 with commands like this:
2308 CREATE FUNCTION add_one(integer) RETURNS integer
2309 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2312 -- note overloading of SQL function name "add_one"
2313 CREATE FUNCTION add_one(double precision) RETURNS double precision
2314 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2317 CREATE FUNCTION makepoint(point, point) RETURNS point
2318 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2321 CREATE FUNCTION copytext(text) RETURNS text
2322 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2325 CREATE FUNCTION concat_text(text, text) RETURNS text
2326 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2331 Here, <replaceable>DIRECTORY</replaceable> stands for the
2332 directory of the shared library file (for instance the
2333 <productname>PostgreSQL</productname> tutorial directory, which
2334 contains the code for the examples used in this section).
2335 (Better style would be to use just <literal>'funcs'</literal> in the
2336 <literal>AS</literal> clause, after having added
2337 <replaceable>DIRECTORY</replaceable> to the search path. In any
2338 case, we can omit the system-specific extension for a shared
2339 library, commonly <literal>.so</literal>.)
2343 Notice that we have specified the functions as <quote>strict</quote>,
2345 the system should automatically assume a null result if any input
2346 value is null. By doing this, we avoid having to check for null inputs
2347 in the function code. Without this, we'd have to check for null values
2348 explicitly, using <function>PG_ARGISNULL()</function>.
2352 At first glance, the version-1 coding conventions might appear to be just
2353 pointless obscurantism, over using plain <literal>C</literal> calling
2354 conventions. They do however allow to deal with <literal>NULL</literal>able
2355 arguments/return values, and <quote>toasted</quote> (compressed or
2356 out-of-line) values.
2360 The macro <function>PG_ARGISNULL(<replaceable>n</replaceable>)</function>
2361 allows a function to test whether each input is null. (Of course, doing
2362 this is only necessary in functions not declared <quote>strict</quote>.)
2364 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2365 the input arguments are counted beginning at zero. Note that one
2366 should refrain from executing
2367 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2368 one has verified that the argument isn't null.
2369 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2370 this works in both strict and nonstrict functions.
2374 Other options provided by the version-1 interface are two
2376 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2377 macros. The first of these,
2378 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2379 guarantees to return a copy of the specified argument that is
2380 safe for writing into. (The normal macros will sometimes return a
2381 pointer to a value that is physically stored in a table, which
2382 must not be written to. Using the
2383 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2384 macros guarantees a writable result.)
2385 The second variant consists of the
2386 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2387 macros which take three arguments. The first is the number of the
2388 function argument (as above). The second and third are the offset and
2389 length of the segment to be returned. Offsets are counted from
2390 zero, and a negative length requests that the remainder of the
2391 value be returned. These macros provide more efficient access to
2392 parts of large values in the case where they have storage type
2393 <quote>external</quote>. (The storage type of a column can be specified using
2394 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2395 COLUMN <replaceable>colname</replaceable> SET STORAGE
2396 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2397 <literal>plain</literal>, <literal>external</literal>, <literal>extended</literal>,
2398 or <literal>main</literal>.)
2402 Finally, the version-1 function call conventions make it possible
2403 to return set results (<xref linkend="xfunc-c-return-set">) and
2404 implement trigger functions (<xref linkend="triggers">) and
2405 procedural-language call handlers (<xref
2406 linkend="plhandler">). For more details
2407 see <filename>src/backend/utils/fmgr/README</filename> in the
2408 source distribution.
2413 <title>Writing Code</title>
2416 Before we turn to the more advanced topics, we should discuss
2417 some coding rules for <productname>PostgreSQL</productname>
2418 C-language functions. While it might be possible to load functions
2419 written in languages other than C into
2420 <productname>PostgreSQL</productname>, this is usually difficult
2421 (when it is possible at all) because other languages, such as
2422 C++, FORTRAN, or Pascal often do not follow the same calling
2423 convention as C. That is, other languages do not pass argument
2424 and return values between functions in the same way. For this
2425 reason, we will assume that your C-language functions are
2426 actually written in C.
2430 The basic rules for writing and building C functions are as follows:
2435 Use <literal>pg_config
2436 --includedir-server</literal><indexterm><primary>pg_config</primary><secondary>with user-defined C functions</secondary></indexterm>
2437 to find out where the <productname>PostgreSQL</productname> server header
2438 files are installed on your system (or the system that your
2439 users will be running on).
2445 Compiling and linking your code so that it can be dynamically
2446 loaded into <productname>PostgreSQL</productname> always
2447 requires special flags. See <xref linkend="dfunc"> for a
2448 detailed explanation of how to do it for your particular
2455 Remember to define a <quote>magic block</quote> for your shared library,
2456 as described in <xref linkend="xfunc-c-dynload">.
2462 When allocating memory, use the
2463 <productname>PostgreSQL</productname> functions
2464 <function>palloc</function><indexterm><primary>palloc</primary></indexterm> and <function>pfree</function><indexterm><primary>pfree</primary></indexterm>
2465 instead of the corresponding C library functions
2466 <function>malloc</function> and <function>free</function>.
2467 The memory allocated by <function>palloc</function> will be
2468 freed automatically at the end of each transaction, preventing
2475 Always zero the bytes of your structures using <function>memset</function>
2476 (or allocate them with <function>palloc0</function> in the first place).
2477 Even if you assign to each field of your structure, there might be
2478 alignment padding (holes in the structure) that contain
2479 garbage values. Without this, it's difficult to
2480 support hash indexes or hash joins, as you must pick out only
2481 the significant bits of your data structure to compute a hash.
2482 The planner also sometimes relies on comparing constants via
2483 bitwise equality, so you can get undesirable planning results if
2484 logically-equivalent values aren't bitwise equal.
2490 Most of the internal <productname>PostgreSQL</productname>
2491 types are declared in <filename>postgres.h</filename>, while
2492 the function manager interfaces
2493 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2494 <filename>fmgr.h</filename>, so you will need to include at
2495 least these two files. For portability reasons it's best to
2496 include <filename>postgres.h</filename> <emphasis>first</emphasis>,
2497 before any other system or user header files. Including
2498 <filename>postgres.h</filename> will also include
2499 <filename>elog.h</filename> and <filename>palloc.h</filename>
2506 Symbol names defined within object files must not conflict
2507 with each other or with symbols defined in the
2508 <productname>PostgreSQL</productname> server executable. You
2509 will have to rename your functions or variables if you get
2510 error messages to this effect.
2520 <title>Composite-type Arguments</title>
2523 Composite types do not have a fixed layout like C structures.
2524 Instances of a composite type can contain null fields. In
2525 addition, composite types that are part of an inheritance
2526 hierarchy can have different fields than other members of the
2527 same inheritance hierarchy. Therefore,
2528 <productname>PostgreSQL</productname> provides a function
2529 interface for accessing fields of composite types from C.
2533 Suppose we want to write a function to answer the query:
2536 SELECT name, c_overpaid(emp, 1500) AS overpaid
2538 WHERE name = 'Bill' OR name = 'Sam';
2541 Using the version-1 calling conventions, we can define
2542 <function>c_overpaid</function> as:
2544 <programlisting><![CDATA[
2545 #include "postgres.h"
2546 #include "executor/executor.h" /* for GetAttributeByName() */
2550 PG_FUNCTION_INFO_V1(c_overpaid);
2553 c_overpaid(PG_FUNCTION_ARGS)
2555 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2556 int32 limit = PG_GETARG_INT32(1);
2560 salary = GetAttributeByName(t, "salary", &isnull);
2562 PG_RETURN_BOOL(false);
2563 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2565 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2572 <function>GetAttributeByName</function> is the
2573 <productname>PostgreSQL</productname> system function that
2574 returns attributes out of the specified row. It has
2575 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2577 the function, the name of the desired attribute, and a
2578 return parameter that tells whether the attribute
2579 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2580 value that you can convert to the proper data type by using the
2581 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2582 macro. Note that the return value is meaningless if the null flag is
2583 set; always check the null flag before trying to do anything with the
2588 There is also <function>GetAttributeByNum</function>, which selects
2589 the target attribute by column number instead of name.
2593 The following command declares the function
2594 <function>c_overpaid</function> in SQL:
2597 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2598 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2602 Notice we have used <literal>STRICT</literal> so that we did not have to
2603 check whether the input arguments were NULL.
2608 <title>Returning Rows (Composite Types)</title>
2611 To return a row or composite-type value from a C-language
2612 function, you can use a special API that provides macros and
2613 functions to hide most of the complexity of building composite
2614 data types. To use this API, the source file must include:
2616 #include "funcapi.h"
2621 There are two ways you can build a composite data value (henceforth
2622 a <quote>tuple</quote>): you can build it from an array of Datum values,
2623 or from an array of C strings that can be passed to the input
2624 conversion functions of the tuple's column data types. In either
2625 case, you first need to obtain or construct a <structname>TupleDesc</structname>
2626 descriptor for the tuple structure. When working with Datums, you
2627 pass the <structname>TupleDesc</structname> to <function>BlessTupleDesc</function>,
2628 and then call <function>heap_form_tuple</function> for each row. When working
2629 with C strings, you pass the <structname>TupleDesc</structname> to
2630 <function>TupleDescGetAttInMetadata</function>, and then call
2631 <function>BuildTupleFromCStrings</function> for each row. In the case of a
2632 function returning a set of tuples, the setup steps can all be done
2633 once during the first call of the function.
2637 Several helper functions are available for setting up the needed
2638 <structname>TupleDesc</structname>. The recommended way to do this in most
2639 functions returning composite values is to call:
2641 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2643 TupleDesc *resultTupleDesc)
2645 passing the same <literal>fcinfo</literal> struct passed to the calling function
2646 itself. (This of course requires that you use the version-1
2647 calling conventions.) <varname>resultTypeId</varname> can be specified
2648 as <literal>NULL</literal> or as the address of a local variable to receive the
2649 function's result type OID. <varname>resultTupleDesc</varname> should be the
2650 address of a local <structname>TupleDesc</structname> variable. Check that the
2651 result is <literal>TYPEFUNC_COMPOSITE</literal>; if so,
2652 <varname>resultTupleDesc</varname> has been filled with the needed
2653 <structname>TupleDesc</structname>. (If it is not, you can report an error along
2654 the lines of <quote>function returning record called in context that
2655 cannot accept type record</quote>.)
2660 <function>get_call_result_type</function> can resolve the actual type of a
2661 polymorphic function result; so it is useful in functions that return
2662 scalar polymorphic results, not only functions that return composites.
2663 The <varname>resultTypeId</varname> output is primarily useful for functions
2664 returning polymorphic scalars.
2670 <function>get_call_result_type</function> has a sibling
2671 <function>get_expr_result_type</function>, which can be used to resolve the
2672 expected output type for a function call represented by an expression
2673 tree. This can be used when trying to determine the result type from
2674 outside the function itself. There is also
2675 <function>get_func_result_type</function>, which can be used when only the
2676 function's OID is available. However these functions are not able
2677 to deal with functions declared to return <structname>record</structname>, and
2678 <function>get_func_result_type</function> cannot resolve polymorphic types,
2679 so you should preferentially use <function>get_call_result_type</function>.
2684 Older, now-deprecated functions for obtaining
2685 <structname>TupleDesc</structname>s are:
2687 TupleDesc RelationNameGetTupleDesc(const char *relname)
2689 to get a <structname>TupleDesc</structname> for the row type of a named relation,
2692 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2694 to get a <structname>TupleDesc</structname> based on a type OID. This can
2695 be used to get a <structname>TupleDesc</structname> for a base or
2696 composite type. It will not work for a function that returns
2697 <structname>record</structname>, however, and it cannot resolve polymorphic
2702 Once you have a <structname>TupleDesc</structname>, call:
2704 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2706 if you plan to work with Datums, or:
2708 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2710 if you plan to work with C strings. If you are writing a function
2711 returning set, you can save the results of these functions in the
2712 <structname>FuncCallContext</structname> structure — use the
2713 <structfield>tuple_desc</structfield> or <structfield>attinmeta</structfield> field
2718 When working with Datums, use:
2720 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2722 to build a <structname>HeapTuple</structname> given user data in Datum form.
2726 When working with C strings, use:
2728 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2730 to build a <structname>HeapTuple</structname> given user data
2731 in C string form. <parameter>values</parameter> is an array of C strings,
2732 one for each attribute of the return row. Each C string should be in
2733 the form expected by the input function of the attribute data
2734 type. In order to return a null value for one of the attributes,
2735 the corresponding pointer in the <parameter>values</parameter> array
2736 should be set to <symbol>NULL</symbol>. This function will need to
2737 be called again for each row you return.
2741 Once you have built a tuple to return from your function, it
2742 must be converted into a <type>Datum</type>. Use:
2744 HeapTupleGetDatum(HeapTuple tuple)
2746 to convert a <structname>HeapTuple</structname> into a valid Datum. This
2747 <type>Datum</type> can be returned directly if you intend to return
2748 just a single row, or it can be used as the current return value
2749 in a set-returning function.
2753 An example appears in the next section.
2758 <sect2 id="xfunc-c-return-set">
2759 <title>Returning Sets</title>
2762 There is also a special API that provides support for returning
2763 sets (multiple rows) from a C-language function. A set-returning
2764 function must follow the version-1 calling conventions. Also,
2765 source files must include <filename>funcapi.h</filename>, as
2770 A set-returning function (<acronym>SRF</acronym>) is called
2771 once for each item it returns. The <acronym>SRF</acronym> must
2772 therefore save enough state to remember what it was doing and
2773 return the next item on each call.
2774 The structure <structname>FuncCallContext</structname> is provided to help
2775 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</literal>
2776 is used to hold a pointer to <structname>FuncCallContext</structname>
2779 typedef struct FuncCallContext
2782 * Number of times we've been called before
2784 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2785 * incremented for you every time SRF_RETURN_NEXT() is called.
2790 * OPTIONAL maximum number of calls
2792 * max_calls is here for convenience only and setting it is optional.
2793 * If not set, you must provide alternative means to know when the
2799 * OPTIONAL pointer to result slot
2801 * This is obsolete and only present for backward compatibility, viz,
2802 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2804 TupleTableSlot *slot;
2807 * OPTIONAL pointer to miscellaneous user-provided context information
2809 * user_fctx is for use as a pointer to your own data to retain
2810 * arbitrary context information between calls of your function.
2815 * OPTIONAL pointer to struct containing attribute type input metadata
2817 * attinmeta is for use when returning tuples (i.e., composite data types)
2818 * and is not used when returning base data types. It is only needed
2819 * if you intend to use BuildTupleFromCStrings() to create the return
2822 AttInMetadata *attinmeta;
2825 * memory context used for structures that must live for multiple calls
2827 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2828 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2829 * context for any memory that is to be reused across multiple calls
2832 MemoryContext multi_call_memory_ctx;
2835 * OPTIONAL pointer to struct containing tuple description
2837 * tuple_desc is for use when returning tuples (i.e., composite data types)
2838 * and is only needed if you are going to build the tuples with
2839 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2840 * the TupleDesc pointer stored here should usually have been run through
2841 * BlessTupleDesc() first.
2843 TupleDesc tuple_desc;
2850 An <acronym>SRF</acronym> uses several functions and macros that
2851 automatically manipulate the <structname>FuncCallContext</structname>
2852 structure (and expect to find it via <literal>fn_extra</literal>). Use:
2856 to determine if your function is being called for the first or a
2857 subsequent time. On the first call (only) use:
2859 SRF_FIRSTCALL_INIT()
2861 to initialize the <structname>FuncCallContext</structname>. On every function call,
2862 including the first, use:
2866 to properly set up for using the <structname>FuncCallContext</structname>
2867 and clearing any previously returned data left over from the
2872 If your function has data to return, use:
2874 SRF_RETURN_NEXT(funcctx, result)
2876 to return it to the caller. (<literal>result</literal> must be of type
2877 <type>Datum</type>, either a single value or a tuple prepared as
2878 described above.) Finally, when your function is finished
2879 returning data, use:
2881 SRF_RETURN_DONE(funcctx)
2883 to clean up and end the <acronym>SRF</acronym>.
2887 The memory context that is current when the <acronym>SRF</acronym> is called is
2888 a transient context that will be cleared between calls. This means
2889 that you do not need to call <function>pfree</function> on everything
2890 you allocated using <function>palloc</function>; it will go away anyway. However, if you want to allocate
2891 any data structures to live across calls, you need to put them somewhere
2892 else. The memory context referenced by
2893 <structfield>multi_call_memory_ctx</structfield> is a suitable location for any
2894 data that needs to survive until the <acronym>SRF</acronym> is finished running. In most
2895 cases, this means that you should switch into
2896 <structfield>multi_call_memory_ctx</structfield> while doing the first-call setup.
2901 While the actual arguments to the function remain unchanged between
2902 calls, if you detoast the argument values (which is normally done
2903 transparently by the
2904 <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2905 in the transient context then the detoasted copies will be freed on
2906 each cycle. Accordingly, if you keep references to such values in
2907 your <structfield>user_fctx</structfield>, you must either copy them into the
2908 <structfield>multi_call_memory_ctx</structfield> after detoasting, or ensure
2909 that you detoast the values only in that context.
2914 A complete pseudo-code example looks like the following:
2917 my_set_returning_function(PG_FUNCTION_ARGS)
2919 FuncCallContext *funcctx;
2921 <replaceable>further declarations as needed</replaceable>
2923 if (SRF_IS_FIRSTCALL())
2925 MemoryContext oldcontext;
2927 funcctx = SRF_FIRSTCALL_INIT();
2928 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2929 /* One-time setup code appears here: */
2930 <replaceable>user code</replaceable>
2931 <replaceable>if returning composite</replaceable>
2932 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2933 <replaceable>endif returning composite</replaceable>
2934 <replaceable>user code</replaceable>
2935 MemoryContextSwitchTo(oldcontext);
2938 /* Each-time setup code appears here: */
2939 <replaceable>user code</replaceable>
2940 funcctx = SRF_PERCALL_SETUP();
2941 <replaceable>user code</replaceable>
2943 /* this is just one way we might test whether we are done: */
2944 if (funcctx->call_cntr < funcctx->max_calls)
2946 /* Here we want to return another item: */
2947 <replaceable>user code</replaceable>
2948 <replaceable>obtain result Datum</replaceable>
2949 SRF_RETURN_NEXT(funcctx, result);
2953 /* Here we are done returning items and just need to clean up: */
2954 <replaceable>user code</replaceable>
2955 SRF_RETURN_DONE(funcctx);
2962 A complete example of a simple <acronym>SRF</acronym> returning a composite type
2964 <programlisting><![CDATA[
2965 PG_FUNCTION_INFO_V1(retcomposite);
2968 retcomposite(PG_FUNCTION_ARGS)
2970 FuncCallContext *funcctx;
2974 AttInMetadata *attinmeta;
2976 /* stuff done only on the first call of the function */
2977 if (SRF_IS_FIRSTCALL())
2979 MemoryContext oldcontext;
2981 /* create a function context for cross-call persistence */
2982 funcctx = SRF_FIRSTCALL_INIT();
2984 /* switch to memory context appropriate for multiple function calls */
2985 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2987 /* total number of tuples to be returned */
2988 funcctx->max_calls = PG_GETARG_UINT32(0);
2990 /* Build a tuple descriptor for our result type */
2991 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
2993 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
2994 errmsg("function returning record called in context "
2995 "that cannot accept type record")));
2998 * generate attribute metadata needed later to produce tuples from raw
3001 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3002 funcctx->attinmeta = attinmeta;
3004 MemoryContextSwitchTo(oldcontext);
3007 /* stuff done on every call of the function */
3008 funcctx = SRF_PERCALL_SETUP();
3010 call_cntr = funcctx->call_cntr;
3011 max_calls = funcctx->max_calls;
3012 attinmeta = funcctx->attinmeta;
3014 if (call_cntr < max_calls) /* do when there is more left to send */
3021 * Prepare a values array for building the returned tuple.
3022 * This should be an array of C strings which will
3023 * be processed later by the type input functions.
3025 values = (char **) palloc(3 * sizeof(char *));
3026 values[0] = (char *) palloc(16 * sizeof(char));
3027 values[1] = (char *) palloc(16 * sizeof(char));
3028 values[2] = (char *) palloc(16 * sizeof(char));
3030 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3031 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3032 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3035 tuple = BuildTupleFromCStrings(attinmeta, values);
3037 /* make the tuple into a datum */
3038 result = HeapTupleGetDatum(tuple);
3040 /* clean up (this is not really necessary) */
3046 SRF_RETURN_NEXT(funcctx, result);
3048 else /* do when there is no more left */
3050 SRF_RETURN_DONE(funcctx);
3056 One way to declare this function in SQL is:
3058 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3060 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3061 RETURNS SETOF __retcomposite
3062 AS '<replaceable>filename</replaceable>', 'retcomposite'
3063 LANGUAGE C IMMUTABLE STRICT;
3065 A different way is to use OUT parameters:
3067 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3068 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3069 RETURNS SETOF record
3070 AS '<replaceable>filename</replaceable>', 'retcomposite'
3071 LANGUAGE C IMMUTABLE STRICT;
3073 Notice that in this method the output type of the function is formally
3074 an anonymous <structname>record</structname> type.
3078 The directory <link linkend="tablefunc"><filename>contrib/tablefunc</filename></link>
3079 module in the source distribution contains more examples of
3080 set-returning functions.
3085 <title>Polymorphic Arguments and Return Types</title>
3088 C-language functions can be declared to accept and
3089 return the polymorphic types
3090 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3091 <type>anyenum</type>, and <type>anyrange</type>.
3092 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3093 of polymorphic functions. When function arguments or return types
3094 are defined as polymorphic types, the function author cannot know
3095 in advance what data type it will be called with, or
3096 need to return. There are two routines provided in <filename>fmgr.h</filename>
3097 to allow a version-1 C function to discover the actual data types
3098 of its arguments and the type it is expected to return. The routines are
3099 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</literal> and
3100 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</literal>.
3101 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3102 information is not available.
3103 The structure <literal>flinfo</literal> is normally accessed as
3104 <literal>fcinfo->flinfo</literal>. The parameter <literal>argnum</literal>
3105 is zero based. <function>get_call_result_type</function> can also be used
3106 as an alternative to <function>get_fn_expr_rettype</function>.
3107 There is also <function>get_fn_expr_variadic</function>, which can be used to
3108 find out whether variadic arguments have been merged into an array.
3109 This is primarily useful for <literal>VARIADIC "any"</literal> functions,
3110 since such merging will always have occurred for variadic functions
3111 taking ordinary array types.
3115 For example, suppose we want to write a function to accept a single
3116 element of any type, and return a one-dimensional array of that type:
3119 PG_FUNCTION_INFO_V1(make_array);
3121 make_array(PG_FUNCTION_ARGS)
3124 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
3134 if (!OidIsValid(element_type))
3135 elog(ERROR, "could not determine data type of input");
3137 /* get the provided element, being careful in case it's NULL */
3138 isnull = PG_ARGISNULL(0);
3140 element = (Datum) 0;
3142 element = PG_GETARG_DATUM(0);
3144 /* we have one dimension */
3146 /* and one element */
3148 /* and lower bound is 1 */
3151 /* get required info about the element type */
3152 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
3154 /* now build the array */
3155 result = construct_md_array(&element, &isnull, ndims, dims, lbs,
3156 element_type, typlen, typbyval, typalign);
3158 PG_RETURN_ARRAYTYPE_P(result);
3164 The following command declares the function
3165 <function>make_array</function> in SQL:
3168 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3169 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3170 LANGUAGE C IMMUTABLE;
3175 There is a variant of polymorphism that is only available to C-language
3176 functions: they can be declared to take parameters of type
3177 <literal>"any"</literal>. (Note that this type name must be double-quoted,
3178 since it's also a SQL reserved word.) This works like
3179 <type>anyelement</type> except that it does not constrain different
3180 <literal>"any"</literal> arguments to be the same type, nor do they help
3181 determine the function's result type. A C-language function can also
3182 declare its final parameter to be <literal>VARIADIC "any"</literal>. This will
3183 match one or more actual arguments of any type (not necessarily the same
3184 type). These arguments will <emphasis>not</emphasis> be gathered into an array
3185 as happens with normal variadic functions; they will just be passed to
3186 the function separately. The <function>PG_NARGS()</function> macro and the
3187 methods described above must be used to determine the number of actual
3188 arguments and their types when using this feature. Also, users of such
3189 a function might wish to use the <literal>VARIADIC</literal> keyword in their
3190 function call, with the expectation that the function would treat the
3191 array elements as separate arguments. The function itself must implement
3192 that behavior if wanted, after using <function>get_fn_expr_variadic</function> to
3193 detect that the actual argument was marked with <literal>VARIADIC</literal>.
3197 <sect2 id="xfunc-transform-functions">
3198 <title>Transform Functions</title>
3201 Some function calls can be simplified during planning based on
3202 properties specific to the function. For example,
3203 <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>.
3204 To define such function-specific optimizations, write a
3205 <firstterm>transform function</firstterm> and place its OID in the
3206 <structfield>protransform</structfield> field of the primary function's
3207 <structname>pg_proc</structname> entry. The transform function must have the SQL
3208 signature <literal>protransform(internal) RETURNS internal</literal>. The
3209 argument, actually <type>FuncExpr *</type>, is a dummy node representing a
3210 call to the primary function. If the transform function's study of the
3211 expression tree proves that a simplified expression tree can substitute
3212 for all possible concrete calls represented thereby, build and return
3213 that simplified expression. Otherwise, return a <literal>NULL</literal>
3214 pointer (<emphasis>not</emphasis> a SQL null).
3218 We make no guarantee that <productname>PostgreSQL</productname> will never call the
3219 primary function in cases that the transform function could simplify.
3220 Ensure rigorous equivalence between the simplified expression and an
3221 actual call to the primary function.
3225 Currently, this facility is not exposed to users at the SQL level
3226 because of security concerns, so it is only practical to use for
3227 optimizing built-in functions.
3232 <title>Shared Memory and LWLocks</title>
3235 Add-ins can reserve LWLocks and an allocation of shared memory on server
3236 startup. The add-in's shared library must be preloaded by specifying
3238 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</primary></indexterm>.
3239 Shared memory is reserved by calling:
3241 void RequestAddinShmemSpace(int size)
3243 from your <function>_PG_init</function> function.
3246 LWLocks are reserved by calling:
3248 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3250 from <function>_PG_init</function>. This will ensure that an array of
3251 <literal>num_lwlocks</literal> LWLocks is available under the name
3252 <literal>tranche_name</literal>. Use <function>GetNamedLWLockTranche</function>
3253 to get a pointer to this array.
3256 To avoid possible race-conditions, each backend should use the LWLock
3257 <function>AddinShmemInitLock</function> when connecting to and initializing
3258 its allocation of shared memory, as shown here:
3260 static mystruct *ptr = NULL;
3266 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3267 ptr = ShmemInitStruct("my struct name", size, &found);
3270 initialize contents of shmem area;
3271 acquire any requested LWLocks using:
3272 ptr->locks = GetNamedLWLockTranche("my tranche name");
3274 LWLockRelease(AddinShmemInitLock);
3280 <sect2 id="extend-cpp">
3281 <title>Using C++ for Extensibility</title>
3283 <indexterm zone="extend-cpp">
3284 <primary>C++</primary>
3288 Although the <productname>PostgreSQL</productname> backend is written in
3289 C, it is possible to write extensions in C++ if these guidelines are
3295 All functions accessed by the backend must present a C interface
3296 to the backend; these C functions can then call C++ functions.
3297 For example, <literal>extern C</literal> linkage is required for
3298 backend-accessed functions. This is also necessary for any
3299 functions that are passed as pointers between the backend and
3305 Free memory using the appropriate deallocation method. For example,
3306 most backend memory is allocated using <function>palloc()</function>, so use
3307 <function>pfree()</function> to free it. Using C++
3308 <function>delete</function> in such cases will fail.
3313 Prevent exceptions from propagating into the C code (use a catch-all
3314 block at the top level of all <literal>extern C</literal> functions). This
3315 is necessary even if the C++ code does not explicitly throw any
3316 exceptions, because events like out-of-memory can still throw
3317 exceptions. Any exceptions must be caught and appropriate errors
3318 passed back to the C interface. If possible, compile C++ with
3319 <option>-fno-exceptions</option> to eliminate exceptions entirely; in such
3320 cases, you must check for failures in your C++ code, e.g. check for
3321 NULL returned by <function>new()</function>.
3326 If calling backend functions from C++ code, be sure that the
3327 C++ call stack contains only plain old data structures
3328 (<acronym>POD</acronym>). This is necessary because backend errors
3329 generate a distant <function>longjmp()</function> that does not properly
3330 unroll a C++ call stack with non-POD objects.
3337 In summary, it is best to place C++ code behind a wall of
3338 <literal>extern C</literal> functions that interface to the backend,
3339 and avoid exception, memory, and call stack leakage.