1 <!-- doc/src/sgml/xfunc.sgml -->
4 <title>User-defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
12 <productname>PostgreSQL</productname> provides four kinds of
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
26 (<xref linkend="xfunc-pl">)
31 internal functions (<xref linkend="xfunc-internal">)
36 C-language functions (<xref linkend="xfunc-c">)
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"> command to
67 understand the examples better. Some examples from this chapter
68 can be found in <filename>funcs.sql</filename> and
69 <filename>funcs.c</filename> in the <filename>src/tutorial</>
70 directory in the <productname>PostgreSQL</productname> source
75 <sect1 id="xfunc-sql">
76 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
78 <indexterm zone="xfunc-sql">
79 <primary>function</primary>
80 <secondary>user-defined</secondary>
81 <tertiary>in SQL</tertiary>
85 SQL functions execute an arbitrary list of SQL statements, returning
86 the result of the last query in the list.
87 In the simple (non-set)
88 case, the first row of the last query's result will be returned.
89 (Bear in mind that <quote>the first row</quote> of a multirow
90 result is not well-defined unless you use <literal>ORDER BY</>.)
91 If the last query happens
92 to return no rows at all, the null value will be returned.
96 Alternatively, an SQL function can be declared to return a set (that is,
97 multiple rows) by specifying the function's return type as <literal>SETOF
98 <replaceable>sometype</></literal>, or equivalently by declaring it as
99 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>. In this case
100 all rows of the last query's result are returned. Further details appear
105 The body of an SQL function must be a list of SQL
106 statements separated by semicolons. A semicolon after the last
107 statement is optional. Unless the function is declared to return
108 <type>void</>, the last statement must be a <command>SELECT</>,
109 or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
110 that has a <literal>RETURNING</> clause.
114 Any collection of commands in the <acronym>SQL</acronym>
115 language can be packaged together and defined as a function.
116 Besides <command>SELECT</command> queries, the commands can include data
117 modification queries (<command>INSERT</command>,
118 <command>UPDATE</command>, and <command>DELETE</command>), as well as
119 other SQL commands. (You cannot use transaction control commands, e.g.
120 <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
121 commands, e.g. <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
122 However, the final command
123 must be a <command>SELECT</command> or have a <literal>RETURNING</>
124 clause that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</>.
128 For example, this function removes rows with negative salaries from
129 the <literal>emp</> table:
132 CREATE FUNCTION clean_emp() RETURNS void AS '
148 The entire body of a SQL function is parsed before any of it is
149 executed. While a SQL function can contain commands that alter
150 the system catalogs (e.g., <command>CREATE TABLE</>), the effects
151 of such commands will not be visible during parse analysis of
152 later commands in the function. Thus, for example,
153 <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
154 will not work as desired if packaged up into a single SQL function,
155 since <structname>foo</> won't exist yet when the <command>INSERT</>
156 command is parsed. It's recommended to use <application>PL/pgSQL</>
157 instead of a SQL function in this type of situation.
162 The syntax of the <command>CREATE FUNCTION</command> command requires
163 the function body to be written as a string constant. It is usually
164 most convenient to use dollar quoting (see <xref
165 linkend="sql-syntax-dollar-quoting">) for the string constant.
166 If you choose to use regular single-quoted string constant syntax,
167 you must double single quote marks (<literal>'</>) and backslashes
168 (<literal>\</>) (assuming escape string syntax) in the body of
169 the function (see <xref linkend="sql-syntax-strings">).
172 <sect2 id="xfunc-sql-function-arguments">
173 <title>Arguments for <acronym>SQL</acronym> Functions</title>
176 <primary>function</primary>
177 <secondary>named argument</secondary>
181 Arguments of a SQL function can be referenced in the function
182 body using either names or numbers. Examples of both methods appear
187 To use a name, declare the function argument as having a name, and
188 then just write that name in the function body. If the argument name
189 is the same as any column name in the current SQL command within the
190 function, the column name will take precedence. To override this,
191 qualify the argument name with the name of the function itself, that is
192 <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
193 (If this would conflict with a qualified column name, again the column
194 name wins. You can avoid the ambiguity by choosing a different alias for
195 the table within the SQL command.)
199 In the older numeric approach, arguments are referenced using the syntax
200 <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
201 argument, <literal>$2</> to the second, and so on. This will work
202 whether or not the particular argument was declared with a name.
206 If an argument is of a composite type, then the dot notation,
207 e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
208 <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
209 argument. Again, you might need to qualify the argument's name with the
210 function name to make the form with an argument name unambiguous.
214 SQL function arguments can only be used as data values,
215 not as identifiers. Thus for example this is reasonable:
217 INSERT INTO mytable VALUES ($1);
219 but this will not work:
221 INSERT INTO $1 VALUES (42);
227 The ability to use names to reference SQL function arguments was added
228 in <productname>PostgreSQL</productname> 9.2. Functions to be used in
229 older servers must use the <literal>$<replaceable>n</></> notation.
234 <sect2 id="xfunc-sql-base-functions">
235 <title><acronym>SQL</acronym> Functions on Base Types</title>
238 The simplest possible <acronym>SQL</acronym> function has no arguments and
239 simply returns a base type, such as <type>integer</type>:
242 CREATE FUNCTION one() RETURNS integer AS $$
246 -- Alternative syntax for string literal:
247 CREATE FUNCTION one() RETURNS integer AS '
260 Notice that we defined a column alias within the function body for the result of the function
261 (with the name <literal>result</>), but this column alias is not visible
262 outside the function. Hence, the result is labeled <literal>one</>
263 instead of <literal>result</>.
267 It is almost as easy to define <acronym>SQL</acronym> functions
268 that take base types as arguments:
271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
275 SELECT add_em(1, 2) AS answer;
284 Alternatively, we could dispense with names for the arguments and
288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
292 SELECT add_em(1, 2) AS answer;
301 Here is a more useful function, which might be used to debit a
305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
307 SET balance = balance - debit
308 WHERE accountno = tf1.accountno;
313 A user could execute this function to debit account 17 by $100.00 as
317 SELECT tf1(17, 100.0);
322 In this example, we chose the name <literal>accountno</> for the first
323 argument, but this is the same as the name of a column in the
324 <literal>bank</> table. Within the <command>UPDATE</> command,
325 <literal>accountno</> refers to the column <literal>bank.accountno</>,
326 so <literal>tf1.accountno</> must be used to refer to the argument.
327 We could of course avoid this by using a different name for the argument.
331 In practice one would probably like a more useful result from the
332 function than a constant 1, so a more likely definition
336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
338 SET balance = balance - debit
339 WHERE accountno = tf1.accountno;
340 SELECT balance FROM bank WHERE accountno = tf1.accountno;
344 which adjusts the balance and returns the new balance.
345 The same thing could be done in one command using <literal>RETURNING</>:
348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
350 SET balance = balance - debit
351 WHERE accountno = tf1.accountno
358 <sect2 id="xfunc-sql-composite-functions">
359 <title><acronym>SQL</acronym> Functions on Composite Types</title>
362 When writing functions with arguments of composite types, we must not
363 only specify which argument we want but also the desired attribute
364 (field) of that argument. For example, suppose that
365 <type>emp</type> is a table containing employee data, and therefore
366 also the name of the composite type of each row of the table. Here
367 is a function <function>double_salary</function> that computes what someone's
368 salary would be if it were doubled:
378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
381 SELECT $1.salary * 2 AS salary;
384 SELECT name, double_salary(emp.*) AS dream
386 WHERE emp.cubicle ~= point '(2,1)';
395 Notice the use of the syntax <literal>$1.salary</literal>
396 to select one field of the argument row value. Also notice
397 how the calling <command>SELECT</> command
398 uses <replaceable>table_name</><literal>.*</> to select
399 the entire current row of a table as a composite value. The table
400 row can alternatively be referenced using just the table name,
403 SELECT name, double_salary(emp) AS dream
405 WHERE emp.cubicle ~= point '(2,1)';
407 but this usage is deprecated since it's easy to get confused.
408 (See <xref linkend="rowtypes-usage"> for details about these
409 two notations for the composite value of a table row.)
413 Sometimes it is handy to construct a composite argument value
414 on-the-fly. This can be done with the <literal>ROW</> construct.
415 For example, we could adjust the data being passed to the function:
417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
423 It is also possible to build a function that returns a composite type.
424 This is an example of a function
425 that returns a single <type>emp</type> row:
428 CREATE FUNCTION new_emp() RETURNS emp AS $$
429 SELECT text 'None' AS name,
432 point '(2,2)' AS cubicle;
436 In this example we have specified each of the attributes
437 with a constant value, but any computation
438 could have been substituted for these constants.
442 Note two important things about defining the function:
447 The select list order in the query must be exactly the same as
448 that in which the columns appear in the table associated
449 with the composite type. (Naming the columns, as we did above,
450 is irrelevant to the system.)
455 You must typecast the expressions to match the
456 definition of the composite type, or you will get errors like this:
459 ERROR: function declared to return emp returns varchar instead of text at column 1
468 A different way to define the same function is:
471 CREATE FUNCTION new_emp() RETURNS emp AS $$
472 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
476 Here we wrote a <command>SELECT</> that returns just a single
477 column of the correct composite type. This isn't really better
478 in this situation, but it is a handy alternative in some cases
479 — for example, if we need to compute the result by calling
480 another function that returns the desired composite value.
484 We could call this function directly either by using it in
491 --------------------------
492 (None,1000.0,25,"(2,2)")
495 or by calling it as a table function:
498 SELECT * FROM new_emp();
500 name | salary | age | cubicle
501 ------+--------+-----+---------
502 None | 1000.0 | 25 | (2,2)
505 The second way is described more fully in <xref
506 linkend="xfunc-sql-table-functions">.
510 When you use a function that returns a composite type,
511 you might want only one field (attribute) from its result.
512 You can do that with syntax like this:
515 SELECT (new_emp()).name;
522 The extra parentheses are needed to keep the parser from getting
523 confused. If you try to do it without them, you get something like this:
526 SELECT new_emp().name;
527 ERROR: syntax error at or near "."
528 LINE 1: SELECT new_emp().name;
534 Another option is to use functional notation for extracting an attribute:
537 SELECT name(new_emp());
544 As explained in <xref linkend="rowtypes-usage">, the field notation and
545 functional notation are equivalent.
549 Another way to use a function returning a composite type is to pass the
550 result to another function that accepts the correct row type as input:
553 CREATE FUNCTION getname(emp) RETURNS text AS $$
557 SELECT getname(new_emp());
566 <sect2 id="xfunc-output-parameters">
567 <title><acronym>SQL</> Functions with Output Parameters</title>
570 <primary>function</primary>
571 <secondary>output parameter</secondary>
575 An alternative way of describing a function's results is to define it
576 with <firstterm>output parameters</>, as in this example:
579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
590 This is not essentially different from the version of <literal>add_em</>
591 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
592 output parameters is that they provide a convenient way of defining
593 functions that return several columns. For example,
596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
597 AS 'SELECT x + y, x * y'
600 SELECT * FROM sum_n_product(11,42);
607 What has essentially happened here is that we have created an anonymous
608 composite type for the result of the function. The above example has
609 the same end result as
612 CREATE TYPE sum_prod AS (sum int, product int);
614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
615 AS 'SELECT $1 + $2, $1 * $2'
619 but not having to bother with the separate composite type definition
620 is often handy. Notice that the names attached to the output parameters
621 are not just decoration, but determine the column names of the anonymous
622 composite type. (If you omit a name for an output parameter, the
623 system will choose a name on its own.)
627 Notice that output parameters are not included in the calling argument
628 list when invoking such a function from SQL. This is because
629 <productname>PostgreSQL</productname> considers only the input
630 parameters to define the function's calling signature. That means
631 also that only the input parameters matter when referencing the function
632 for purposes such as dropping it. We could drop the above function
636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
637 DROP FUNCTION sum_n_product (int, int);
642 Parameters can be marked as <literal>IN</> (the default),
643 <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
645 parameter serves as both an input parameter (part of the calling
646 argument list) and an output parameter (part of the result record type).
647 <literal>VARIADIC</> parameters are input parameters, but are treated
648 specially as described next.
652 <sect2 id="xfunc-sql-variadic-functions">
653 <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
656 <primary>function</primary>
657 <secondary>variadic</secondary>
661 <primary>variadic function</primary>
665 <acronym>SQL</acronym> functions can be declared to accept
666 variable numbers of arguments, so long as all the <quote>optional</>
667 arguments are of the same data type. The optional arguments will be
668 passed to the function as an array. The function is declared by
669 marking the last parameter as <literal>VARIADIC</>; this parameter
670 must be declared as being of an array type. For example:
673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
674 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
677 SELECT mleast(10, -1, 5, 4.4);
684 Effectively, all the actual arguments at or beyond the
685 <literal>VARIADIC</> position are gathered up into a one-dimensional
686 array, as if you had written
689 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
692 You can't actually write that, though — or at least, it will
693 not match this function definition. A parameter marked
694 <literal>VARIADIC</> matches one or more occurrences of its element
695 type, not of its own type.
699 Sometimes it is useful to be able to pass an already-constructed array
700 to a variadic function; this is particularly handy when one variadic
701 function wants to pass on its array parameter to another one. You can
702 do that by specifying <literal>VARIADIC</> in the call:
705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
708 This prevents expansion of the function's variadic parameter into its
709 element type, thereby allowing the array argument value to match
710 normally. <literal>VARIADIC</> can only be attached to the last
711 actual argument of a function call.
715 Specifying <literal>VARIADIC</> in the call is also the only way to
716 pass an empty array to a variadic function, for example:
719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
722 Simply writing <literal>SELECT mleast()</> does not work because a
723 variadic parameter must match at least one actual argument.
724 (You could define a second function also named <literal>mleast</>,
725 with no parameters, if you wanted to allow such calls.)
729 The array element parameters generated from a variadic parameter are
730 treated as not having any names of their own. This means it is not
731 possible to call a variadic function using named arguments (<xref
732 linkend="sql-syntax-calling-funcs">), except when you specify
733 <literal>VARIADIC</>. For example, this will work:
736 SELECT mleast(VARIADIC arr => ARRAY[10, -1, 5, 4.4]);
742 SELECT mleast(arr => 10);
743 SELECT mleast(arr => ARRAY[10, -1, 5, 4.4]);
748 <sect2 id="xfunc-sql-parameter-defaults">
749 <title><acronym>SQL</> Functions with Default Values for Arguments</title>
752 <primary>function</primary>
753 <secondary>default values for arguments</secondary>
757 Functions can be declared with default values for some or all input
758 arguments. The default values are inserted whenever the function is
759 called with insufficiently many actual arguments. Since arguments
760 can only be omitted from the end of the actual argument list, all
761 parameters after a parameter with a default value have to have
762 default values as well. (Although the use of named argument notation
763 could allow this restriction to be relaxed, it's still enforced so that
764 positional argument notation works sensibly.)
770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
777 SELECT foo(10, 20, 30);
795 SELECT foo(); -- fails since there is no default for the first argument
796 ERROR: function foo() does not exist
798 The <literal>=</literal> sign can also be used in place of the
799 key word <literal>DEFAULT</literal>.
803 <sect2 id="xfunc-sql-table-functions">
804 <title><acronym>SQL</acronym> Functions as Table Sources</title>
807 All SQL functions can be used in the <literal>FROM</> clause of a query,
808 but it is particularly useful for functions returning composite types.
809 If the function is defined to return a base type, the table function
810 produces a one-column table. If the function is defined to return
811 a composite type, the table function produces a column for each attribute
812 of the composite type.
819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
820 INSERT INTO foo VALUES (1, 1, 'Joe');
821 INSERT INTO foo VALUES (1, 2, 'Ed');
822 INSERT INTO foo VALUES (2, 1, 'Mary');
824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
825 SELECT * FROM foo WHERE fooid = $1;
828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
830 fooid | foosubid | fooname | upper
831 -------+----------+---------+-------
836 As the example shows, we can work with the columns of the function's
837 result just the same as if they were columns of a regular table.
841 Note that we only got one row out of the function. This is because
842 we did not use <literal>SETOF</>. That is described in the next section.
846 <sect2 id="xfunc-sql-functions-returning-set">
847 <title><acronym>SQL</acronym> Functions Returning Sets</title>
850 <primary>function</primary>
851 <secondary>with SETOF</secondary>
855 When an SQL function is declared as returning <literal>SETOF
856 <replaceable>sometype</></literal>, the function's final
857 query is executed to completion, and each row it
858 outputs is returned as an element of the result set.
862 This feature is normally used when calling the function in the <literal>FROM</>
863 clause. In this case each row returned by the function becomes
864 a row of the table seen by the query. For example, assume that
865 table <literal>foo</> has the same contents as above, and we say:
868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
869 SELECT * FROM foo WHERE fooid = $1;
872 SELECT * FROM getfoo(1) AS t1;
877 fooid | foosubid | fooname
878 -------+----------+---------
886 It is also possible to return multiple rows with the columns defined by
887 output parameters, like this:
890 CREATE TABLE tab (y int, z int);
891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
896 SELECT $1 + tab.y, $1 * tab.y FROM tab;
899 SELECT * FROM sum_n_product_with_tab(10);
909 The key point here is that you must write <literal>RETURNS SETOF record</>
910 to indicate that the function returns multiple rows instead of just one.
911 If there is only one output parameter, write that parameter's type
912 instead of <type>record</>.
916 It is frequently useful to construct a query's result by invoking a
917 set-returning function multiple times, with the parameters for each
918 invocation coming from successive rows of a table or subquery. The
919 preferred way to do this is to use the <literal>LATERAL</> key word,
920 which is described in <xref linkend="queries-lateral">.
921 Here is an example using a set-returning function to enumerate
922 elements of a tree structure:
936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
937 SELECT name FROM nodes WHERE parent = $1
938 $$ LANGUAGE SQL STABLE;
940 SELECT * FROM listchildren('Top');
948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
959 This example does not do anything that we couldn't have done with a
960 simple join, but in more complex calculations the option to put
961 some of the work into a function can be quite convenient.
965 Functions returning sets can also be called in the select list
966 of a query. For each row that the query
967 generates by itself, the set-returning function is invoked, and an output
968 row is generated for each element of the function's result set.
969 The previous example could also be done with queries like
973 SELECT listchildren('Top');
981 SELECT name, listchildren(name) FROM nodes;
983 --------+--------------
992 In the last <command>SELECT</command>,
993 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
994 This happens because <function>listchildren</function> returns an empty set
995 for those arguments, so no result rows are generated. This is the same
996 behavior as we got from an inner join to the function result when using
997 the <literal>LATERAL</> syntax.
1001 <productname>PostgreSQL</>'s behavior for a set-returning function in a
1002 query's select list is almost exactly the same as if the set-returning
1003 function had been written in a <literal>LATERAL FROM</>-clause item
1004 instead. For example,
1006 SELECT x, generate_series(1,5) AS g FROM tab;
1008 is almost equivalent to
1010 SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
1012 It would be exactly the same, except that in this specific example,
1013 the planner could choose to put <structname>g</> on the outside of the
1014 nestloop join, since <structname>g</> has no actual lateral dependency
1015 on <structname>tab</>. That would result in a different output row
1016 order. Set-returning functions in the select list are always evaluated
1017 as though they are on the inside of a nestloop join with the rest of
1018 the <literal>FROM</> clause, so that the function(s) are run to
1019 completion before the next row from the <literal>FROM</> clause is
1024 If there is more than one set-returning function in the query's select
1025 list, the behavior is similar to what you get from putting the functions
1026 into a single <literal>LATERAL ROWS FROM( ... )</> <literal>FROM</>-clause
1027 item. For each row from the underlying query, there is an output row
1028 using the first result from each function, then an output row using the
1029 second result, and so on. If some of the set-returning functions
1030 produce fewer outputs than others, null values are substituted for the
1031 missing data, so that the total number of rows emitted for one
1032 underlying row is the same as for the set-returning function that
1033 produced the most outputs. Thus the set-returning functions
1034 run <quote>in lockstep</> until they are all exhausted, and then
1035 execution continues with the next underlying row.
1039 Set-returning functions can be nested in a select list, although that is
1040 not allowed in <literal>FROM</>-clause items. In such cases, each level
1041 of nesting is treated separately, as though it were
1042 a separate <literal>LATERAL ROWS FROM( ... )</> item. For example, in
1044 SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
1046 the set-returning functions <function>srf2</>, <function>srf3</>,
1047 and <function>srf5</> would be run in lockstep for each row
1048 of <structname>tab</>, and then <function>srf1</> and <function>srf4</>
1049 would be applied in lockstep to each row produced by the lower
1054 Set-returning functions cannot be used within conditional-evaluation
1055 constructs, such as <literal>CASE</> or <literal>COALESCE</>. For
1058 SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
1060 It might seem that this should produce five repetitions of input rows
1061 that have <literal>x > 0</>, and a single repetition of those that do
1062 not; but actually, because <function>generate_series(1, 5)</> would be
1063 run in an implicit <literal>LATERAL FROM</> item before
1064 the <literal>CASE</> expression is ever evaluated, it would produce five
1065 repetitions of every input row. To reduce confusion, such cases produce
1066 a parse-time error instead.
1071 If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1072 or <command>DELETE</> with <literal>RETURNING</>, that command will
1073 always be executed to completion, even if the function is not declared
1074 with <literal>SETOF</> or the calling query does not fetch all the
1075 result rows. Any extra rows produced by the <literal>RETURNING</>
1076 clause are silently dropped, but the commanded table modifications
1077 still happen (and are all completed before returning from the function).
1083 Before <productname>PostgreSQL</> 10, putting more than one
1084 set-returning function in the same select list did not behave very
1085 sensibly unless they always produced equal numbers of rows. Otherwise,
1086 what you got was a number of output rows equal to the least common
1087 multiple of the numbers of rows produced by the set-returning
1088 functions. Also, nested set-returning functions did not work as
1089 described above; instead, a set-returning function could have at most
1090 one set-returning argument, and each nest of set-returning functions
1091 was run independently. Also, conditional execution (set-returning
1092 functions inside <literal>CASE</> etc) was previously allowed,
1093 complicating things even more.
1094 Use of the <literal>LATERAL</> syntax is recommended when writing
1095 queries that need to work in older <productname>PostgreSQL</> versions,
1096 because that will give consistent results across different versions.
1097 If you have a query that is relying on conditional execution of a
1098 set-returning function, you may be able to fix it by moving the
1099 conditional test into a custom set-returning function. For example,
1101 SELECT x, CASE WHEN y > 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
1105 CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
1106 RETURNS SETOF int AS $$
1109 RETURN QUERY SELECT generate_series(start, fin);
1111 RETURN QUERY SELECT els;
1113 END$$ LANGUAGE plpgsql;
1115 SELECT x, case_generate_series(y > 0, 1, z, 5) FROM tab;
1117 This formulation will work the same in all versions
1118 of <productname>PostgreSQL</>.
1123 <sect2 id="xfunc-sql-functions-returning-table">
1124 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1127 <primary>function</primary>
1128 <secondary>RETURNS TABLE</secondary>
1132 There is another way to declare a function as returning a set,
1133 which is to use the syntax
1134 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1135 This is equivalent to using one or more <literal>OUT</> parameters plus
1136 marking the function as returning <literal>SETOF record</> (or
1137 <literal>SETOF</> a single output parameter's type, as appropriate).
1138 This notation is specified in recent versions of the SQL standard, and
1139 thus may be more portable than using <literal>SETOF</>.
1143 For example, the preceding sum-and-product example could also be
1147 CREATE FUNCTION sum_n_product_with_tab (x int)
1148 RETURNS TABLE(sum int, product int) AS $$
1149 SELECT $1 + tab.y, $1 * tab.y FROM tab;
1153 It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1154 parameters with the <literal>RETURNS TABLE</> notation — you must
1155 put all the output columns in the <literal>TABLE</> list.
1160 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1163 <acronym>SQL</acronym> functions can be declared to accept and
1164 return the polymorphic types <type>anyelement</type>,
1165 <type>anyarray</type>, <type>anynonarray</type>,
1166 <type>anyenum</type>, and <type>anyrange</type>. See <xref
1167 linkend="extend-types-polymorphic"> for a more detailed
1168 explanation of polymorphic functions. Here is a polymorphic
1169 function <function>make_array</function> that builds up an array
1170 from two arbitrary data type elements:
1172 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1173 SELECT ARRAY[$1, $2];
1176 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1177 intarray | textarray
1178 ----------+-----------
1185 Notice the use of the typecast <literal>'a'::text</literal>
1186 to specify that the argument is of type <type>text</type>. This is
1187 required if the argument is just a string literal, since otherwise
1188 it would be treated as type
1189 <type>unknown</type>, and array of <type>unknown</type> is not a valid
1191 Without the typecast, you will get errors like this:
1194 ERROR: could not determine polymorphic type because input has type "unknown"
1200 It is permitted to have polymorphic arguments with a fixed
1201 return type, but the converse is not. For example:
1203 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1207 SELECT is_greater(1, 2);
1213 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1216 ERROR: cannot determine result data type
1217 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
1222 Polymorphism can be used with functions that have output arguments.
1225 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1226 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1228 SELECT * FROM dup(22);
1237 Polymorphism can also be used with variadic functions.
1240 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1241 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1244 SELECT anyleast(10, -1, 5, 4);
1250 SELECT anyleast('abc'::text, 'def');
1256 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1257 SELECT array_to_string($2, $1);
1260 SELECT concat_values('|', 1, 4, 2);
1270 <title><acronym>SQL</acronym> Functions with Collations</title>
1273 <primary>collation</>
1274 <secondary>in SQL functions</>
1278 When a SQL function has one or more parameters of collatable data types,
1279 a collation is identified for each function call depending on the
1280 collations assigned to the actual arguments, as described in <xref
1281 linkend="collation">. If a collation is successfully identified
1282 (i.e., there are no conflicts of implicit collations among the arguments)
1283 then all the collatable parameters are treated as having that collation
1284 implicitly. This will affect the behavior of collation-sensitive
1285 operations within the function. For example, using the
1286 <function>anyleast</> function described above, the result of
1288 SELECT anyleast('abc'::text, 'ABC');
1290 will depend on the database's default collation. In <literal>C</> locale
1291 the result will be <literal>ABC</>, but in many other locales it will
1292 be <literal>abc</>. The collation to use can be forced by adding
1293 a <literal>COLLATE</> clause to any of the arguments, for example
1295 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1297 Alternatively, if you wish a function to operate with a particular
1298 collation regardless of what it is called with, insert
1299 <literal>COLLATE</> clauses as needed in the function definition.
1300 This version of <function>anyleast</> would always use <literal>en_US</>
1301 locale to compare strings:
1303 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1304 SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1307 But note that this will throw an error if applied to a non-collatable
1312 If no common collation can be identified among the actual arguments,
1313 then a SQL function treats its parameters as having their data types'
1314 default collation (which is usually the database's default collation,
1315 but could be different for parameters of domain types).
1319 The behavior of collatable parameters can be thought of as a limited
1320 form of polymorphism, applicable only to textual data types.
1325 <sect1 id="xfunc-overload">
1326 <title>Function Overloading</title>
1328 <indexterm zone="xfunc-overload">
1329 <primary>overloading</primary>
1330 <secondary>functions</secondary>
1334 More than one function can be defined with the same SQL name, so long
1335 as the arguments they take are different. In other words,
1336 function names can be <firstterm>overloaded</firstterm>. When a
1337 query is executed, the server will determine which function to
1338 call from the data types and the number of the provided arguments.
1339 Overloading can also be used to simulate functions with a variable
1340 number of arguments, up to a finite maximum number.
1344 When creating a family of overloaded functions, one should be
1345 careful not to create ambiguities. For instance, given the
1348 CREATE FUNCTION test(int, real) RETURNS ...
1349 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1351 it is not immediately clear which function would be called with
1352 some trivial input like <literal>test(1, 1.5)</literal>. The
1353 currently implemented resolution rules are described in
1354 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1355 relies on this behavior.
1359 A function that takes a single argument of a composite type should
1360 generally not have the same name as any attribute (field) of that type.
1361 Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1362 is considered equivalent
1363 to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1364 In the case that there is an
1365 ambiguity between a function on a composite type and an attribute of
1366 the composite type, the attribute will always be used. It is possible
1367 to override that choice by schema-qualifying the function name
1368 (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1369 </literal>) but it's better to
1370 avoid the problem by not choosing conflicting names.
1374 Another possible conflict is between variadic and non-variadic functions.
1375 For instance, it is possible to create both <literal>foo(numeric)</> and
1376 <literal>foo(VARIADIC numeric[])</>. In this case it is unclear which one
1377 should be matched to a call providing a single numeric argument, such as
1378 <literal>foo(10.1)</>. The rule is that the function appearing
1379 earlier in the search path is used, or if the two functions are in the
1380 same schema, the non-variadic one is preferred.
1384 When overloading C-language functions, there is an additional
1385 constraint: The C name of each function in the family of
1386 overloaded functions must be different from the C names of all
1387 other functions, either internal or dynamically loaded. If this
1388 rule is violated, the behavior is not portable. You might get a
1389 run-time linker error, or one of the functions will get called
1390 (usually the internal one). The alternative form of the
1391 <literal>AS</> clause for the SQL <command>CREATE
1392 FUNCTION</command> command decouples the SQL function name from
1393 the function name in the C source code. For instance:
1395 CREATE FUNCTION test(int) RETURNS int
1396 AS '<replaceable>filename</>', 'test_1arg'
1398 CREATE FUNCTION test(int, int) RETURNS int
1399 AS '<replaceable>filename</>', 'test_2arg'
1402 The names of the C functions here reflect one of many possible conventions.
1406 <sect1 id="xfunc-volatility">
1407 <title>Function Volatility Categories</title>
1409 <indexterm zone="xfunc-volatility">
1410 <primary>volatility</primary>
1411 <secondary>functions</secondary>
1413 <indexterm zone="xfunc-volatility">
1414 <primary>VOLATILE</primary>
1416 <indexterm zone="xfunc-volatility">
1417 <primary>STABLE</primary>
1419 <indexterm zone="xfunc-volatility">
1420 <primary>IMMUTABLE</primary>
1424 Every function has a <firstterm>volatility</> classification, with
1425 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1426 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
1427 <xref linkend="sql-createfunction">
1428 command does not specify a category. The volatility category is a
1429 promise to the optimizer about the behavior of the function:
1434 A <literal>VOLATILE</> function can do anything, including modifying
1435 the database. It can return different results on successive calls with
1436 the same arguments. The optimizer makes no assumptions about the
1437 behavior of such functions. A query using a volatile function will
1438 re-evaluate the function at every row where its value is needed.
1443 A <literal>STABLE</> function cannot modify the database and is
1444 guaranteed to return the same results given the same arguments
1445 for all rows within a single statement. This category allows the
1446 optimizer to optimize multiple calls of the function to a single
1447 call. In particular, it is safe to use an expression containing
1448 such a function in an index scan condition. (Since an index scan
1449 will evaluate the comparison value only once, not once at each
1450 row, it is not valid to use a <literal>VOLATILE</> function in an
1451 index scan condition.)
1456 An <literal>IMMUTABLE</> function cannot modify the database and is
1457 guaranteed to return the same results given the same arguments forever.
1458 This category allows the optimizer to pre-evaluate the function when
1459 a query calls it with constant arguments. For example, a query like
1460 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1461 <literal>SELECT ... WHERE x = 4</>, because the function underlying
1462 the integer addition operator is marked <literal>IMMUTABLE</>.
1469 For best optimization results, you should label your functions with the
1470 strictest volatility category that is valid for them.
1474 Any function with side-effects <emphasis>must</> be labeled
1475 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1476 Even a function with no side-effects needs to be labeled
1477 <literal>VOLATILE</> if its value can change within a single query;
1478 some examples are <literal>random()</>, <literal>currval()</>,
1479 <literal>timeofday()</>.
1483 Another important example is that the <function>current_timestamp</>
1484 family of functions qualify as <literal>STABLE</>, since their values do
1485 not change within a transaction.
1489 There is relatively little difference between <literal>STABLE</> and
1490 <literal>IMMUTABLE</> categories when considering simple interactive
1491 queries that are planned and immediately executed: it doesn't matter
1492 a lot whether a function is executed once during planning or once during
1493 query execution startup. But there is a big difference if the plan is
1494 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
1495 it really isn't might allow it to be prematurely folded to a constant during
1496 planning, resulting in a stale value being re-used during subsequent uses
1497 of the plan. This is a hazard when using prepared statements or when
1498 using function languages that cache plans (such as
1499 <application>PL/pgSQL</>).
1503 For functions written in SQL or in any of the standard procedural
1504 languages, there is a second important property determined by the
1505 volatility category, namely the visibility of any data changes that have
1506 been made by the SQL command that is calling the function. A
1507 <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1508 or <literal>IMMUTABLE</> function will not. This behavior is implemented
1509 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1510 <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1511 established as of the start of the calling query, whereas
1512 <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1513 each query they execute.
1518 Functions written in C can manage snapshots however they want, but it's
1519 usually a good idea to make C functions work this way too.
1524 Because of this snapshotting behavior,
1525 a function containing only <command>SELECT</> commands can safely be
1526 marked <literal>STABLE</>, even if it selects from tables that might be
1527 undergoing modifications by concurrent queries.
1528 <productname>PostgreSQL</productname> will execute all commands of a
1529 <literal>STABLE</> function using the snapshot established for the
1530 calling query, and so it will see a fixed view of the database throughout
1535 The same snapshotting behavior is used for <command>SELECT</> commands
1536 within <literal>IMMUTABLE</> functions. It is generally unwise to select
1537 from database tables within an <literal>IMMUTABLE</> function at all,
1538 since the immutability will be broken if the table contents ever change.
1539 However, <productname>PostgreSQL</productname> does not enforce that you
1544 A common error is to label a function <literal>IMMUTABLE</> when its
1545 results depend on a configuration parameter. For example, a function
1546 that manipulates timestamps might well have results that depend on the
1547 <xref linkend="guc-timezone"> setting. For safety, such functions should
1548 be labeled <literal>STABLE</> instead.
1553 <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1554 and <literal>IMMUTABLE</> functions contain no SQL commands other
1555 than <command>SELECT</> to prevent data modification.
1556 (This is not a completely bulletproof test, since such functions could
1557 still call <literal>VOLATILE</> functions that modify the database.
1558 If you do that, you will find that the <literal>STABLE</> or
1559 <literal>IMMUTABLE</> function does not notice the database changes
1560 applied by the called function, since they are hidden from its snapshot.)
1565 <sect1 id="xfunc-pl">
1566 <title>Procedural Language Functions</title>
1569 <productname>PostgreSQL</productname> allows user-defined functions
1570 to be written in other languages besides SQL and C. These other
1571 languages are generically called <firstterm>procedural
1572 languages</firstterm> (<acronym>PL</>s).
1573 Procedural languages aren't built into the
1574 <productname>PostgreSQL</productname> server; they are offered
1575 by loadable modules.
1576 See <xref linkend="xplang"> and following chapters for more
1581 <sect1 id="xfunc-internal">
1582 <title>Internal Functions</title>
1584 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1587 Internal functions are functions written in C that have been statically
1588 linked into the <productname>PostgreSQL</productname> server.
1589 The <quote>body</quote> of the function definition
1590 specifies the C-language name of the function, which need not be the
1591 same as the name being declared for SQL use.
1592 (For reasons of backward compatibility, an empty body
1593 is accepted as meaning that the C-language function name is the
1594 same as the SQL name.)
1598 Normally, all internal functions present in the
1599 server are declared during the initialization of the database cluster
1600 (see <xref linkend="creating-cluster">),
1601 but a user could use <command>CREATE FUNCTION</command>
1602 to create additional alias names for an internal function.
1603 Internal functions are declared in <command>CREATE FUNCTION</command>
1604 with language name <literal>internal</literal>. For instance, to
1605 create an alias for the <function>sqrt</function> function:
1607 CREATE FUNCTION square_root(double precision) RETURNS double precision
1612 (Most internal functions expect to be declared <quote>strict</quote>.)
1617 Not all <quote>predefined</quote> functions are
1618 <quote>internal</quote> in the above sense. Some predefined
1619 functions are written in SQL.
1624 <sect1 id="xfunc-c">
1625 <title>C-Language Functions</title>
1627 <indexterm zone="xfunc-c">
1628 <primary>function</primary>
1629 <secondary>user-defined</secondary>
1630 <tertiary>in C</tertiary>
1634 User-defined functions can be written in C (or a language that can
1635 be made compatible with C, such as C++). Such functions are
1636 compiled into dynamically loadable objects (also called shared
1637 libraries) and are loaded by the server on demand. The dynamic
1638 loading feature is what distinguishes <quote>C language</> functions
1639 from <quote>internal</> functions — the actual coding conventions
1640 are essentially the same for both. (Hence, the standard internal
1641 function library is a rich source of coding examples for user-defined
1646 Currently only one calling convention is used for C functions
1647 (<quote>version 1</quote>). Support for that calling convention is
1648 indicated by writing a <literal>PG_FUNCTION_INFO_V1()</literal> macro
1649 call for the function, as illustrated below.
1652 <sect2 id="xfunc-c-dynload">
1653 <title>Dynamic Loading</title>
1655 <indexterm zone="xfunc-c-dynload">
1656 <primary>dynamic loading</primary>
1660 The first time a user-defined function in a particular
1661 loadable object file is called in a session,
1662 the dynamic loader loads that object file into memory so that the
1663 function can be called. The <command>CREATE FUNCTION</command>
1664 for a user-defined C function must therefore specify two pieces of
1665 information for the function: the name of the loadable
1666 object file, and the C name (link symbol) of the specific function to call
1667 within that object file. If the C name is not explicitly specified then
1668 it is assumed to be the same as the SQL function name.
1672 The following algorithm is used to locate the shared object file
1673 based on the name given in the <command>CREATE FUNCTION</command>
1679 If the name is an absolute path, the given file is loaded.
1685 If the name starts with the string <literal>$libdir</literal>,
1686 that part is replaced by the <productname>PostgreSQL</> package
1688 name, which is determined at build time.<indexterm><primary>$libdir</></>
1694 If the name does not contain a directory part, the file is
1695 searched for in the path specified by the configuration variable
1696 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1702 Otherwise (the file was not found in the path, or it contains a
1703 non-absolute directory part), the dynamic loader will try to
1704 take the name as given, which will most likely fail. (It is
1705 unreliable to depend on the current working directory.)
1710 If this sequence does not work, the platform-specific shared
1711 library file name extension (often <filename>.so</filename>) is
1712 appended to the given name and this sequence is tried again. If
1713 that fails as well, the load will fail.
1717 It is recommended to locate shared libraries either relative to
1718 <literal>$libdir</literal> or through the dynamic library path.
1719 This simplifies version upgrades if the new installation is at a
1720 different location. The actual directory that
1721 <literal>$libdir</literal> stands for can be found out with the
1722 command <literal>pg_config --pkglibdir</literal>.
1726 The user ID the <productname>PostgreSQL</productname> server runs
1727 as must be able to traverse the path to the file you intend to
1728 load. Making the file or a higher-level directory not readable
1729 and/or not executable by the <systemitem>postgres</systemitem>
1730 user is a common mistake.
1734 In any case, the file name that is given in the
1735 <command>CREATE FUNCTION</command> command is recorded literally
1736 in the system catalogs, so if the file needs to be loaded again
1737 the same procedure is applied.
1742 <productname>PostgreSQL</productname> will not compile a C function
1743 automatically. The object file must be compiled before it is referenced
1744 in a <command>CREATE
1745 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1750 <indexterm zone="xfunc-c-dynload">
1751 <primary>magic block</primary>
1755 To ensure that a dynamically loaded object file is not loaded into an
1756 incompatible server, <productname>PostgreSQL</productname> checks that the
1757 file contains a <quote>magic block</> with the appropriate contents.
1758 This allows the server to detect obvious incompatibilities, such as code
1759 compiled for a different major version of
1760 <productname>PostgreSQL</productname>. A magic block is required as of
1761 <productname>PostgreSQL</productname> 8.2. To include a magic block,
1762 write this in one (and only one) of the module source files, after having
1763 included the header <filename>fmgr.h</>:
1766 #ifdef PG_MODULE_MAGIC
1771 The <literal>#ifdef</> test can be omitted if the code doesn't
1772 need to compile against pre-8.2 <productname>PostgreSQL</productname>
1777 After it is used for the first time, a dynamically loaded object
1778 file is retained in memory. Future calls in the same session to
1779 the function(s) in that file will only incur the small overhead of
1780 a symbol table lookup. If you need to force a reload of an object
1781 file, for example after recompiling it, begin a fresh session.
1784 <indexterm zone="xfunc-c-dynload">
1785 <primary>_PG_init</primary>
1787 <indexterm zone="xfunc-c-dynload">
1788 <primary>_PG_fini</primary>
1790 <indexterm zone="xfunc-c-dynload">
1791 <primary>library initialization function</primary>
1793 <indexterm zone="xfunc-c-dynload">
1794 <primary>library finalization function</primary>
1798 Optionally, a dynamically loaded file can contain initialization and
1799 finalization functions. If the file includes a function named
1800 <function>_PG_init</>, that function will be called immediately after
1801 loading the file. The function receives no parameters and should
1802 return void. If the file includes a function named
1803 <function>_PG_fini</>, that function will be called immediately before
1804 unloading the file. Likewise, the function receives no parameters and
1805 should return void. Note that <function>_PG_fini</> will only be called
1806 during an unload of the file, not during process termination.
1807 (Presently, unloads are disabled and will never occur, but this may
1808 change in the future.)
1813 <sect2 id="xfunc-c-basetype">
1814 <title>Base Types in C-Language Functions</title>
1816 <indexterm zone="xfunc-c-basetype">
1817 <primary>data type</primary>
1818 <secondary>internal organization</secondary>
1822 To know how to write C-language functions, you need to know how
1823 <productname>PostgreSQL</productname> internally represents base
1824 data types and how they can be passed to and from functions.
1825 Internally, <productname>PostgreSQL</productname> regards a base
1826 type as a <quote>blob of memory</quote>. The user-defined
1827 functions that you define over a type in turn define the way that
1828 <productname>PostgreSQL</productname> can operate on it. That
1829 is, <productname>PostgreSQL</productname> will only store and
1830 retrieve the data from disk and use your user-defined functions
1831 to input, process, and output the data.
1835 Base types can have one of three internal formats:
1840 pass by value, fixed-length
1845 pass by reference, fixed-length
1850 pass by reference, variable-length
1857 By-value types can only be 1, 2, or 4 bytes in length
1858 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1859 You should be careful to define your types such that they will be the
1860 same size (in bytes) on all architectures. For example, the
1861 <literal>long</literal> type is dangerous because it is 4 bytes on some
1862 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1863 on most Unix machines. A reasonable implementation of the
1864 <type>int4</type> type on Unix machines might be:
1867 /* 4-byte integer, passed by value */
1871 (The actual PostgreSQL C code calls this type <type>int32</type>, because
1872 it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1873 means <replaceable>XX</replaceable> <emphasis>bits</emphasis>. Note
1874 therefore also that the C type <type>int8</type> is 1 byte in size. The
1875 SQL type <type>int8</type> is called <type>int64</type> in C. See also
1876 <xref linkend="xfunc-c-type-table">.)
1880 On the other hand, fixed-length types of any size can
1881 be passed by-reference. For example, here is a sample
1882 implementation of a <productname>PostgreSQL</productname> type:
1885 /* 16-byte structure, passed by reference */
1892 Only pointers to such types can be used when passing
1893 them in and out of <productname>PostgreSQL</productname> functions.
1894 To return a value of such a type, allocate the right amount of
1895 memory with <literal>palloc</literal>, fill in the allocated memory,
1896 and return a pointer to it. (Also, if you just want to return the
1897 same value as one of your input arguments that's of the same data type,
1898 you can skip the extra <literal>palloc</literal> and just return the
1899 pointer to the input value.)
1903 Finally, all variable-length types must also be passed
1904 by reference. All variable-length types must begin
1905 with an opaque length field of exactly 4 bytes, which will be set
1906 by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1907 be stored within that type must be located in the memory
1908 immediately following that length field. The
1909 length field contains the total length of the structure,
1910 that is, it includes the size of the length field
1915 Another important point is to avoid leaving any uninitialized bits
1916 within data type values; for example, take care to zero out any
1917 alignment padding bytes that might be present in structs. Without
1918 this, logically-equivalent constants of your data type might be
1919 seen as unequal by the planner, leading to inefficient (though not
1925 <emphasis>Never</> modify the contents of a pass-by-reference input
1926 value. If you do so you are likely to corrupt on-disk data, since
1927 the pointer you are given might point directly into a disk buffer.
1928 The sole exception to this rule is explained in
1929 <xref linkend="xaggr">.
1934 As an example, we can define the type <type>text</type> as
1940 char data[FLEXIBLE_ARRAY_MEMBER];
1944 The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1945 length of the data part is not specified by this declaration.
1950 variable-length types, we must be careful to allocate
1951 the correct amount of memory and set the length field correctly.
1952 For example, if we wanted to store 40 bytes in a <structname>text</>
1953 structure, we might use a code fragment like this:
1955 <programlisting><![CDATA[
1956 #include "postgres.h"
1958 char buffer[40]; /* our source data */
1960 text *destination = (text *) palloc(VARHDRSZ + 40);
1961 SET_VARSIZE(destination, VARHDRSZ + 40);
1962 memcpy(destination->data, buffer, 40);
1967 <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1968 it's considered good style to use the macro <literal>VARHDRSZ</>
1969 to refer to the size of the overhead for a variable-length type.
1970 Also, the length field <emphasis>must</> be set using the
1971 <literal>SET_VARSIZE</> macro, not by simple assignment.
1975 <xref linkend="xfunc-c-type-table"> specifies which C type
1976 corresponds to which SQL type when writing a C-language function
1977 that uses a built-in type of <productname>PostgreSQL</>.
1978 The <quote>Defined In</quote> column gives the header file that
1979 needs to be included to get the type definition. (The actual
1980 definition might be in a different file that is included by the
1981 listed file. It is recommended that users stick to the defined
1982 interface.) Note that you should always include
1983 <filename>postgres.h</filename> first in any source file, because
1984 it declares a number of things that you will need anyway.
1987 <table tocentry="1" id="xfunc-c-type-table">
1988 <title>Equivalent C Types for Built-in SQL Types</title>
2005 <entry><type>abstime</type></entry>
2006 <entry><type>AbsoluteTime</type></entry>
2007 <entry><filename>utils/nabstime.h</filename></entry>
2010 <entry><type>bigint</type> (<type>int8</type>)</entry>
2011 <entry><type>int64</type></entry>
2012 <entry><filename>postgres.h</filename></entry>
2015 <entry><type>boolean</type></entry>
2016 <entry><type>bool</type></entry>
2017 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
2020 <entry><type>box</type></entry>
2021 <entry><type>BOX*</type></entry>
2022 <entry><filename>utils/geo_decls.h</filename></entry>
2025 <entry><type>bytea</type></entry>
2026 <entry><type>bytea*</type></entry>
2027 <entry><filename>postgres.h</filename></entry>
2030 <entry><type>"char"</type></entry>
2031 <entry><type>char</type></entry>
2032 <entry>(compiler built-in)</entry>
2035 <entry><type>character</type></entry>
2036 <entry><type>BpChar*</type></entry>
2037 <entry><filename>postgres.h</filename></entry>
2040 <entry><type>cid</type></entry>
2041 <entry><type>CommandId</type></entry>
2042 <entry><filename>postgres.h</filename></entry>
2045 <entry><type>date</type></entry>
2046 <entry><type>DateADT</type></entry>
2047 <entry><filename>utils/date.h</filename></entry>
2050 <entry><type>smallint</type> (<type>int2</type>)</entry>
2051 <entry><type>int16</type></entry>
2052 <entry><filename>postgres.h</filename></entry>
2055 <entry><type>int2vector</type></entry>
2056 <entry><type>int2vector*</type></entry>
2057 <entry><filename>postgres.h</filename></entry>
2060 <entry><type>integer</type> (<type>int4</type>)</entry>
2061 <entry><type>int32</type></entry>
2062 <entry><filename>postgres.h</filename></entry>
2065 <entry><type>real</type> (<type>float4</type>)</entry>
2066 <entry><type>float4*</type></entry>
2067 <entry><filename>postgres.h</filename></entry>
2070 <entry><type>double precision</type> (<type>float8</type>)</entry>
2071 <entry><type>float8*</type></entry>
2072 <entry><filename>postgres.h</filename></entry>
2075 <entry><type>interval</type></entry>
2076 <entry><type>Interval*</type></entry>
2077 <entry><filename>datatype/timestamp.h</filename></entry>
2080 <entry><type>lseg</type></entry>
2081 <entry><type>LSEG*</type></entry>
2082 <entry><filename>utils/geo_decls.h</filename></entry>
2085 <entry><type>name</type></entry>
2086 <entry><type>Name</type></entry>
2087 <entry><filename>postgres.h</filename></entry>
2090 <entry><type>oid</type></entry>
2091 <entry><type>Oid</type></entry>
2092 <entry><filename>postgres.h</filename></entry>
2095 <entry><type>oidvector</type></entry>
2096 <entry><type>oidvector*</type></entry>
2097 <entry><filename>postgres.h</filename></entry>
2100 <entry><type>path</type></entry>
2101 <entry><type>PATH*</type></entry>
2102 <entry><filename>utils/geo_decls.h</filename></entry>
2105 <entry><type>point</type></entry>
2106 <entry><type>POINT*</type></entry>
2107 <entry><filename>utils/geo_decls.h</filename></entry>
2110 <entry><type>regproc</type></entry>
2111 <entry><type>regproc</type></entry>
2112 <entry><filename>postgres.h</filename></entry>
2115 <entry><type>reltime</type></entry>
2116 <entry><type>RelativeTime</type></entry>
2117 <entry><filename>utils/nabstime.h</filename></entry>
2120 <entry><type>text</type></entry>
2121 <entry><type>text*</type></entry>
2122 <entry><filename>postgres.h</filename></entry>
2125 <entry><type>tid</type></entry>
2126 <entry><type>ItemPointer</type></entry>
2127 <entry><filename>storage/itemptr.h</filename></entry>
2130 <entry><type>time</type></entry>
2131 <entry><type>TimeADT</type></entry>
2132 <entry><filename>utils/date.h</filename></entry>
2135 <entry><type>time with time zone</type></entry>
2136 <entry><type>TimeTzADT</type></entry>
2137 <entry><filename>utils/date.h</filename></entry>
2140 <entry><type>timestamp</type></entry>
2141 <entry><type>Timestamp*</type></entry>
2142 <entry><filename>datatype/timestamp.h</filename></entry>
2145 <entry><type>tinterval</type></entry>
2146 <entry><type>TimeInterval</type></entry>
2147 <entry><filename>utils/nabstime.h</filename></entry>
2150 <entry><type>varchar</type></entry>
2151 <entry><type>VarChar*</type></entry>
2152 <entry><filename>postgres.h</filename></entry>
2155 <entry><type>xid</type></entry>
2156 <entry><type>TransactionId</type></entry>
2157 <entry><filename>postgres.h</filename></entry>
2164 Now that we've gone over all of the possible structures
2165 for base types, we can show some examples of real functions.
2170 <title>Version 1 Calling Conventions</title>
2173 The version-1 calling convention relies on macros to suppress most
2174 of the complexity of passing arguments and results. The C declaration
2175 of a version-1 function is always:
2177 Datum funcname(PG_FUNCTION_ARGS)
2179 In addition, the macro call:
2181 PG_FUNCTION_INFO_V1(funcname);
2183 must appear in the same source file. (Conventionally, it's
2184 written just before the function itself.) This macro call is not
2185 needed for <literal>internal</>-language functions, since
2186 <productname>PostgreSQL</> assumes that all internal functions
2187 use the version-1 convention. It is, however, required for
2188 dynamically-loaded functions.
2192 In a version-1 function, each actual argument is fetched using a
2193 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2194 macro that corresponds to the argument's data type. In non-strict
2195 functions there needs to be a previous check about argument null-ness
2196 using <function>PG_ARGNULL_<replaceable>xxx</replaceable>()</function>.
2197 The result is returned using a
2198 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2199 macro for the return type.
2200 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2201 takes as its argument the number of the function argument to
2202 fetch, where the count starts at 0.
2203 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2204 takes as its argument the actual value to return.
2208 Here are some examples using the version-1 calling convention:
2211 <programlisting><![CDATA[
2212 #include "postgres.h"
2215 #include "utils/geo_decls.h"
2217 #ifdef PG_MODULE_MAGIC
2223 PG_FUNCTION_INFO_V1(add_one);
2226 add_one(PG_FUNCTION_ARGS)
2228 int32 arg = PG_GETARG_INT32(0);
2230 PG_RETURN_INT32(arg + 1);
2233 /* by reference, fixed length */
2235 PG_FUNCTION_INFO_V1(add_one_float8);
2238 add_one_float8(PG_FUNCTION_ARGS)
2240 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2241 float8 arg = PG_GETARG_FLOAT8(0);
2243 PG_RETURN_FLOAT8(arg + 1.0);
2246 PG_FUNCTION_INFO_V1(makepoint);
2249 makepoint(PG_FUNCTION_ARGS)
2251 /* Here, the pass-by-reference nature of Point is not hidden. */
2252 Point *pointx = PG_GETARG_POINT_P(0);
2253 Point *pointy = PG_GETARG_POINT_P(1);
2254 Point *new_point = (Point *) palloc(sizeof(Point));
2256 new_point->x = pointx->x;
2257 new_point->y = pointy->y;
2259 PG_RETURN_POINT_P(new_point);
2262 /* by reference, variable length */
2264 PG_FUNCTION_INFO_V1(copytext);
2267 copytext(PG_FUNCTION_ARGS)
2269 text *t = PG_GETARG_TEXT_PP(0);
2272 * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
2273 * VARHDRSZ or VARHDRSZ_SHORT of its header. Construct the copy with a
2274 * full-length header.
2276 text *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2277 SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2280 * VARDATA is a pointer to the data region of the new struct. The source
2281 * could be a short datum, so retrieve its data through VARDATA_ANY.
2283 memcpy((void *) VARDATA(new_t), /* destination */
2284 (void *) VARDATA_ANY(t), /* source */
2285 VARSIZE_ANY_EXHDR(t)); /* how many bytes */
2286 PG_RETURN_TEXT_P(new_t);
2289 PG_FUNCTION_INFO_V1(concat_text);
2292 concat_text(PG_FUNCTION_ARGS)
2294 text *arg1 = PG_GETARG_TEXT_PP(0);
2295 text *arg2 = PG_GETARG_TEXT_PP(1);
2296 int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
2297 int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
2298 int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
2299 text *new_text = (text *) palloc(new_text_size);
2301 SET_VARSIZE(new_text, new_text_size);
2302 memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
2303 memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
2304 PG_RETURN_TEXT_P(new_text);
2310 Supposing that the above code has been prepared in file
2311 <filename>funcs.c</filename> and compiled into a shared object,
2312 we could define the functions to <productname>PostgreSQL</productname>
2313 with commands like this:
2317 CREATE FUNCTION add_one(integer) RETURNS integer
2318 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2321 -- note overloading of SQL function name "add_one"
2322 CREATE FUNCTION add_one(double precision) RETURNS double precision
2323 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2326 CREATE FUNCTION makepoint(point, point) RETURNS point
2327 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2330 CREATE FUNCTION copytext(text) RETURNS text
2331 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2334 CREATE FUNCTION concat_text(text, text) RETURNS text
2335 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2340 Here, <replaceable>DIRECTORY</replaceable> stands for the
2341 directory of the shared library file (for instance the
2342 <productname>PostgreSQL</productname> tutorial directory, which
2343 contains the code for the examples used in this section).
2344 (Better style would be to use just <literal>'funcs'</> in the
2345 <literal>AS</> clause, after having added
2346 <replaceable>DIRECTORY</replaceable> to the search path. In any
2347 case, we can omit the system-specific extension for a shared
2348 library, commonly <literal>.so</literal>.)
2352 Notice that we have specified the functions as <quote>strict</quote>,
2354 the system should automatically assume a null result if any input
2355 value is null. By doing this, we avoid having to check for null inputs
2356 in the function code. Without this, we'd have to check for null values
2357 explicitly, using PG_ARGISNULL().
2361 At first glance, the version-1 coding conventions might appear to be just
2362 pointless obscurantism, over using plain <literal>C</> calling
2363 conventions. They do however allow to deal with <literal>NULL</>able
2364 arguments/return values, and <quote>toasted</quote> (compressed or
2365 out-of-line) values.
2369 The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2370 allows a function to test whether each input is null. (Of course, doing
2371 this is only necessary in functions not declared <quote>strict</>.)
2373 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2374 the input arguments are counted beginning at zero. Note that one
2375 should refrain from executing
2376 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2377 one has verified that the argument isn't null.
2378 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2379 this works in both strict and nonstrict functions.
2383 Other options provided by the version-1 interface are two
2385 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2386 macros. The first of these,
2387 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2388 guarantees to return a copy of the specified argument that is
2389 safe for writing into. (The normal macros will sometimes return a
2390 pointer to a value that is physically stored in a table, which
2391 must not be written to. Using the
2392 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2393 macros guarantees a writable result.)
2394 The second variant consists of the
2395 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2396 macros which take three arguments. The first is the number of the
2397 function argument (as above). The second and third are the offset and
2398 length of the segment to be returned. Offsets are counted from
2399 zero, and a negative length requests that the remainder of the
2400 value be returned. These macros provide more efficient access to
2401 parts of large values in the case where they have storage type
2402 <quote>external</quote>. (The storage type of a column can be specified using
2403 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2404 COLUMN <replaceable>colname</replaceable> SET STORAGE
2405 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2406 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2407 or <literal>main</>.)
2411 Finally, the version-1 function call conventions make it possible
2412 to return set results (<xref linkend="xfunc-c-return-set">) and
2413 implement trigger functions (<xref linkend="triggers">) and
2414 procedural-language call handlers (<xref
2415 linkend="plhandler">). For more details
2416 see <filename>src/backend/utils/fmgr/README</filename> in the
2417 source distribution.
2422 <title>Writing Code</title>
2425 Before we turn to the more advanced topics, we should discuss
2426 some coding rules for <productname>PostgreSQL</productname>
2427 C-language functions. While it might be possible to load functions
2428 written in languages other than C into
2429 <productname>PostgreSQL</productname>, this is usually difficult
2430 (when it is possible at all) because other languages, such as
2431 C++, FORTRAN, or Pascal often do not follow the same calling
2432 convention as C. That is, other languages do not pass argument
2433 and return values between functions in the same way. For this
2434 reason, we will assume that your C-language functions are
2435 actually written in C.
2439 The basic rules for writing and building C functions are as follows:
2444 Use <literal>pg_config
2445 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2446 to find out where the <productname>PostgreSQL</> server header
2447 files are installed on your system (or the system that your
2448 users will be running on).
2454 Compiling and linking your code so that it can be dynamically
2455 loaded into <productname>PostgreSQL</productname> always
2456 requires special flags. See <xref linkend="dfunc"> for a
2457 detailed explanation of how to do it for your particular
2464 Remember to define a <quote>magic block</> for your shared library,
2465 as described in <xref linkend="xfunc-c-dynload">.
2471 When allocating memory, use the
2472 <productname>PostgreSQL</productname> functions
2473 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2474 instead of the corresponding C library functions
2475 <function>malloc</function> and <function>free</function>.
2476 The memory allocated by <function>palloc</function> will be
2477 freed automatically at the end of each transaction, preventing
2484 Always zero the bytes of your structures using <function>memset</>
2485 (or allocate them with <function>palloc0</> in the first place).
2486 Even if you assign to each field of your structure, there might be
2487 alignment padding (holes in the structure) that contain
2488 garbage values. Without this, it's difficult to
2489 support hash indexes or hash joins, as you must pick out only
2490 the significant bits of your data structure to compute a hash.
2491 The planner also sometimes relies on comparing constants via
2492 bitwise equality, so you can get undesirable planning results if
2493 logically-equivalent values aren't bitwise equal.
2499 Most of the internal <productname>PostgreSQL</productname>
2500 types are declared in <filename>postgres.h</filename>, while
2501 the function manager interfaces
2502 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2503 <filename>fmgr.h</filename>, so you will need to include at
2504 least these two files. For portability reasons it's best to
2505 include <filename>postgres.h</filename> <emphasis>first</>,
2506 before any other system or user header files. Including
2507 <filename>postgres.h</filename> will also include
2508 <filename>elog.h</filename> and <filename>palloc.h</filename>
2515 Symbol names defined within object files must not conflict
2516 with each other or with symbols defined in the
2517 <productname>PostgreSQL</productname> server executable. You
2518 will have to rename your functions or variables if you get
2519 error messages to this effect.
2529 <title>Composite-type Arguments</title>
2532 Composite types do not have a fixed layout like C structures.
2533 Instances of a composite type can contain null fields. In
2534 addition, composite types that are part of an inheritance
2535 hierarchy can have different fields than other members of the
2536 same inheritance hierarchy. Therefore,
2537 <productname>PostgreSQL</productname> provides a function
2538 interface for accessing fields of composite types from C.
2542 Suppose we want to write a function to answer the query:
2545 SELECT name, c_overpaid(emp, 1500) AS overpaid
2547 WHERE name = 'Bill' OR name = 'Sam';
2550 Using the version-1 calling conventions, we can define
2551 <function>c_overpaid</> as:
2553 <programlisting><![CDATA[
2554 #include "postgres.h"
2555 #include "executor/executor.h" /* for GetAttributeByName() */
2557 #ifdef PG_MODULE_MAGIC
2561 PG_FUNCTION_INFO_V1(c_overpaid);
2564 c_overpaid(PG_FUNCTION_ARGS)
2566 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2567 int32 limit = PG_GETARG_INT32(1);
2571 salary = GetAttributeByName(t, "salary", &isnull);
2573 PG_RETURN_BOOL(false);
2574 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2576 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2583 <function>GetAttributeByName</function> is the
2584 <productname>PostgreSQL</productname> system function that
2585 returns attributes out of the specified row. It has
2586 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2588 the function, the name of the desired attribute, and a
2589 return parameter that tells whether the attribute
2590 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2591 value that you can convert to the proper data type by using the
2592 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2593 macro. Note that the return value is meaningless if the null flag is
2594 set; always check the null flag before trying to do anything with the
2599 There is also <function>GetAttributeByNum</function>, which selects
2600 the target attribute by column number instead of name.
2604 The following command declares the function
2605 <function>c_overpaid</function> in SQL:
2608 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2609 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2613 Notice we have used <literal>STRICT</> so that we did not have to
2614 check whether the input arguments were NULL.
2619 <title>Returning Rows (Composite Types)</title>
2622 To return a row or composite-type value from a C-language
2623 function, you can use a special API that provides macros and
2624 functions to hide most of the complexity of building composite
2625 data types. To use this API, the source file must include:
2627 #include "funcapi.h"
2632 There are two ways you can build a composite data value (henceforth
2633 a <quote>tuple</>): you can build it from an array of Datum values,
2634 or from an array of C strings that can be passed to the input
2635 conversion functions of the tuple's column data types. In either
2636 case, you first need to obtain or construct a <structname>TupleDesc</>
2637 descriptor for the tuple structure. When working with Datums, you
2638 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2639 and then call <function>heap_form_tuple</> for each row. When working
2640 with C strings, you pass the <structname>TupleDesc</> to
2641 <function>TupleDescGetAttInMetadata</>, and then call
2642 <function>BuildTupleFromCStrings</> for each row. In the case of a
2643 function returning a set of tuples, the setup steps can all be done
2644 once during the first call of the function.
2648 Several helper functions are available for setting up the needed
2649 <structname>TupleDesc</>. The recommended way to do this in most
2650 functions returning composite values is to call:
2652 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2654 TupleDesc *resultTupleDesc)
2656 passing the same <literal>fcinfo</> struct passed to the calling function
2657 itself. (This of course requires that you use the version-1
2658 calling conventions.) <varname>resultTypeId</> can be specified
2659 as <literal>NULL</> or as the address of a local variable to receive the
2660 function's result type OID. <varname>resultTupleDesc</> should be the
2661 address of a local <structname>TupleDesc</> variable. Check that the
2662 result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2663 <varname>resultTupleDesc</> has been filled with the needed
2664 <structname>TupleDesc</>. (If it is not, you can report an error along
2665 the lines of <quote>function returning record called in context that
2666 cannot accept type record</quote>.)
2671 <function>get_call_result_type</> can resolve the actual type of a
2672 polymorphic function result; so it is useful in functions that return
2673 scalar polymorphic results, not only functions that return composites.
2674 The <varname>resultTypeId</> output is primarily useful for functions
2675 returning polymorphic scalars.
2681 <function>get_call_result_type</> has a sibling
2682 <function>get_expr_result_type</>, which can be used to resolve the
2683 expected output type for a function call represented by an expression
2684 tree. This can be used when trying to determine the result type from
2685 outside the function itself. There is also
2686 <function>get_func_result_type</>, which can be used when only the
2687 function's OID is available. However these functions are not able
2688 to deal with functions declared to return <structname>record</>, and
2689 <function>get_func_result_type</> cannot resolve polymorphic types,
2690 so you should preferentially use <function>get_call_result_type</>.
2695 Older, now-deprecated functions for obtaining
2696 <structname>TupleDesc</>s are:
2698 TupleDesc RelationNameGetTupleDesc(const char *relname)
2700 to get a <structname>TupleDesc</> for the row type of a named relation,
2703 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2705 to get a <structname>TupleDesc</> based on a type OID. This can
2706 be used to get a <structname>TupleDesc</> for a base or
2707 composite type. It will not work for a function that returns
2708 <structname>record</>, however, and it cannot resolve polymorphic
2713 Once you have a <structname>TupleDesc</>, call:
2715 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2717 if you plan to work with Datums, or:
2719 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2721 if you plan to work with C strings. If you are writing a function
2722 returning set, you can save the results of these functions in the
2723 <structname>FuncCallContext</> structure — use the
2724 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2729 When working with Datums, use:
2731 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2733 to build a <structname>HeapTuple</> given user data in Datum form.
2737 When working with C strings, use:
2739 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2741 to build a <structname>HeapTuple</> given user data
2742 in C string form. <parameter>values</parameter> is an array of C strings,
2743 one for each attribute of the return row. Each C string should be in
2744 the form expected by the input function of the attribute data
2745 type. In order to return a null value for one of the attributes,
2746 the corresponding pointer in the <parameter>values</> array
2747 should be set to <symbol>NULL</>. This function will need to
2748 be called again for each row you return.
2752 Once you have built a tuple to return from your function, it
2753 must be converted into a <type>Datum</>. Use:
2755 HeapTupleGetDatum(HeapTuple tuple)
2757 to convert a <structname>HeapTuple</> into a valid Datum. This
2758 <type>Datum</> can be returned directly if you intend to return
2759 just a single row, or it can be used as the current return value
2760 in a set-returning function.
2764 An example appears in the next section.
2769 <sect2 id="xfunc-c-return-set">
2770 <title>Returning Sets</title>
2773 There is also a special API that provides support for returning
2774 sets (multiple rows) from a C-language function. A set-returning
2775 function must follow the version-1 calling conventions. Also,
2776 source files must include <filename>funcapi.h</filename>, as
2781 A set-returning function (<acronym>SRF</>) is called
2782 once for each item it returns. The <acronym>SRF</> must
2783 therefore save enough state to remember what it was doing and
2784 return the next item on each call.
2785 The structure <structname>FuncCallContext</> is provided to help
2786 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2787 is used to hold a pointer to <structname>FuncCallContext</>
2790 typedef struct FuncCallContext
2793 * Number of times we've been called before
2795 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2796 * incremented for you every time SRF_RETURN_NEXT() is called.
2801 * OPTIONAL maximum number of calls
2803 * max_calls is here for convenience only and setting it is optional.
2804 * If not set, you must provide alternative means to know when the
2810 * OPTIONAL pointer to result slot
2812 * This is obsolete and only present for backward compatibility, viz,
2813 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2815 TupleTableSlot *slot;
2818 * OPTIONAL pointer to miscellaneous user-provided context information
2820 * user_fctx is for use as a pointer to your own data to retain
2821 * arbitrary context information between calls of your function.
2826 * OPTIONAL pointer to struct containing attribute type input metadata
2828 * attinmeta is for use when returning tuples (i.e., composite data types)
2829 * and is not used when returning base data types. It is only needed
2830 * if you intend to use BuildTupleFromCStrings() to create the return
2833 AttInMetadata *attinmeta;
2836 * memory context used for structures that must live for multiple calls
2838 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2839 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2840 * context for any memory that is to be reused across multiple calls
2843 MemoryContext multi_call_memory_ctx;
2846 * OPTIONAL pointer to struct containing tuple description
2848 * tuple_desc is for use when returning tuples (i.e., composite data types)
2849 * and is only needed if you are going to build the tuples with
2850 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2851 * the TupleDesc pointer stored here should usually have been run through
2852 * BlessTupleDesc() first.
2854 TupleDesc tuple_desc;
2861 An <acronym>SRF</> uses several functions and macros that
2862 automatically manipulate the <structname>FuncCallContext</>
2863 structure (and expect to find it via <literal>fn_extra</>). Use:
2867 to determine if your function is being called for the first or a
2868 subsequent time. On the first call (only) use:
2870 SRF_FIRSTCALL_INIT()
2872 to initialize the <structname>FuncCallContext</>. On every function call,
2873 including the first, use:
2877 to properly set up for using the <structname>FuncCallContext</>
2878 and clearing any previously returned data left over from the
2883 If your function has data to return, use:
2885 SRF_RETURN_NEXT(funcctx, result)
2887 to return it to the caller. (<literal>result</> must be of type
2888 <type>Datum</>, either a single value or a tuple prepared as
2889 described above.) Finally, when your function is finished
2890 returning data, use:
2892 SRF_RETURN_DONE(funcctx)
2894 to clean up and end the <acronym>SRF</>.
2898 The memory context that is current when the <acronym>SRF</> is called is
2899 a transient context that will be cleared between calls. This means
2900 that you do not need to call <function>pfree</> on everything
2901 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2902 any data structures to live across calls, you need to put them somewhere
2903 else. The memory context referenced by
2904 <structfield>multi_call_memory_ctx</> is a suitable location for any
2905 data that needs to survive until the <acronym>SRF</> is finished running. In most
2906 cases, this means that you should switch into
2907 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2912 While the actual arguments to the function remain unchanged between
2913 calls, if you detoast the argument values (which is normally done
2914 transparently by the
2915 <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2916 in the transient context then the detoasted copies will be freed on
2917 each cycle. Accordingly, if you keep references to such values in
2918 your <structfield>user_fctx</>, you must either copy them into the
2919 <structfield>multi_call_memory_ctx</> after detoasting, or ensure
2920 that you detoast the values only in that context.
2925 A complete pseudo-code example looks like the following:
2928 my_set_returning_function(PG_FUNCTION_ARGS)
2930 FuncCallContext *funcctx;
2932 <replaceable>further declarations as needed</replaceable>
2934 if (SRF_IS_FIRSTCALL())
2936 MemoryContext oldcontext;
2938 funcctx = SRF_FIRSTCALL_INIT();
2939 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2940 /* One-time setup code appears here: */
2941 <replaceable>user code</replaceable>
2942 <replaceable>if returning composite</replaceable>
2943 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2944 <replaceable>endif returning composite</replaceable>
2945 <replaceable>user code</replaceable>
2946 MemoryContextSwitchTo(oldcontext);
2949 /* Each-time setup code appears here: */
2950 <replaceable>user code</replaceable>
2951 funcctx = SRF_PERCALL_SETUP();
2952 <replaceable>user code</replaceable>
2954 /* this is just one way we might test whether we are done: */
2955 if (funcctx->call_cntr < funcctx->max_calls)
2957 /* Here we want to return another item: */
2958 <replaceable>user code</replaceable>
2959 <replaceable>obtain result Datum</replaceable>
2960 SRF_RETURN_NEXT(funcctx, result);
2964 /* Here we are done returning items and just need to clean up: */
2965 <replaceable>user code</replaceable>
2966 SRF_RETURN_DONE(funcctx);
2973 A complete example of a simple <acronym>SRF</> returning a composite type
2975 <programlisting><![CDATA[
2976 PG_FUNCTION_INFO_V1(retcomposite);
2979 retcomposite(PG_FUNCTION_ARGS)
2981 FuncCallContext *funcctx;
2985 AttInMetadata *attinmeta;
2987 /* stuff done only on the first call of the function */
2988 if (SRF_IS_FIRSTCALL())
2990 MemoryContext oldcontext;
2992 /* create a function context for cross-call persistence */
2993 funcctx = SRF_FIRSTCALL_INIT();
2995 /* switch to memory context appropriate for multiple function calls */
2996 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2998 /* total number of tuples to be returned */
2999 funcctx->max_calls = PG_GETARG_UINT32(0);
3001 /* Build a tuple descriptor for our result type */
3002 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3004 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3005 errmsg("function returning record called in context "
3006 "that cannot accept type record")));
3009 * generate attribute metadata needed later to produce tuples from raw
3012 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3013 funcctx->attinmeta = attinmeta;
3015 MemoryContextSwitchTo(oldcontext);
3018 /* stuff done on every call of the function */
3019 funcctx = SRF_PERCALL_SETUP();
3021 call_cntr = funcctx->call_cntr;
3022 max_calls = funcctx->max_calls;
3023 attinmeta = funcctx->attinmeta;
3025 if (call_cntr < max_calls) /* do when there is more left to send */
3032 * Prepare a values array for building the returned tuple.
3033 * This should be an array of C strings which will
3034 * be processed later by the type input functions.
3036 values = (char **) palloc(3 * sizeof(char *));
3037 values[0] = (char *) palloc(16 * sizeof(char));
3038 values[1] = (char *) palloc(16 * sizeof(char));
3039 values[2] = (char *) palloc(16 * sizeof(char));
3041 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3042 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3043 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3046 tuple = BuildTupleFromCStrings(attinmeta, values);
3048 /* make the tuple into a datum */
3049 result = HeapTupleGetDatum(tuple);
3051 /* clean up (this is not really necessary) */
3057 SRF_RETURN_NEXT(funcctx, result);
3059 else /* do when there is no more left */
3061 SRF_RETURN_DONE(funcctx);
3067 One way to declare this function in SQL is:
3069 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3071 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3072 RETURNS SETOF __retcomposite
3073 AS '<replaceable>filename</>', 'retcomposite'
3074 LANGUAGE C IMMUTABLE STRICT;
3076 A different way is to use OUT parameters:
3078 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3079 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3080 RETURNS SETOF record
3081 AS '<replaceable>filename</>', 'retcomposite'
3082 LANGUAGE C IMMUTABLE STRICT;
3084 Notice that in this method the output type of the function is formally
3085 an anonymous <structname>record</> type.
3089 The directory <link linkend="tablefunc">contrib/tablefunc</>
3090 module in the source distribution contains more examples of
3091 set-returning functions.
3096 <title>Polymorphic Arguments and Return Types</title>
3099 C-language functions can be declared to accept and
3100 return the polymorphic types
3101 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3102 <type>anyenum</type>, and <type>anyrange</type>.
3103 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3104 of polymorphic functions. When function arguments or return types
3105 are defined as polymorphic types, the function author cannot know
3106 in advance what data type it will be called with, or
3107 need to return. There are two routines provided in <filename>fmgr.h</>
3108 to allow a version-1 C function to discover the actual data types
3109 of its arguments and the type it is expected to return. The routines are
3110 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3111 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3112 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3113 information is not available.
3114 The structure <literal>flinfo</> is normally accessed as
3115 <literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
3116 is zero based. <function>get_call_result_type</> can also be used
3117 as an alternative to <function>get_fn_expr_rettype</>.
3118 There is also <function>get_fn_expr_variadic</>, which can be used to
3119 find out whether variadic arguments have been merged into an array.
3120 This is primarily useful for <literal>VARIADIC "any"</> functions,
3121 since such merging will always have occurred for variadic functions
3122 taking ordinary array types.
3126 For example, suppose we want to write a function to accept a single
3127 element of any type, and return a one-dimensional array of that type:
3130 PG_FUNCTION_INFO_V1(make_array);
3132 make_array(PG_FUNCTION_ARGS)
3135 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
3145 if (!OidIsValid(element_type))
3146 elog(ERROR, "could not determine data type of input");
3148 /* get the provided element, being careful in case it's NULL */
3149 isnull = PG_ARGISNULL(0);
3151 element = (Datum) 0;
3153 element = PG_GETARG_DATUM(0);
3155 /* we have one dimension */
3157 /* and one element */
3159 /* and lower bound is 1 */
3162 /* get required info about the element type */
3163 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
3165 /* now build the array */
3166 result = construct_md_array(&element, &isnull, ndims, dims, lbs,
3167 element_type, typlen, typbyval, typalign);
3169 PG_RETURN_ARRAYTYPE_P(result);
3175 The following command declares the function
3176 <function>make_array</function> in SQL:
3179 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3180 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3181 LANGUAGE C IMMUTABLE;
3186 There is a variant of polymorphism that is only available to C-language
3187 functions: they can be declared to take parameters of type
3188 <literal>"any"</>. (Note that this type name must be double-quoted,
3189 since it's also a SQL reserved word.) This works like
3190 <type>anyelement</> except that it does not constrain different
3191 <literal>"any"</> arguments to be the same type, nor do they help
3192 determine the function's result type. A C-language function can also
3193 declare its final parameter to be <literal>VARIADIC "any"</>. This will
3194 match one or more actual arguments of any type (not necessarily the same
3195 type). These arguments will <emphasis>not</> be gathered into an array
3196 as happens with normal variadic functions; they will just be passed to
3197 the function separately. The <function>PG_NARGS()</> macro and the
3198 methods described above must be used to determine the number of actual
3199 arguments and their types when using this feature. Also, users of such
3200 a function might wish to use the <literal>VARIADIC</> keyword in their
3201 function call, with the expectation that the function would treat the
3202 array elements as separate arguments. The function itself must implement
3203 that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3204 detect that the actual argument was marked with <literal>VARIADIC</>.
3208 <sect2 id="xfunc-transform-functions">
3209 <title>Transform Functions</title>
3212 Some function calls can be simplified during planning based on
3213 properties specific to the function. For example,
3214 <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3215 To define such function-specific optimizations, write a
3216 <firstterm>transform function</> and place its OID in the
3217 <structfield>protransform</> field of the primary function's
3218 <structname>pg_proc</> entry. The transform function must have the SQL
3219 signature <literal>protransform(internal) RETURNS internal</>. The
3220 argument, actually <type>FuncExpr *</>, is a dummy node representing a
3221 call to the primary function. If the transform function's study of the
3222 expression tree proves that a simplified expression tree can substitute
3223 for all possible concrete calls represented thereby, build and return
3224 that simplified expression. Otherwise, return a <literal>NULL</>
3225 pointer (<emphasis>not</> a SQL null).
3229 We make no guarantee that <productname>PostgreSQL</> will never call the
3230 primary function in cases that the transform function could simplify.
3231 Ensure rigorous equivalence between the simplified expression and an
3232 actual call to the primary function.
3236 Currently, this facility is not exposed to users at the SQL level
3237 because of security concerns, so it is only practical to use for
3238 optimizing built-in functions.
3243 <title>Shared Memory and LWLocks</title>
3246 Add-ins can reserve LWLocks and an allocation of shared memory on server
3247 startup. The add-in's shared library must be preloaded by specifying
3249 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3250 Shared memory is reserved by calling:
3252 void RequestAddinShmemSpace(int size)
3254 from your <function>_PG_init</> function.
3257 LWLocks are reserved by calling:
3259 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3261 from <function>_PG_init</>. This will ensure that an array of
3262 <literal>num_lwlocks</> LWLocks is available under the name
3263 <literal>tranche_name</>. Use <function>GetNamedLWLockTranche</>
3264 to get a pointer to this array.
3267 To avoid possible race-conditions, each backend should use the LWLock
3268 <function>AddinShmemInitLock</> when connecting to and initializing
3269 its allocation of shared memory, as shown here:
3271 static mystruct *ptr = NULL;
3277 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3278 ptr = ShmemInitStruct("my struct name", size, &found);
3281 initialize contents of shmem area;
3282 acquire any requested LWLocks using:
3283 ptr->locks = GetNamedLWLockTranche("my tranche name");
3285 LWLockRelease(AddinShmemInitLock);
3291 <sect2 id="extend-Cpp">
3292 <title>Using C++ for Extensibility</title>
3294 <indexterm zone="extend-Cpp">
3295 <primary>C++</primary>
3299 Although the <productname>PostgreSQL</productname> backend is written in
3300 C, it is possible to write extensions in C++ if these guidelines are
3306 All functions accessed by the backend must present a C interface
3307 to the backend; these C functions can then call C++ functions.
3308 For example, <literal>extern C</> linkage is required for
3309 backend-accessed functions. This is also necessary for any
3310 functions that are passed as pointers between the backend and
3316 Free memory using the appropriate deallocation method. For example,
3317 most backend memory is allocated using <function>palloc()</>, so use
3318 <function>pfree()</> to free it. Using C++
3319 <function>delete</> in such cases will fail.
3324 Prevent exceptions from propagating into the C code (use a catch-all
3325 block at the top level of all <literal>extern C</> functions). This
3326 is necessary even if the C++ code does not explicitly throw any
3327 exceptions, because events like out-of-memory can still throw
3328 exceptions. Any exceptions must be caught and appropriate errors
3329 passed back to the C interface. If possible, compile C++ with
3330 <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3331 cases, you must check for failures in your C++ code, e.g. check for
3332 NULL returned by <function>new()</>.
3337 If calling backend functions from C++ code, be sure that the
3338 C++ call stack contains only plain old data structures
3339 (<acronym>POD</>). This is necessary because backend errors
3340 generate a distant <function>longjmp()</> that does not properly
3341 unroll a C++ call stack with non-POD objects.
3348 In summary, it is best to place C++ code behind a wall of
3349 <literal>extern C</> functions that interface to the backend,
3350 and avoid exception, memory, and call stack leakage.