1 <!-- doc/src/sgml/xfunc.sgml -->
4 <title>User-defined Functions</title>
6 <indexterm zone="xfunc">
7 <primary>function</primary>
8 <secondary>user-defined</secondary>
12 <productname>PostgreSQL</productname> provides four kinds of
18 query language functions (functions written in
19 <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
24 procedural language functions (functions written in, for
25 example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
26 (<xref linkend="xfunc-pl">)
31 internal functions (<xref linkend="xfunc-internal">)
36 C-language functions (<xref linkend="xfunc-c">)
44 of function can take base types, composite types, or
45 combinations of these as arguments (parameters). In addition,
46 every kind of function can return a base type or
47 a composite type. Functions can also be defined to return
48 sets of base or composite values.
52 Many kinds of functions can take or return certain pseudo-types
53 (such as polymorphic types), but the available facilities vary.
54 Consult the description of each kind of function for more details.
58 It's easiest to define <acronym>SQL</acronym>
59 functions, so we'll start by discussing those.
60 Most of the concepts presented for <acronym>SQL</acronym> functions
61 will carry over to the other types of functions.
65 Throughout this chapter, it can be useful to look at the reference
66 page of the <xref linkend="sql-createfunction"> command to
67 understand the examples better. Some examples from this chapter
68 can be found in <filename>funcs.sql</filename> and
69 <filename>funcs.c</filename> in the <filename>src/tutorial</>
70 directory in the <productname>PostgreSQL</productname> source
75 <sect1 id="xfunc-sql">
76 <title>Query Language (<acronym>SQL</acronym>) Functions</title>
78 <indexterm zone="xfunc-sql">
79 <primary>function</primary>
80 <secondary>user-defined</secondary>
81 <tertiary>in SQL</tertiary>
85 SQL functions execute an arbitrary list of SQL statements, returning
86 the result of the last query in the list.
87 In the simple (non-set)
88 case, the first row of the last query's result will be returned.
89 (Bear in mind that <quote>the first row</quote> of a multirow
90 result is not well-defined unless you use <literal>ORDER BY</>.)
91 If the last query happens
92 to return no rows at all, the null value will be returned.
96 Alternatively, an SQL function can be declared to return a set (that is,
97 multiple rows) by specifying the function's return type as <literal>SETOF
98 <replaceable>sometype</></literal>, or equivalently by declaring it as
99 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>. In this case
100 all rows of the last query's result are returned. Further details appear
105 The body of an SQL function must be a list of SQL
106 statements separated by semicolons. A semicolon after the last
107 statement is optional. Unless the function is declared to return
108 <type>void</>, the last statement must be a <command>SELECT</>,
109 or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
110 that has a <literal>RETURNING</> clause.
114 Any collection of commands in the <acronym>SQL</acronym>
115 language can be packaged together and defined as a function.
116 Besides <command>SELECT</command> queries, the commands can include data
117 modification queries (<command>INSERT</command>,
118 <command>UPDATE</command>, and <command>DELETE</command>), as well as
119 other SQL commands. (You cannot use transaction control commands, e.g.
120 <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
121 commands, e.g. <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
122 However, the final command
123 must be a <command>SELECT</command> or have a <literal>RETURNING</>
124 clause that returns whatever is
125 specified as the function's return type. Alternatively, if you
126 want to define a SQL function that performs actions but has no
127 useful value to return, you can define it as returning <type>void</>.
128 For example, this function removes rows with negative salaries from
129 the <literal>emp</> table:
132 CREATE FUNCTION clean_emp() RETURNS void AS '
148 The entire body of a SQL function is parsed before any of it is
149 executed. While a SQL function can contain commands that alter
150 the system catalogs (e.g., <command>CREATE TABLE</>), the effects
151 of such commands will not be visible during parse analysis of
152 later commands in the function. Thus, for example,
153 <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
154 will not work as desired if packaged up into a single SQL function,
155 since <structname>foo</> won't exist yet when the <command>INSERT</>
156 command is parsed. It's recommended to use <application>PL/PgSQL</>
157 instead of a SQL function in this type of situation.
162 The syntax of the <command>CREATE FUNCTION</command> command requires
163 the function body to be written as a string constant. It is usually
164 most convenient to use dollar quoting (see <xref
165 linkend="sql-syntax-dollar-quoting">) for the string constant.
166 If you choose to use regular single-quoted string constant syntax,
167 you must double single quote marks (<literal>'</>) and backslashes
168 (<literal>\</>) (assuming escape string syntax) in the body of
169 the function (see <xref linkend="sql-syntax-strings">).
172 <sect2 id="xfunc-sql-function-arguments">
173 <title>Arguments for <acronym>SQL</acronym> Functions</title>
176 <primary>function</primary>
177 <secondary>named argument</secondary>
181 Arguments of a SQL function can be referenced in the function
182 body using either names or numbers. Examples of both methods appear
187 To use a name, declare the function argument as having a name, and
188 then just write that name in the function body. If the argument name
189 is the same as any column name in the current SQL command within the
190 function, the column name will take precedence. To override this,
191 qualify the argument name with the name of the function itself, that is
192 <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
193 (If this would conflict with a qualified column name, again the column
194 name wins. You can avoid the ambiguity by choosing a different alias for
195 the table within the SQL command.)
199 In the older numeric approach, arguments are referenced using the syntax
200 <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
201 argument, <literal>$2</> to the second, and so on. This will work
202 whether or not the particular argument was declared with a name.
206 If an argument is of a composite type, then the dot notation,
207 e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
208 <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
209 argument. Again, you might need to qualify the argument's name with the
210 function name to make the form with an argument name unambiguous.
214 SQL function arguments can only be used as data values,
215 not as identifiers. Thus for example this is reasonable:
217 INSERT INTO mytable VALUES ($1);
219 but this will not work:
221 INSERT INTO $1 VALUES (42);
227 The ability to use names to reference SQL function arguments was added
228 in <productname>PostgreSQL</productname> 9.2. Functions to be used in
229 older servers must use the <literal>$<replaceable>n</></> notation.
234 <sect2 id="xfunc-sql-base-functions">
235 <title><acronym>SQL</acronym> Functions on Base Types</title>
238 The simplest possible <acronym>SQL</acronym> function has no arguments and
239 simply returns a base type, such as <type>integer</type>:
242 CREATE FUNCTION one() RETURNS integer AS $$
246 -- Alternative syntax for string literal:
247 CREATE FUNCTION one() RETURNS integer AS '
260 Notice that we defined a column alias within the function body for the result of the function
261 (with the name <literal>result</>), but this column alias is not visible
262 outside the function. Hence, the result is labeled <literal>one</>
263 instead of <literal>result</>.
267 It is almost as easy to define <acronym>SQL</acronym> functions
268 that take base types as arguments:
271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
275 SELECT add_em(1, 2) AS answer;
284 Alternatively, we could dispense with names for the arguments and
288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
292 SELECT add_em(1, 2) AS answer;
301 Here is a more useful function, which might be used to debit a
305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
307 SET balance = balance - debit
308 WHERE accountno = tf1.accountno;
313 A user could execute this function to debit account 17 by $100.00 as
317 SELECT tf1(17, 100.0);
322 In this example, we chose the name <literal>accountno</> for the first
323 argument, but this is the same as the name of a column in the
324 <literal>bank</> table. Within the <command>UPDATE</> command,
325 <literal>accountno</> refers to the column <literal>bank.accountno</>,
326 so <literal>tf1.accountno</> must be used to refer to the argument.
327 We could of course avoid this by using a different name for the argument.
331 In practice one would probably like a more useful result from the
332 function than a constant 1, so a more likely definition
336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
338 SET balance = balance - debit
339 WHERE accountno = tf1.accountno;
340 SELECT balance FROM bank WHERE accountno = tf1.accountno;
344 which adjusts the balance and returns the new balance.
345 The same thing could be done in one command using <literal>RETURNING</>:
348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
350 SET balance = balance - debit
351 WHERE accountno = tf1.accountno
358 <sect2 id="xfunc-sql-composite-functions">
359 <title><acronym>SQL</acronym> Functions on Composite Types</title>
362 When writing functions with arguments of composite types, we must not
363 only specify which argument we want but also the desired attribute
364 (field) of that argument. For example, suppose that
365 <type>emp</type> is a table containing employee data, and therefore
366 also the name of the composite type of each row of the table. Here
367 is a function <function>double_salary</function> that computes what someone's
368 salary would be if it were doubled:
378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
381 SELECT $1.salary * 2 AS salary;
384 SELECT name, double_salary(emp.*) AS dream
386 WHERE emp.cubicle ~= point '(2,1)';
395 Notice the use of the syntax <literal>$1.salary</literal>
396 to select one field of the argument row value. Also notice
397 how the calling <command>SELECT</> command uses <literal>*</>
399 the entire current row of a table as a composite value. The table
400 row can alternatively be referenced using just the table name,
403 SELECT name, double_salary(emp) AS dream
405 WHERE emp.cubicle ~= point '(2,1)';
407 but this usage is deprecated since it's easy to get confused.
411 Sometimes it is handy to construct a composite argument value
412 on-the-fly. This can be done with the <literal>ROW</> construct.
413 For example, we could adjust the data being passed to the function:
415 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
421 It is also possible to build a function that returns a composite type.
422 This is an example of a function
423 that returns a single <type>emp</type> row:
426 CREATE FUNCTION new_emp() RETURNS emp AS $$
427 SELECT text 'None' AS name,
430 point '(2,2)' AS cubicle;
434 In this example we have specified each of the attributes
435 with a constant value, but any computation
436 could have been substituted for these constants.
440 Note two important things about defining the function:
445 The select list order in the query must be exactly the same as
446 that in which the columns appear in the table associated
447 with the composite type. (Naming the columns, as we did above,
448 is irrelevant to the system.)
453 You must typecast the expressions to match the
454 definition of the composite type, or you will get errors like this:
457 ERROR: function declared to return emp returns varchar instead of text at column 1
466 A different way to define the same function is:
469 CREATE FUNCTION new_emp() RETURNS emp AS $$
470 SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
474 Here we wrote a <command>SELECT</> that returns just a single
475 column of the correct composite type. This isn't really better
476 in this situation, but it is a handy alternative in some cases
477 — for example, if we need to compute the result by calling
478 another function that returns the desired composite value.
482 We could call this function directly in either of two ways:
488 --------------------------
489 (None,1000.0,25,"(2,2)")
491 SELECT * FROM new_emp();
493 name | salary | age | cubicle
494 ------+--------+-----+---------
495 None | 1000.0 | 25 | (2,2)
498 The second way is described more fully in <xref
499 linkend="xfunc-sql-table-functions">.
503 When you use a function that returns a composite type,
504 you might want only one field (attribute) from its result.
505 You can do that with syntax like this:
508 SELECT (new_emp()).name;
515 The extra parentheses are needed to keep the parser from getting
516 confused. If you try to do it without them, you get something like this:
519 SELECT new_emp().name;
520 ERROR: syntax error at or near "."
521 LINE 1: SELECT new_emp().name;
527 Another option is to use
528 functional notation for extracting an attribute. The simple way
529 to explain this is that we can use the
530 notations <literal><replaceable>attribute</>(<replaceable>table</>)</>
531 and <literal><replaceable>table</>.<replaceable>attribute</></>
535 SELECT name(new_emp());
543 -- This is the same as:
544 -- SELECT emp.name AS youngster FROM emp WHERE emp.age < 30;
546 SELECT name(emp) AS youngster FROM emp WHERE age(emp) < 30;
557 The equivalence between functional notation and attribute notation
558 makes it possible to use functions on composite types to emulate
559 <quote>computed fields</>.
561 <primary>computed field</primary>
564 <primary>field</primary>
565 <secondary>computed</secondary>
567 For example, using the previous definition
568 for <literal>double_salary(emp)</>, we can write
571 SELECT emp.name, emp.double_salary FROM emp;
574 An application using this wouldn't need to be directly aware that
575 <literal>double_salary</> isn't a real column of the table.
576 (You can also emulate computed fields with views.)
580 Because of this behavior, it's unwise to give a function that takes
581 a single composite-type argument the same name as any of the fields of
587 Another way to use a function returning a composite type is to pass the
588 result to another function that accepts the correct row type as input:
591 CREATE FUNCTION getname(emp) RETURNS text AS $$
595 SELECT getname(new_emp());
604 Still another way to use a function that returns a composite type is to
605 call it as a table function, as described in <xref
606 linkend="xfunc-sql-table-functions">.
610 <sect2 id="xfunc-output-parameters">
611 <title><acronym>SQL</> Functions with Output Parameters</title>
614 <primary>function</primary>
615 <secondary>output parameter</secondary>
619 An alternative way of describing a function's results is to define it
620 with <firstterm>output parameters</>, as in this example:
623 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
634 This is not essentially different from the version of <literal>add_em</>
635 shown in <xref linkend="xfunc-sql-base-functions">. The real value of
636 output parameters is that they provide a convenient way of defining
637 functions that return several columns. For example,
640 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
641 AS 'SELECT x + y, x * y'
644 SELECT * FROM sum_n_product(11,42);
651 What has essentially happened here is that we have created an anonymous
652 composite type for the result of the function. The above example has
653 the same end result as
656 CREATE TYPE sum_prod AS (sum int, product int);
658 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
659 AS 'SELECT $1 + $2, $1 * $2'
663 but not having to bother with the separate composite type definition
664 is often handy. Notice that the names attached to the output parameters
665 are not just decoration, but determine the column names of the anonymous
666 composite type. (If you omit a name for an output parameter, the
667 system will choose a name on its own.)
671 Notice that output parameters are not included in the calling argument
672 list when invoking such a function from SQL. This is because
673 <productname>PostgreSQL</productname> considers only the input
674 parameters to define the function's calling signature. That means
675 also that only the input parameters matter when referencing the function
676 for purposes such as dropping it. We could drop the above function
680 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
681 DROP FUNCTION sum_n_product (int, int);
686 Parameters can be marked as <literal>IN</> (the default),
687 <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
689 parameter serves as both an input parameter (part of the calling
690 argument list) and an output parameter (part of the result record type).
691 <literal>VARIADIC</> parameters are input parameters, but are treated
692 specially as described next.
696 <sect2 id="xfunc-sql-variadic-functions">
697 <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
700 <primary>function</primary>
701 <secondary>variadic</secondary>
705 <primary>variadic function</primary>
709 <acronym>SQL</acronym> functions can be declared to accept
710 variable numbers of arguments, so long as all the <quote>optional</>
711 arguments are of the same data type. The optional arguments will be
712 passed to the function as an array. The function is declared by
713 marking the last parameter as <literal>VARIADIC</>; this parameter
714 must be declared as being of an array type. For example:
717 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
718 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
721 SELECT mleast(10, -1, 5, 4.4);
728 Effectively, all the actual arguments at or beyond the
729 <literal>VARIADIC</> position are gathered up into a one-dimensional
730 array, as if you had written
733 SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
736 You can't actually write that, though — or at least, it will
737 not match this function definition. A parameter marked
738 <literal>VARIADIC</> matches one or more occurrences of its element
739 type, not of its own type.
743 Sometimes it is useful to be able to pass an already-constructed array
744 to a variadic function; this is particularly handy when one variadic
745 function wants to pass on its array parameter to another one. You can
746 do that by specifying <literal>VARIADIC</> in the call:
749 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
752 This prevents expansion of the function's variadic parameter into its
753 element type, thereby allowing the array argument value to match
754 normally. <literal>VARIADIC</> can only be attached to the last
755 actual argument of a function call.
759 Specifying <literal>VARIADIC</> in the call is also the only way to
760 pass an empty array to a variadic function, for example:
763 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
766 Simply writing <literal>SELECT mleast()</> does not work because a
767 variadic parameter must match at least one actual argument.
768 (You could define a second function also named <literal>mleast</>,
769 with no parameters, if you wanted to allow such calls.)
773 The array element parameters generated from a variadic parameter are
774 treated as not having any names of their own. This means it is not
775 possible to call a variadic function using named arguments (<xref
776 linkend="sql-syntax-calling-funcs">), except when you specify
777 <literal>VARIADIC</>. For example, this will work:
780 SELECT mleast(VARIADIC arr => ARRAY[10, -1, 5, 4.4]);
786 SELECT mleast(arr => 10);
787 SELECT mleast(arr => ARRAY[10, -1, 5, 4.4]);
792 <sect2 id="xfunc-sql-parameter-defaults">
793 <title><acronym>SQL</> Functions with Default Values for Arguments</title>
796 <primary>function</primary>
797 <secondary>default values for arguments</secondary>
801 Functions can be declared with default values for some or all input
802 arguments. The default values are inserted whenever the function is
803 called with insufficiently many actual arguments. Since arguments
804 can only be omitted from the end of the actual argument list, all
805 parameters after a parameter with a default value have to have
806 default values as well. (Although the use of named argument notation
807 could allow this restriction to be relaxed, it's still enforced so that
808 positional argument notation works sensibly.)
814 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
821 SELECT foo(10, 20, 30);
839 SELECT foo(); -- fails since there is no default for the first argument
840 ERROR: function foo() does not exist
842 The <literal>=</literal> sign can also be used in place of the
843 key word <literal>DEFAULT</literal>.
847 <sect2 id="xfunc-sql-table-functions">
848 <title><acronym>SQL</acronym> Functions as Table Sources</title>
851 All SQL functions can be used in the <literal>FROM</> clause of a query,
852 but it is particularly useful for functions returning composite types.
853 If the function is defined to return a base type, the table function
854 produces a one-column table. If the function is defined to return
855 a composite type, the table function produces a column for each attribute
856 of the composite type.
863 CREATE TABLE foo (fooid int, foosubid int, fooname text);
864 INSERT INTO foo VALUES (1, 1, 'Joe');
865 INSERT INTO foo VALUES (1, 2, 'Ed');
866 INSERT INTO foo VALUES (2, 1, 'Mary');
868 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
869 SELECT * FROM foo WHERE fooid = $1;
872 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
874 fooid | foosubid | fooname | upper
875 -------+----------+---------+-------
880 As the example shows, we can work with the columns of the function's
881 result just the same as if they were columns of a regular table.
885 Note that we only got one row out of the function. This is because
886 we did not use <literal>SETOF</>. That is described in the next section.
890 <sect2 id="xfunc-sql-functions-returning-set">
891 <title><acronym>SQL</acronym> Functions Returning Sets</title>
894 <primary>function</primary>
895 <secondary>with SETOF</secondary>
899 When an SQL function is declared as returning <literal>SETOF
900 <replaceable>sometype</></literal>, the function's final
901 query is executed to completion, and each row it
902 outputs is returned as an element of the result set.
906 This feature is normally used when calling the function in the <literal>FROM</>
907 clause. In this case each row returned by the function becomes
908 a row of the table seen by the query. For example, assume that
909 table <literal>foo</> has the same contents as above, and we say:
912 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
913 SELECT * FROM foo WHERE fooid = $1;
916 SELECT * FROM getfoo(1) AS t1;
921 fooid | foosubid | fooname
922 -------+----------+---------
930 It is also possible to return multiple rows with the columns defined by
931 output parameters, like this:
934 CREATE TABLE tab (y int, z int);
935 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
937 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
940 SELECT $1 + tab.y, $1 * tab.y FROM tab;
943 SELECT * FROM sum_n_product_with_tab(10);
953 The key point here is that you must write <literal>RETURNS SETOF record</>
954 to indicate that the function returns multiple rows instead of just one.
955 If there is only one output parameter, write that parameter's type
956 instead of <type>record</>.
960 It is frequently useful to construct a query's result by invoking a
961 set-returning function multiple times, with the parameters for each
962 invocation coming from successive rows of a table or subquery. The
963 preferred way to do this is to use the <literal>LATERAL</> key word,
964 which is described in <xref linkend="queries-lateral">.
965 Here is an example using a set-returning function to enumerate
966 elements of a tree structure:
980 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
981 SELECT name FROM nodes WHERE parent = $1
982 $$ LANGUAGE SQL STABLE;
984 SELECT * FROM listchildren('Top');
992 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
1003 This example does not do anything that we couldn't have done with a
1004 simple join, but in more complex calculations the option to put
1005 some of the work into a function can be quite convenient.
1009 Currently, functions returning sets can also be called in the select list
1010 of a query. For each row that the query
1011 generates by itself, the function returning set is invoked, and an output
1012 row is generated for each element of the function's result set. Note,
1013 however, that this capability is deprecated and might be removed in future
1014 releases. The previous example could also be done with queries like
1018 SELECT listchildren('Top');
1026 SELECT name, listchildren(name) FROM nodes;
1028 --------+--------------
1037 In the last <command>SELECT</command>,
1038 notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
1039 This happens because <function>listchildren</function> returns an empty set
1040 for those arguments, so no result rows are generated. This is the same
1041 behavior as we got from an inner join to the function result when using
1042 the <literal>LATERAL</> syntax.
1047 If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1048 or <command>DELETE</> with <literal>RETURNING</>, that command will
1049 always be executed to completion, even if the function is not declared
1050 with <literal>SETOF</> or the calling query does not fetch all the
1051 result rows. Any extra rows produced by the <literal>RETURNING</>
1052 clause are silently dropped, but the commanded table modifications
1053 still happen (and are all completed before returning from the function).
1059 The key problem with using set-returning functions in the select list,
1060 rather than the <literal>FROM</> clause, is that putting more than one
1061 set-returning function in the same select list does not behave very
1062 sensibly. (What you actually get if you do so is a number of output
1063 rows equal to the least common multiple of the numbers of rows produced
1064 by each set-returning function.) The <literal>LATERAL</> syntax
1065 produces less surprising results when calling multiple set-returning
1066 functions, and should usually be used instead.
1071 <sect2 id="xfunc-sql-functions-returning-table">
1072 <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1075 <primary>function</primary>
1076 <secondary>RETURNS TABLE</secondary>
1080 There is another way to declare a function as returning a set,
1081 which is to use the syntax
1082 <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1083 This is equivalent to using one or more <literal>OUT</> parameters plus
1084 marking the function as returning <literal>SETOF record</> (or
1085 <literal>SETOF</> a single output parameter's type, as appropriate).
1086 This notation is specified in recent versions of the SQL standard, and
1087 thus may be more portable than using <literal>SETOF</>.
1091 For example, the preceding sum-and-product example could also be
1095 CREATE FUNCTION sum_n_product_with_tab (x int)
1096 RETURNS TABLE(sum int, product int) AS $$
1097 SELECT $1 + tab.y, $1 * tab.y FROM tab;
1101 It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1102 parameters with the <literal>RETURNS TABLE</> notation — you must
1103 put all the output columns in the <literal>TABLE</> list.
1108 <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1111 <acronym>SQL</acronym> functions can be declared to accept and
1112 return the polymorphic types <type>anyelement</type>,
1113 <type>anyarray</type>, <type>anynonarray</type>,
1114 <type>anyenum</type>, and <type>anyrange</type>. See <xref
1115 linkend="extend-types-polymorphic"> for a more detailed
1116 explanation of polymorphic functions. Here is a polymorphic
1117 function <function>make_array</function> that builds up an array
1118 from two arbitrary data type elements:
1120 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1121 SELECT ARRAY[$1, $2];
1124 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1125 intarray | textarray
1126 ----------+-----------
1133 Notice the use of the typecast <literal>'a'::text</literal>
1134 to specify that the argument is of type <type>text</type>. This is
1135 required if the argument is just a string literal, since otherwise
1136 it would be treated as type
1137 <type>unknown</type>, and array of <type>unknown</type> is not a valid
1139 Without the typecast, you will get errors like this:
1142 ERROR: could not determine polymorphic type because input has type "unknown"
1148 It is permitted to have polymorphic arguments with a fixed
1149 return type, but the converse is not. For example:
1151 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1155 SELECT is_greater(1, 2);
1161 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1164 ERROR: cannot determine result data type
1165 DETAIL: A function returning a polymorphic type must have at least one polymorphic argument.
1170 Polymorphism can be used with functions that have output arguments.
1173 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1174 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1176 SELECT * FROM dup(22);
1185 Polymorphism can also be used with variadic functions.
1188 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1189 SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1192 SELECT anyleast(10, -1, 5, 4);
1198 SELECT anyleast('abc'::text, 'def');
1204 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1205 SELECT array_to_string($2, $1);
1208 SELECT concat_values('|', 1, 4, 2);
1218 <title><acronym>SQL</acronym> Functions with Collations</title>
1221 <primary>collation</>
1222 <secondary>in SQL functions</>
1226 When a SQL function has one or more parameters of collatable data types,
1227 a collation is identified for each function call depending on the
1228 collations assigned to the actual arguments, as described in <xref
1229 linkend="collation">. If a collation is successfully identified
1230 (i.e., there are no conflicts of implicit collations among the arguments)
1231 then all the collatable parameters are treated as having that collation
1232 implicitly. This will affect the behavior of collation-sensitive
1233 operations within the function. For example, using the
1234 <function>anyleast</> function described above, the result of
1236 SELECT anyleast('abc'::text, 'ABC');
1238 will depend on the database's default collation. In <literal>C</> locale
1239 the result will be <literal>ABC</>, but in many other locales it will
1240 be <literal>abc</>. The collation to use can be forced by adding
1241 a <literal>COLLATE</> clause to any of the arguments, for example
1243 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1245 Alternatively, if you wish a function to operate with a particular
1246 collation regardless of what it is called with, insert
1247 <literal>COLLATE</> clauses as needed in the function definition.
1248 This version of <function>anyleast</> would always use <literal>en_US</>
1249 locale to compare strings:
1251 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1252 SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1255 But note that this will throw an error if applied to a non-collatable
1260 If no common collation can be identified among the actual arguments,
1261 then a SQL function treats its parameters as having their data types'
1262 default collation (which is usually the database's default collation,
1263 but could be different for parameters of domain types).
1267 The behavior of collatable parameters can be thought of as a limited
1268 form of polymorphism, applicable only to textual data types.
1273 <sect1 id="xfunc-overload">
1274 <title>Function Overloading</title>
1276 <indexterm zone="xfunc-overload">
1277 <primary>overloading</primary>
1278 <secondary>functions</secondary>
1282 More than one function can be defined with the same SQL name, so long
1283 as the arguments they take are different. In other words,
1284 function names can be <firstterm>overloaded</firstterm>. When a
1285 query is executed, the server will determine which function to
1286 call from the data types and the number of the provided arguments.
1287 Overloading can also be used to simulate functions with a variable
1288 number of arguments, up to a finite maximum number.
1292 When creating a family of overloaded functions, one should be
1293 careful not to create ambiguities. For instance, given the
1296 CREATE FUNCTION test(int, real) RETURNS ...
1297 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1299 it is not immediately clear which function would be called with
1300 some trivial input like <literal>test(1, 1.5)</literal>. The
1301 currently implemented resolution rules are described in
1302 <xref linkend="typeconv">, but it is unwise to design a system that subtly
1303 relies on this behavior.
1307 A function that takes a single argument of a composite type should
1308 generally not have the same name as any attribute (field) of that type.
1309 Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1310 is considered equivalent
1311 to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1312 In the case that there is an
1313 ambiguity between a function on a composite type and an attribute of
1314 the composite type, the attribute will always be used. It is possible
1315 to override that choice by schema-qualifying the function name
1316 (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1317 </literal>) but it's better to
1318 avoid the problem by not choosing conflicting names.
1322 Another possible conflict is between variadic and non-variadic functions.
1323 For instance, it is possible to create both <literal>foo(numeric)</> and
1324 <literal>foo(VARIADIC numeric[])</>. In this case it is unclear which one
1325 should be matched to a call providing a single numeric argument, such as
1326 <literal>foo(10.1)</>. The rule is that the function appearing
1327 earlier in the search path is used, or if the two functions are in the
1328 same schema, the non-variadic one is preferred.
1332 When overloading C-language functions, there is an additional
1333 constraint: The C name of each function in the family of
1334 overloaded functions must be different from the C names of all
1335 other functions, either internal or dynamically loaded. If this
1336 rule is violated, the behavior is not portable. You might get a
1337 run-time linker error, or one of the functions will get called
1338 (usually the internal one). The alternative form of the
1339 <literal>AS</> clause for the SQL <command>CREATE
1340 FUNCTION</command> command decouples the SQL function name from
1341 the function name in the C source code. For instance:
1343 CREATE FUNCTION test(int) RETURNS int
1344 AS '<replaceable>filename</>', 'test_1arg'
1346 CREATE FUNCTION test(int, int) RETURNS int
1347 AS '<replaceable>filename</>', 'test_2arg'
1350 The names of the C functions here reflect one of many possible conventions.
1354 <sect1 id="xfunc-volatility">
1355 <title>Function Volatility Categories</title>
1357 <indexterm zone="xfunc-volatility">
1358 <primary>volatility</primary>
1359 <secondary>functions</secondary>
1361 <indexterm zone="xfunc-volatility">
1362 <primary>VOLATILE</primary>
1364 <indexterm zone="xfunc-volatility">
1365 <primary>STABLE</primary>
1367 <indexterm zone="xfunc-volatility">
1368 <primary>IMMUTABLE</primary>
1372 Every function has a <firstterm>volatility</> classification, with
1373 the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1374 <literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
1375 <xref linkend="sql-createfunction">
1376 command does not specify a category. The volatility category is a
1377 promise to the optimizer about the behavior of the function:
1382 A <literal>VOLATILE</> function can do anything, including modifying
1383 the database. It can return different results on successive calls with
1384 the same arguments. The optimizer makes no assumptions about the
1385 behavior of such functions. A query using a volatile function will
1386 re-evaluate the function at every row where its value is needed.
1391 A <literal>STABLE</> function cannot modify the database and is
1392 guaranteed to return the same results given the same arguments
1393 for all rows within a single statement. This category allows the
1394 optimizer to optimize multiple calls of the function to a single
1395 call. In particular, it is safe to use an expression containing
1396 such a function in an index scan condition. (Since an index scan
1397 will evaluate the comparison value only once, not once at each
1398 row, it is not valid to use a <literal>VOLATILE</> function in an
1399 index scan condition.)
1404 An <literal>IMMUTABLE</> function cannot modify the database and is
1405 guaranteed to return the same results given the same arguments forever.
1406 This category allows the optimizer to pre-evaluate the function when
1407 a query calls it with constant arguments. For example, a query like
1408 <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1409 <literal>SELECT ... WHERE x = 4</>, because the function underlying
1410 the integer addition operator is marked <literal>IMMUTABLE</>.
1417 For best optimization results, you should label your functions with the
1418 strictest volatility category that is valid for them.
1422 Any function with side-effects <emphasis>must</> be labeled
1423 <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1424 Even a function with no side-effects needs to be labeled
1425 <literal>VOLATILE</> if its value can change within a single query;
1426 some examples are <literal>random()</>, <literal>currval()</>,
1427 <literal>timeofday()</>.
1431 Another important example is that the <function>current_timestamp</>
1432 family of functions qualify as <literal>STABLE</>, since their values do
1433 not change within a transaction.
1437 There is relatively little difference between <literal>STABLE</> and
1438 <literal>IMMUTABLE</> categories when considering simple interactive
1439 queries that are planned and immediately executed: it doesn't matter
1440 a lot whether a function is executed once during planning or once during
1441 query execution startup. But there is a big difference if the plan is
1442 saved and reused later. Labeling a function <literal>IMMUTABLE</> when
1443 it really isn't might allow it to be prematurely folded to a constant during
1444 planning, resulting in a stale value being re-used during subsequent uses
1445 of the plan. This is a hazard when using prepared statements or when
1446 using function languages that cache plans (such as
1447 <application>PL/pgSQL</>).
1451 For functions written in SQL or in any of the standard procedural
1452 languages, there is a second important property determined by the
1453 volatility category, namely the visibility of any data changes that have
1454 been made by the SQL command that is calling the function. A
1455 <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1456 or <literal>IMMUTABLE</> function will not. This behavior is implemented
1457 using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1458 <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1459 established as of the start of the calling query, whereas
1460 <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1461 each query they execute.
1466 Functions written in C can manage snapshots however they want, but it's
1467 usually a good idea to make C functions work this way too.
1472 Because of this snapshotting behavior,
1473 a function containing only <command>SELECT</> commands can safely be
1474 marked <literal>STABLE</>, even if it selects from tables that might be
1475 undergoing modifications by concurrent queries.
1476 <productname>PostgreSQL</productname> will execute all commands of a
1477 <literal>STABLE</> function using the snapshot established for the
1478 calling query, and so it will see a fixed view of the database throughout
1483 The same snapshotting behavior is used for <command>SELECT</> commands
1484 within <literal>IMMUTABLE</> functions. It is generally unwise to select
1485 from database tables within an <literal>IMMUTABLE</> function at all,
1486 since the immutability will be broken if the table contents ever change.
1487 However, <productname>PostgreSQL</productname> does not enforce that you
1492 A common error is to label a function <literal>IMMUTABLE</> when its
1493 results depend on a configuration parameter. For example, a function
1494 that manipulates timestamps might well have results that depend on the
1495 <xref linkend="guc-timezone"> setting. For safety, such functions should
1496 be labeled <literal>STABLE</> instead.
1501 <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1502 and <literal>IMMUTABLE</> functions contain no SQL commands other
1503 than <command>SELECT</> to prevent data modification.
1504 (This is not a completely bulletproof test, since such functions could
1505 still call <literal>VOLATILE</> functions that modify the database.
1506 If you do that, you will find that the <literal>STABLE</> or
1507 <literal>IMMUTABLE</> function does not notice the database changes
1508 applied by the called function, since they are hidden from its snapshot.)
1513 <sect1 id="xfunc-pl">
1514 <title>Procedural Language Functions</title>
1517 <productname>PostgreSQL</productname> allows user-defined functions
1518 to be written in other languages besides SQL and C. These other
1519 languages are generically called <firstterm>procedural
1520 languages</firstterm> (<acronym>PL</>s).
1521 Procedural languages aren't built into the
1522 <productname>PostgreSQL</productname> server; they are offered
1523 by loadable modules.
1524 See <xref linkend="xplang"> and following chapters for more
1529 <sect1 id="xfunc-internal">
1530 <title>Internal Functions</title>
1532 <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1535 Internal functions are functions written in C that have been statically
1536 linked into the <productname>PostgreSQL</productname> server.
1537 The <quote>body</quote> of the function definition
1538 specifies the C-language name of the function, which need not be the
1539 same as the name being declared for SQL use.
1540 (For reasons of backward compatibility, an empty body
1541 is accepted as meaning that the C-language function name is the
1542 same as the SQL name.)
1546 Normally, all internal functions present in the
1547 server are declared during the initialization of the database cluster
1548 (see <xref linkend="creating-cluster">),
1549 but a user could use <command>CREATE FUNCTION</command>
1550 to create additional alias names for an internal function.
1551 Internal functions are declared in <command>CREATE FUNCTION</command>
1552 with language name <literal>internal</literal>. For instance, to
1553 create an alias for the <function>sqrt</function> function:
1555 CREATE FUNCTION square_root(double precision) RETURNS double precision
1560 (Most internal functions expect to be declared <quote>strict</quote>.)
1565 Not all <quote>predefined</quote> functions are
1566 <quote>internal</quote> in the above sense. Some predefined
1567 functions are written in SQL.
1572 <sect1 id="xfunc-c">
1573 <title>C-Language Functions</title>
1575 <indexterm zone="xfunc-c">
1576 <primary>function</primary>
1577 <secondary>user-defined</secondary>
1578 <tertiary>in C</tertiary>
1582 User-defined functions can be written in C (or a language that can
1583 be made compatible with C, such as C++). Such functions are
1584 compiled into dynamically loadable objects (also called shared
1585 libraries) and are loaded by the server on demand. The dynamic
1586 loading feature is what distinguishes <quote>C language</> functions
1587 from <quote>internal</> functions — the actual coding conventions
1588 are essentially the same for both. (Hence, the standard internal
1589 function library is a rich source of coding examples for user-defined
1594 Two different calling conventions are currently used for C functions.
1595 The newer <quote>version 1</quote> calling convention is indicated by writing
1596 a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1597 as illustrated below. Lack of such a macro indicates an old-style
1598 (<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
1599 is <literal>C</literal> in either case. Old-style functions are now deprecated
1600 because of portability problems and lack of functionality, but they
1601 are still supported for compatibility reasons.
1604 <sect2 id="xfunc-c-dynload">
1605 <title>Dynamic Loading</title>
1607 <indexterm zone="xfunc-c-dynload">
1608 <primary>dynamic loading</primary>
1612 The first time a user-defined function in a particular
1613 loadable object file is called in a session,
1614 the dynamic loader loads that object file into memory so that the
1615 function can be called. The <command>CREATE FUNCTION</command>
1616 for a user-defined C function must therefore specify two pieces of
1617 information for the function: the name of the loadable
1618 object file, and the C name (link symbol) of the specific function to call
1619 within that object file. If the C name is not explicitly specified then
1620 it is assumed to be the same as the SQL function name.
1624 The following algorithm is used to locate the shared object file
1625 based on the name given in the <command>CREATE FUNCTION</command>
1631 If the name is an absolute path, the given file is loaded.
1637 If the name starts with the string <literal>$libdir</literal>,
1638 that part is replaced by the <productname>PostgreSQL</> package
1640 name, which is determined at build time.<indexterm><primary>$libdir</></>
1646 If the name does not contain a directory part, the file is
1647 searched for in the path specified by the configuration variable
1648 <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1654 Otherwise (the file was not found in the path, or it contains a
1655 non-absolute directory part), the dynamic loader will try to
1656 take the name as given, which will most likely fail. (It is
1657 unreliable to depend on the current working directory.)
1662 If this sequence does not work, the platform-specific shared
1663 library file name extension (often <filename>.so</filename>) is
1664 appended to the given name and this sequence is tried again. If
1665 that fails as well, the load will fail.
1669 It is recommended to locate shared libraries either relative to
1670 <literal>$libdir</literal> or through the dynamic library path.
1671 This simplifies version upgrades if the new installation is at a
1672 different location. The actual directory that
1673 <literal>$libdir</literal> stands for can be found out with the
1674 command <literal>pg_config --pkglibdir</literal>.
1678 The user ID the <productname>PostgreSQL</productname> server runs
1679 as must be able to traverse the path to the file you intend to
1680 load. Making the file or a higher-level directory not readable
1681 and/or not executable by the <systemitem>postgres</systemitem>
1682 user is a common mistake.
1686 In any case, the file name that is given in the
1687 <command>CREATE FUNCTION</command> command is recorded literally
1688 in the system catalogs, so if the file needs to be loaded again
1689 the same procedure is applied.
1694 <productname>PostgreSQL</productname> will not compile a C function
1695 automatically. The object file must be compiled before it is referenced
1696 in a <command>CREATE
1697 FUNCTION</> command. See <xref linkend="dfunc"> for additional
1702 <indexterm zone="xfunc-c-dynload">
1703 <primary>magic block</primary>
1707 To ensure that a dynamically loaded object file is not loaded into an
1708 incompatible server, <productname>PostgreSQL</productname> checks that the
1709 file contains a <quote>magic block</> with the appropriate contents.
1710 This allows the server to detect obvious incompatibilities, such as code
1711 compiled for a different major version of
1712 <productname>PostgreSQL</productname>. A magic block is required as of
1713 <productname>PostgreSQL</productname> 8.2. To include a magic block,
1714 write this in one (and only one) of the module source files, after having
1715 included the header <filename>fmgr.h</>:
1718 #ifdef PG_MODULE_MAGIC
1723 The <literal>#ifdef</> test can be omitted if the code doesn't
1724 need to compile against pre-8.2 <productname>PostgreSQL</productname>
1729 After it is used for the first time, a dynamically loaded object
1730 file is retained in memory. Future calls in the same session to
1731 the function(s) in that file will only incur the small overhead of
1732 a symbol table lookup. If you need to force a reload of an object
1733 file, for example after recompiling it, begin a fresh session.
1736 <indexterm zone="xfunc-c-dynload">
1737 <primary>_PG_init</primary>
1739 <indexterm zone="xfunc-c-dynload">
1740 <primary>_PG_fini</primary>
1742 <indexterm zone="xfunc-c-dynload">
1743 <primary>library initialization function</primary>
1745 <indexterm zone="xfunc-c-dynload">
1746 <primary>library finalization function</primary>
1750 Optionally, a dynamically loaded file can contain initialization and
1751 finalization functions. If the file includes a function named
1752 <function>_PG_init</>, that function will be called immediately after
1753 loading the file. The function receives no parameters and should
1754 return void. If the file includes a function named
1755 <function>_PG_fini</>, that function will be called immediately before
1756 unloading the file. Likewise, the function receives no parameters and
1757 should return void. Note that <function>_PG_fini</> will only be called
1758 during an unload of the file, not during process termination.
1759 (Presently, unloads are disabled and will never occur, but this may
1760 change in the future.)
1765 <sect2 id="xfunc-c-basetype">
1766 <title>Base Types in C-Language Functions</title>
1768 <indexterm zone="xfunc-c-basetype">
1769 <primary>data type</primary>
1770 <secondary>internal organization</secondary>
1774 To know how to write C-language functions, you need to know how
1775 <productname>PostgreSQL</productname> internally represents base
1776 data types and how they can be passed to and from functions.
1777 Internally, <productname>PostgreSQL</productname> regards a base
1778 type as a <quote>blob of memory</quote>. The user-defined
1779 functions that you define over a type in turn define the way that
1780 <productname>PostgreSQL</productname> can operate on it. That
1781 is, <productname>PostgreSQL</productname> will only store and
1782 retrieve the data from disk and use your user-defined functions
1783 to input, process, and output the data.
1787 Base types can have one of three internal formats:
1792 pass by value, fixed-length
1797 pass by reference, fixed-length
1802 pass by reference, variable-length
1809 By-value types can only be 1, 2, or 4 bytes in length
1810 (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1811 You should be careful to define your types such that they will be the
1812 same size (in bytes) on all architectures. For example, the
1813 <literal>long</literal> type is dangerous because it is 4 bytes on some
1814 machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1815 on most Unix machines. A reasonable implementation of the
1816 <type>int4</type> type on Unix machines might be:
1819 /* 4-byte integer, passed by value */
1823 (The actual PostgreSQL C code calls this type <type>int32</type>, because
1824 it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1825 means <replaceable>XX</replaceable> <emphasis>bits</emphasis>. Note
1826 therefore also that the C type <type>int8</type> is 1 byte in size. The
1827 SQL type <type>int8</type> is called <type>int64</type> in C. See also
1828 <xref linkend="xfunc-c-type-table">.)
1832 On the other hand, fixed-length types of any size can
1833 be passed by-reference. For example, here is a sample
1834 implementation of a <productname>PostgreSQL</productname> type:
1837 /* 16-byte structure, passed by reference */
1844 Only pointers to such types can be used when passing
1845 them in and out of <productname>PostgreSQL</productname> functions.
1846 To return a value of such a type, allocate the right amount of
1847 memory with <literal>palloc</literal>, fill in the allocated memory,
1848 and return a pointer to it. (Also, if you just want to return the
1849 same value as one of your input arguments that's of the same data type,
1850 you can skip the extra <literal>palloc</literal> and just return the
1851 pointer to the input value.)
1855 Finally, all variable-length types must also be passed
1856 by reference. All variable-length types must begin
1857 with an opaque length field of exactly 4 bytes, which will be set
1858 by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1859 be stored within that type must be located in the memory
1860 immediately following that length field. The
1861 length field contains the total length of the structure,
1862 that is, it includes the size of the length field
1867 Another important point is to avoid leaving any uninitialized bits
1868 within data type values; for example, take care to zero out any
1869 alignment padding bytes that might be present in structs. Without
1870 this, logically-equivalent constants of your data type might be
1871 seen as unequal by the planner, leading to inefficient (though not
1877 <emphasis>Never</> modify the contents of a pass-by-reference input
1878 value. If you do so you are likely to corrupt on-disk data, since
1879 the pointer you are given might point directly into a disk buffer.
1880 The sole exception to this rule is explained in
1881 <xref linkend="xaggr">.
1886 As an example, we can define the type <type>text</type> as
1892 char data[FLEXIBLE_ARRAY_MEMBER];
1896 The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1897 length of the data part is not specified by this declaration.
1902 variable-length types, we must be careful to allocate
1903 the correct amount of memory and set the length field correctly.
1904 For example, if we wanted to store 40 bytes in a <structname>text</>
1905 structure, we might use a code fragment like this:
1907 <programlisting><![CDATA[
1908 #include "postgres.h"
1910 char buffer[40]; /* our source data */
1912 text *destination = (text *) palloc(VARHDRSZ + 40);
1913 SET_VARSIZE(destination, VARHDRSZ + 40);
1914 memcpy(destination->data, buffer, 40);
1919 <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1920 it's considered good style to use the macro <literal>VARHDRSZ</>
1921 to refer to the size of the overhead for a variable-length type.
1922 Also, the length field <emphasis>must</> be set using the
1923 <literal>SET_VARSIZE</> macro, not by simple assignment.
1927 <xref linkend="xfunc-c-type-table"> specifies which C type
1928 corresponds to which SQL type when writing a C-language function
1929 that uses a built-in type of <productname>PostgreSQL</>.
1930 The <quote>Defined In</quote> column gives the header file that
1931 needs to be included to get the type definition. (The actual
1932 definition might be in a different file that is included by the
1933 listed file. It is recommended that users stick to the defined
1934 interface.) Note that you should always include
1935 <filename>postgres.h</filename> first in any source file, because
1936 it declares a number of things that you will need anyway.
1939 <table tocentry="1" id="xfunc-c-type-table">
1940 <title>Equivalent C Types for Built-in SQL Types</title>
1957 <entry><type>abstime</type></entry>
1958 <entry><type>AbsoluteTime</type></entry>
1959 <entry><filename>utils/nabstime.h</filename></entry>
1962 <entry><type>bigint</type> (<type>int8</type>)</entry>
1963 <entry><type>int64</type></entry>
1964 <entry><filename>postgres.h</filename></entry>
1967 <entry><type>boolean</type></entry>
1968 <entry><type>bool</type></entry>
1969 <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1972 <entry><type>box</type></entry>
1973 <entry><type>BOX*</type></entry>
1974 <entry><filename>utils/geo_decls.h</filename></entry>
1977 <entry><type>bytea</type></entry>
1978 <entry><type>bytea*</type></entry>
1979 <entry><filename>postgres.h</filename></entry>
1982 <entry><type>"char"</type></entry>
1983 <entry><type>char</type></entry>
1984 <entry>(compiler built-in)</entry>
1987 <entry><type>character</type></entry>
1988 <entry><type>BpChar*</type></entry>
1989 <entry><filename>postgres.h</filename></entry>
1992 <entry><type>cid</type></entry>
1993 <entry><type>CommandId</type></entry>
1994 <entry><filename>postgres.h</filename></entry>
1997 <entry><type>date</type></entry>
1998 <entry><type>DateADT</type></entry>
1999 <entry><filename>utils/date.h</filename></entry>
2002 <entry><type>smallint</type> (<type>int2</type>)</entry>
2003 <entry><type>int16</type></entry>
2004 <entry><filename>postgres.h</filename></entry>
2007 <entry><type>int2vector</type></entry>
2008 <entry><type>int2vector*</type></entry>
2009 <entry><filename>postgres.h</filename></entry>
2012 <entry><type>integer</type> (<type>int4</type>)</entry>
2013 <entry><type>int32</type></entry>
2014 <entry><filename>postgres.h</filename></entry>
2017 <entry><type>real</type> (<type>float4</type>)</entry>
2018 <entry><type>float4*</type></entry>
2019 <entry><filename>postgres.h</filename></entry>
2022 <entry><type>double precision</type> (<type>float8</type>)</entry>
2023 <entry><type>float8*</type></entry>
2024 <entry><filename>postgres.h</filename></entry>
2027 <entry><type>interval</type></entry>
2028 <entry><type>Interval*</type></entry>
2029 <entry><filename>datatype/timestamp.h</filename></entry>
2032 <entry><type>lseg</type></entry>
2033 <entry><type>LSEG*</type></entry>
2034 <entry><filename>utils/geo_decls.h</filename></entry>
2037 <entry><type>name</type></entry>
2038 <entry><type>Name</type></entry>
2039 <entry><filename>postgres.h</filename></entry>
2042 <entry><type>oid</type></entry>
2043 <entry><type>Oid</type></entry>
2044 <entry><filename>postgres.h</filename></entry>
2047 <entry><type>oidvector</type></entry>
2048 <entry><type>oidvector*</type></entry>
2049 <entry><filename>postgres.h</filename></entry>
2052 <entry><type>path</type></entry>
2053 <entry><type>PATH*</type></entry>
2054 <entry><filename>utils/geo_decls.h</filename></entry>
2057 <entry><type>point</type></entry>
2058 <entry><type>POINT*</type></entry>
2059 <entry><filename>utils/geo_decls.h</filename></entry>
2062 <entry><type>regproc</type></entry>
2063 <entry><type>regproc</type></entry>
2064 <entry><filename>postgres.h</filename></entry>
2067 <entry><type>reltime</type></entry>
2068 <entry><type>RelativeTime</type></entry>
2069 <entry><filename>utils/nabstime.h</filename></entry>
2072 <entry><type>text</type></entry>
2073 <entry><type>text*</type></entry>
2074 <entry><filename>postgres.h</filename></entry>
2077 <entry><type>tid</type></entry>
2078 <entry><type>ItemPointer</type></entry>
2079 <entry><filename>storage/itemptr.h</filename></entry>
2082 <entry><type>time</type></entry>
2083 <entry><type>TimeADT</type></entry>
2084 <entry><filename>utils/date.h</filename></entry>
2087 <entry><type>time with time zone</type></entry>
2088 <entry><type>TimeTzADT</type></entry>
2089 <entry><filename>utils/date.h</filename></entry>
2092 <entry><type>timestamp</type></entry>
2093 <entry><type>Timestamp*</type></entry>
2094 <entry><filename>datatype/timestamp.h</filename></entry>
2097 <entry><type>tinterval</type></entry>
2098 <entry><type>TimeInterval</type></entry>
2099 <entry><filename>utils/nabstime.h</filename></entry>
2102 <entry><type>varchar</type></entry>
2103 <entry><type>VarChar*</type></entry>
2104 <entry><filename>postgres.h</filename></entry>
2107 <entry><type>xid</type></entry>
2108 <entry><type>TransactionId</type></entry>
2109 <entry><filename>postgres.h</filename></entry>
2116 Now that we've gone over all of the possible structures
2117 for base types, we can show some examples of real functions.
2122 <title>Version 0 Calling Conventions</title>
2125 We present the <quote>old style</quote> calling convention first — although
2126 this approach is now deprecated, it's easier to get a handle on
2127 initially. In the version-0 method, the arguments and result
2128 of the C function are just declared in normal C style, but being
2129 careful to use the C representation of each SQL data type as shown
2134 Here are some examples:
2136 <programlisting><![CDATA[
2137 #include "postgres.h"
2139 #include "utils/geo_decls.h"
2141 #ifdef PG_MODULE_MAGIC
2153 /* by reference, fixed length */
2156 add_one_float8(float8 *arg)
2158 float8 *result = (float8 *) palloc(sizeof(float8));
2160 *result = *arg + 1.0;
2166 makepoint(Point *pointx, Point *pointy)
2168 Point *new_point = (Point *) palloc(sizeof(Point));
2170 new_point->x = pointx->x;
2171 new_point->y = pointy->y;
2176 /* by reference, variable length */
2182 * VARSIZE is the total size of the struct in bytes.
2184 text *new_t = (text *) palloc(VARSIZE(t));
2185 SET_VARSIZE(new_t, VARSIZE(t));
2187 * VARDATA is a pointer to the data region of the struct.
2189 memcpy((void *) VARDATA(new_t), /* destination */
2190 (void *) VARDATA(t), /* source */
2191 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2196 concat_text(text *arg1, text *arg2)
2198 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2199 text *new_text = (text *) palloc(new_text_size);
2201 SET_VARSIZE(new_text, new_text_size);
2202 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2203 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2204 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2212 Supposing that the above code has been prepared in file
2213 <filename>funcs.c</filename> and compiled into a shared object,
2214 we could define the functions to <productname>PostgreSQL</productname>
2215 with commands like this:
2218 CREATE FUNCTION add_one(integer) RETURNS integer
2219 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2222 -- note overloading of SQL function name "add_one"
2223 CREATE FUNCTION add_one(double precision) RETURNS double precision
2224 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2227 CREATE FUNCTION makepoint(point, point) RETURNS point
2228 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2231 CREATE FUNCTION copytext(text) RETURNS text
2232 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2235 CREATE FUNCTION concat_text(text, text) RETURNS text
2236 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2242 Here, <replaceable>DIRECTORY</replaceable> stands for the
2243 directory of the shared library file (for instance the
2244 <productname>PostgreSQL</productname> tutorial directory, which
2245 contains the code for the examples used in this section).
2246 (Better style would be to use just <literal>'funcs'</> in the
2247 <literal>AS</> clause, after having added
2248 <replaceable>DIRECTORY</replaceable> to the search path. In any
2249 case, we can omit the system-specific extension for a shared
2250 library, commonly <literal>.so</literal> or
2251 <literal>.sl</literal>.)
2255 Notice that we have specified the functions as <quote>strict</quote>,
2257 the system should automatically assume a null result if any input
2258 value is null. By doing this, we avoid having to check for null inputs
2259 in the function code. Without this, we'd have to check for null values
2260 explicitly, by checking for a null pointer for each
2261 pass-by-reference argument. (For pass-by-value arguments, we don't
2262 even have a way to check!)
2266 Although this calling convention is simple to use,
2267 it is not very portable; on some architectures there are problems
2268 with passing data types that are smaller than <type>int</type> this way. Also, there is
2269 no simple way to return a null result, nor to cope with null arguments
2270 in any way other than making the function strict. The version-1
2271 convention, presented next, overcomes these objections.
2276 <title>Version 1 Calling Conventions</title>
2279 The version-1 calling convention relies on macros to suppress most
2280 of the complexity of passing arguments and results. The C declaration
2281 of a version-1 function is always:
2283 Datum funcname(PG_FUNCTION_ARGS)
2285 In addition, the macro call:
2287 PG_FUNCTION_INFO_V1(funcname);
2289 must appear in the same source file. (Conventionally, it's
2290 written just before the function itself.) This macro call is not
2291 needed for <literal>internal</>-language functions, since
2292 <productname>PostgreSQL</> assumes that all internal functions
2293 use the version-1 convention. It is, however, required for
2294 dynamically-loaded functions.
2298 In a version-1 function, each actual argument is fetched using a
2299 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2300 macro that corresponds to the argument's data type, and the
2301 result is returned using a
2302 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2303 macro for the return type.
2304 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2305 takes as its argument the number of the function argument to
2306 fetch, where the count starts at 0.
2307 <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2308 takes as its argument the actual value to return.
2312 Here we show the same functions as above, coded in version-1 style:
2314 <programlisting><![CDATA[
2315 #include "postgres.h"
2318 #include "utils/geo_decls.h"
2320 #ifdef PG_MODULE_MAGIC
2326 PG_FUNCTION_INFO_V1(add_one);
2329 add_one(PG_FUNCTION_ARGS)
2331 int32 arg = PG_GETARG_INT32(0);
2333 PG_RETURN_INT32(arg + 1);
2336 /* by reference, fixed length */
2338 PG_FUNCTION_INFO_V1(add_one_float8);
2341 add_one_float8(PG_FUNCTION_ARGS)
2343 /* The macros for FLOAT8 hide its pass-by-reference nature. */
2344 float8 arg = PG_GETARG_FLOAT8(0);
2346 PG_RETURN_FLOAT8(arg + 1.0);
2349 PG_FUNCTION_INFO_V1(makepoint);
2352 makepoint(PG_FUNCTION_ARGS)
2354 /* Here, the pass-by-reference nature of Point is not hidden. */
2355 Point *pointx = PG_GETARG_POINT_P(0);
2356 Point *pointy = PG_GETARG_POINT_P(1);
2357 Point *new_point = (Point *) palloc(sizeof(Point));
2359 new_point->x = pointx->x;
2360 new_point->y = pointy->y;
2362 PG_RETURN_POINT_P(new_point);
2365 /* by reference, variable length */
2367 PG_FUNCTION_INFO_V1(copytext);
2370 copytext(PG_FUNCTION_ARGS)
2372 text *t = PG_GETARG_TEXT_P(0);
2374 * VARSIZE is the total size of the struct in bytes.
2376 text *new_t = (text *) palloc(VARSIZE(t));
2377 SET_VARSIZE(new_t, VARSIZE(t));
2379 * VARDATA is a pointer to the data region of the struct.
2381 memcpy((void *) VARDATA(new_t), /* destination */
2382 (void *) VARDATA(t), /* source */
2383 VARSIZE(t) - VARHDRSZ); /* how many bytes */
2384 PG_RETURN_TEXT_P(new_t);
2387 PG_FUNCTION_INFO_V1(concat_text);
2390 concat_text(PG_FUNCTION_ARGS)
2392 text *arg1 = PG_GETARG_TEXT_P(0);
2393 text *arg2 = PG_GETARG_TEXT_P(1);
2394 int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2395 text *new_text = (text *) palloc(new_text_size);
2397 SET_VARSIZE(new_text, new_text_size);
2398 memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2399 memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2400 VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2401 PG_RETURN_TEXT_P(new_text);
2408 The <command>CREATE FUNCTION</command> commands are the same as
2409 for the version-0 equivalents.
2413 At first glance, the version-1 coding conventions might appear to
2414 be just pointless obscurantism. They do, however, offer a number
2415 of improvements, because the macros can hide unnecessary detail.
2416 An example is that in coding <function>add_one_float8</>, we no longer need to
2417 be aware that <type>float8</type> is a pass-by-reference type. Another
2418 example is that the <literal>GETARG</> macros for variable-length types allow
2419 for more efficient fetching of <quote>toasted</quote> (compressed or
2420 out-of-line) values.
2424 One big improvement in version-1 functions is better handling of null
2425 inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2426 allows a function to test whether each input is null. (Of course, doing
2427 this is only necessary in functions not declared <quote>strict</>.)
2429 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2430 the input arguments are counted beginning at zero. Note that one
2431 should refrain from executing
2432 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2433 one has verified that the argument isn't null.
2434 To return a null result, execute <function>PG_RETURN_NULL()</function>;
2435 this works in both strict and nonstrict functions.
2439 Other options provided in the new-style interface are two
2441 <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2442 macros. The first of these,
2443 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2444 guarantees to return a copy of the specified argument that is
2445 safe for writing into. (The normal macros will sometimes return a
2446 pointer to a value that is physically stored in a table, which
2447 must not be written to. Using the
2448 <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2449 macros guarantees a writable result.)
2450 The second variant consists of the
2451 <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2452 macros which take three arguments. The first is the number of the
2453 function argument (as above). The second and third are the offset and
2454 length of the segment to be returned. Offsets are counted from
2455 zero, and a negative length requests that the remainder of the
2456 value be returned. These macros provide more efficient access to
2457 parts of large values in the case where they have storage type
2458 <quote>external</quote>. (The storage type of a column can be specified using
2459 <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2460 COLUMN <replaceable>colname</replaceable> SET STORAGE
2461 <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2462 <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2463 or <literal>main</>.)
2467 Finally, the version-1 function call conventions make it possible
2468 to return set results (<xref linkend="xfunc-c-return-set">) and
2469 implement trigger functions (<xref linkend="triggers">) and
2470 procedural-language call handlers (<xref
2471 linkend="plhandler">). Version-1 code is also more
2472 portable than version-0, because it does not break restrictions
2473 on function call protocol in the C standard. For more details
2474 see <filename>src/backend/utils/fmgr/README</filename> in the
2475 source distribution.
2480 <title>Writing Code</title>
2483 Before we turn to the more advanced topics, we should discuss
2484 some coding rules for <productname>PostgreSQL</productname>
2485 C-language functions. While it might be possible to load functions
2486 written in languages other than C into
2487 <productname>PostgreSQL</productname>, this is usually difficult
2488 (when it is possible at all) because other languages, such as
2489 C++, FORTRAN, or Pascal often do not follow the same calling
2490 convention as C. That is, other languages do not pass argument
2491 and return values between functions in the same way. For this
2492 reason, we will assume that your C-language functions are
2493 actually written in C.
2497 The basic rules for writing and building C functions are as follows:
2502 Use <literal>pg_config
2503 --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2504 to find out where the <productname>PostgreSQL</> server header
2505 files are installed on your system (or the system that your
2506 users will be running on).
2512 Compiling and linking your code so that it can be dynamically
2513 loaded into <productname>PostgreSQL</productname> always
2514 requires special flags. See <xref linkend="dfunc"> for a
2515 detailed explanation of how to do it for your particular
2522 Remember to define a <quote>magic block</> for your shared library,
2523 as described in <xref linkend="xfunc-c-dynload">.
2529 When allocating memory, use the
2530 <productname>PostgreSQL</productname> functions
2531 <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2532 instead of the corresponding C library functions
2533 <function>malloc</function> and <function>free</function>.
2534 The memory allocated by <function>palloc</function> will be
2535 freed automatically at the end of each transaction, preventing
2542 Always zero the bytes of your structures using <function>memset</>
2543 (or allocate them with <function>palloc0</> in the first place).
2544 Even if you assign to each field of your structure, there might be
2545 alignment padding (holes in the structure) that contain
2546 garbage values. Without this, it's difficult to
2547 support hash indexes or hash joins, as you must pick out only
2548 the significant bits of your data structure to compute a hash.
2549 The planner also sometimes relies on comparing constants via
2550 bitwise equality, so you can get undesirable planning results if
2551 logically-equivalent values aren't bitwise equal.
2557 Most of the internal <productname>PostgreSQL</productname>
2558 types are declared in <filename>postgres.h</filename>, while
2559 the function manager interfaces
2560 (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
2561 <filename>fmgr.h</filename>, so you will need to include at
2562 least these two files. For portability reasons it's best to
2563 include <filename>postgres.h</filename> <emphasis>first</>,
2564 before any other system or user header files. Including
2565 <filename>postgres.h</filename> will also include
2566 <filename>elog.h</filename> and <filename>palloc.h</filename>
2573 Symbol names defined within object files must not conflict
2574 with each other or with symbols defined in the
2575 <productname>PostgreSQL</productname> server executable. You
2576 will have to rename your functions or variables if you get
2577 error messages to this effect.
2583 To work correctly on Windows, <literal>C</>-language functions need
2584 to be marked with <literal>PGDLLEXPORT</>, unless you use a build
2585 process that marks all global functions that way. In simple cases
2586 this detail will be handled transparently by
2587 the <literal>PG_FUNCTION_INFO_V1</> macro. However, if you write
2588 explicit external declarations (perhaps in header files), be sure
2589 to write them like this:
2591 extern PGDLLEXPORT Datum funcname(PG_FUNCTION_ARGS);
2593 or you'll get compiler complaints when building on Windows. (On
2594 other platforms, the <literal>PGDLLEXPORT</> macro does nothing.)
2604 <title>Composite-type Arguments</title>
2607 Composite types do not have a fixed layout like C structures.
2608 Instances of a composite type can contain null fields. In
2609 addition, composite types that are part of an inheritance
2610 hierarchy can have different fields than other members of the
2611 same inheritance hierarchy. Therefore,
2612 <productname>PostgreSQL</productname> provides a function
2613 interface for accessing fields of composite types from C.
2617 Suppose we want to write a function to answer the query:
2620 SELECT name, c_overpaid(emp, 1500) AS overpaid
2622 WHERE name = 'Bill' OR name = 'Sam';
2625 Using call conventions version 0, we can define
2626 <function>c_overpaid</> as:
2628 <programlisting><![CDATA[
2629 #include "postgres.h"
2630 #include "executor/executor.h" /* for GetAttributeByName() */
2632 #ifdef PG_MODULE_MAGIC
2637 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2643 salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2646 return salary > limit;
2651 In version-1 coding, the above would look like this:
2653 <programlisting><![CDATA[
2654 #include "postgres.h"
2655 #include "executor/executor.h" /* for GetAttributeByName() */
2657 #ifdef PG_MODULE_MAGIC
2661 PG_FUNCTION_INFO_V1(c_overpaid);
2664 c_overpaid(PG_FUNCTION_ARGS)
2666 HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2667 int32 limit = PG_GETARG_INT32(1);
2671 salary = GetAttributeByName(t, "salary", &isnull);
2673 PG_RETURN_BOOL(false);
2674 /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2676 PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2683 <function>GetAttributeByName</function> is the
2684 <productname>PostgreSQL</productname> system function that
2685 returns attributes out of the specified row. It has
2686 three arguments: the argument of type <type>HeapTupleHeader</type> passed
2688 the function, the name of the desired attribute, and a
2689 return parameter that tells whether the attribute
2690 is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2691 value that you can convert to the proper data type by using the
2692 appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2693 macro. Note that the return value is meaningless if the null flag is
2694 set; always check the null flag before trying to do anything with the
2699 There is also <function>GetAttributeByNum</function>, which selects
2700 the target attribute by column number instead of name.
2704 The following command declares the function
2705 <function>c_overpaid</function> in SQL:
2708 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2709 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2713 Notice we have used <literal>STRICT</> so that we did not have to
2714 check whether the input arguments were NULL.
2719 <title>Returning Rows (Composite Types)</title>
2722 To return a row or composite-type value from a C-language
2723 function, you can use a special API that provides macros and
2724 functions to hide most of the complexity of building composite
2725 data types. To use this API, the source file must include:
2727 #include "funcapi.h"
2732 There are two ways you can build a composite data value (henceforth
2733 a <quote>tuple</>): you can build it from an array of Datum values,
2734 or from an array of C strings that can be passed to the input
2735 conversion functions of the tuple's column data types. In either
2736 case, you first need to obtain or construct a <structname>TupleDesc</>
2737 descriptor for the tuple structure. When working with Datums, you
2738 pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2739 and then call <function>heap_form_tuple</> for each row. When working
2740 with C strings, you pass the <structname>TupleDesc</> to
2741 <function>TupleDescGetAttInMetadata</>, and then call
2742 <function>BuildTupleFromCStrings</> for each row. In the case of a
2743 function returning a set of tuples, the setup steps can all be done
2744 once during the first call of the function.
2748 Several helper functions are available for setting up the needed
2749 <structname>TupleDesc</>. The recommended way to do this in most
2750 functions returning composite values is to call:
2752 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2754 TupleDesc *resultTupleDesc)
2756 passing the same <literal>fcinfo</> struct passed to the calling function
2757 itself. (This of course requires that you use the version-1
2758 calling conventions.) <varname>resultTypeId</> can be specified
2759 as <literal>NULL</> or as the address of a local variable to receive the
2760 function's result type OID. <varname>resultTupleDesc</> should be the
2761 address of a local <structname>TupleDesc</> variable. Check that the
2762 result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2763 <varname>resultTupleDesc</> has been filled with the needed
2764 <structname>TupleDesc</>. (If it is not, you can report an error along
2765 the lines of <quote>function returning record called in context that
2766 cannot accept type record</quote>.)
2771 <function>get_call_result_type</> can resolve the actual type of a
2772 polymorphic function result; so it is useful in functions that return
2773 scalar polymorphic results, not only functions that return composites.
2774 The <varname>resultTypeId</> output is primarily useful for functions
2775 returning polymorphic scalars.
2781 <function>get_call_result_type</> has a sibling
2782 <function>get_expr_result_type</>, which can be used to resolve the
2783 expected output type for a function call represented by an expression
2784 tree. This can be used when trying to determine the result type from
2785 outside the function itself. There is also
2786 <function>get_func_result_type</>, which can be used when only the
2787 function's OID is available. However these functions are not able
2788 to deal with functions declared to return <structname>record</>, and
2789 <function>get_func_result_type</> cannot resolve polymorphic types,
2790 so you should preferentially use <function>get_call_result_type</>.
2795 Older, now-deprecated functions for obtaining
2796 <structname>TupleDesc</>s are:
2798 TupleDesc RelationNameGetTupleDesc(const char *relname)
2800 to get a <structname>TupleDesc</> for the row type of a named relation,
2803 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2805 to get a <structname>TupleDesc</> based on a type OID. This can
2806 be used to get a <structname>TupleDesc</> for a base or
2807 composite type. It will not work for a function that returns
2808 <structname>record</>, however, and it cannot resolve polymorphic
2813 Once you have a <structname>TupleDesc</>, call:
2815 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2817 if you plan to work with Datums, or:
2819 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2821 if you plan to work with C strings. If you are writing a function
2822 returning set, you can save the results of these functions in the
2823 <structname>FuncCallContext</> structure — use the
2824 <structfield>tuple_desc</> or <structfield>attinmeta</> field
2829 When working with Datums, use:
2831 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2833 to build a <structname>HeapTuple</> given user data in Datum form.
2837 When working with C strings, use:
2839 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2841 to build a <structname>HeapTuple</> given user data
2842 in C string form. <parameter>values</parameter> is an array of C strings,
2843 one for each attribute of the return row. Each C string should be in
2844 the form expected by the input function of the attribute data
2845 type. In order to return a null value for one of the attributes,
2846 the corresponding pointer in the <parameter>values</> array
2847 should be set to <symbol>NULL</>. This function will need to
2848 be called again for each row you return.
2852 Once you have built a tuple to return from your function, it
2853 must be converted into a <type>Datum</>. Use:
2855 HeapTupleGetDatum(HeapTuple tuple)
2857 to convert a <structname>HeapTuple</> into a valid Datum. This
2858 <type>Datum</> can be returned directly if you intend to return
2859 just a single row, or it can be used as the current return value
2860 in a set-returning function.
2864 An example appears in the next section.
2869 <sect2 id="xfunc-c-return-set">
2870 <title>Returning Sets</title>
2873 There is also a special API that provides support for returning
2874 sets (multiple rows) from a C-language function. A set-returning
2875 function must follow the version-1 calling conventions. Also,
2876 source files must include <filename>funcapi.h</filename>, as
2881 A set-returning function (<acronym>SRF</>) is called
2882 once for each item it returns. The <acronym>SRF</> must
2883 therefore save enough state to remember what it was doing and
2884 return the next item on each call.
2885 The structure <structname>FuncCallContext</> is provided to help
2886 control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2887 is used to hold a pointer to <structname>FuncCallContext</>
2893 * Number of times we've been called before
2895 * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2896 * incremented for you every time SRF_RETURN_NEXT() is called.
2901 * OPTIONAL maximum number of calls
2903 * max_calls is here for convenience only and setting it is optional.
2904 * If not set, you must provide alternative means to know when the
2910 * OPTIONAL pointer to result slot
2912 * This is obsolete and only present for backward compatibility, viz,
2913 * user-defined SRFs that use the deprecated TupleDescGetSlot().
2915 TupleTableSlot *slot;
2918 * OPTIONAL pointer to miscellaneous user-provided context information
2920 * user_fctx is for use as a pointer to your own data to retain
2921 * arbitrary context information between calls of your function.
2926 * OPTIONAL pointer to struct containing attribute type input metadata
2928 * attinmeta is for use when returning tuples (i.e., composite data types)
2929 * and is not used when returning base data types. It is only needed
2930 * if you intend to use BuildTupleFromCStrings() to create the return
2933 AttInMetadata *attinmeta;
2936 * memory context used for structures that must live for multiple calls
2938 * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2939 * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2940 * context for any memory that is to be reused across multiple calls
2943 MemoryContext multi_call_memory_ctx;
2946 * OPTIONAL pointer to struct containing tuple description
2948 * tuple_desc is for use when returning tuples (i.e., composite data types)
2949 * and is only needed if you are going to build the tuples with
2950 * heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
2951 * the TupleDesc pointer stored here should usually have been run through
2952 * BlessTupleDesc() first.
2954 TupleDesc tuple_desc;
2961 An <acronym>SRF</> uses several functions and macros that
2962 automatically manipulate the <structname>FuncCallContext</>
2963 structure (and expect to find it via <literal>fn_extra</>). Use:
2967 to determine if your function is being called for the first or a
2968 subsequent time. On the first call (only) use:
2970 SRF_FIRSTCALL_INIT()
2972 to initialize the <structname>FuncCallContext</>. On every function call,
2973 including the first, use:
2977 to properly set up for using the <structname>FuncCallContext</>
2978 and clearing any previously returned data left over from the
2983 If your function has data to return, use:
2985 SRF_RETURN_NEXT(funcctx, result)
2987 to return it to the caller. (<literal>result</> must be of type
2988 <type>Datum</>, either a single value or a tuple prepared as
2989 described above.) Finally, when your function is finished
2990 returning data, use:
2992 SRF_RETURN_DONE(funcctx)
2994 to clean up and end the <acronym>SRF</>.
2998 The memory context that is current when the <acronym>SRF</> is called is
2999 a transient context that will be cleared between calls. This means
3000 that you do not need to call <function>pfree</> on everything
3001 you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
3002 any data structures to live across calls, you need to put them somewhere
3003 else. The memory context referenced by
3004 <structfield>multi_call_memory_ctx</> is a suitable location for any
3005 data that needs to survive until the <acronym>SRF</> is finished running. In most
3006 cases, this means that you should switch into
3007 <structfield>multi_call_memory_ctx</> while doing the first-call setup.
3012 While the actual arguments to the function remain unchanged between
3013 calls, if you detoast the argument values (which is normally done
3014 transparently by the
3015 <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
3016 in the transient context then the detoasted copies will be freed on
3017 each cycle. Accordingly, if you keep references to such values in
3018 your <structfield>user_fctx</>, you must either copy them into the
3019 <structfield>multi_call_memory_ctx</> after detoasting, or ensure
3020 that you detoast the values only in that context.
3025 A complete pseudo-code example looks like the following:
3028 my_set_returning_function(PG_FUNCTION_ARGS)
3030 FuncCallContext *funcctx;
3032 <replaceable>further declarations as needed</replaceable>
3034 if (SRF_IS_FIRSTCALL())
3036 MemoryContext oldcontext;
3038 funcctx = SRF_FIRSTCALL_INIT();
3039 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3040 /* One-time setup code appears here: */
3041 <replaceable>user code</replaceable>
3042 <replaceable>if returning composite</replaceable>
3043 <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
3044 <replaceable>endif returning composite</replaceable>
3045 <replaceable>user code</replaceable>
3046 MemoryContextSwitchTo(oldcontext);
3049 /* Each-time setup code appears here: */
3050 <replaceable>user code</replaceable>
3051 funcctx = SRF_PERCALL_SETUP();
3052 <replaceable>user code</replaceable>
3054 /* this is just one way we might test whether we are done: */
3055 if (funcctx->call_cntr < funcctx->max_calls)
3057 /* Here we want to return another item: */
3058 <replaceable>user code</replaceable>
3059 <replaceable>obtain result Datum</replaceable>
3060 SRF_RETURN_NEXT(funcctx, result);
3064 /* Here we are done returning items and just need to clean up: */
3065 <replaceable>user code</replaceable>
3066 SRF_RETURN_DONE(funcctx);
3073 A complete example of a simple <acronym>SRF</> returning a composite type
3075 <programlisting><![CDATA[
3076 PG_FUNCTION_INFO_V1(retcomposite);
3079 retcomposite(PG_FUNCTION_ARGS)
3081 FuncCallContext *funcctx;
3085 AttInMetadata *attinmeta;
3087 /* stuff done only on the first call of the function */
3088 if (SRF_IS_FIRSTCALL())
3090 MemoryContext oldcontext;
3092 /* create a function context for cross-call persistence */
3093 funcctx = SRF_FIRSTCALL_INIT();
3095 /* switch to memory context appropriate for multiple function calls */
3096 oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3098 /* total number of tuples to be returned */
3099 funcctx->max_calls = PG_GETARG_UINT32(0);
3101 /* Build a tuple descriptor for our result type */
3102 if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3104 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3105 errmsg("function returning record called in context "
3106 "that cannot accept type record")));
3109 * generate attribute metadata needed later to produce tuples from raw
3112 attinmeta = TupleDescGetAttInMetadata(tupdesc);
3113 funcctx->attinmeta = attinmeta;
3115 MemoryContextSwitchTo(oldcontext);
3118 /* stuff done on every call of the function */
3119 funcctx = SRF_PERCALL_SETUP();
3121 call_cntr = funcctx->call_cntr;
3122 max_calls = funcctx->max_calls;
3123 attinmeta = funcctx->attinmeta;
3125 if (call_cntr < max_calls) /* do when there is more left to send */
3132 * Prepare a values array for building the returned tuple.
3133 * This should be an array of C strings which will
3134 * be processed later by the type input functions.
3136 values = (char **) palloc(3 * sizeof(char *));
3137 values[0] = (char *) palloc(16 * sizeof(char));
3138 values[1] = (char *) palloc(16 * sizeof(char));
3139 values[2] = (char *) palloc(16 * sizeof(char));
3141 snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3142 snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3143 snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3146 tuple = BuildTupleFromCStrings(attinmeta, values);
3148 /* make the tuple into a datum */
3149 result = HeapTupleGetDatum(tuple);
3151 /* clean up (this is not really necessary) */
3157 SRF_RETURN_NEXT(funcctx, result);
3159 else /* do when there is no more left */
3161 SRF_RETURN_DONE(funcctx);
3167 One way to declare this function in SQL is:
3169 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3171 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3172 RETURNS SETOF __retcomposite
3173 AS '<replaceable>filename</>', 'retcomposite'
3174 LANGUAGE C IMMUTABLE STRICT;
3176 A different way is to use OUT parameters:
3178 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3179 OUT f1 integer, OUT f2 integer, OUT f3 integer)
3180 RETURNS SETOF record
3181 AS '<replaceable>filename</>', 'retcomposite'
3182 LANGUAGE C IMMUTABLE STRICT;
3184 Notice that in this method the output type of the function is formally
3185 an anonymous <structname>record</> type.
3189 The directory <link linkend="tablefunc">contrib/tablefunc</>
3190 module in the source distribution contains more examples of
3191 set-returning functions.
3196 <title>Polymorphic Arguments and Return Types</title>
3199 C-language functions can be declared to accept and
3200 return the polymorphic types
3201 <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3202 <type>anyenum</type>, and <type>anyrange</type>.
3203 See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3204 of polymorphic functions. When function arguments or return types
3205 are defined as polymorphic types, the function author cannot know
3206 in advance what data type it will be called with, or
3207 need to return. There are two routines provided in <filename>fmgr.h</>
3208 to allow a version-1 C function to discover the actual data types
3209 of its arguments and the type it is expected to return. The routines are
3210 called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3211 <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3212 They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3213 information is not available.
3214 The structure <literal>flinfo</> is normally accessed as
3215 <literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
3216 is zero based. <function>get_call_result_type</> can also be used
3217 as an alternative to <function>get_fn_expr_rettype</>.
3218 There is also <function>get_fn_expr_variadic</>, which can be used to
3219 find out whether variadic arguments have been merged into an array.
3220 This is primarily useful for <literal>VARIADIC "any"</> functions,
3221 since such merging will always have occurred for variadic functions
3222 taking ordinary array types.
3226 For example, suppose we want to write a function to accept a single
3227 element of any type, and return a one-dimensional array of that type:
3230 PG_FUNCTION_INFO_V1(make_array);
3232 make_array(PG_FUNCTION_ARGS)
3235 Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
3245 if (!OidIsValid(element_type))
3246 elog(ERROR, "could not determine data type of input");
3248 /* get the provided element, being careful in case it's NULL */
3249 isnull = PG_ARGISNULL(0);
3251 element = (Datum) 0;
3253 element = PG_GETARG_DATUM(0);
3255 /* we have one dimension */
3257 /* and one element */
3259 /* and lower bound is 1 */
3262 /* get required info about the element type */
3263 get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
3265 /* now build the array */
3266 result = construct_md_array(&element, &isnull, ndims, dims, lbs,
3267 element_type, typlen, typbyval, typalign);
3269 PG_RETURN_ARRAYTYPE_P(result);
3275 The following command declares the function
3276 <function>make_array</function> in SQL:
3279 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3280 AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3281 LANGUAGE C IMMUTABLE;
3286 There is a variant of polymorphism that is only available to C-language
3287 functions: they can be declared to take parameters of type
3288 <literal>"any"</>. (Note that this type name must be double-quoted,
3289 since it's also a SQL reserved word.) This works like
3290 <type>anyelement</> except that it does not constrain different
3291 <literal>"any"</> arguments to be the same type, nor do they help
3292 determine the function's result type. A C-language function can also
3293 declare its final parameter to be <literal>VARIADIC "any"</>. This will
3294 match one or more actual arguments of any type (not necessarily the same
3295 type). These arguments will <emphasis>not</> be gathered into an array
3296 as happens with normal variadic functions; they will just be passed to
3297 the function separately. The <function>PG_NARGS()</> macro and the
3298 methods described above must be used to determine the number of actual
3299 arguments and their types when using this feature. Also, users of such
3300 a function might wish to use the <literal>VARIADIC</> keyword in their
3301 function call, with the expectation that the function would treat the
3302 array elements as separate arguments. The function itself must implement
3303 that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3304 detect that the actual argument was marked with <literal>VARIADIC</>.
3308 <sect2 id="xfunc-transform-functions">
3309 <title>Transform Functions</title>
3312 Some function calls can be simplified during planning based on
3313 properties specific to the function. For example,
3314 <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3315 To define such function-specific optimizations, write a
3316 <firstterm>transform function</> and place its OID in the
3317 <structfield>protransform</> field of the primary function's
3318 <structname>pg_proc</> entry. The transform function must have the SQL
3319 signature <literal>protransform(internal) RETURNS internal</>. The
3320 argument, actually <type>FuncExpr *</>, is a dummy node representing a
3321 call to the primary function. If the transform function's study of the
3322 expression tree proves that a simplified expression tree can substitute
3323 for all possible concrete calls represented thereby, build and return
3324 that simplified expression. Otherwise, return a <literal>NULL</>
3325 pointer (<emphasis>not</> a SQL null).
3329 We make no guarantee that <productname>PostgreSQL</> will never call the
3330 primary function in cases that the transform function could simplify.
3331 Ensure rigorous equivalence between the simplified expression and an
3332 actual call to the primary function.
3336 Currently, this facility is not exposed to users at the SQL level
3337 because of security concerns, so it is only practical to use for
3338 optimizing built-in functions.
3343 <title>Shared Memory and LWLocks</title>
3346 Add-ins can reserve LWLocks and an allocation of shared memory on server
3347 startup. The add-in's shared library must be preloaded by specifying
3349 <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3350 Shared memory is reserved by calling:
3352 void RequestAddinShmemSpace(int size)
3354 from your <function>_PG_init</> function.
3357 LWLocks are reserved by calling:
3359 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3361 from <function>_PG_init</>. This will ensure that an array of
3362 <literal>num_lwlocks</> LWLocks is available under the name
3363 <literal>tranche_name</>. Use <function>GetNamedLWLockTranche</>
3364 to get a pointer to this array.
3367 To avoid possible race-conditions, each backend should use the LWLock
3368 <function>AddinShmemInitLock</> when connecting to and initializing
3369 its allocation of shared memory, as shown here:
3371 static mystruct *ptr = NULL;
3377 LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3378 ptr = ShmemInitStruct("my struct name", size, &found);
3381 initialize contents of shmem area;
3382 acquire any requested LWLocks using:
3383 ptr->locks = GetNamedLWLockTranche("my tranche name");
3385 LWLockRelease(AddinShmemInitLock);
3391 <sect2 id="extend-Cpp">
3392 <title>Using C++ for Extensibility</title>
3394 <indexterm zone="extend-Cpp">
3395 <primary>C++</primary>
3399 Although the <productname>PostgreSQL</productname> backend is written in
3400 C, it is possible to write extensions in C++ if these guidelines are
3406 All functions accessed by the backend must present a C interface
3407 to the backend; these C functions can then call C++ functions.
3408 For example, <literal>extern C</> linkage is required for
3409 backend-accessed functions. This is also necessary for any
3410 functions that are passed as pointers between the backend and
3416 Free memory using the appropriate deallocation method. For example,
3417 most backend memory is allocated using <function>palloc()</>, so use
3418 <function>pfree()</> to free it. Using C++
3419 <function>delete</> in such cases will fail.
3424 Prevent exceptions from propagating into the C code (use a catch-all
3425 block at the top level of all <literal>extern C</> functions). This
3426 is necessary even if the C++ code does not explicitly throw any
3427 exceptions, because events like out-of-memory can still throw
3428 exceptions. Any exceptions must be caught and appropriate errors
3429 passed back to the C interface. If possible, compile C++ with
3430 <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3431 cases, you must check for failures in your C++ code, e.g. check for
3432 NULL returned by <function>new()</>.
3437 If calling backend functions from C++ code, be sure that the
3438 C++ call stack contains only plain old data structures
3439 (<acronym>POD</>). This is necessary because backend errors
3440 generate a distant <function>longjmp()</> that does not properly
3441 unroll a C++ call stack with non-POD objects.
3448 In summary, it is best to place C++ code behind a wall of
3449 <literal>extern C</> functions that interface to the backend,
3450 and avoid exception, memory, and call stack leakage.