granicus.if.org Git - postgresql/blob - doc/src/sgml/xfunc.sgml

   1 <!-- doc/src/sgml/xfunc.sgml -->
   2
   3  <sect1 id="xfunc">
   4   <title>User-defined Functions</title>
   5
   6   <indexterm zone="xfunc">
   7    <primary>function</primary>
   8    <secondary>user-defined</secondary>
   9   </indexterm>
  10
  11   <para>
  12    <productname>PostgreSQL</productname> provides four kinds of
  13    functions:
  14
  15    <itemizedlist>
  16     <listitem>
  17      <para>
  18       query language functions (functions written in
  19       <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
  20      </para>
  21     </listitem>
  22     <listitem>
  23      <para>
  24       procedural language functions (functions written in, for
  25       example, <application>PL/pgSQL</application> or <application>PL/Tcl</application>)
  26       (<xref linkend="xfunc-pl">)
  27      </para>
  28     </listitem>
  29     <listitem>
  30      <para>
  31       internal functions (<xref linkend="xfunc-internal">)
  32      </para>
  33     </listitem>
  34     <listitem>
  35      <para>
  36       C-language functions (<xref linkend="xfunc-c">)
  37      </para>
  38     </listitem>
  39    </itemizedlist>
  40   </para>
  41
  42   <para>
  43    Every kind
  44    of  function  can take base types, composite types, or
  45    combinations of these as arguments (parameters). In addition,
  46    every kind of function can return a base type or
  47    a composite type.  Functions can also be defined to return
  48    sets of base or composite values.
  49   </para>
  50
  51   <para>
  52    Many kinds of functions can take or return certain pseudo-types
  53    (such as polymorphic types), but the available facilities vary.
  54    Consult the description of each kind of function for more details.
  55   </para>
  56
  57   <para>
  58    It's easiest to define <acronym>SQL</acronym>
  59    functions, so we'll start by discussing those.
  60    Most of the concepts presented for <acronym>SQL</acronym> functions
  61    will carry over to the other types of functions.
  62   </para>
  63
  64   <para>
  65    Throughout this chapter, it can be useful to look at the reference
  66    page of the <xref linkend="sql-createfunction"> command to
  67    understand the examples better.  Some examples from this chapter
  68    can be found in <filename>funcs.sql</filename> and
  69    <filename>funcs.c</filename> in the <filename>src/tutorial</filename>
  70    directory in the <productname>PostgreSQL</productname> source
  71    distribution.
  72   </para>
  73   </sect1>
  74
  75   <sect1 id="xfunc-sql">
  76    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  77
  78    <indexterm zone="xfunc-sql">
  79     <primary>function</primary>
  80     <secondary>user-defined</secondary>
  81     <tertiary>in SQL</tertiary>
  82    </indexterm>
  83
  84    <para>
  85     SQL functions execute an arbitrary list of SQL statements, returning
  86     the result of the last query in the list.
  87     In the simple (non-set)
  88     case, the first row of the last query's result will be returned.
  89     (Bear in mind that <quote>the first row</quote> of a multirow
  90     result is not well-defined unless you use <literal>ORDER BY</literal>.)
  91     If the last query happens
  92     to return no rows at all, the null value will be returned.
  93    </para>
  94
  95    <para>
  96     Alternatively, an SQL function can be declared to return a set (that is,
  97     multiple rows) by specifying the function's return type as <literal>SETOF
  98     <replaceable>sometype</replaceable></literal>, or equivalently by declaring it as
  99     <literal>RETURNS TABLE(<replaceable>columns</replaceable>)</literal>.  In this case
 100     all rows of the last query's result are returned.  Further details appear
 101     below.
 102    </para>
 103
 104    <para>
 105     The body of an SQL function must be a list of SQL
 106     statements separated by semicolons.  A semicolon after the last
 107     statement is optional.  Unless the function is declared to return
 108     <type>void</type>, the last statement must be a <command>SELECT</command>,
 109     or an <command>INSERT</command>, <command>UPDATE</command>, or <command>DELETE</command>
 110     that has a <literal>RETURNING</literal> clause.
 111    </para>
 112
 113     <para>
 114      Any collection of commands in the  <acronym>SQL</acronym>
 115      language can be packaged together and defined as a function.
 116      Besides <command>SELECT</command> queries, the commands can include data
 117      modification queries (<command>INSERT</command>,
 118      <command>UPDATE</command>, and <command>DELETE</command>), as well as
 119      other SQL commands. (You cannot use transaction control commands, e.g.
 120      <command>COMMIT</command>, <command>SAVEPOINT</command>, and some utility
 121      commands, e.g.  <literal>VACUUM</literal>, in <acronym>SQL</acronym> functions.)
 122      However, the final command
 123      must be a <command>SELECT</command> or have a <literal>RETURNING</literal>
 124      clause that returns whatever is
 125      specified as the function's return type.  Alternatively, if you
 126      want to define a SQL function that performs actions but has no
 127      useful value to return, you can define it as returning <type>void</type>.
 128      For example, this function removes rows with negative salaries from
 129      the <literal>emp</literal> table:
 130
 131 <screen>
 132 CREATE FUNCTION clean_emp() RETURNS void AS '
 133     DELETE FROM emp
 134         WHERE salary &lt; 0;
 135 ' LANGUAGE SQL;
 136
 137 SELECT clean_emp();
 138
 139  clean_emp
 140 -----------
 141
 142 (1 row)
 143 </screen>
 144     </para>
 145
 146     <note>
 147      <para>
 148       The entire body of a SQL function is parsed before any of it is
 149       executed.  While a SQL function can contain commands that alter
 150       the system catalogs (e.g., <command>CREATE TABLE</command>), the effects
 151       of such commands will not be visible during parse analysis of
 152       later commands in the function.  Thus, for example,
 153       <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
 154       will not work as desired if packaged up into a single SQL function,
 155       since <structname>foo</structname> won't exist yet when the <command>INSERT</command>
 156       command is parsed.  It's recommended to use <application>PL/pgSQL</application>
 157       instead of a SQL function in this type of situation.
 158      </para>
 159    </note>
 160
 161    <para>
 162     The syntax of the <command>CREATE FUNCTION</command> command requires
 163     the function body to be written as a string constant.  It is usually
 164     most convenient to use dollar quoting (see <xref
 165     linkend="sql-syntax-dollar-quoting">) for the string constant.
 166     If you choose to use regular single-quoted string constant syntax,
 167     you must double single quote marks (<literal>'</literal>) and backslashes
 168     (<literal>\</literal>) (assuming escape string syntax) in the body of
 169     the function (see <xref linkend="sql-syntax-strings">).
 170    </para>
 171
 172    <sect2 id="xfunc-sql-function-arguments">
 173     <title>Arguments for <acronym>SQL</acronym> Functions</title>
 174
 175    <indexterm>
 176     <primary>function</primary>
 177     <secondary>named argument</secondary>
 178    </indexterm>
 179
 180     <para>
 181      Arguments of a SQL function can be referenced in the function
 182      body using either names or numbers.  Examples of both methods appear
 183      below.
 184     </para>
 185
 186     <para>
 187      To use a name, declare the function argument as having a name, and
 188      then just write that name in the function body.  If the argument name
 189      is the same as any column name in the current SQL command within the
 190      function, the column name will take precedence.  To override this,
 191      qualify the argument name with the name of the function itself, that is
 192      <literal><replaceable>function_name</replaceable>.<replaceable>argument_name</replaceable></literal>.
 193      (If this would conflict with a qualified column name, again the column
 194      name wins.  You can avoid the ambiguity by choosing a different alias for
 195      the table within the SQL command.)
 196     </para>
 197
 198     <para>
 199      In the older numeric approach, arguments are referenced using the syntax
 200      <literal>$<replaceable>n</replaceable></literal>: <literal>$1</literal> refers to the first input
 201      argument, <literal>$2</literal> to the second, and so on.  This will work
 202      whether or not the particular argument was declared with a name.
 203     </para>
 204
 205     <para>
 206      If an argument is of a composite type, then the dot notation,
 207      e.g., <literal><replaceable>argname</replaceable>.<replaceable>fieldname</replaceable></literal> or
 208      <literal>$1.<replaceable>fieldname</replaceable></literal>, can be used to access attributes of the
 209      argument.  Again, you might need to qualify the argument's name with the
 210      function name to make the form with an argument name unambiguous.
 211     </para>
 212
 213     <para>
 214      SQL function arguments can only be used as data values,
 215      not as identifiers.  Thus for example this is reasonable:
 216 <programlisting>
 217 INSERT INTO mytable VALUES ($1);
 218 </programlisting>
 219 but this will not work:
 220 <programlisting>
 221 INSERT INTO $1 VALUES (42);
 222 </programlisting>
 223     </para>
 224
 225     <note>
 226      <para>
 227       The ability to use names to reference SQL function arguments was added
 228       in <productname>PostgreSQL</productname> 9.2.  Functions to be used in
 229       older servers must use the <literal>$<replaceable>n</replaceable></literal> notation.
 230      </para>
 231     </note>
 232    </sect2>
 233
 234    <sect2 id="xfunc-sql-base-functions">
 235     <title><acronym>SQL</acronym> Functions on Base Types</title>
 236
 237     <para>
 238      The simplest possible <acronym>SQL</acronym> function has no arguments and
 239      simply returns a base type, such as <type>integer</type>:
 240
 241 <screen>
 242 CREATE FUNCTION one() RETURNS integer AS $$
 243     SELECT 1 AS result;
 244 $$ LANGUAGE SQL;
 245
 246 -- Alternative syntax for string literal:
 247 CREATE FUNCTION one() RETURNS integer AS '
 248     SELECT 1 AS result;
 249 ' LANGUAGE SQL;
 250
 251 SELECT one();
 252
 253  one
 254 -----
 255    1
 256 </screen>
 257     </para>
 258
 259     <para>
 260      Notice that we defined a column alias within the function body for the result of the function
 261      (with  the  name <literal>result</literal>),  but this column alias is not visible
 262      outside the function.  Hence,  the  result  is labeled <literal>one</literal>
 263      instead of <literal>result</literal>.
 264     </para>
 265
 266     <para>
 267      It is almost as easy to define <acronym>SQL</acronym> functions
 268      that take base types as arguments:
 269
 270 <screen>
 271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
 272     SELECT x + y;
 273 $$ LANGUAGE SQL;
 274
 275 SELECT add_em(1, 2) AS answer;
 276
 277  answer
 278 --------
 279       3
 280 </screen>
 281     </para>
 282
 283     <para>
 284      Alternatively, we could dispense with names for the arguments and
 285      use numbers:
 286
 287 <screen>
 288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
 289     SELECT $1 + $2;
 290 $$ LANGUAGE SQL;
 291
 292 SELECT add_em(1, 2) AS answer;
 293
 294  answer
 295 --------
 296       3
 297 </screen>
 298     </para>
 299
 300     <para>
 301      Here is a more useful function, which might be used to debit a
 302      bank account:
 303
 304 <programlisting>
 305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 306     UPDATE bank
 307         SET balance = balance - debit
 308         WHERE accountno = tf1.accountno;
 309     SELECT 1;
 310 $$ LANGUAGE SQL;
 311 </programlisting>
 312
 313      A user could execute this function to debit account 17 by $100.00 as
 314      follows:
 315
 316 <programlisting>
 317 SELECT tf1(17, 100.0);
 318 </programlisting>
 319     </para>
 320
 321     <para>
 322      In this example, we chose the name <literal>accountno</literal> for the first
 323      argument, but this is the same as the name of a column in the
 324      <literal>bank</literal> table.  Within the <command>UPDATE</command> command,
 325      <literal>accountno</literal> refers to the column <literal>bank.accountno</literal>,
 326      so <literal>tf1.accountno</literal> must be used to refer to the argument.
 327      We could of course avoid this by using a different name for the argument.
 328     </para>
 329
 330     <para>
 331      In practice one would probably like a more useful result from the
 332      function than a constant 1, so a more likely definition
 333      is:
 334
 335 <programlisting>
 336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 337     UPDATE bank
 338         SET balance = balance - debit
 339         WHERE accountno = tf1.accountno;
 340     SELECT balance FROM bank WHERE accountno = tf1.accountno;
 341 $$ LANGUAGE SQL;
 342 </programlisting>
 343
 344      which adjusts the balance and returns the new balance.
 345      The same thing could be done in one command using <literal>RETURNING</literal>:
 346
 347 <programlisting>
 348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 349     UPDATE bank
 350         SET balance = balance - debit
 351         WHERE accountno = tf1.accountno
 352     RETURNING balance;
 353 $$ LANGUAGE SQL;
 354 </programlisting>
 355     </para>
 356    </sect2>
 357
 358    <sect2 id="xfunc-sql-composite-functions">
 359     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 360
 361     <para>
 362      When writing functions with arguments of composite types, we must not
 363      only specify which argument we want but also the desired attribute
 364      (field) of that argument.  For example, suppose that
 365      <type>emp</type> is a table containing employee data, and therefore
 366      also the name of the composite type of each row of the table.  Here
 367      is a function <function>double_salary</function> that computes what someone's
 368      salary would be if it were doubled:
 369
 370 <screen>
 371 CREATE TABLE emp (
 372     name        text,
 373     salary      numeric,
 374     age         integer,
 375     cubicle     point
 376 );
 377
 378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
 379
 380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
 381     SELECT $1.salary * 2 AS salary;
 382 $$ LANGUAGE SQL;
 383
 384 SELECT name, double_salary(emp.*) AS dream
 385     FROM emp
 386     WHERE emp.cubicle ~= point '(2,1)';
 387
 388  name | dream
 389 ------+-------
 390  Bill |  8400
 391 </screen>
 392     </para>
 393
 394     <para>
 395      Notice the use of the syntax <literal>$1.salary</literal>
 396      to select one field of the argument row value.  Also notice
 397      how the calling <command>SELECT</command> command
 398      uses <replaceable>table_name</replaceable><literal>.*</literal> to select
 399      the entire current row of a table as a composite value.  The table
 400      row can alternatively be referenced using just the table name,
 401      like this:
 402 <screen>
 403 SELECT name, double_salary(emp) AS dream
 404     FROM emp
 405     WHERE emp.cubicle ~= point '(2,1)';
 406 </screen>
 407      but this usage is deprecated since it's easy to get confused.
 408      (See <xref linkend="rowtypes-usage"> for details about these
 409      two notations for the composite value of a table row.)
 410     </para>
 411
 412     <para>
 413      Sometimes it is handy to construct a composite argument value
 414      on-the-fly.  This can be done with the <literal>ROW</literal> construct.
 415      For example, we could adjust the data being passed to the function:
 416 <screen>
 417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
 418     FROM emp;
 419 </screen>
 420     </para>
 421
 422     <para>
 423      It is also possible to build a function that returns a composite type.
 424      This is an example of a function
 425      that returns a single <type>emp</type> row:
 426
 427 <programlisting>
 428 CREATE FUNCTION new_emp() RETURNS emp AS $$
 429     SELECT text 'None' AS name,
 430         1000.0 AS salary,
 431         25 AS age,
 432         point '(2,2)' AS cubicle;
 433 $$ LANGUAGE SQL;
 434 </programlisting>
 435
 436      In this example we have specified each of  the  attributes
 437      with  a  constant value, but any computation
 438      could have been substituted for these constants.
 439     </para>
 440
 441     <para>
 442      Note two important things about defining the function:
 443
 444      <itemizedlist>
 445       <listitem>
 446        <para>
 447         The select list order in the query must be exactly the same as
 448         that in which the columns appear in the table associated
 449         with the composite type.  (Naming the columns, as we did above,
 450         is irrelevant to the system.)
 451        </para>
 452       </listitem>
 453       <listitem>
 454        <para>
 455         You must typecast the expressions to match the
 456         definition of the composite type, or you will get errors like this:
 457 <screen>
 458 <computeroutput>
 459 ERROR:  function declared to return emp returns varchar instead of text at column 1
 460 </computeroutput>
 461 </screen>
 462        </para>
 463       </listitem>
 464      </itemizedlist>
 465     </para>
 466
 467     <para>
 468      A different way to define the same function is:
 469
 470 <programlisting>
 471 CREATE FUNCTION new_emp() RETURNS emp AS $$
 472     SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
 473 $$ LANGUAGE SQL;
 474 </programlisting>
 475
 476      Here we wrote a <command>SELECT</command> that returns just a single
 477      column of the correct composite type.  This isn't really better
 478      in this situation, but it is a handy alternative in some cases
 479      &mdash; for example, if we need to compute the result by calling
 480      another function that returns the desired composite value.
 481     </para>
 482
 483     <para>
 484      We could call this function directly either by using it in
 485      a value expression:
 486
 487 <screen>
 488 SELECT new_emp();
 489
 490          new_emp
 491 --------------------------
 492  (None,1000.0,25,"(2,2)")
 493 </screen>
 494
 495      or by calling it as a table function:
 496
 497 <screen>
 498 SELECT * FROM new_emp();
 499
 500  name | salary | age | cubicle
 501 ------+--------+-----+---------
 502  None | 1000.0 |  25 | (2,2)
 503 </screen>
 504
 505      The second way is described more fully in <xref
 506      linkend="xfunc-sql-table-functions">.
 507     </para>
 508
 509     <para>
 510      When you use a function that returns a composite type,
 511      you might want only one field (attribute) from its result.
 512      You can do that with syntax like this:
 513
 514 <screen>
 515 SELECT (new_emp()).name;
 516
 517  name
 518 ------
 519  None
 520 </screen>
 521
 522      The extra parentheses are needed to keep the parser from getting
 523      confused.  If you try to do it without them, you get something like this:
 524
 525 <screen>
 526 SELECT new_emp().name;
 527 ERROR:  syntax error at or near "."
 528 LINE 1: SELECT new_emp().name;
 529                         ^
 530 </screen>
 531     </para>
 532
 533     <para>
 534      Another option is to use functional notation for extracting an attribute:
 535
 536 <screen>
 537 SELECT name(new_emp());
 538
 539  name
 540 ------
 541  None
 542 </screen>
 543
 544      As explained in <xref linkend="rowtypes-usage">, the field notation and
 545      functional notation are equivalent.
 546     </para>
 547
 548     <para>
 549      Another way to use a function returning a composite type is to pass the
 550      result to another function that accepts the correct row type as input:
 551
 552 <screen>
 553 CREATE FUNCTION getname(emp) RETURNS text AS $$
 554     SELECT $1.name;
 555 $$ LANGUAGE SQL;
 556
 557 SELECT getname(new_emp());
 558  getname
 559 ---------
 560  None
 561 (1 row)
 562 </screen>
 563     </para>
 564    </sect2>
 565
 566    <sect2 id="xfunc-output-parameters">
 567     <title><acronym>SQL</acronym> Functions with Output Parameters</title>
 568
 569    <indexterm>
 570     <primary>function</primary>
 571     <secondary>output parameter</secondary>
 572    </indexterm>
 573
 574     <para>
 575      An alternative way of describing a function's results is to define it
 576      with <firstterm>output parameters</firstterm>, as in this example:
 577
 578 <screen>
 579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
 580 AS 'SELECT x + y'
 581 LANGUAGE SQL;
 582
 583 SELECT add_em(3,7);
 584  add_em
 585 --------
 586      10
 587 (1 row)
 588 </screen>
 589
 590      This is not essentially different from the version of <literal>add_em</literal>
 591      shown in <xref linkend="xfunc-sql-base-functions">.  The real value of
 592      output parameters is that they provide a convenient way of defining
 593      functions that return several columns.  For example,
 594
 595 <screen>
 596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
 597 AS 'SELECT x + y, x * y'
 598 LANGUAGE SQL;
 599
 600  SELECT * FROM sum_n_product(11,42);
 601  sum | product
 602 -----+---------
 603   53 |     462
 604 (1 row)
 605 </screen>
 606
 607      What has essentially happened here is that we have created an anonymous
 608      composite type for the result of the function.  The above example has
 609      the same end result as
 610
 611 <screen>
 612 CREATE TYPE sum_prod AS (sum int, product int);
 613
 614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
 615 AS 'SELECT $1 + $2, $1 * $2'
 616 LANGUAGE SQL;
 617 </screen>
 618
 619      but not having to bother with the separate composite type definition
 620      is often handy.  Notice that the names attached to the output parameters
 621      are not just decoration, but determine the column names of the anonymous
 622      composite type.  (If you omit a name for an output parameter, the
 623      system will choose a name on its own.)
 624     </para>
 625
 626     <para>
 627      Notice that output parameters are not included in the calling argument
 628      list when invoking such a function from SQL.  This is because
 629      <productname>PostgreSQL</productname> considers only the input
 630      parameters to define the function's calling signature.  That means
 631      also that only the input parameters matter when referencing the function
 632      for purposes such as dropping it.  We could drop the above function
 633      with either of
 634
 635 <screen>
 636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
 637 DROP FUNCTION sum_n_product (int, int);
 638 </screen>
 639     </para>
 640
 641     <para>
 642      Parameters can be marked as <literal>IN</literal> (the default),
 643      <literal>OUT</literal>, <literal>INOUT</literal>, or <literal>VARIADIC</literal>.
 644      An <literal>INOUT</literal>
 645      parameter serves as both an input parameter (part of the calling
 646      argument list) and an output parameter (part of the result record type).
 647      <literal>VARIADIC</literal> parameters are input parameters, but are treated
 648      specially as described next.
 649     </para>
 650    </sect2>
 651
 652    <sect2 id="xfunc-sql-variadic-functions">
 653     <title><acronym>SQL</acronym> Functions with Variable Numbers of Arguments</title>
 654
 655     <indexterm>
 656      <primary>function</primary>
 657      <secondary>variadic</secondary>
 658     </indexterm>
 659
 660     <indexterm>
 661      <primary>variadic function</primary>
 662     </indexterm>
 663
 664     <para>
 665      <acronym>SQL</acronym> functions can be declared to accept
 666      variable numbers of arguments, so long as all the <quote>optional</quote>
 667      arguments are of the same data type.  The optional arguments will be
 668      passed to the function as an array.  The function is declared by
 669      marking the last parameter as <literal>VARIADIC</literal>; this parameter
 670      must be declared as being of an array type.  For example:
 671
 672 <screen>
 673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
 674     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 675 $$ LANGUAGE SQL;
 676
 677 SELECT mleast(10, -1, 5, 4.4);
 678  mleast
 679 --------
 680      -1
 681 (1 row)
 682 </screen>
 683
 684      Effectively, all the actual arguments at or beyond the
 685      <literal>VARIADIC</literal> position are gathered up into a one-dimensional
 686      array, as if you had written
 687
 688 <screen>
 689 SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
 690 </screen>
 691
 692      You can't actually write that, though &mdash; or at least, it will
 693      not match this function definition.  A parameter marked
 694      <literal>VARIADIC</literal> matches one or more occurrences of its element
 695      type, not of its own type.
 696     </para>
 697
 698     <para>
 699      Sometimes it is useful to be able to pass an already-constructed array
 700      to a variadic function; this is particularly handy when one variadic
 701      function wants to pass on its array parameter to another one.  You can
 702      do that by specifying <literal>VARIADIC</literal> in the call:
 703
 704 <screen>
 705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
 706 </screen>
 707
 708      This prevents expansion of the function's variadic parameter into its
 709      element type, thereby allowing the array argument value to match
 710      normally.  <literal>VARIADIC</literal> can only be attached to the last
 711      actual argument of a function call.
 712     </para>
 713
 714     <para>
 715      Specifying <literal>VARIADIC</literal> in the call is also the only way to
 716      pass an empty array to a variadic function, for example:
 717
 718 <screen>
 719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
 720 </screen>
 721
 722      Simply writing <literal>SELECT mleast()</literal> does not work because a
 723      variadic parameter must match at least one actual argument.
 724      (You could define a second function also named <literal>mleast</literal>,
 725      with no parameters, if you wanted to allow such calls.)
 726     </para>
 727
 728     <para>
 729      The array element parameters generated from a variadic parameter are
 730      treated as not having any names of their own.  This means it is not
 731      possible to call a variadic function using named arguments (<xref
 732      linkend="sql-syntax-calling-funcs">), except when you specify
 733      <literal>VARIADIC</literal>.  For example, this will work:
 734
 735 <screen>
 736 SELECT mleast(VARIADIC arr =&gt; ARRAY[10, -1, 5, 4.4]);
 737 </screen>
 738
 739      but not these:
 740
 741 <screen>
 742 SELECT mleast(arr =&gt; 10);
 743 SELECT mleast(arr =&gt; ARRAY[10, -1, 5, 4.4]);
 744 </screen>
 745     </para>
 746    </sect2>
 747
 748    <sect2 id="xfunc-sql-parameter-defaults">
 749     <title><acronym>SQL</acronym> Functions with Default Values for Arguments</title>
 750
 751     <indexterm>
 752      <primary>function</primary>
 753      <secondary>default values for arguments</secondary>
 754     </indexterm>
 755
 756     <para>
 757      Functions can be declared with default values for some or all input
 758      arguments.  The default values are inserted whenever the function is
 759      called with insufficiently many actual arguments.  Since arguments
 760      can only be omitted from the end of the actual argument list, all
 761      parameters after a parameter with a default value have to have
 762      default values as well.  (Although the use of named argument notation
 763      could allow this restriction to be relaxed, it's still enforced so that
 764      positional argument notation works sensibly.)
 765     </para>
 766
 767     <para>
 768      For example:
 769 <screen>
 770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
 771 RETURNS int
 772 LANGUAGE SQL
 773 AS $$
 774     SELECT $1 + $2 + $3;
 775 $$;
 776
 777 SELECT foo(10, 20, 30);
 778  foo
 779 -----
 780   60
 781 (1 row)
 782
 783 SELECT foo(10, 20);
 784  foo
 785 -----
 786   33
 787 (1 row)
 788
 789 SELECT foo(10);
 790  foo
 791 -----
 792   15
 793 (1 row)
 794
 795 SELECT foo();  -- fails since there is no default for the first argument
 796 ERROR:  function foo() does not exist
 797 </screen>
 798      The <literal>=</literal> sign can also be used in place of the
 799      key word <literal>DEFAULT</literal>.
 800     </para>
 801    </sect2>
 802
 803    <sect2 id="xfunc-sql-table-functions">
 804     <title><acronym>SQL</acronym> Functions as Table Sources</title>
 805
 806     <para>
 807      All SQL functions can be used in the <literal>FROM</literal> clause of a query,
 808      but it is particularly useful for functions returning composite types.
 809      If the function is defined to return a base type, the table function
 810      produces a one-column table.  If the function is defined to return
 811      a composite type, the table function produces a column for each attribute
 812      of the composite type.
 813     </para>
 814
 815     <para>
 816      Here is an example:
 817
 818 <screen>
 819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 820 INSERT INTO foo VALUES (1, 1, 'Joe');
 821 INSERT INTO foo VALUES (1, 2, 'Ed');
 822 INSERT INTO foo VALUES (2, 1, 'Mary');
 823
 824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
 825     SELECT * FROM foo WHERE fooid = $1;
 826 $$ LANGUAGE SQL;
 827
 828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
 829
 830  fooid | foosubid | fooname | upper
 831 -------+----------+---------+-------
 832      1 |        1 | Joe     | JOE
 833 (1 row)
 834 </screen>
 835
 836      As the example shows, we can work with the columns of the function's
 837      result just the same as if they were columns of a regular table.
 838     </para>
 839
 840     <para>
 841      Note that we only got one row out of the function.  This is because
 842      we did not use <literal>SETOF</literal>.  That is described in the next section.
 843     </para>
 844    </sect2>
 845
 846    <sect2 id="xfunc-sql-functions-returning-set">
 847     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 848
 849     <indexterm>
 850      <primary>function</primary>
 851      <secondary>with SETOF</secondary>
 852     </indexterm>
 853
 854     <para>
 855      When an SQL function is declared as returning <literal>SETOF
 856      <replaceable>sometype</replaceable></literal>, the function's final
 857      query is executed to completion, and each row it
 858      outputs is returned as an element of the result set.
 859     </para>
 860
 861     <para>
 862      This feature is normally used when calling the function in the <literal>FROM</literal>
 863      clause.  In this case each row returned by the function becomes
 864      a row of the table seen by the query.  For example, assume that
 865      table <literal>foo</literal> has the same contents as above, and we say:
 866
 867 <programlisting>
 868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 869     SELECT * FROM foo WHERE fooid = $1;
 870 $$ LANGUAGE SQL;
 871
 872 SELECT * FROM getfoo(1) AS t1;
 873 </programlisting>
 874
 875      Then we would get:
 876 <screen>
 877  fooid | foosubid | fooname
 878 -------+----------+---------
 879      1 |        1 | Joe
 880      1 |        2 | Ed
 881 (2 rows)
 882 </screen>
 883     </para>
 884
 885     <para>
 886      It is also possible to return multiple rows with the columns defined by
 887      output parameters, like this:
 888
 889 <programlisting>
 890 CREATE TABLE tab (y int, z int);
 891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
 892
 893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
 894 RETURNS SETOF record
 895 AS $$
 896     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 897 $$ LANGUAGE SQL;
 898
 899 SELECT * FROM sum_n_product_with_tab(10);
 900  sum | product
 901 -----+---------
 902   11 |      10
 903   13 |      30
 904   15 |      50
 905   17 |      70
 906 (4 rows)
 907 </programlisting>
 908
 909      The key point here is that you must write <literal>RETURNS SETOF record</literal>
 910      to indicate that the function returns multiple rows instead of just one.
 911      If there is only one output parameter, write that parameter's type
 912      instead of <type>record</type>.
 913     </para>
 914
 915     <para>
 916      It is frequently useful to construct a query's result by invoking a
 917      set-returning function multiple times, with the parameters for each
 918      invocation coming from successive rows of a table or subquery.  The
 919      preferred way to do this is to use the <literal>LATERAL</literal> key word,
 920      which is described in <xref linkend="queries-lateral">.
 921      Here is an example using a set-returning function to enumerate
 922      elements of a tree structure:
 923
 924 <screen>
 925 SELECT * FROM nodes;
 926    name    | parent
 927 -----------+--------
 928  Top       |
 929  Child1    | Top
 930  Child2    | Top
 931  Child3    | Top
 932  SubChild1 | Child1
 933  SubChild2 | Child1
 934 (6 rows)
 935
 936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
 937     SELECT name FROM nodes WHERE parent = $1
 938 $$ LANGUAGE SQL STABLE;
 939
 940 SELECT * FROM listchildren('Top');
 941  listchildren
 942 --------------
 943  Child1
 944  Child2
 945  Child3
 946 (3 rows)
 947
 948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
 949   name  |   child
 950 --------+-----------
 951  Top    | Child1
 952  Top    | Child2
 953  Top    | Child3
 954  Child1 | SubChild1
 955  Child1 | SubChild2
 956 (5 rows)
 957 </screen>
 958
 959      This example does not do anything that we couldn't have done with a
 960      simple join, but in more complex calculations the option to put
 961      some of the work into a function can be quite convenient.
 962     </para>
 963
 964     <para>
 965      Functions returning sets can also be called in the select list
 966      of a query.  For each row that the query
 967      generates by itself, the set-returning function is invoked, and an output
 968      row is generated for each element of the function's result set.
 969      The previous example could also be done with queries like
 970      these:
 971
 972 <screen>
 973 SELECT listchildren('Top');
 974  listchildren
 975 --------------
 976  Child1
 977  Child2
 978  Child3
 979 (3 rows)
 980
 981 SELECT name, listchildren(name) FROM nodes;
 982   name  | listchildren
 983 --------+--------------
 984  Top    | Child1
 985  Top    | Child2
 986  Top    | Child3
 987  Child1 | SubChild1
 988  Child1 | SubChild2
 989 (5 rows)
 990 </screen>
 991
 992      In the last <command>SELECT</command>,
 993      notice that no output row appears for <literal>Child2</literal>, <literal>Child3</literal>, etc.
 994      This happens because <function>listchildren</function> returns an empty set
 995      for those arguments, so no result rows are generated.  This is the same
 996      behavior as we got from an inner join to the function result when using
 997      the <literal>LATERAL</literal> syntax.
 998     </para>
 999
1000     <para>
1001      <productname>PostgreSQL</productname>'s behavior for a set-returning function in a
1002      query's select list is almost exactly the same as if the set-returning
1003      function had been written in a <literal>LATERAL FROM</literal>-clause item
1004      instead.  For example,
1005 <programlisting>
1006 SELECT x, generate_series(1,5) AS g FROM tab;
1007 </programlisting>
1008      is almost equivalent to
1009 <programlisting>
1010 SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
1011 </programlisting>
1012      It would be exactly the same, except that in this specific example,
1013      the planner could choose to put <structname>g</structname> on the outside of the
1014      nestloop join, since <structname>g</structname> has no actual lateral dependency
1015      on <structname>tab</structname>.  That would result in a different output row
1016      order.  Set-returning functions in the select list are always evaluated
1017      as though they are on the inside of a nestloop join with the rest of
1018      the <literal>FROM</literal> clause, so that the function(s) are run to
1019      completion before the next row from the <literal>FROM</literal> clause is
1020      considered.
1021     </para>
1022
1023     <para>
1024      If there is more than one set-returning function in the query's select
1025      list, the behavior is similar to what you get from putting the functions
1026      into a single <literal>LATERAL ROWS FROM( ... )</literal> <literal>FROM</literal>-clause
1027      item.  For each row from the underlying query, there is an output row
1028      using the first result from each function, then an output row using the
1029      second result, and so on.  If some of the set-returning functions
1030      produce fewer outputs than others, null values are substituted for the
1031      missing data, so that the total number of rows emitted for one
1032      underlying row is the same as for the set-returning function that
1033      produced the most outputs.  Thus the set-returning functions
1034      run <quote>in lockstep</quote> until they are all exhausted, and then
1035      execution continues with the next underlying row.
1036     </para>
1037
1038     <para>
1039      Set-returning functions can be nested in a select list, although that is
1040      not allowed in <literal>FROM</literal>-clause items.  In such cases, each level
1041      of nesting is treated separately, as though it were
1042      a separate <literal>LATERAL ROWS FROM( ... )</literal> item.  For example, in
1043 <programlisting>
1044 SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
1045 </programlisting>
1046      the set-returning functions <function>srf2</function>, <function>srf3</function>,
1047      and <function>srf5</function> would be run in lockstep for each row
1048      of <structname>tab</structname>, and then <function>srf1</function> and <function>srf4</function>
1049      would be applied in lockstep to each row produced by the lower
1050      functions.
1051     </para>
1052
1053     <para>
1054      Set-returning functions cannot be used within conditional-evaluation
1055      constructs, such as <literal>CASE</literal> or <literal>COALESCE</literal>.  For
1056      example, consider
1057 <programlisting>
1058 SELECT x, CASE WHEN x &gt; 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
1059 </programlisting>
1060      It might seem that this should produce five repetitions of input rows
1061      that have <literal>x &gt; 0</literal>, and a single repetition of those that do
1062      not; but actually, because <function>generate_series(1, 5)</function> would be
1063      run in an implicit <literal>LATERAL FROM</literal> item before
1064      the <literal>CASE</literal> expression is ever evaluated, it would produce five
1065      repetitions of every input row.  To reduce confusion, such cases produce
1066      a parse-time error instead.
1067     </para>
1068
1069     <note>
1070      <para>
1071       If a function's last command is <command>INSERT</command>, <command>UPDATE</command>,
1072       or <command>DELETE</command> with <literal>RETURNING</literal>, that command will
1073       always be executed to completion, even if the function is not declared
1074       with <literal>SETOF</literal> or the calling query does not fetch all the
1075       result rows.  Any extra rows produced by the <literal>RETURNING</literal>
1076       clause are silently dropped, but the commanded table modifications
1077       still happen (and are all completed before returning from the function).
1078      </para>
1079     </note>
1080
1081     <note>
1082      <para>
1083       Before <productname>PostgreSQL</productname> 10, putting more than one
1084       set-returning function in the same select list did not behave very
1085       sensibly unless they always produced equal numbers of rows.  Otherwise,
1086       what you got was a number of output rows equal to the least common
1087       multiple of the numbers of rows produced by the set-returning
1088       functions.  Also, nested set-returning functions did not work as
1089       described above; instead, a set-returning function could have at most
1090       one set-returning argument, and each nest of set-returning functions
1091       was run independently.  Also, conditional execution (set-returning
1092       functions inside <literal>CASE</literal> etc) was previously allowed,
1093       complicating things even more.
1094       Use of the <literal>LATERAL</literal> syntax is recommended when writing
1095       queries that need to work in older <productname>PostgreSQL</productname> versions,
1096       because that will give consistent results across different versions.
1097       If you have a query that is relying on conditional execution of a
1098       set-returning function, you may be able to fix it by moving the
1099       conditional test into a custom set-returning function.  For example,
1100 <programlisting>
1101 SELECT x, CASE WHEN y &gt; 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
1102 </programlisting>
1103       could become
1104 <programlisting>
1105 CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
1106   RETURNS SETOF int AS $$
1107 BEGIN
1108   IF cond THEN
1109     RETURN QUERY SELECT generate_series(start, fin);
1110   ELSE
1111     RETURN QUERY SELECT els;
1112   END IF;
1113 END$$ LANGUAGE plpgsql;
1114
1115 SELECT x, case_generate_series(y &gt; 0, 1, z, 5) FROM tab;
1116 </programlisting>
1117       This formulation will work the same in all versions
1118       of <productname>PostgreSQL</productname>.
1119      </para>
1120     </note>
1121    </sect2>
1122
1123    <sect2 id="xfunc-sql-functions-returning-table">
1124     <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</literal></title>
1125
1126     <indexterm>
1127      <primary>function</primary>
1128      <secondary>RETURNS TABLE</secondary>
1129     </indexterm>
1130
1131     <para>
1132      There is another way to declare a function as returning a set,
1133      which is to use the syntax
1134      <literal>RETURNS TABLE(<replaceable>columns</replaceable>)</literal>.
1135      This is equivalent to using one or more <literal>OUT</literal> parameters plus
1136      marking the function as returning <literal>SETOF record</literal> (or
1137      <literal>SETOF</literal> a single output parameter's type, as appropriate).
1138      This notation is specified in recent versions of the SQL standard, and
1139      thus may be more portable than using <literal>SETOF</literal>.
1140     </para>
1141
1142     <para>
1143      For example, the preceding sum-and-product example could also be
1144      done this way:
1145
1146 <programlisting>
1147 CREATE FUNCTION sum_n_product_with_tab (x int)
1148 RETURNS TABLE(sum int, product int) AS $$
1149     SELECT $1 + tab.y, $1 * tab.y FROM tab;
1150 $$ LANGUAGE SQL;
1151 </programlisting>
1152
1153      It is not allowed to use explicit <literal>OUT</literal> or <literal>INOUT</literal>
1154      parameters with the <literal>RETURNS TABLE</literal> notation &mdash; you must
1155      put all the output columns in the <literal>TABLE</literal> list.
1156     </para>
1157    </sect2>
1158
1159    <sect2>
1160     <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1161
1162     <para>
1163      <acronym>SQL</acronym> functions can be declared to accept and
1164      return the polymorphic types <type>anyelement</type>,
1165      <type>anyarray</type>, <type>anynonarray</type>,
1166      <type>anyenum</type>, and <type>anyrange</type>.  See <xref
1167      linkend="extend-types-polymorphic"> for a more detailed
1168      explanation of polymorphic functions. Here is a polymorphic
1169      function <function>make_array</function> that builds up an array
1170      from two arbitrary data type elements:
1171 <screen>
1172 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1173     SELECT ARRAY[$1, $2];
1174 $$ LANGUAGE SQL;
1175
1176 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1177  intarray | textarray
1178 ----------+-----------
1179  {1,2}    | {a,b}
1180 (1 row)
1181 </screen>
1182     </para>
1183
1184     <para>
1185      Notice the use of the typecast <literal>'a'::text</literal>
1186      to specify that the argument is of type <type>text</type>. This is
1187      required if the argument is just a string literal, since otherwise
1188      it would be treated as type
1189      <type>unknown</type>, and array of <type>unknown</type> is not a valid
1190      type.
1191      Without the typecast, you will get errors like this:
1192 <screen>
1193 <computeroutput>
1194 ERROR:  could not determine polymorphic type because input has type "unknown"
1195 </computeroutput>
1196 </screen>
1197     </para>
1198
1199     <para>
1200      It is permitted to have polymorphic arguments with a fixed
1201      return type, but the converse is not. For example:
1202 <screen>
1203 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1204     SELECT $1 &gt; $2;
1205 $$ LANGUAGE SQL;
1206
1207 SELECT is_greater(1, 2);
1208  is_greater
1209 ------------
1210  f
1211 (1 row)
1212
1213 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1214     SELECT 1;
1215 $$ LANGUAGE SQL;
1216 ERROR:  cannot determine result data type
1217 DETAIL:  A function returning a polymorphic type must have at least one polymorphic argument.
1218 </screen>
1219     </para>
1220
1221     <para>
1222      Polymorphism can be used with functions that have output arguments.
1223      For example:
1224 <screen>
1225 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1226 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1227
1228 SELECT * FROM dup(22);
1229  f2 |   f3
1230 ----+---------
1231  22 | {22,22}
1232 (1 row)
1233 </screen>
1234     </para>
1235
1236     <para>
1237      Polymorphism can also be used with variadic functions.
1238      For example:
1239 <screen>
1240 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1241     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1242 $$ LANGUAGE SQL;
1243
1244 SELECT anyleast(10, -1, 5, 4);
1245  anyleast
1246 ----------
1247        -1
1248 (1 row)
1249
1250 SELECT anyleast('abc'::text, 'def');
1251  anyleast
1252 ----------
1253  abc
1254 (1 row)
1255
1256 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1257     SELECT array_to_string($2, $1);
1258 $$ LANGUAGE SQL;
1259
1260 SELECT concat_values('|', 1, 4, 2);
1261  concat_values
1262 ---------------
1263  1|4|2
1264 (1 row)
1265 </screen>
1266     </para>
1267    </sect2>
1268
1269    <sect2>
1270     <title><acronym>SQL</acronym> Functions with Collations</title>
1271
1272     <indexterm>
1273      <primary>collation</primary>
1274      <secondary>in SQL functions</secondary>
1275     </indexterm>
1276
1277     <para>
1278      When a SQL function has one or more parameters of collatable data types,
1279      a collation is identified for each function call depending on the
1280      collations assigned to the actual arguments, as described in <xref
1281      linkend="collation">.  If a collation is successfully identified
1282      (i.e., there are no conflicts of implicit collations among the arguments)
1283      then all the collatable parameters are treated as having that collation
1284      implicitly.  This will affect the behavior of collation-sensitive
1285      operations within the function.  For example, using the
1286      <function>anyleast</function> function described above, the result of
1287 <programlisting>
1288 SELECT anyleast('abc'::text, 'ABC');
1289 </programlisting>
1290      will depend on the database's default collation.  In <literal>C</literal> locale
1291      the result will be <literal>ABC</literal>, but in many other locales it will
1292      be <literal>abc</literal>.  The collation to use can be forced by adding
1293      a <literal>COLLATE</literal> clause to any of the arguments, for example
1294 <programlisting>
1295 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1296 </programlisting>
1297      Alternatively, if you wish a function to operate with a particular
1298      collation regardless of what it is called with, insert
1299      <literal>COLLATE</literal> clauses as needed in the function definition.
1300      This version of <function>anyleast</function> would always use <literal>en_US</literal>
1301      locale to compare strings:
1302 <programlisting>
1303 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1304     SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1305 $$ LANGUAGE SQL;
1306 </programlisting>
1307      But note that this will throw an error if applied to a non-collatable
1308      data type.
1309     </para>
1310
1311     <para>
1312      If no common collation can be identified among the actual arguments,
1313      then a SQL function treats its parameters as having their data types'
1314      default collation (which is usually the database's default collation,
1315      but could be different for parameters of domain types).
1316     </para>
1317
1318     <para>
1319      The behavior of collatable parameters can be thought of as a limited
1320      form of polymorphism, applicable only to textual data types.
1321     </para>
1322    </sect2>
1323   </sect1>
1324
1325   <sect1 id="xfunc-overload">
1326    <title>Function Overloading</title>
1327
1328    <indexterm zone="xfunc-overload">
1329     <primary>overloading</primary>
1330     <secondary>functions</secondary>
1331    </indexterm>
1332
1333    <para>
1334     More than one function can be defined with the same SQL name, so long
1335     as the arguments they take are different.  In other words,
1336     function names can be <firstterm>overloaded</firstterm>.  When a
1337     query is executed, the server will determine which function to
1338     call from the data types and the number of the provided arguments.
1339     Overloading can also be used to simulate functions with a variable
1340     number of arguments, up to a finite maximum number.
1341    </para>
1342
1343    <para>
1344     When creating a family of overloaded functions, one should be
1345     careful not to create ambiguities.  For instance, given the
1346     functions:
1347 <programlisting>
1348 CREATE FUNCTION test(int, real) RETURNS ...
1349 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1350 </programlisting>
1351     it is not immediately clear which function would be called with
1352     some trivial input like <literal>test(1, 1.5)</literal>.  The
1353     currently implemented resolution rules are described in
1354     <xref linkend="typeconv">, but it is unwise to design a system that subtly
1355     relies on this behavior.
1356    </para>
1357
1358    <para>
1359     A function that takes a single argument of a composite type should
1360     generally not have the same name as any attribute (field) of that type.
1361     Recall that <literal><replaceable>attribute</replaceable>(<replaceable>table</replaceable>)</literal>
1362     is considered equivalent
1363     to <literal><replaceable>table</replaceable>.<replaceable>attribute</replaceable></literal>.
1364     In the case that there is an
1365     ambiguity between a function on a composite type and an attribute of
1366     the composite type, the attribute will always be used.  It is possible
1367     to override that choice by schema-qualifying the function name
1368     (that is, <literal><replaceable>schema</replaceable>.<replaceable>func</replaceable>(<replaceable>table</replaceable>)
1369     </literal>) but it's better to
1370     avoid the problem by not choosing conflicting names.
1371    </para>
1372
1373    <para>
1374     Another possible conflict is between variadic and non-variadic functions.
1375     For instance, it is possible to create both <literal>foo(numeric)</literal> and
1376     <literal>foo(VARIADIC numeric[])</literal>.  In this case it is unclear which one
1377     should be matched to a call providing a single numeric argument, such as
1378     <literal>foo(10.1)</literal>.  The rule is that the function appearing
1379     earlier in the search path is used, or if the two functions are in the
1380     same schema, the non-variadic one is preferred.
1381    </para>
1382
1383    <para>
1384     When overloading C-language functions, there is an additional
1385     constraint: The C name of each function in the family of
1386     overloaded functions must be different from the C names of all
1387     other functions, either internal or dynamically loaded.  If this
1388     rule is violated, the behavior is not portable.  You might get a
1389     run-time linker error, or one of the functions will get called
1390     (usually the internal one).  The alternative form of the
1391     <literal>AS</literal> clause for the SQL <command>CREATE
1392     FUNCTION</command> command decouples the SQL function name from
1393     the function name in the C source code.  For instance:
1394 <programlisting>
1395 CREATE FUNCTION test(int) RETURNS int
1396     AS '<replaceable>filename</replaceable>', 'test_1arg'
1397     LANGUAGE C;
1398 CREATE FUNCTION test(int, int) RETURNS int
1399     AS '<replaceable>filename</replaceable>', 'test_2arg'
1400     LANGUAGE C;
1401 </programlisting>
1402     The names of the C functions here reflect one of many possible conventions.
1403    </para>
1404   </sect1>
1405
1406   <sect1 id="xfunc-volatility">
1407    <title>Function Volatility Categories</title>
1408
1409    <indexterm zone="xfunc-volatility">
1410     <primary>volatility</primary>
1411     <secondary>functions</secondary>
1412    </indexterm>
1413    <indexterm zone="xfunc-volatility">
1414     <primary>VOLATILE</primary>
1415    </indexterm>
1416    <indexterm zone="xfunc-volatility">
1417     <primary>STABLE</primary>
1418    </indexterm>
1419    <indexterm zone="xfunc-volatility">
1420     <primary>IMMUTABLE</primary>
1421    </indexterm>
1422
1423    <para>
1424     Every function has a <firstterm>volatility</firstterm> classification, with
1425     the possibilities being <literal>VOLATILE</literal>, <literal>STABLE</literal>, or
1426     <literal>IMMUTABLE</literal>.  <literal>VOLATILE</literal> is the default if the
1427     <xref linkend="sql-createfunction">
1428     command does not specify a category.  The volatility category is a
1429     promise to the optimizer about the behavior of the function:
1430
1431    <itemizedlist>
1432     <listitem>
1433      <para>
1434       A <literal>VOLATILE</literal> function can do anything, including modifying
1435       the database.  It can return different results on successive calls with
1436       the same arguments.  The optimizer makes no assumptions about the
1437       behavior of such functions.  A query using a volatile function will
1438       re-evaluate the function at every row where its value is needed.
1439      </para>
1440     </listitem>
1441     <listitem>
1442      <para>
1443       A <literal>STABLE</literal> function cannot modify the database and is
1444       guaranteed to return the same results given the same arguments
1445       for all rows within a single statement. This category allows the
1446       optimizer to optimize multiple calls of the function to a single
1447       call. In particular, it is safe to use an expression containing
1448       such a function in an index scan condition. (Since an index scan
1449       will evaluate the comparison value only once, not once at each
1450       row, it is not valid to use a <literal>VOLATILE</literal> function in an
1451       index scan condition.)
1452      </para>
1453     </listitem>
1454     <listitem>
1455      <para>
1456       An <literal>IMMUTABLE</literal> function cannot modify the database and is
1457       guaranteed to return the same results given the same arguments forever.
1458       This category allows the optimizer to pre-evaluate the function when
1459       a query calls it with constant arguments.  For example, a query like
1460       <literal>SELECT ... WHERE x = 2 + 2</literal> can be simplified on sight to
1461       <literal>SELECT ... WHERE x = 4</literal>, because the function underlying
1462       the integer addition operator is marked <literal>IMMUTABLE</literal>.
1463      </para>
1464     </listitem>
1465    </itemizedlist>
1466    </para>
1467
1468    <para>
1469     For best optimization results, you should label your functions with the
1470     strictest volatility category that is valid for them.
1471    </para>
1472
1473    <para>
1474     Any function with side-effects <emphasis>must</emphasis> be labeled
1475     <literal>VOLATILE</literal>, so that calls to it cannot be optimized away.
1476     Even a function with no side-effects needs to be labeled
1477     <literal>VOLATILE</literal> if its value can change within a single query;
1478     some examples are <literal>random()</literal>, <literal>currval()</literal>,
1479     <literal>timeofday()</literal>.
1480    </para>
1481
1482    <para>
1483     Another important example is that the <function>current_timestamp</function>
1484     family of functions qualify as <literal>STABLE</literal>, since their values do
1485     not change within a transaction.
1486    </para>
1487
1488    <para>
1489     There is relatively little difference between <literal>STABLE</literal> and
1490     <literal>IMMUTABLE</literal> categories when considering simple interactive
1491     queries that are planned and immediately executed: it doesn't matter
1492     a lot whether a function is executed once during planning or once during
1493     query execution startup.  But there is a big difference if the plan is
1494     saved and reused later.  Labeling a function <literal>IMMUTABLE</literal> when
1495     it really isn't might allow it to be prematurely folded to a constant during
1496     planning, resulting in a stale value being re-used during subsequent uses
1497     of the plan.  This is a hazard when using prepared statements or when
1498     using function languages that cache plans (such as
1499     <application>PL/pgSQL</application>).
1500    </para>
1501
1502    <para>
1503     For functions written in SQL or in any of the standard procedural
1504     languages, there is a second important property determined by the
1505     volatility category, namely the visibility of any data changes that have
1506     been made by the SQL command that is calling the function.  A
1507     <literal>VOLATILE</literal> function will see such changes, a <literal>STABLE</literal>
1508     or <literal>IMMUTABLE</literal> function will not.  This behavior is implemented
1509     using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1510     <literal>STABLE</literal> and <literal>IMMUTABLE</literal> functions use a snapshot
1511     established as of the start of the calling query, whereas
1512     <literal>VOLATILE</literal> functions obtain a fresh snapshot at the start of
1513     each query they execute.
1514    </para>
1515
1516    <note>
1517     <para>
1518      Functions written in C can manage snapshots however they want, but it's
1519      usually a good idea to make C functions work this way too.
1520     </para>
1521    </note>
1522
1523    <para>
1524     Because of this snapshotting behavior,
1525     a function containing only <command>SELECT</command> commands can safely be
1526     marked <literal>STABLE</literal>, even if it selects from tables that might be
1527     undergoing modifications by concurrent queries.
1528     <productname>PostgreSQL</productname> will execute all commands of a
1529     <literal>STABLE</literal> function using the snapshot established for the
1530     calling query, and so it will see a fixed view of the database throughout
1531     that query.
1532    </para>
1533
1534    <para>
1535     The same snapshotting behavior is used for <command>SELECT</command> commands
1536     within <literal>IMMUTABLE</literal> functions.  It is generally unwise to select
1537     from database tables within an <literal>IMMUTABLE</literal> function at all,
1538     since the immutability will be broken if the table contents ever change.
1539     However, <productname>PostgreSQL</productname> does not enforce that you
1540     do not do that.
1541    </para>
1542
1543    <para>
1544     A common error is to label a function <literal>IMMUTABLE</literal> when its
1545     results depend on a configuration parameter.  For example, a function
1546     that manipulates timestamps might well have results that depend on the
1547     <xref linkend="guc-timezone"> setting.  For safety, such functions should
1548     be labeled <literal>STABLE</literal> instead.
1549    </para>
1550
1551    <note>
1552     <para>
1553      <productname>PostgreSQL</productname> requires that <literal>STABLE</literal>
1554      and <literal>IMMUTABLE</literal> functions contain no SQL commands other
1555      than <command>SELECT</command> to prevent data modification.
1556      (This is not a completely bulletproof test, since such functions could
1557      still call <literal>VOLATILE</literal> functions that modify the database.
1558      If you do that, you will find that the <literal>STABLE</literal> or
1559      <literal>IMMUTABLE</literal> function does not notice the database changes
1560      applied by the called function, since they are hidden from its snapshot.)
1561     </para>
1562    </note>
1563   </sect1>
1564
1565   <sect1 id="xfunc-pl">
1566    <title>Procedural Language Functions</title>
1567
1568    <para>
1569     <productname>PostgreSQL</productname> allows user-defined functions
1570     to be written in other languages besides SQL and C.  These other
1571     languages are generically called <firstterm>procedural
1572     languages</firstterm> (<acronym>PL</acronym>s).
1573     Procedural languages aren't built into the
1574     <productname>PostgreSQL</productname> server; they are offered
1575     by loadable modules.
1576     See <xref linkend="xplang"> and following chapters for more
1577     information.
1578    </para>
1579   </sect1>
1580
1581   <sect1 id="xfunc-internal">
1582    <title>Internal Functions</title>
1583
1584    <indexterm zone="xfunc-internal"><primary>function</primary><secondary>internal</secondary></indexterm>
1585
1586    <para>
1587     Internal functions are functions written in C that have been statically
1588     linked into the <productname>PostgreSQL</productname> server.
1589     The <quote>body</quote> of the function definition
1590     specifies the C-language name of the function, which need not be the
1591     same as the name being declared for SQL use.
1592     (For reasons of backward compatibility, an empty body
1593     is accepted as meaning that the C-language function name is the
1594     same as the SQL name.)
1595    </para>
1596
1597    <para>
1598     Normally, all internal functions present in the
1599     server are declared during the initialization of the database cluster
1600     (see <xref linkend="creating-cluster">),
1601     but a user could use <command>CREATE FUNCTION</command>
1602     to create additional alias names for an internal function.
1603     Internal functions are declared in <command>CREATE FUNCTION</command>
1604     with language name <literal>internal</literal>.  For instance, to
1605     create an alias for the <function>sqrt</function> function:
1606 <programlisting>
1607 CREATE FUNCTION square_root(double precision) RETURNS double precision
1608     AS 'dsqrt'
1609     LANGUAGE internal
1610     STRICT;
1611 </programlisting>
1612     (Most internal functions expect to be declared <quote>strict</quote>.)
1613    </para>
1614
1615    <note>
1616     <para>
1617      Not all <quote>predefined</quote> functions are
1618      <quote>internal</quote> in the above sense.  Some predefined
1619      functions are written in SQL.
1620     </para>
1621    </note>
1622   </sect1>
1623
1624   <sect1 id="xfunc-c">
1625    <title>C-Language Functions</title>
1626
1627    <indexterm zone="xfunc-c">
1628     <primary>function</primary>
1629     <secondary>user-defined</secondary>
1630     <tertiary>in C</tertiary>
1631    </indexterm>
1632
1633    <para>
1634     User-defined functions can be written in C (or a language that can
1635     be made compatible with C, such as C++).  Such functions are
1636     compiled into dynamically loadable objects (also called shared
1637     libraries) and are loaded by the server on demand.  The dynamic
1638     loading feature is what distinguishes <quote>C language</quote> functions
1639     from <quote>internal</quote> functions &mdash; the actual coding conventions
1640     are essentially the same for both.  (Hence, the standard internal
1641     function library is a rich source of coding examples for user-defined
1642     C functions.)
1643    </para>
1644
1645    <para>
1646     Currently only one calling convention is used for C functions
1647     (<quote>version 1</quote>). Support for that calling convention is
1648     indicated by writing a <literal>PG_FUNCTION_INFO_V1()</literal> macro
1649     call for the function, as illustrated below.
1650    </para>
1651
1652   <sect2 id="xfunc-c-dynload">
1653    <title>Dynamic Loading</title>
1654
1655    <indexterm zone="xfunc-c-dynload">
1656     <primary>dynamic loading</primary>
1657    </indexterm>
1658
1659    <para>
1660     The first time a user-defined function in a particular
1661     loadable object file is called in a session,
1662     the dynamic loader loads that object file into memory so that the
1663     function can be called.  The <command>CREATE FUNCTION</command>
1664     for a user-defined C function must therefore specify two pieces of
1665     information for the function: the name of the loadable
1666     object file, and the C name (link symbol) of the specific function to call
1667     within that object file.  If the C name is not explicitly specified then
1668     it is assumed to be the same as the SQL function name.
1669    </para>
1670
1671    <para>
1672     The following algorithm is used to locate the shared object file
1673     based on the name given in the <command>CREATE FUNCTION</command>
1674     command:
1675
1676     <orderedlist>
1677      <listitem>
1678       <para>
1679        If the name is an absolute path, the given file is loaded.
1680       </para>
1681      </listitem>
1682
1683      <listitem>
1684       <para>
1685        If the name starts with the string <literal>$libdir</literal>,
1686        that part is replaced by the <productname>PostgreSQL</productname> package
1687         library directory
1688        name, which is determined at build time.<indexterm><primary>$libdir</primary></indexterm>
1689       </para>
1690      </listitem>
1691
1692      <listitem>
1693       <para>
1694        If the name does not contain a directory part, the file is
1695        searched for in the path specified by the configuration variable
1696        <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</primary></indexterm>
1697       </para>
1698      </listitem>
1699
1700      <listitem>
1701       <para>
1702        Otherwise (the file was not found in the path, or it contains a
1703        non-absolute directory part), the dynamic loader will try to
1704        take the name as given, which will most likely fail.  (It is
1705        unreliable to depend on the current working directory.)
1706       </para>
1707      </listitem>
1708     </orderedlist>
1709
1710     If this sequence does not work, the platform-specific shared
1711     library file name extension (often <filename>.so</filename>) is
1712     appended to the given name and this sequence is tried again.  If
1713     that fails as well, the load will fail.
1714    </para>
1715
1716    <para>
1717     It is recommended to locate shared libraries either relative to
1718     <literal>$libdir</literal> or through the dynamic library path.
1719     This simplifies version upgrades if the new installation is at a
1720     different location.  The actual directory that
1721     <literal>$libdir</literal> stands for can be found out with the
1722     command <literal>pg_config --pkglibdir</literal>.
1723    </para>
1724
1725    <para>
1726     The user ID the <productname>PostgreSQL</productname> server runs
1727     as must be able to traverse the path to the file you intend to
1728     load.  Making the file or a higher-level directory not readable
1729     and/or not executable by the <systemitem>postgres</systemitem>
1730     user is a common mistake.
1731    </para>
1732
1733    <para>
1734     In any case, the file name that is given in the
1735     <command>CREATE FUNCTION</command> command is recorded literally
1736     in the system catalogs, so if the file needs to be loaded again
1737     the same procedure is applied.
1738    </para>
1739
1740    <note>
1741     <para>
1742      <productname>PostgreSQL</productname> will not compile a C function
1743      automatically.  The object file must be compiled before it is referenced
1744      in a <command>CREATE
1745      FUNCTION</command> command.  See <xref linkend="dfunc"> for additional
1746      information.
1747     </para>
1748    </note>
1749
1750    <indexterm zone="xfunc-c-dynload">
1751     <primary>magic block</primary>
1752    </indexterm>
1753
1754    <para>
1755     To ensure that a dynamically loaded object file is not loaded into an
1756     incompatible server, <productname>PostgreSQL</productname> checks that the
1757     file contains a <quote>magic block</quote> with the appropriate contents.
1758     This allows the server to detect obvious incompatibilities, such as code
1759     compiled for a different major version of
1760     <productname>PostgreSQL</productname>. To include a magic block,
1761     write this in one (and only one) of the module source files, after having
1762     included the header <filename>fmgr.h</filename>:
1763
1764 <programlisting>
1765 PG_MODULE_MAGIC;
1766 </programlisting>
1767    </para>
1768
1769    <para>
1770     After it is used for the first time, a dynamically loaded object
1771     file is retained in memory.  Future calls in the same session to
1772     the function(s) in that file will only incur the small overhead of
1773     a symbol table lookup.  If you need to force a reload of an object
1774     file, for example after recompiling it, begin a fresh session.
1775    </para>
1776
1777    <indexterm zone="xfunc-c-dynload">
1778     <primary>_PG_init</primary>
1779    </indexterm>
1780    <indexterm zone="xfunc-c-dynload">
1781     <primary>_PG_fini</primary>
1782    </indexterm>
1783    <indexterm zone="xfunc-c-dynload">
1784     <primary>library initialization function</primary>
1785    </indexterm>
1786    <indexterm zone="xfunc-c-dynload">
1787     <primary>library finalization function</primary>
1788    </indexterm>
1789
1790    <para>
1791     Optionally, a dynamically loaded file can contain initialization and
1792     finalization functions.  If the file includes a function named
1793     <function>_PG_init</function>, that function will be called immediately after
1794     loading the file.  The function receives no parameters and should
1795     return void.  If the file includes a function named
1796     <function>_PG_fini</function>, that function will be called immediately before
1797     unloading the file.  Likewise, the function receives no parameters and
1798     should return void.  Note that <function>_PG_fini</function> will only be called
1799     during an unload of the file, not during process termination.
1800     (Presently, unloads are disabled and will never occur, but this may
1801     change in the future.)
1802    </para>
1803
1804   </sect2>
1805
1806    <sect2 id="xfunc-c-basetype">
1807     <title>Base Types in C-Language Functions</title>
1808
1809     <indexterm zone="xfunc-c-basetype">
1810      <primary>data type</primary>
1811      <secondary>internal organization</secondary>
1812     </indexterm>
1813
1814     <para>
1815      To know how to write C-language functions, you need to know how
1816      <productname>PostgreSQL</productname> internally represents base
1817      data types and how they can be passed to and from functions.
1818      Internally, <productname>PostgreSQL</productname> regards a base
1819      type as a <quote>blob of memory</quote>.  The user-defined
1820      functions that you define over a type in turn define the way that
1821      <productname>PostgreSQL</productname> can operate on it.  That
1822      is, <productname>PostgreSQL</productname> will only store and
1823      retrieve the data from disk and use your user-defined functions
1824      to input, process, and output the data.
1825     </para>
1826
1827     <para>
1828      Base types can have one of three internal formats:
1829
1830      <itemizedlist>
1831       <listitem>
1832        <para>
1833         pass by value, fixed-length
1834        </para>
1835       </listitem>
1836       <listitem>
1837        <para>
1838         pass by reference, fixed-length
1839        </para>
1840       </listitem>
1841       <listitem>
1842        <para>
1843         pass by reference, variable-length
1844        </para>
1845       </listitem>
1846      </itemizedlist>
1847     </para>
1848
1849     <para>
1850      By-value  types  can  only be 1, 2, or 4 bytes in length
1851      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1852      You should be careful to define your types such that they will be the
1853      same size (in bytes) on all architectures.  For example, the
1854      <literal>long</literal> type is dangerous because it is 4 bytes on some
1855      machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1856      on most Unix machines.  A reasonable implementation of the
1857      <type>int4</type> type on Unix machines might be:
1858
1859 <programlisting>
1860 /* 4-byte integer, passed by value */
1861 typedef int int4;
1862 </programlisting>
1863
1864      (The actual PostgreSQL C code calls this type <type>int32</type>, because
1865      it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1866      means <replaceable>XX</replaceable> <emphasis>bits</emphasis>.  Note
1867      therefore also that the C type <type>int8</type> is 1 byte in size.  The
1868      SQL type <type>int8</type> is called <type>int64</type> in C.  See also
1869      <xref linkend="xfunc-c-type-table">.)
1870     </para>
1871
1872     <para>
1873      On  the  other hand, fixed-length types of any size can
1874      be passed by-reference.  For example, here is a  sample
1875      implementation of a <productname>PostgreSQL</productname> type:
1876
1877 <programlisting>
1878 /* 16-byte structure, passed by reference */
1879 typedef struct
1880 {
1881     double  x, y;
1882 } Point;
1883 </programlisting>
1884
1885      Only  pointers  to  such types can be used when passing
1886      them in and out of <productname>PostgreSQL</productname> functions.
1887      To return a value of such a type, allocate the right amount of
1888      memory with <literal>palloc</literal>, fill in the allocated memory,
1889      and return a pointer to it.  (Also, if you just want to return the
1890      same value as one of your input arguments that's of the same data type,
1891      you can skip the extra <literal>palloc</literal> and just return the
1892      pointer to the input value.)
1893     </para>
1894
1895     <para>
1896      Finally, all variable-length types must also be  passed
1897      by  reference.   All  variable-length  types must begin
1898      with an opaque length field of exactly 4 bytes, which will be set
1899      by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1900      be  stored within that type must be located in the memory
1901      immediately  following  that  length  field.   The
1902      length field contains the total length of the structure,
1903      that is,  it  includes  the  size  of  the  length  field
1904      itself.
1905     </para>
1906
1907     <para>
1908      Another important point is to avoid leaving any uninitialized bits
1909      within data type values; for example, take care to zero out any
1910      alignment padding bytes that might be present in structs.  Without
1911      this, logically-equivalent constants of your data type might be
1912      seen as unequal by the planner, leading to inefficient (though not
1913      incorrect) plans.
1914     </para>
1915
1916     <warning>
1917      <para>
1918       <emphasis>Never</emphasis> modify the contents of a pass-by-reference input
1919       value.  If you do so you are likely to corrupt on-disk data, since
1920       the pointer you are given might point directly into a disk buffer.
1921       The sole exception to this rule is explained in
1922       <xref linkend="xaggr">.
1923      </para>
1924     </warning>
1925
1926     <para>
1927      As an example, we can define the type <type>text</type> as
1928      follows:
1929
1930 <programlisting>
1931 typedef struct {
1932     int32 length;
1933     char data[FLEXIBLE_ARRAY_MEMBER];
1934 } text;
1935 </programlisting>
1936
1937      The <literal>[FLEXIBLE_ARRAY_MEMBER]</literal> notation means that the actual
1938      length of the data part is not specified by this declaration.
1939     </para>
1940
1941     <para>
1942      When manipulating
1943      variable-length types, we must  be  careful  to  allocate
1944      the  correct amount  of memory and set the length field correctly.
1945      For example, if we wanted to  store  40  bytes  in  a <structname>text</structname>
1946      structure, we might use a code fragment like this:
1947
1948 <programlisting><![CDATA[
1949 #include "postgres.h"
1950 ...
1951 char buffer[40]; /* our source data */
1952 ...
1953 text *destination = (text *) palloc(VARHDRSZ + 40);
1954 SET_VARSIZE(destination, VARHDRSZ + 40);
1955 memcpy(destination->data, buffer, 40);
1956 ...
1957 ]]>
1958 </programlisting>
1959
1960      <literal>VARHDRSZ</literal> is the same as <literal>sizeof(int32)</literal>, but
1961      it's considered good style to use the macro <literal>VARHDRSZ</literal>
1962      to refer to the size of the overhead for a variable-length type.
1963      Also, the length field <emphasis>must</emphasis> be set using the
1964      <literal>SET_VARSIZE</literal> macro, not by simple assignment.
1965     </para>
1966
1967     <para>
1968      <xref linkend="xfunc-c-type-table"> specifies which C type
1969      corresponds to which SQL type when writing a C-language function
1970      that uses a built-in type of <productname>PostgreSQL</productname>.
1971      The <quote>Defined In</quote> column gives the header file that
1972      needs to be included to get the type definition.  (The actual
1973      definition might be in a different file that is included by the
1974      listed file.  It is recommended that users stick to the defined
1975      interface.)  Note that you should always include
1976      <filename>postgres.h</filename> first in any source file, because
1977      it declares a number of things that you will need anyway.
1978     </para>
1979
1980      <table tocentry="1" id="xfunc-c-type-table">
1981       <title>Equivalent C Types for Built-in SQL Types</title>
1982       <tgroup cols="3">
1983        <thead>
1984         <row>
1985          <entry>
1986           SQL Type
1987          </entry>
1988          <entry>
1989           C Type
1990          </entry>
1991          <entry>
1992           Defined In
1993          </entry>
1994         </row>
1995        </thead>
1996        <tbody>
1997         <row>
1998          <entry><type>abstime</type></entry>
1999          <entry><type>AbsoluteTime</type></entry>
2000          <entry><filename>utils/nabstime.h</filename></entry>
2001         </row>
2002         <row>
2003          <entry><type>bigint</type> (<type>int8</type>)</entry>
2004          <entry><type>int64</type></entry>
2005          <entry><filename>postgres.h</filename></entry>
2006         </row>
2007         <row>
2008          <entry><type>boolean</type></entry>
2009          <entry><type>bool</type></entry>
2010          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
2011         </row>
2012         <row>
2013          <entry><type>box</type></entry>
2014          <entry><type>BOX*</type></entry>
2015          <entry><filename>utils/geo_decls.h</filename></entry>
2016         </row>
2017         <row>
2018          <entry><type>bytea</type></entry>
2019          <entry><type>bytea*</type></entry>
2020          <entry><filename>postgres.h</filename></entry>
2021         </row>
2022         <row>
2023          <entry><type>"char"</type></entry>
2024          <entry><type>char</type></entry>
2025          <entry>(compiler built-in)</entry>
2026         </row>
2027         <row>
2028          <entry><type>character</type></entry>
2029          <entry><type>BpChar*</type></entry>
2030          <entry><filename>postgres.h</filename></entry>
2031         </row>
2032         <row>
2033          <entry><type>cid</type></entry>
2034          <entry><type>CommandId</type></entry>
2035          <entry><filename>postgres.h</filename></entry>
2036         </row>
2037         <row>
2038          <entry><type>date</type></entry>
2039          <entry><type>DateADT</type></entry>
2040          <entry><filename>utils/date.h</filename></entry>
2041         </row>
2042         <row>
2043          <entry><type>smallint</type> (<type>int2</type>)</entry>
2044          <entry><type>int16</type></entry>
2045          <entry><filename>postgres.h</filename></entry>
2046         </row>
2047         <row>
2048          <entry><type>int2vector</type></entry>
2049          <entry><type>int2vector*</type></entry>
2050          <entry><filename>postgres.h</filename></entry>
2051         </row>
2052         <row>
2053          <entry><type>integer</type> (<type>int4</type>)</entry>
2054          <entry><type>int32</type></entry>
2055          <entry><filename>postgres.h</filename></entry>
2056         </row>
2057         <row>
2058          <entry><type>real</type> (<type>float4</type>)</entry>
2059          <entry><type>float4*</type></entry>
2060         <entry><filename>postgres.h</filename></entry>
2061         </row>
2062         <row>
2063          <entry><type>double precision</type> (<type>float8</type>)</entry>
2064          <entry><type>float8*</type></entry>
2065          <entry><filename>postgres.h</filename></entry>
2066         </row>
2067         <row>
2068          <entry><type>interval</type></entry>
2069          <entry><type>Interval*</type></entry>
2070          <entry><filename>datatype/timestamp.h</filename></entry>
2071         </row>
2072         <row>
2073          <entry><type>lseg</type></entry>
2074          <entry><type>LSEG*</type></entry>
2075          <entry><filename>utils/geo_decls.h</filename></entry>
2076         </row>
2077         <row>
2078          <entry><type>name</type></entry>
2079          <entry><type>Name</type></entry>
2080          <entry><filename>postgres.h</filename></entry>
2081         </row>
2082         <row>
2083          <entry><type>oid</type></entry>
2084          <entry><type>Oid</type></entry>
2085          <entry><filename>postgres.h</filename></entry>
2086         </row>
2087         <row>
2088          <entry><type>oidvector</type></entry>
2089          <entry><type>oidvector*</type></entry>
2090          <entry><filename>postgres.h</filename></entry>
2091         </row>
2092         <row>
2093          <entry><type>path</type></entry>
2094          <entry><type>PATH*</type></entry>
2095          <entry><filename>utils/geo_decls.h</filename></entry>
2096         </row>
2097         <row>
2098          <entry><type>point</type></entry>
2099          <entry><type>POINT*</type></entry>
2100          <entry><filename>utils/geo_decls.h</filename></entry>
2101         </row>
2102         <row>
2103          <entry><type>regproc</type></entry>
2104          <entry><type>regproc</type></entry>
2105          <entry><filename>postgres.h</filename></entry>
2106         </row>
2107         <row>
2108          <entry><type>reltime</type></entry>
2109          <entry><type>RelativeTime</type></entry>
2110          <entry><filename>utils/nabstime.h</filename></entry>
2111         </row>
2112         <row>
2113          <entry><type>text</type></entry>
2114          <entry><type>text*</type></entry>
2115          <entry><filename>postgres.h</filename></entry>
2116         </row>
2117         <row>
2118          <entry><type>tid</type></entry>
2119          <entry><type>ItemPointer</type></entry>
2120          <entry><filename>storage/itemptr.h</filename></entry>
2121         </row>
2122         <row>
2123          <entry><type>time</type></entry>
2124          <entry><type>TimeADT</type></entry>
2125          <entry><filename>utils/date.h</filename></entry>
2126         </row>
2127         <row>
2128          <entry><type>time with time zone</type></entry>
2129          <entry><type>TimeTzADT</type></entry>
2130          <entry><filename>utils/date.h</filename></entry>
2131         </row>
2132         <row>
2133          <entry><type>timestamp</type></entry>
2134          <entry><type>Timestamp*</type></entry>
2135          <entry><filename>datatype/timestamp.h</filename></entry>
2136         </row>
2137         <row>
2138          <entry><type>tinterval</type></entry>
2139          <entry><type>TimeInterval</type></entry>
2140          <entry><filename>utils/nabstime.h</filename></entry>
2141         </row>
2142         <row>
2143          <entry><type>varchar</type></entry>
2144          <entry><type>VarChar*</type></entry>
2145          <entry><filename>postgres.h</filename></entry>
2146         </row>
2147         <row>
2148          <entry><type>xid</type></entry>
2149          <entry><type>TransactionId</type></entry>
2150          <entry><filename>postgres.h</filename></entry>
2151         </row>
2152        </tbody>
2153       </tgroup>
2154      </table>
2155
2156     <para>
2157      Now that we've gone over all of the possible structures
2158      for base types, we can show some examples of real functions.
2159     </para>
2160    </sect2>
2161
2162    <sect2>
2163     <title>Version 1 Calling Conventions</title>
2164
2165     <para>
2166      The version-1 calling convention relies on macros to suppress most
2167      of the complexity of passing arguments and results.  The C declaration
2168      of a version-1 function is always:
2169 <programlisting>
2170 Datum funcname(PG_FUNCTION_ARGS)
2171 </programlisting>
2172      In addition, the macro call:
2173 <programlisting>
2174 PG_FUNCTION_INFO_V1(funcname);
2175 </programlisting>
2176      must appear in the same source file.  (Conventionally, it's
2177      written just before the function itself.)  This macro call is not
2178      needed for <literal>internal</literal>-language functions, since
2179      <productname>PostgreSQL</productname> assumes that all internal functions
2180      use the version-1 convention.  It is, however, required for
2181      dynamically-loaded functions.
2182     </para>
2183
2184     <para>
2185      In a version-1 function, each actual argument is fetched using a
2186      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2187      macro that corresponds to the argument's data type.  In non-strict
2188      functions there needs to be a previous check about argument null-ness
2189      using <function>PG_ARGNULL_<replaceable>xxx</replaceable>()</function>.
2190      The result is returned using a
2191      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2192      macro for the return type.
2193      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2194      takes as its argument the number of the function argument to
2195      fetch, where the count starts at 0.
2196      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2197      takes as its argument the actual value to return.
2198     </para>
2199
2200     <para>
2201      Here are some examples using the version-1 calling convention:
2202     </para>
2203
2204 <programlisting><![CDATA[
2205 #include "postgres.h"
2206 #include <string.h>
2207 #include "fmgr.h"
2208 #include "utils/geo_decls.h"
2209
2210 PG_MODULE_MAGIC;
2211
2212 /* by value */
2213
2214 PG_FUNCTION_INFO_V1(add_one);
2215
2216 Datum
2217 add_one(PG_FUNCTION_ARGS)
2218 {
2219     int32   arg = PG_GETARG_INT32(0);
2220
2221     PG_RETURN_INT32(arg + 1);
2222 }
2223
2224 /* by reference, fixed length */
2225
2226 PG_FUNCTION_INFO_V1(add_one_float8);
2227
2228 Datum
2229 add_one_float8(PG_FUNCTION_ARGS)
2230 {
2231     /* The macros for FLOAT8 hide its pass-by-reference nature. */
2232     float8   arg = PG_GETARG_FLOAT8(0);
2233
2234     PG_RETURN_FLOAT8(arg + 1.0);
2235 }
2236
2237 PG_FUNCTION_INFO_V1(makepoint);
2238
2239 Datum
2240 makepoint(PG_FUNCTION_ARGS)
2241 {
2242     /* Here, the pass-by-reference nature of Point is not hidden. */
2243     Point     *pointx = PG_GETARG_POINT_P(0);
2244     Point     *pointy = PG_GETARG_POINT_P(1);
2245     Point     *new_point = (Point *) palloc(sizeof(Point));
2246
2247     new_point->x = pointx->x;
2248     new_point->y = pointy->y;
2249
2250     PG_RETURN_POINT_P(new_point);
2251 }
2252
2253 /* by reference, variable length */
2254
2255 PG_FUNCTION_INFO_V1(copytext);
2256
2257 Datum
2258 copytext(PG_FUNCTION_ARGS)
2259 {
2260     text     *t = PG_GETARG_TEXT_PP(0);
2261
2262     /*
2263      * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
2264      * VARHDRSZ or VARHDRSZ_SHORT of its header.  Construct the copy with a
2265      * full-length header.
2266      */
2267     text     *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2268     SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2269
2270     /*
2271      * VARDATA is a pointer to the data region of the new struct.  The source
2272      * could be a short datum, so retrieve its data through VARDATA_ANY.
2273      */
2274     memcpy((void *) VARDATA(new_t), /* destination */
2275            (void *) VARDATA_ANY(t), /* source */
2276            VARSIZE_ANY_EXHDR(t));   /* how many bytes */
2277     PG_RETURN_TEXT_P(new_t);
2278 }
2279
2280 PG_FUNCTION_INFO_V1(concat_text);
2281
2282 Datum
2283 concat_text(PG_FUNCTION_ARGS)
2284 {
2285     text  *arg1 = PG_GETARG_TEXT_PP(0);
2286     text  *arg2 = PG_GETARG_TEXT_PP(1);
2287     int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
2288     int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
2289     int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
2290     text *new_text = (text *) palloc(new_text_size);
2291
2292     SET_VARSIZE(new_text, new_text_size);
2293     memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
2294     memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
2295     PG_RETURN_TEXT_P(new_text);
2296 }
2297 ]]>
2298 </programlisting>
2299
2300     <para>
2301      Supposing that the above code has been prepared in file
2302      <filename>funcs.c</filename> and compiled into a shared object,
2303      we could define the functions to <productname>PostgreSQL</productname>
2304      with commands like this:
2305     </para>
2306
2307 <programlisting>
2308 CREATE FUNCTION add_one(integer) RETURNS integer
2309      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2310      LANGUAGE C STRICT;
2311
2312 -- note overloading of SQL function name "add_one"
2313 CREATE FUNCTION add_one(double precision) RETURNS double precision
2314      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2315      LANGUAGE C STRICT;
2316
2317 CREATE FUNCTION makepoint(point, point) RETURNS point
2318      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2319      LANGUAGE C STRICT;
2320
2321 CREATE FUNCTION copytext(text) RETURNS text
2322      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2323      LANGUAGE C STRICT;
2324
2325 CREATE FUNCTION concat_text(text, text) RETURNS text
2326      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2327      LANGUAGE C STRICT;
2328 </programlisting>
2329
2330     <para>
2331      Here, <replaceable>DIRECTORY</replaceable> stands for the
2332      directory of the shared library file (for instance the
2333      <productname>PostgreSQL</productname> tutorial directory, which
2334      contains the code for the examples used in this section).
2335      (Better style would be to use just <literal>'funcs'</literal> in the
2336      <literal>AS</literal> clause, after having added
2337      <replaceable>DIRECTORY</replaceable> to the search path.  In any
2338      case, we can omit the system-specific extension for a shared
2339      library, commonly <literal>.so</literal>.)
2340     </para>
2341
2342     <para>
2343      Notice that we have specified the functions as <quote>strict</quote>,
2344      meaning that
2345      the system should automatically assume a null result if any input
2346      value is null.  By doing this, we avoid having to check for null inputs
2347      in the function code.  Without this, we'd have to check for null values
2348      explicitly, using <function>PG_ARGISNULL()</function>.
2349     </para>
2350
2351     <para>
2352      At first glance, the version-1 coding conventions might appear to be just
2353      pointless obscurantism, over using plain <literal>C</literal> calling
2354      conventions.  They do however allow to deal with <literal>NULL</literal>able
2355      arguments/return values, and <quote>toasted</quote> (compressed or
2356      out-of-line) values.
2357     </para>
2358
2359     <para>
2360      The macro <function>PG_ARGISNULL(<replaceable>n</replaceable>)</function>
2361      allows a function to test whether each input is null.  (Of course, doing
2362      this is only necessary in functions not declared <quote>strict</quote>.)
2363      As with the
2364      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2365      the input arguments are counted beginning at zero.  Note that one
2366      should refrain from executing
2367      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2368      one has verified that the argument isn't null.
2369      To return a null result, execute <function>PG_RETURN_NULL()</function>;
2370      this works in both strict and nonstrict functions.
2371     </para>
2372
2373     <para>
2374      Other options provided by the version-1 interface are two
2375      variants of the
2376      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2377      macros. The first of these,
2378      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2379      guarantees to return a copy of the specified argument that is
2380      safe for writing into. (The normal macros will sometimes return a
2381      pointer to a value that is physically stored in a table, which
2382      must not be written to. Using the
2383      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2384      macros guarantees a writable result.)
2385     The second variant consists of the
2386     <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2387     macros which take three arguments. The first is the number of the
2388     function argument (as above). The second and third are the offset and
2389     length of the segment to be returned. Offsets are counted from
2390     zero, and a negative length requests that the remainder of the
2391     value be returned. These macros provide more efficient access to
2392     parts of large values in the case where they have storage type
2393     <quote>external</quote>. (The storage type of a column can be specified using
2394     <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2395     COLUMN <replaceable>colname</replaceable> SET STORAGE
2396     <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2397     <literal>plain</literal>, <literal>external</literal>, <literal>extended</literal>,
2398      or <literal>main</literal>.)
2399     </para>
2400
2401     <para>
2402      Finally, the version-1 function call conventions make it possible
2403      to return set results (<xref linkend="xfunc-c-return-set">) and
2404      implement trigger functions (<xref linkend="triggers">) and
2405      procedural-language call handlers (<xref
2406      linkend="plhandler">).  For more details
2407      see <filename>src/backend/utils/fmgr/README</filename> in the
2408      source distribution.
2409     </para>
2410    </sect2>
2411
2412    <sect2>
2413     <title>Writing Code</title>
2414
2415     <para>
2416      Before we turn to the more advanced topics, we should discuss
2417      some coding rules for <productname>PostgreSQL</productname>
2418      C-language functions.  While it might be possible to load functions
2419      written in languages other than C into
2420      <productname>PostgreSQL</productname>, this is usually difficult
2421      (when it is possible at all) because other languages, such as
2422      C++, FORTRAN, or Pascal often do not follow the same calling
2423      convention as C.  That is, other languages do not pass argument
2424      and return values between functions in the same way.  For this
2425      reason, we will assume that your C-language functions are
2426      actually written in C.
2427     </para>
2428
2429     <para>
2430      The basic rules for writing and building C functions are as follows:
2431
2432      <itemizedlist>
2433       <listitem>
2434        <para>
2435         Use <literal>pg_config
2436         --includedir-server</literal><indexterm><primary>pg_config</primary><secondary>with user-defined C functions</secondary></indexterm>
2437         to find out where the <productname>PostgreSQL</productname> server header
2438         files are installed on your system (or the system that your
2439         users will be running on).
2440        </para>
2441       </listitem>
2442
2443       <listitem>
2444        <para>
2445         Compiling and linking your code so that it can be dynamically
2446         loaded into <productname>PostgreSQL</productname> always
2447         requires special flags.  See <xref linkend="dfunc"> for a
2448         detailed explanation of how to do it for your particular
2449         operating system.
2450        </para>
2451       </listitem>
2452
2453       <listitem>
2454        <para>
2455         Remember to define a <quote>magic block</quote> for your shared library,
2456         as described in <xref linkend="xfunc-c-dynload">.
2457        </para>
2458       </listitem>
2459
2460       <listitem>
2461        <para>
2462         When allocating memory, use the
2463         <productname>PostgreSQL</productname> functions
2464         <function>palloc</function><indexterm><primary>palloc</primary></indexterm> and <function>pfree</function><indexterm><primary>pfree</primary></indexterm>
2465         instead of the corresponding C library functions
2466         <function>malloc</function> and <function>free</function>.
2467         The memory allocated by <function>palloc</function> will be
2468         freed automatically at the end of each transaction, preventing
2469         memory leaks.
2470        </para>
2471       </listitem>
2472
2473       <listitem>
2474        <para>
2475         Always zero the bytes of your structures using <function>memset</function>
2476         (or allocate them with <function>palloc0</function> in the first place).
2477         Even if you assign to each field of your structure, there might be
2478         alignment padding (holes in the structure) that contain
2479         garbage values.  Without this, it's difficult to
2480         support hash indexes or hash joins, as you must pick out only
2481         the significant bits of your data structure to compute a hash.
2482         The planner also sometimes relies on comparing constants via
2483         bitwise equality, so you can get undesirable planning results if
2484         logically-equivalent values aren't bitwise equal.
2485        </para>
2486       </listitem>
2487
2488       <listitem>
2489        <para>
2490         Most of the internal <productname>PostgreSQL</productname>
2491         types are declared in <filename>postgres.h</filename>, while
2492         the function manager interfaces
2493         (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)  are in
2494         <filename>fmgr.h</filename>, so you will need to include at
2495         least these two files.  For portability reasons it's best to
2496         include <filename>postgres.h</filename> <emphasis>first</emphasis>,
2497         before any other system or user header files.  Including
2498         <filename>postgres.h</filename> will also include
2499         <filename>elog.h</filename> and <filename>palloc.h</filename>
2500         for you.
2501        </para>
2502       </listitem>
2503
2504       <listitem>
2505        <para>
2506         Symbol names defined within object files must not conflict
2507         with each other or with symbols defined in the
2508         <productname>PostgreSQL</productname> server executable.  You
2509         will have to rename your functions or variables if you get
2510         error messages to this effect.
2511        </para>
2512       </listitem>
2513      </itemizedlist>
2514     </para>
2515    </sect2>
2516
2517 &dfunc;
2518
2519    <sect2>
2520     <title>Composite-type Arguments</title>
2521
2522     <para>
2523      Composite types do not have a fixed layout like C structures.
2524      Instances of a composite type can contain null fields.  In
2525      addition, composite types that are part of an inheritance
2526      hierarchy can have different fields than other members of the
2527      same inheritance hierarchy.  Therefore,
2528      <productname>PostgreSQL</productname> provides a function
2529      interface for accessing fields of composite types from C.
2530     </para>
2531
2532     <para>
2533      Suppose we want to write a function to answer the query:
2534
2535 <programlisting>
2536 SELECT name, c_overpaid(emp, 1500) AS overpaid
2537     FROM emp
2538     WHERE name = 'Bill' OR name = 'Sam';
2539 </programlisting>
2540
2541      Using the version-1 calling conventions, we can define
2542      <function>c_overpaid</function> as:
2543
2544 <programlisting><![CDATA[
2545 #include "postgres.h"
2546 #include "executor/executor.h"  /* for GetAttributeByName() */
2547
2548 PG_MODULE_MAGIC;
2549
2550 PG_FUNCTION_INFO_V1(c_overpaid);
2551
2552 Datum
2553 c_overpaid(PG_FUNCTION_ARGS)
2554 {
2555     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
2556     int32            limit = PG_GETARG_INT32(1);
2557     bool isnull;
2558     Datum salary;
2559
2560     salary = GetAttributeByName(t, "salary", &isnull);
2561     if (isnull)
2562         PG_RETURN_BOOL(false);
2563     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2564
2565     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2566 }
2567 ]]>
2568 </programlisting>
2569     </para>
2570
2571     <para>
2572      <function>GetAttributeByName</function> is the
2573      <productname>PostgreSQL</productname> system function that
2574      returns attributes out of the specified row.  It has
2575      three arguments: the argument of type <type>HeapTupleHeader</type> passed
2576      into
2577      the  function, the name of the desired attribute, and a
2578      return parameter that tells whether  the  attribute
2579      is  null.   <function>GetAttributeByName</function> returns a <type>Datum</type>
2580      value that you can convert to the proper data type by using the
2581      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2582      macro.  Note that the return value is meaningless if the null flag is
2583      set; always check the null flag before trying to do anything with the
2584      result.
2585     </para>
2586
2587     <para>
2588      There is also <function>GetAttributeByNum</function>, which selects
2589      the target attribute by column number instead of name.
2590     </para>
2591
2592     <para>
2593      The following command declares the function
2594      <function>c_overpaid</function> in SQL:
2595
2596 <programlisting>
2597 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2598     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2599     LANGUAGE C STRICT;
2600 </programlisting>
2601
2602      Notice we have used <literal>STRICT</literal> so that we did not have to
2603      check whether the input arguments were NULL.
2604     </para>
2605    </sect2>
2606
2607    <sect2>
2608     <title>Returning Rows (Composite Types)</title>
2609
2610     <para>
2611      To return a row or composite-type value from a C-language
2612      function, you can use a special API that provides macros and
2613      functions to hide most of the complexity of building composite
2614      data types.  To use this API, the source file must include:
2615 <programlisting>
2616 #include "funcapi.h"
2617 </programlisting>
2618     </para>
2619
2620     <para>
2621      There are two ways you can build a composite data value (henceforth
2622      a <quote>tuple</quote>): you can build it from an array of Datum values,
2623      or from an array of C strings that can be passed to the input
2624      conversion functions of the tuple's column data types.  In either
2625      case, you first need to obtain or construct a <structname>TupleDesc</structname>
2626      descriptor for the tuple structure.  When working with Datums, you
2627      pass the <structname>TupleDesc</structname> to <function>BlessTupleDesc</function>,
2628      and then call <function>heap_form_tuple</function> for each row.  When working
2629      with C strings, you pass the <structname>TupleDesc</structname> to
2630      <function>TupleDescGetAttInMetadata</function>, and then call
2631      <function>BuildTupleFromCStrings</function> for each row.  In the case of a
2632      function returning a set of tuples, the setup steps can all be done
2633      once during the first call of the function.
2634     </para>
2635
2636     <para>
2637      Several helper functions are available for setting up the needed
2638      <structname>TupleDesc</structname>.  The recommended way to do this in most
2639      functions returning composite values is to call:
2640 <programlisting>
2641 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2642                                    Oid *resultTypeId,
2643                                    TupleDesc *resultTupleDesc)
2644 </programlisting>
2645      passing the same <literal>fcinfo</literal> struct passed to the calling function
2646      itself.  (This of course requires that you use the version-1
2647      calling conventions.)  <varname>resultTypeId</varname> can be specified
2648      as <literal>NULL</literal> or as the address of a local variable to receive the
2649      function's result type OID.  <varname>resultTupleDesc</varname> should be the
2650      address of a local <structname>TupleDesc</structname> variable.  Check that the
2651      result is <literal>TYPEFUNC_COMPOSITE</literal>; if so,
2652      <varname>resultTupleDesc</varname> has been filled with the needed
2653      <structname>TupleDesc</structname>.  (If it is not, you can report an error along
2654      the lines of <quote>function returning record called in context that
2655      cannot accept type record</quote>.)
2656     </para>
2657
2658     <tip>
2659      <para>
2660       <function>get_call_result_type</function> can resolve the actual type of a
2661       polymorphic function result; so it is useful in functions that return
2662       scalar polymorphic results, not only functions that return composites.
2663       The <varname>resultTypeId</varname> output is primarily useful for functions
2664       returning polymorphic scalars.
2665      </para>
2666     </tip>
2667
2668     <note>
2669      <para>
2670       <function>get_call_result_type</function> has a sibling
2671       <function>get_expr_result_type</function>, which can be used to resolve the
2672       expected output type for a function call represented by an expression
2673       tree.  This can be used when trying to determine the result type from
2674       outside the function itself.  There is also
2675       <function>get_func_result_type</function>, which can be used when only the
2676       function's OID is available.  However these functions are not able
2677       to deal with functions declared to return <structname>record</structname>, and
2678       <function>get_func_result_type</function> cannot resolve polymorphic types,
2679       so you should preferentially use <function>get_call_result_type</function>.
2680      </para>
2681     </note>
2682
2683     <para>
2684      Older, now-deprecated functions for obtaining
2685      <structname>TupleDesc</structname>s are:
2686 <programlisting>
2687 TupleDesc RelationNameGetTupleDesc(const char *relname)
2688 </programlisting>
2689      to get a <structname>TupleDesc</structname> for the row type of a named relation,
2690      and:
2691 <programlisting>
2692 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2693 </programlisting>
2694      to get a <structname>TupleDesc</structname> based on a type OID. This can
2695      be used to get a <structname>TupleDesc</structname> for a base or
2696      composite type.  It will not work for a function that returns
2697      <structname>record</structname>, however, and it cannot resolve polymorphic
2698      types.
2699     </para>
2700
2701     <para>
2702      Once you have a <structname>TupleDesc</structname>, call:
2703 <programlisting>
2704 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2705 </programlisting>
2706      if you plan to work with Datums, or:
2707 <programlisting>
2708 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2709 </programlisting>
2710      if you plan to work with C strings.  If you are writing a function
2711      returning set, you can save the results of these functions in the
2712      <structname>FuncCallContext</structname> structure &mdash; use the
2713      <structfield>tuple_desc</structfield> or <structfield>attinmeta</structfield> field
2714      respectively.
2715     </para>
2716
2717     <para>
2718      When working with Datums, use:
2719 <programlisting>
2720 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2721 </programlisting>
2722      to build a <structname>HeapTuple</structname> given user data in Datum form.
2723     </para>
2724
2725     <para>
2726      When working with C strings, use:
2727 <programlisting>
2728 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2729 </programlisting>
2730      to build a <structname>HeapTuple</structname> given user data
2731      in C string form.  <parameter>values</parameter> is an array of C strings,
2732      one for each attribute of the return row. Each C string should be in
2733      the form expected by the input function of the attribute data
2734      type. In order to return a null value for one of the attributes,
2735      the corresponding pointer in the <parameter>values</parameter> array
2736      should be set to <symbol>NULL</symbol>.  This function will need to
2737      be called again for each row you return.
2738     </para>
2739
2740     <para>
2741      Once you have built a tuple to return from your function, it
2742      must be converted into a <type>Datum</type>. Use:
2743 <programlisting>
2744 HeapTupleGetDatum(HeapTuple tuple)
2745 </programlisting>
2746      to convert a <structname>HeapTuple</structname> into a valid Datum.  This
2747      <type>Datum</type> can be returned directly if you intend to return
2748      just a single row, or it can be used as the current return value
2749      in a set-returning function.
2750     </para>
2751
2752     <para>
2753      An example appears in the next section.
2754     </para>
2755
2756    </sect2>
2757
2758    <sect2 id="xfunc-c-return-set">
2759     <title>Returning Sets</title>
2760
2761     <para>
2762      There is also a special API that provides support for returning
2763      sets (multiple rows) from a C-language function.  A set-returning
2764      function must follow the version-1 calling conventions.  Also,
2765      source files must include <filename>funcapi.h</filename>, as
2766      above.
2767     </para>
2768
2769     <para>
2770      A set-returning function (<acronym>SRF</acronym>) is called
2771      once for each item it returns.  The <acronym>SRF</acronym> must
2772      therefore save enough state to remember what it was doing and
2773      return the next item on each call.
2774      The structure <structname>FuncCallContext</structname> is provided to help
2775      control this process.  Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</literal>
2776      is used to hold a pointer to <structname>FuncCallContext</structname>
2777      across calls.
2778 <programlisting>
2779 typedef struct FuncCallContext
2780 {
2781     /*
2782      * Number of times we've been called before
2783      *
2784      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2785      * incremented for you every time SRF_RETURN_NEXT() is called.
2786      */
2787     uint64 call_cntr;
2788
2789     /*
2790      * OPTIONAL maximum number of calls
2791      *
2792      * max_calls is here for convenience only and setting it is optional.
2793      * If not set, you must provide alternative means to know when the
2794      * function is done.
2795      */
2796     uint64 max_calls;
2797
2798     /*
2799      * OPTIONAL pointer to result slot
2800      *
2801      * This is obsolete and only present for backward compatibility, viz,
2802      * user-defined SRFs that use the deprecated TupleDescGetSlot().
2803      */
2804     TupleTableSlot *slot;
2805
2806     /*
2807      * OPTIONAL pointer to miscellaneous user-provided context information
2808      *
2809      * user_fctx is for use as a pointer to your own data to retain
2810      * arbitrary context information between calls of your function.
2811      */
2812     void *user_fctx;
2813
2814     /*
2815      * OPTIONAL pointer to struct containing attribute type input metadata
2816      *
2817      * attinmeta is for use when returning tuples (i.e., composite data types)
2818      * and is not used when returning base data types. It is only needed
2819      * if you intend to use BuildTupleFromCStrings() to create the return
2820      * tuple.
2821      */
2822     AttInMetadata *attinmeta;
2823
2824     /*
2825      * memory context used for structures that must live for multiple calls
2826      *
2827      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2828      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2829      * context for any memory that is to be reused across multiple calls
2830      * of the SRF.
2831      */
2832     MemoryContext multi_call_memory_ctx;
2833
2834     /*
2835      * OPTIONAL pointer to struct containing tuple description
2836      *
2837      * tuple_desc is for use when returning tuples (i.e., composite data types)
2838      * and is only needed if you are going to build the tuples with
2839      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
2840      * the TupleDesc pointer stored here should usually have been run through
2841      * BlessTupleDesc() first.
2842      */
2843     TupleDesc tuple_desc;
2844
2845 } FuncCallContext;
2846 </programlisting>
2847     </para>
2848
2849     <para>
2850      An <acronym>SRF</acronym> uses several functions and macros that
2851      automatically manipulate the <structname>FuncCallContext</structname>
2852      structure (and expect to find it via <literal>fn_extra</literal>).  Use:
2853 <programlisting>
2854 SRF_IS_FIRSTCALL()
2855 </programlisting>
2856      to determine if your function is being called for the first or a
2857      subsequent time. On the first call (only) use:
2858 <programlisting>
2859 SRF_FIRSTCALL_INIT()
2860 </programlisting>
2861      to initialize the <structname>FuncCallContext</structname>. On every function call,
2862      including the first, use:
2863 <programlisting>
2864 SRF_PERCALL_SETUP()
2865 </programlisting>
2866      to properly set up for using the <structname>FuncCallContext</structname>
2867      and clearing any previously returned data left over from the
2868      previous pass.
2869     </para>
2870
2871     <para>
2872      If your function has data to return, use:
2873 <programlisting>
2874 SRF_RETURN_NEXT(funcctx, result)
2875 </programlisting>
2876      to return it to the caller.  (<literal>result</literal> must be of type
2877      <type>Datum</type>, either a single value or a tuple prepared as
2878      described above.)  Finally, when your function is finished
2879      returning data, use:
2880 <programlisting>
2881 SRF_RETURN_DONE(funcctx)
2882 </programlisting>
2883      to clean up and end the <acronym>SRF</acronym>.
2884     </para>
2885
2886     <para>
2887      The memory context that is current when the <acronym>SRF</acronym> is called is
2888      a transient context that will be cleared between calls.  This means
2889      that you do not need to call <function>pfree</function> on everything
2890      you allocated using <function>palloc</function>; it will go away anyway.  However, if you want to allocate
2891      any data structures to live across calls, you need to put them somewhere
2892      else.  The memory context referenced by
2893      <structfield>multi_call_memory_ctx</structfield> is a suitable location for any
2894      data that needs to survive until the <acronym>SRF</acronym> is finished running.  In most
2895      cases, this means that you should switch into
2896      <structfield>multi_call_memory_ctx</structfield> while doing the first-call setup.
2897     </para>
2898
2899     <warning>
2900      <para>
2901       While the actual arguments to the function remain unchanged between
2902       calls, if you detoast the argument values (which is normally done
2903       transparently by the
2904       <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2905       in the transient context then the detoasted copies will be freed on
2906       each cycle. Accordingly, if you keep references to such values in
2907       your <structfield>user_fctx</structfield>, you must either copy them into the
2908       <structfield>multi_call_memory_ctx</structfield> after detoasting, or ensure
2909       that you detoast the values only in that context.
2910      </para>
2911     </warning>
2912
2913     <para>
2914      A complete pseudo-code example looks like the following:
2915 <programlisting>
2916 Datum
2917 my_set_returning_function(PG_FUNCTION_ARGS)
2918 {
2919     FuncCallContext  *funcctx;
2920     Datum             result;
2921     <replaceable>further declarations as needed</replaceable>
2922
2923     if (SRF_IS_FIRSTCALL())
2924     {
2925         MemoryContext oldcontext;
2926
2927         funcctx = SRF_FIRSTCALL_INIT();
2928         oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
2929         /* One-time setup code appears here: */
2930         <replaceable>user code</replaceable>
2931         <replaceable>if returning composite</replaceable>
2932             <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2933         <replaceable>endif returning composite</replaceable>
2934         <replaceable>user code</replaceable>
2935         MemoryContextSwitchTo(oldcontext);
2936     }
2937
2938     /* Each-time setup code appears here: */
2939     <replaceable>user code</replaceable>
2940     funcctx = SRF_PERCALL_SETUP();
2941     <replaceable>user code</replaceable>
2942
2943     /* this is just one way we might test whether we are done: */
2944     if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
2945     {
2946         /* Here we want to return another item: */
2947         <replaceable>user code</replaceable>
2948         <replaceable>obtain result Datum</replaceable>
2949         SRF_RETURN_NEXT(funcctx, result);
2950     }
2951     else
2952     {
2953         /* Here we are done returning items and just need to clean up: */
2954         <replaceable>user code</replaceable>
2955         SRF_RETURN_DONE(funcctx);
2956     }
2957 }
2958 </programlisting>
2959     </para>
2960
2961     <para>
2962      A complete example of a simple <acronym>SRF</acronym> returning a composite type
2963      looks like:
2964 <programlisting><![CDATA[
2965 PG_FUNCTION_INFO_V1(retcomposite);
2966
2967 Datum
2968 retcomposite(PG_FUNCTION_ARGS)
2969 {
2970     FuncCallContext     *funcctx;
2971     int                  call_cntr;
2972     int                  max_calls;
2973     TupleDesc            tupdesc;
2974     AttInMetadata       *attinmeta;
2975
2976     /* stuff done only on the first call of the function */
2977     if (SRF_IS_FIRSTCALL())
2978     {
2979         MemoryContext   oldcontext;
2980
2981         /* create a function context for cross-call persistence */
2982         funcctx = SRF_FIRSTCALL_INIT();
2983
2984         /* switch to memory context appropriate for multiple function calls */
2985         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2986
2987         /* total number of tuples to be returned */
2988         funcctx->max_calls = PG_GETARG_UINT32(0);
2989
2990         /* Build a tuple descriptor for our result type */
2991         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
2992             ereport(ERROR,
2993                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
2994                      errmsg("function returning record called in context "
2995                             "that cannot accept type record")));
2996
2997         /*
2998          * generate attribute metadata needed later to produce tuples from raw
2999          * C strings
3000          */
3001         attinmeta = TupleDescGetAttInMetadata(tupdesc);
3002         funcctx->attinmeta = attinmeta;
3003
3004         MemoryContextSwitchTo(oldcontext);
3005     }
3006
3007     /* stuff done on every call of the function */
3008     funcctx = SRF_PERCALL_SETUP();
3009
3010     call_cntr = funcctx->call_cntr;
3011     max_calls = funcctx->max_calls;
3012     attinmeta = funcctx->attinmeta;
3013
3014     if (call_cntr < max_calls)    /* do when there is more left to send */
3015     {
3016         char       **values;
3017         HeapTuple    tuple;
3018         Datum        result;
3019
3020         /*
3021          * Prepare a values array for building the returned tuple.
3022          * This should be an array of C strings which will
3023          * be processed later by the type input functions.
3024          */
3025         values = (char **) palloc(3 * sizeof(char *));
3026         values[0] = (char *) palloc(16 * sizeof(char));
3027         values[1] = (char *) palloc(16 * sizeof(char));
3028         values[2] = (char *) palloc(16 * sizeof(char));
3029
3030         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3031         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3032         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3033
3034         /* build a tuple */
3035         tuple = BuildTupleFromCStrings(attinmeta, values);
3036
3037         /* make the tuple into a datum */
3038         result = HeapTupleGetDatum(tuple);
3039
3040         /* clean up (this is not really necessary) */
3041         pfree(values[0]);
3042         pfree(values[1]);
3043         pfree(values[2]);
3044         pfree(values);
3045
3046         SRF_RETURN_NEXT(funcctx, result);
3047     }
3048     else    /* do when there is no more left */
3049     {
3050         SRF_RETURN_DONE(funcctx);
3051     }
3052 }
3053 ]]>
3054 </programlisting>
3055
3056      One way to declare this function in SQL is:
3057 <programlisting>
3058 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3059
3060 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3061     RETURNS SETOF __retcomposite
3062     AS '<replaceable>filename</replaceable>', 'retcomposite'
3063     LANGUAGE C IMMUTABLE STRICT;
3064 </programlisting>
3065      A different way is to use OUT parameters:
3066 <programlisting>
3067 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3068     OUT f1 integer, OUT f2 integer, OUT f3 integer)
3069     RETURNS SETOF record
3070     AS '<replaceable>filename</replaceable>', 'retcomposite'
3071     LANGUAGE C IMMUTABLE STRICT;
3072 </programlisting>
3073      Notice that in this method the output type of the function is formally
3074      an anonymous <structname>record</structname> type.
3075     </para>
3076
3077     <para>
3078      The directory <link linkend="tablefunc"><filename>contrib/tablefunc</filename></link>
3079      module in the source distribution contains more examples of
3080      set-returning functions.
3081     </para>
3082    </sect2>
3083
3084    <sect2>
3085     <title>Polymorphic Arguments and Return Types</title>
3086
3087     <para>
3088      C-language functions can be declared to accept and
3089      return the polymorphic types
3090      <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3091      <type>anyenum</type>, and <type>anyrange</type>.
3092      See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3093      of polymorphic functions. When function arguments or return types
3094      are defined as polymorphic types, the function author cannot know
3095      in advance what data type it will be called with, or
3096      need to return. There are two routines provided in <filename>fmgr.h</filename>
3097      to allow a version-1 C function to discover the actual data types
3098      of its arguments and the type it is expected to return. The routines are
3099      called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</literal> and
3100      <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</literal>.
3101      They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3102      information is not available.
3103      The structure <literal>flinfo</literal> is normally accessed as
3104      <literal>fcinfo-&gt;flinfo</literal>. The parameter <literal>argnum</literal>
3105      is zero based.  <function>get_call_result_type</function> can also be used
3106      as an alternative to <function>get_fn_expr_rettype</function>.
3107      There is also <function>get_fn_expr_variadic</function>, which can be used to
3108      find out whether variadic arguments have been merged into an array.
3109      This is primarily useful for <literal>VARIADIC "any"</literal> functions,
3110      since such merging will always have occurred for variadic functions
3111      taking ordinary array types.
3112     </para>
3113
3114     <para>
3115      For example, suppose we want to write a function to accept a single
3116      element of any type, and return a one-dimensional array of that type:
3117
3118 <programlisting>
3119 PG_FUNCTION_INFO_V1(make_array);
3120 Datum
3121 make_array(PG_FUNCTION_ARGS)
3122 {
3123     ArrayType  *result;
3124     Oid         element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3125     Datum       element;
3126     bool        isnull;
3127     int16       typlen;
3128     bool        typbyval;
3129     char        typalign;
3130     int         ndims;
3131     int         dims[MAXDIM];
3132     int         lbs[MAXDIM];
3133
3134     if (!OidIsValid(element_type))
3135         elog(ERROR, "could not determine data type of input");
3136
3137     /* get the provided element, being careful in case it's NULL */
3138     isnull = PG_ARGISNULL(0);
3139     if (isnull)
3140         element = (Datum) 0;
3141     else
3142         element = PG_GETARG_DATUM(0);
3143
3144     /* we have one dimension */
3145     ndims = 1;
3146     /* and one element */
3147     dims[0] = 1;
3148     /* and lower bound is 1 */
3149     lbs[0] = 1;
3150
3151     /* get required info about the element type */
3152     get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3153
3154     /* now build the array */
3155     result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3156                                 element_type, typlen, typbyval, typalign);
3157
3158     PG_RETURN_ARRAYTYPE_P(result);
3159 }
3160 </programlisting>
3161     </para>
3162
3163     <para>
3164      The following command declares the function
3165      <function>make_array</function> in SQL:
3166
3167 <programlisting>
3168 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3169     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3170     LANGUAGE C IMMUTABLE;
3171 </programlisting>
3172     </para>
3173
3174     <para>
3175      There is a variant of polymorphism that is only available to C-language
3176      functions: they can be declared to take parameters of type
3177      <literal>"any"</literal>.  (Note that this type name must be double-quoted,
3178      since it's also a SQL reserved word.)  This works like
3179      <type>anyelement</type> except that it does not constrain different
3180      <literal>"any"</literal> arguments to be the same type, nor do they help
3181      determine the function's result type.  A C-language function can also
3182      declare its final parameter to be <literal>VARIADIC "any"</literal>.  This will
3183      match one or more actual arguments of any type (not necessarily the same
3184      type).  These arguments will <emphasis>not</emphasis> be gathered into an array
3185      as happens with normal variadic functions; they will just be passed to
3186      the function separately.  The <function>PG_NARGS()</function> macro and the
3187      methods described above must be used to determine the number of actual
3188      arguments and their types when using this feature.  Also, users of such
3189      a function might wish to use the <literal>VARIADIC</literal> keyword in their
3190      function call, with the expectation that the function would treat the
3191      array elements as separate arguments.  The function itself must implement
3192      that behavior if wanted, after using <function>get_fn_expr_variadic</function> to
3193      detect that the actual argument was marked with <literal>VARIADIC</literal>.
3194     </para>
3195    </sect2>
3196
3197    <sect2 id="xfunc-transform-functions">
3198     <title>Transform Functions</title>
3199
3200     <para>
3201      Some function calls can be simplified during planning based on
3202      properties specific to the function.  For example,
3203      <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>.
3204      To define such function-specific optimizations, write a
3205      <firstterm>transform function</firstterm> and place its OID in the
3206      <structfield>protransform</structfield> field of the primary function's
3207      <structname>pg_proc</structname> entry.  The transform function must have the SQL
3208      signature <literal>protransform(internal) RETURNS internal</literal>.  The
3209      argument, actually <type>FuncExpr *</type>, is a dummy node representing a
3210      call to the primary function.  If the transform function's study of the
3211      expression tree proves that a simplified expression tree can substitute
3212      for all possible concrete calls represented thereby, build and return
3213      that simplified expression.  Otherwise, return a <literal>NULL</literal>
3214      pointer (<emphasis>not</emphasis> a SQL null).
3215     </para>
3216
3217     <para>
3218      We make no guarantee that <productname>PostgreSQL</productname> will never call the
3219      primary function in cases that the transform function could simplify.
3220      Ensure rigorous equivalence between the simplified expression and an
3221      actual call to the primary function.
3222     </para>
3223
3224     <para>
3225      Currently, this facility is not exposed to users at the SQL level
3226      because of security concerns, so it is only practical to use for
3227      optimizing built-in functions.
3228     </para>
3229    </sect2>
3230
3231    <sect2>
3232     <title>Shared Memory and LWLocks</title>
3233
3234     <para>
3235      Add-ins can reserve LWLocks and an allocation of shared memory on server
3236      startup.  The add-in's shared library must be preloaded by specifying
3237      it in
3238      <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</primary></indexterm>.
3239      Shared memory is reserved by calling:
3240 <programlisting>
3241 void RequestAddinShmemSpace(int size)
3242 </programlisting>
3243      from your <function>_PG_init</function> function.
3244     </para>
3245     <para>
3246      LWLocks are reserved by calling:
3247 <programlisting>
3248 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3249 </programlisting>
3250      from <function>_PG_init</function>.  This will ensure that an array of
3251      <literal>num_lwlocks</literal> LWLocks is available under the name
3252      <literal>tranche_name</literal>.  Use <function>GetNamedLWLockTranche</function>
3253      to get a pointer to this array.
3254     </para>
3255     <para>
3256      To avoid possible race-conditions, each backend should use the LWLock
3257      <function>AddinShmemInitLock</function> when connecting to and initializing
3258      its allocation of shared memory, as shown here:
3259 <programlisting>
3260 static mystruct *ptr = NULL;
3261
3262 if (!ptr)
3263 {
3264         bool    found;
3265
3266         LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3267         ptr = ShmemInitStruct("my struct name", size, &amp;found);
3268         if (!found)
3269         {
3270                 initialize contents of shmem area;
3271                 acquire any requested LWLocks using:
3272                 ptr->locks = GetNamedLWLockTranche("my tranche name");
3273         }
3274         LWLockRelease(AddinShmemInitLock);
3275 }
3276 </programlisting>
3277     </para>
3278    </sect2>
3279
3280    <sect2 id="extend-cpp">
3281     <title>Using C++ for Extensibility</title>
3282
3283     <indexterm zone="extend-cpp">
3284      <primary>C++</primary>
3285     </indexterm>
3286
3287     <para>
3288      Although the <productname>PostgreSQL</productname> backend is written in
3289      C, it is possible to write extensions in C++ if these guidelines are
3290      followed:
3291
3292      <itemizedlist>
3293       <listitem>
3294        <para>
3295          All functions accessed by the backend must present a C interface
3296          to the backend;  these C functions can then call C++ functions.
3297          For example, <literal>extern C</literal> linkage is required for
3298          backend-accessed functions.  This is also necessary for any
3299          functions that are passed as pointers between the backend and
3300          C++ code.
3301        </para>
3302       </listitem>
3303       <listitem>
3304        <para>
3305         Free memory using the appropriate deallocation method.  For example,
3306         most backend memory is allocated using <function>palloc()</function>, so use
3307         <function>pfree()</function> to free it.  Using C++
3308         <function>delete</function> in such cases will fail.
3309        </para>
3310       </listitem>
3311       <listitem>
3312        <para>
3313         Prevent exceptions from propagating into the C code (use a catch-all
3314         block at the top level of all <literal>extern C</literal> functions).  This
3315         is necessary even if the C++ code does not explicitly throw any
3316         exceptions, because events like out-of-memory can still throw
3317         exceptions.  Any exceptions must be caught and appropriate errors
3318         passed back to the C interface.  If possible, compile C++ with
3319         <option>-fno-exceptions</option> to eliminate exceptions entirely; in such
3320         cases, you must check for failures in your C++ code, e.g.  check for
3321         NULL returned by <function>new()</function>.
3322        </para>
3323       </listitem>
3324       <listitem>
3325        <para>
3326         If calling backend functions from C++ code, be sure that the
3327         C++ call stack contains only plain old data structures
3328         (<acronym>POD</acronym>).  This is necessary because backend errors
3329         generate a distant <function>longjmp()</function> that does not properly
3330         unroll a C++ call stack with non-POD objects.
3331        </para>
3332       </listitem>
3333      </itemizedlist>
3334     </para>
3335
3336     <para>
3337      In summary, it is best to place C++ code behind a wall of
3338      <literal>extern C</literal> functions that interface to the backend,
3339      and avoid exception, memory, and call stack leakage.
3340     </para>
3341    </sect2>
3342
3343   </sect1>