granicus.if.org Git - postgresql/blob - doc/src/sgml/xfunc.sgml

   1 <!-- doc/src/sgml/xfunc.sgml -->
   2
   3  <sect1 id="xfunc">
   4   <title>User-defined Functions</title>
   5
   6   <indexterm zone="xfunc">
   7    <primary>function</primary>
   8    <secondary>user-defined</secondary>
   9   </indexterm>
  10
  11   <para>
  12    <productname>PostgreSQL</productname> provides four kinds of
  13    functions:
  14
  15    <itemizedlist>
  16     <listitem>
  17      <para>
  18       query language functions (functions written in
  19       <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
  20      </para>
  21     </listitem>
  22     <listitem>
  23      <para>
  24       procedural language functions (functions written in, for
  25       example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
  26       (<xref linkend="xfunc-pl">)
  27      </para>
  28     </listitem>
  29     <listitem>
  30      <para>
  31       internal functions (<xref linkend="xfunc-internal">)
  32      </para>
  33     </listitem>
  34     <listitem>
  35      <para>
  36       C-language functions (<xref linkend="xfunc-c">)
  37      </para>
  38     </listitem>
  39    </itemizedlist>
  40   </para>
  41
  42   <para>
  43    Every kind
  44    of  function  can take base types, composite types, or
  45    combinations of these as arguments (parameters). In addition,
  46    every kind of function can return a base type or
  47    a composite type.  Functions can also be defined to return
  48    sets of base or composite values.
  49   </para>
  50
  51   <para>
  52    Many kinds of functions can take or return certain pseudo-types
  53    (such as polymorphic types), but the available facilities vary.
  54    Consult the description of each kind of function for more details.
  55   </para>
  56
  57   <para>
  58    It's easiest to define <acronym>SQL</acronym>
  59    functions, so we'll start by discussing those.
  60    Most of the concepts presented for <acronym>SQL</acronym> functions
  61    will carry over to the other types of functions.
  62   </para>
  63
  64   <para>
  65    Throughout this chapter, it can be useful to look at the reference
  66    page of the <xref linkend="sql-createfunction"> command to
  67    understand the examples better.  Some examples from this chapter
  68    can be found in <filename>funcs.sql</filename> and
  69    <filename>funcs.c</filename> in the <filename>src/tutorial</>
  70    directory in the <productname>PostgreSQL</productname> source
  71    distribution.
  72   </para>
  73   </sect1>
  74
  75   <sect1 id="xfunc-sql">
  76    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  77
  78    <indexterm zone="xfunc-sql">
  79     <primary>function</primary>
  80     <secondary>user-defined</secondary>
  81     <tertiary>in SQL</tertiary>
  82    </indexterm>
  83
  84    <para>
  85     SQL functions execute an arbitrary list of SQL statements, returning
  86     the result of the last query in the list.
  87     In the simple (non-set)
  88     case, the first row of the last query's result will be returned.
  89     (Bear in mind that <quote>the first row</quote> of a multirow
  90     result is not well-defined unless you use <literal>ORDER BY</>.)
  91     If the last query happens
  92     to return no rows at all, the null value will be returned.
  93    </para>
  94
  95    <para>
  96     Alternatively, an SQL function can be declared to return a set (that is,
  97     multiple rows) by specifying the function's return type as <literal>SETOF
  98     <replaceable>sometype</></literal>, or equivalently by declaring it as
  99     <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.  In this case
 100     all rows of the last query's result are returned.  Further details appear
 101     below.
 102    </para>
 103
 104    <para>
 105     The body of an SQL function must be a list of SQL
 106     statements separated by semicolons.  A semicolon after the last
 107     statement is optional.  Unless the function is declared to return
 108     <type>void</>, the last statement must be a <command>SELECT</>,
 109     or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
 110     that has a <literal>RETURNING</> clause.
 111    </para>
 112
 113     <para>
 114      Any collection of commands in the  <acronym>SQL</acronym>
 115      language can be packaged together and defined as a function.
 116      Besides <command>SELECT</command> queries, the commands can include data
 117      modification queries (<command>INSERT</command>,
 118      <command>UPDATE</command>, and <command>DELETE</command>), as well as
 119      other SQL commands. (You cannot use transaction control commands, e.g.
 120      <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
 121      commands, e.g.  <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
 122      However, the final command
 123      must be a <command>SELECT</command> or have a <literal>RETURNING</>
 124      clause that returns whatever is
 125      specified as the function's return type.  Alternatively, if you
 126      want to define a SQL function that performs actions but has no
 127      useful value to return, you can define it as returning <type>void</>.
 128      For example, this function removes rows with negative salaries from
 129      the <literal>emp</> table:
 130
 131 <screen>
 132 CREATE FUNCTION clean_emp() RETURNS void AS '
 133     DELETE FROM emp
 134         WHERE salary &lt; 0;
 135 ' LANGUAGE SQL;
 136
 137 SELECT clean_emp();
 138
 139  clean_emp
 140 -----------
 141
 142 (1 row)
 143 </screen>
 144     </para>
 145
 146     <note>
 147      <para>
 148       The entire body of a SQL function is parsed before any of it is
 149       executed.  While a SQL function can contain commands that alter
 150       the system catalogs (e.g., <command>CREATE TABLE</>), the effects
 151       of such commands will not be visible during parse analysis of
 152       later commands in the function.  Thus, for example,
 153       <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
 154       will not work as desired if packaged up into a single SQL function,
 155       since <structname>foo</> won't exist yet when the <command>INSERT</>
 156       command is parsed.  It's recommended to use <application>PL/PgSQL</>
 157       instead of a SQL function in this type of situation.
 158      </para>
 159    </note>
 160
 161    <para>
 162     The syntax of the <command>CREATE FUNCTION</command> command requires
 163     the function body to be written as a string constant.  It is usually
 164     most convenient to use dollar quoting (see <xref
 165     linkend="sql-syntax-dollar-quoting">) for the string constant.
 166     If you choose to use regular single-quoted string constant syntax,
 167     you must double single quote marks (<literal>'</>) and backslashes
 168     (<literal>\</>) (assuming escape string syntax) in the body of
 169     the function (see <xref linkend="sql-syntax-strings">).
 170    </para>
 171
 172    <sect2 id="xfunc-sql-function-arguments">
 173     <title>Arguments for <acronym>SQL</acronym> Functions</title>
 174
 175    <indexterm>
 176     <primary>function</primary>
 177     <secondary>named argument</secondary>
 178    </indexterm>
 179
 180     <para>
 181      Arguments of a SQL function can be referenced in the function
 182      body using either names or numbers.  Examples of both methods appear
 183      below.
 184     </para>
 185
 186     <para>
 187      To use a name, declare the function argument as having a name, and
 188      then just write that name in the function body.  If the argument name
 189      is the same as any column name in the current SQL command within the
 190      function, the column name will take precedence.  To override this,
 191      qualify the argument name with the name of the function itself, that is
 192      <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
 193      (If this would conflict with a qualified column name, again the column
 194      name wins.  You can avoid the ambiguity by choosing a different alias for
 195      the table within the SQL command.)
 196     </para>
 197
 198     <para>
 199      In the older numeric approach, arguments are referenced using the syntax
 200      <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
 201      argument, <literal>$2</> to the second, and so on.  This will work
 202      whether or not the particular argument was declared with a name.
 203     </para>
 204
 205     <para>
 206      If an argument is of a composite type, then the dot notation,
 207      e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
 208      <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
 209      argument.  Again, you might need to qualify the argument's name with the
 210      function name to make the form with an argument name unambiguous.
 211     </para>
 212
 213     <para>
 214      SQL function arguments can only be used as data values,
 215      not as identifiers.  Thus for example this is reasonable:
 216 <programlisting>
 217 INSERT INTO mytable VALUES ($1);
 218 </programlisting>
 219 but this will not work:
 220 <programlisting>
 221 INSERT INTO $1 VALUES (42);
 222 </programlisting>
 223     </para>
 224
 225     <note>
 226      <para>
 227       The ability to use names to reference SQL function arguments was added
 228       in <productname>PostgreSQL</productname> 9.2.  Functions to be used in
 229       older servers must use the <literal>$<replaceable>n</></> notation.
 230      </para>
 231     </note>
 232    </sect2>
 233
 234    <sect2 id="xfunc-sql-base-functions">
 235     <title><acronym>SQL</acronym> Functions on Base Types</title>
 236
 237     <para>
 238      The simplest possible <acronym>SQL</acronym> function has no arguments and
 239      simply returns a base type, such as <type>integer</type>:
 240
 241 <screen>
 242 CREATE FUNCTION one() RETURNS integer AS $$
 243     SELECT 1 AS result;
 244 $$ LANGUAGE SQL;
 245
 246 -- Alternative syntax for string literal:
 247 CREATE FUNCTION one() RETURNS integer AS '
 248     SELECT 1 AS result;
 249 ' LANGUAGE SQL;
 250
 251 SELECT one();
 252
 253  one
 254 -----
 255    1
 256 </screen>
 257     </para>
 258
 259     <para>
 260      Notice that we defined a column alias within the function body for the result of the function
 261      (with  the  name <literal>result</>),  but this column alias is not visible
 262      outside the function.  Hence,  the  result  is labeled <literal>one</>
 263      instead of <literal>result</>.
 264     </para>
 265
 266     <para>
 267      It is almost as easy to define <acronym>SQL</acronym> functions
 268      that take base types as arguments:
 269
 270 <screen>
 271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
 272     SELECT x + y;
 273 $$ LANGUAGE SQL;
 274
 275 SELECT add_em(1, 2) AS answer;
 276
 277  answer
 278 --------
 279       3
 280 </screen>
 281     </para>
 282
 283     <para>
 284      Alternatively, we could dispense with names for the arguments and
 285      use numbers:
 286
 287 <screen>
 288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
 289     SELECT $1 + $2;
 290 $$ LANGUAGE SQL;
 291
 292 SELECT add_em(1, 2) AS answer;
 293
 294  answer
 295 --------
 296       3
 297 </screen>
 298     </para>
 299
 300     <para>
 301      Here is a more useful function, which might be used to debit a
 302      bank account:
 303
 304 <programlisting>
 305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 306     UPDATE bank
 307         SET balance = balance - debit
 308         WHERE accountno = tf1.accountno;
 309     SELECT 1;
 310 $$ LANGUAGE SQL;
 311 </programlisting>
 312
 313      A user could execute this function to debit account 17 by $100.00 as
 314      follows:
 315
 316 <programlisting>
 317 SELECT tf1(17, 100.0);
 318 </programlisting>
 319     </para>
 320
 321     <para>
 322      In this example, we chose the name <literal>accountno</> for the first
 323      argument, but this is the same as the name of a column in the
 324      <literal>bank</> table.  Within the <command>UPDATE</> command,
 325      <literal>accountno</> refers to the column <literal>bank.accountno</>,
 326      so <literal>tf1.accountno</> must be used to refer to the argument.
 327      We could of course avoid this by using a different name for the argument.
 328     </para>
 329
 330     <para>
 331      In practice one would probably like a more useful result from the
 332      function than a constant 1, so a more likely definition
 333      is:
 334
 335 <programlisting>
 336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 337     UPDATE bank
 338         SET balance = balance - debit
 339         WHERE accountno = tf1.accountno;
 340     SELECT balance FROM bank WHERE accountno = tf1.accountno;
 341 $$ LANGUAGE SQL;
 342 </programlisting>
 343
 344      which adjusts the balance and returns the new balance.
 345      The same thing could be done in one command using <literal>RETURNING</>:
 346
 347 <programlisting>
 348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 349     UPDATE bank
 350         SET balance = balance - debit
 351         WHERE accountno = tf1.accountno
 352     RETURNING balance;
 353 $$ LANGUAGE SQL;
 354 </programlisting>
 355     </para>
 356    </sect2>
 357
 358    <sect2 id="xfunc-sql-composite-functions">
 359     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 360
 361     <para>
 362      When writing functions with arguments of composite types, we must not
 363      only specify which argument we want but also the desired attribute
 364      (field) of that argument.  For example, suppose that
 365      <type>emp</type> is a table containing employee data, and therefore
 366      also the name of the composite type of each row of the table.  Here
 367      is a function <function>double_salary</function> that computes what someone's
 368      salary would be if it were doubled:
 369
 370 <screen>
 371 CREATE TABLE emp (
 372     name        text,
 373     salary      numeric,
 374     age         integer,
 375     cubicle     point
 376 );
 377
 378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
 379
 380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
 381     SELECT $1.salary * 2 AS salary;
 382 $$ LANGUAGE SQL;
 383
 384 SELECT name, double_salary(emp.*) AS dream
 385     FROM emp
 386     WHERE emp.cubicle ~= point '(2,1)';
 387
 388  name | dream
 389 ------+-------
 390  Bill |  8400
 391 </screen>
 392     </para>
 393
 394     <para>
 395      Notice the use of the syntax <literal>$1.salary</literal>
 396      to select one field of the argument row value.  Also notice
 397      how the calling <command>SELECT</> command
 398      uses <replaceable>table_name</><literal>.*</> to select
 399      the entire current row of a table as a composite value.  The table
 400      row can alternatively be referenced using just the table name,
 401      like this:
 402 <screen>
 403 SELECT name, double_salary(emp) AS dream
 404     FROM emp
 405     WHERE emp.cubicle ~= point '(2,1)';
 406 </screen>
 407      but this usage is deprecated since it's easy to get confused.
 408      (See <xref linkend="rowtypes-usage"> for details about these
 409      two notations for the composite value of a table row.)
 410     </para>
 411
 412     <para>
 413      Sometimes it is handy to construct a composite argument value
 414      on-the-fly.  This can be done with the <literal>ROW</> construct.
 415      For example, we could adjust the data being passed to the function:
 416 <screen>
 417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
 418     FROM emp;
 419 </screen>
 420     </para>
 421
 422     <para>
 423      It is also possible to build a function that returns a composite type.
 424      This is an example of a function
 425      that returns a single <type>emp</type> row:
 426
 427 <programlisting>
 428 CREATE FUNCTION new_emp() RETURNS emp AS $$
 429     SELECT text 'None' AS name,
 430         1000.0 AS salary,
 431         25 AS age,
 432         point '(2,2)' AS cubicle;
 433 $$ LANGUAGE SQL;
 434 </programlisting>
 435
 436      In this example we have specified each of  the  attributes
 437      with  a  constant value, but any computation
 438      could have been substituted for these constants.
 439     </para>
 440
 441     <para>
 442      Note two important things about defining the function:
 443
 444      <itemizedlist>
 445       <listitem>
 446        <para>
 447         The select list order in the query must be exactly the same as
 448         that in which the columns appear in the table associated
 449         with the composite type.  (Naming the columns, as we did above,
 450         is irrelevant to the system.)
 451        </para>
 452       </listitem>
 453       <listitem>
 454        <para>
 455         You must typecast the expressions to match the
 456         definition of the composite type, or you will get errors like this:
 457 <screen>
 458 <computeroutput>
 459 ERROR:  function declared to return emp returns varchar instead of text at column 1
 460 </computeroutput>
 461 </screen>
 462        </para>
 463       </listitem>
 464      </itemizedlist>
 465     </para>
 466
 467     <para>
 468      A different way to define the same function is:
 469
 470 <programlisting>
 471 CREATE FUNCTION new_emp() RETURNS emp AS $$
 472     SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
 473 $$ LANGUAGE SQL;
 474 </programlisting>
 475
 476      Here we wrote a <command>SELECT</> that returns just a single
 477      column of the correct composite type.  This isn't really better
 478      in this situation, but it is a handy alternative in some cases
 479      &mdash; for example, if we need to compute the result by calling
 480      another function that returns the desired composite value.
 481     </para>
 482
 483     <para>
 484      We could call this function directly either by using it in
 485      a value expression:
 486
 487 <screen>
 488 SELECT new_emp();
 489
 490          new_emp
 491 --------------------------
 492  (None,1000.0,25,"(2,2)")
 493 </screen>
 494
 495      or by calling it as a table function:
 496
 497 <screen>
 498 SELECT * FROM new_emp();
 499
 500  name | salary | age | cubicle
 501 ------+--------+-----+---------
 502  None | 1000.0 |  25 | (2,2)
 503 </screen>
 504
 505      The second way is described more fully in <xref
 506      linkend="xfunc-sql-table-functions">.
 507     </para>
 508
 509     <para>
 510      When you use a function that returns a composite type,
 511      you might want only one field (attribute) from its result.
 512      You can do that with syntax like this:
 513
 514 <screen>
 515 SELECT (new_emp()).name;
 516
 517  name
 518 ------
 519  None
 520 </screen>
 521
 522      The extra parentheses are needed to keep the parser from getting
 523      confused.  If you try to do it without them, you get something like this:
 524
 525 <screen>
 526 SELECT new_emp().name;
 527 ERROR:  syntax error at or near "."
 528 LINE 1: SELECT new_emp().name;
 529                         ^
 530 </screen>
 531     </para>
 532
 533     <para>
 534      Another option is to use functional notation for extracting an attribute:
 535
 536 <screen>
 537 SELECT name(new_emp());
 538
 539  name
 540 ------
 541  None
 542 </screen>
 543
 544      As explained in <xref linkend="rowtypes-usage">, the field notation and
 545      functional notation are equivalent.
 546     </para>
 547
 548     <para>
 549      Another way to use a function returning a composite type is to pass the
 550      result to another function that accepts the correct row type as input:
 551
 552 <screen>
 553 CREATE FUNCTION getname(emp) RETURNS text AS $$
 554     SELECT $1.name;
 555 $$ LANGUAGE SQL;
 556
 557 SELECT getname(new_emp());
 558  getname
 559 ---------
 560  None
 561 (1 row)
 562 </screen>
 563     </para>
 564    </sect2>
 565
 566    <sect2 id="xfunc-output-parameters">
 567     <title><acronym>SQL</> Functions with Output Parameters</title>
 568
 569    <indexterm>
 570     <primary>function</primary>
 571     <secondary>output parameter</secondary>
 572    </indexterm>
 573
 574     <para>
 575      An alternative way of describing a function's results is to define it
 576      with <firstterm>output parameters</>, as in this example:
 577
 578 <screen>
 579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
 580 AS 'SELECT x + y'
 581 LANGUAGE SQL;
 582
 583 SELECT add_em(3,7);
 584  add_em
 585 --------
 586      10
 587 (1 row)
 588 </screen>
 589
 590      This is not essentially different from the version of <literal>add_em</>
 591      shown in <xref linkend="xfunc-sql-base-functions">.  The real value of
 592      output parameters is that they provide a convenient way of defining
 593      functions that return several columns.  For example,
 594
 595 <screen>
 596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
 597 AS 'SELECT x + y, x * y'
 598 LANGUAGE SQL;
 599
 600  SELECT * FROM sum_n_product(11,42);
 601  sum | product
 602 -----+---------
 603   53 |     462
 604 (1 row)
 605 </screen>
 606
 607      What has essentially happened here is that we have created an anonymous
 608      composite type for the result of the function.  The above example has
 609      the same end result as
 610
 611 <screen>
 612 CREATE TYPE sum_prod AS (sum int, product int);
 613
 614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
 615 AS 'SELECT $1 + $2, $1 * $2'
 616 LANGUAGE SQL;
 617 </screen>
 618
 619      but not having to bother with the separate composite type definition
 620      is often handy.  Notice that the names attached to the output parameters
 621      are not just decoration, but determine the column names of the anonymous
 622      composite type.  (If you omit a name for an output parameter, the
 623      system will choose a name on its own.)
 624     </para>
 625
 626     <para>
 627      Notice that output parameters are not included in the calling argument
 628      list when invoking such a function from SQL.  This is because
 629      <productname>PostgreSQL</productname> considers only the input
 630      parameters to define the function's calling signature.  That means
 631      also that only the input parameters matter when referencing the function
 632      for purposes such as dropping it.  We could drop the above function
 633      with either of
 634
 635 <screen>
 636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
 637 DROP FUNCTION sum_n_product (int, int);
 638 </screen>
 639     </para>
 640
 641     <para>
 642      Parameters can be marked as <literal>IN</> (the default),
 643      <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
 644      An <literal>INOUT</>
 645      parameter serves as both an input parameter (part of the calling
 646      argument list) and an output parameter (part of the result record type).
 647      <literal>VARIADIC</> parameters are input parameters, but are treated
 648      specially as described next.
 649     </para>
 650    </sect2>
 651
 652    <sect2 id="xfunc-sql-variadic-functions">
 653     <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
 654
 655     <indexterm>
 656      <primary>function</primary>
 657      <secondary>variadic</secondary>
 658     </indexterm>
 659
 660     <indexterm>
 661      <primary>variadic function</primary>
 662     </indexterm>
 663
 664     <para>
 665      <acronym>SQL</acronym> functions can be declared to accept
 666      variable numbers of arguments, so long as all the <quote>optional</>
 667      arguments are of the same data type.  The optional arguments will be
 668      passed to the function as an array.  The function is declared by
 669      marking the last parameter as <literal>VARIADIC</>; this parameter
 670      must be declared as being of an array type.  For example:
 671
 672 <screen>
 673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
 674     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 675 $$ LANGUAGE SQL;
 676
 677 SELECT mleast(10, -1, 5, 4.4);
 678  mleast
 679 --------
 680      -1
 681 (1 row)
 682 </screen>
 683
 684      Effectively, all the actual arguments at or beyond the
 685      <literal>VARIADIC</> position are gathered up into a one-dimensional
 686      array, as if you had written
 687
 688 <screen>
 689 SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
 690 </screen>
 691
 692      You can't actually write that, though &mdash; or at least, it will
 693      not match this function definition.  A parameter marked
 694      <literal>VARIADIC</> matches one or more occurrences of its element
 695      type, not of its own type.
 696     </para>
 697
 698     <para>
 699      Sometimes it is useful to be able to pass an already-constructed array
 700      to a variadic function; this is particularly handy when one variadic
 701      function wants to pass on its array parameter to another one.  You can
 702      do that by specifying <literal>VARIADIC</> in the call:
 703
 704 <screen>
 705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
 706 </screen>
 707
 708      This prevents expansion of the function's variadic parameter into its
 709      element type, thereby allowing the array argument value to match
 710      normally.  <literal>VARIADIC</> can only be attached to the last
 711      actual argument of a function call.
 712     </para>
 713
 714     <para>
 715      Specifying <literal>VARIADIC</> in the call is also the only way to
 716      pass an empty array to a variadic function, for example:
 717
 718 <screen>
 719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
 720 </screen>
 721
 722      Simply writing <literal>SELECT mleast()</> does not work because a
 723      variadic parameter must match at least one actual argument.
 724      (You could define a second function also named <literal>mleast</>,
 725      with no parameters, if you wanted to allow such calls.)
 726     </para>
 727
 728     <para>
 729      The array element parameters generated from a variadic parameter are
 730      treated as not having any names of their own.  This means it is not
 731      possible to call a variadic function using named arguments (<xref
 732      linkend="sql-syntax-calling-funcs">), except when you specify
 733      <literal>VARIADIC</>.  For example, this will work:
 734
 735 <screen>
 736 SELECT mleast(VARIADIC arr =&gt; ARRAY[10, -1, 5, 4.4]);
 737 </screen>
 738
 739      but not these:
 740
 741 <screen>
 742 SELECT mleast(arr =&gt; 10);
 743 SELECT mleast(arr =&gt; ARRAY[10, -1, 5, 4.4]);
 744 </screen>
 745     </para>
 746    </sect2>
 747
 748    <sect2 id="xfunc-sql-parameter-defaults">
 749     <title><acronym>SQL</> Functions with Default Values for Arguments</title>
 750
 751     <indexterm>
 752      <primary>function</primary>
 753      <secondary>default values for arguments</secondary>
 754     </indexterm>
 755
 756     <para>
 757      Functions can be declared with default values for some or all input
 758      arguments.  The default values are inserted whenever the function is
 759      called with insufficiently many actual arguments.  Since arguments
 760      can only be omitted from the end of the actual argument list, all
 761      parameters after a parameter with a default value have to have
 762      default values as well.  (Although the use of named argument notation
 763      could allow this restriction to be relaxed, it's still enforced so that
 764      positional argument notation works sensibly.)
 765     </para>
 766
 767     <para>
 768      For example:
 769 <screen>
 770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
 771 RETURNS int
 772 LANGUAGE SQL
 773 AS $$
 774     SELECT $1 + $2 + $3;
 775 $$;
 776
 777 SELECT foo(10, 20, 30);
 778  foo
 779 -----
 780   60
 781 (1 row)
 782
 783 SELECT foo(10, 20);
 784  foo
 785 -----
 786   33
 787 (1 row)
 788
 789 SELECT foo(10);
 790  foo
 791 -----
 792   15
 793 (1 row)
 794
 795 SELECT foo();  -- fails since there is no default for the first argument
 796 ERROR:  function foo() does not exist
 797 </screen>
 798      The <literal>=</literal> sign can also be used in place of the
 799      key word <literal>DEFAULT</literal>.
 800     </para>
 801    </sect2>
 802
 803    <sect2 id="xfunc-sql-table-functions">
 804     <title><acronym>SQL</acronym> Functions as Table Sources</title>
 805
 806     <para>
 807      All SQL functions can be used in the <literal>FROM</> clause of a query,
 808      but it is particularly useful for functions returning composite types.
 809      If the function is defined to return a base type, the table function
 810      produces a one-column table.  If the function is defined to return
 811      a composite type, the table function produces a column for each attribute
 812      of the composite type.
 813     </para>
 814
 815     <para>
 816      Here is an example:
 817
 818 <screen>
 819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 820 INSERT INTO foo VALUES (1, 1, 'Joe');
 821 INSERT INTO foo VALUES (1, 2, 'Ed');
 822 INSERT INTO foo VALUES (2, 1, 'Mary');
 823
 824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
 825     SELECT * FROM foo WHERE fooid = $1;
 826 $$ LANGUAGE SQL;
 827
 828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
 829
 830  fooid | foosubid | fooname | upper
 831 -------+----------+---------+-------
 832      1 |        1 | Joe     | JOE
 833 (1 row)
 834 </screen>
 835
 836      As the example shows, we can work with the columns of the function's
 837      result just the same as if they were columns of a regular table.
 838     </para>
 839
 840     <para>
 841      Note that we only got one row out of the function.  This is because
 842      we did not use <literal>SETOF</>.  That is described in the next section.
 843     </para>
 844    </sect2>
 845
 846    <sect2 id="xfunc-sql-functions-returning-set">
 847     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 848
 849     <indexterm>
 850      <primary>function</primary>
 851      <secondary>with SETOF</secondary>
 852     </indexterm>
 853
 854     <para>
 855      When an SQL function is declared as returning <literal>SETOF
 856      <replaceable>sometype</></literal>, the function's final
 857      query is executed to completion, and each row it
 858      outputs is returned as an element of the result set.
 859     </para>
 860
 861     <para>
 862      This feature is normally used when calling the function in the <literal>FROM</>
 863      clause.  In this case each row returned by the function becomes
 864      a row of the table seen by the query.  For example, assume that
 865      table <literal>foo</> has the same contents as above, and we say:
 866
 867 <programlisting>
 868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 869     SELECT * FROM foo WHERE fooid = $1;
 870 $$ LANGUAGE SQL;
 871
 872 SELECT * FROM getfoo(1) AS t1;
 873 </programlisting>
 874
 875      Then we would get:
 876 <screen>
 877  fooid | foosubid | fooname
 878 -------+----------+---------
 879      1 |        1 | Joe
 880      1 |        2 | Ed
 881 (2 rows)
 882 </screen>
 883     </para>
 884
 885     <para>
 886      It is also possible to return multiple rows with the columns defined by
 887      output parameters, like this:
 888
 889 <programlisting>
 890 CREATE TABLE tab (y int, z int);
 891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
 892
 893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
 894 RETURNS SETOF record
 895 AS $$
 896     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 897 $$ LANGUAGE SQL;
 898
 899 SELECT * FROM sum_n_product_with_tab(10);
 900  sum | product
 901 -----+---------
 902   11 |      10
 903   13 |      30
 904   15 |      50
 905   17 |      70
 906 (4 rows)
 907 </programlisting>
 908
 909      The key point here is that you must write <literal>RETURNS SETOF record</>
 910      to indicate that the function returns multiple rows instead of just one.
 911      If there is only one output parameter, write that parameter's type
 912      instead of <type>record</>.
 913     </para>
 914
 915     <para>
 916      It is frequently useful to construct a query's result by invoking a
 917      set-returning function multiple times, with the parameters for each
 918      invocation coming from successive rows of a table or subquery.  The
 919      preferred way to do this is to use the <literal>LATERAL</> key word,
 920      which is described in <xref linkend="queries-lateral">.
 921      Here is an example using a set-returning function to enumerate
 922      elements of a tree structure:
 923
 924 <screen>
 925 SELECT * FROM nodes;
 926    name    | parent
 927 -----------+--------
 928  Top       |
 929  Child1    | Top
 930  Child2    | Top
 931  Child3    | Top
 932  SubChild1 | Child1
 933  SubChild2 | Child1
 934 (6 rows)
 935
 936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
 937     SELECT name FROM nodes WHERE parent = $1
 938 $$ LANGUAGE SQL STABLE;
 939
 940 SELECT * FROM listchildren('Top');
 941  listchildren
 942 --------------
 943  Child1
 944  Child2
 945  Child3
 946 (3 rows)
 947
 948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
 949   name  |   child
 950 --------+-----------
 951  Top    | Child1
 952  Top    | Child2
 953  Top    | Child3
 954  Child1 | SubChild1
 955  Child1 | SubChild2
 956 (5 rows)
 957 </screen>
 958
 959      This example does not do anything that we couldn't have done with a
 960      simple join, but in more complex calculations the option to put
 961      some of the work into a function can be quite convenient.
 962     </para>
 963
 964     <para>
 965      Currently, functions returning sets can also be called in the select list
 966      of a query.  For each row that the query
 967      generates by itself, the function returning set is invoked, and an output
 968      row is generated for each element of the function's result set. Note,
 969      however, that this capability is deprecated and might be removed in future
 970      releases. The previous example could also be done with queries like
 971      these:
 972
 973 <screen>
 974 SELECT listchildren('Top');
 975  listchildren
 976 --------------
 977  Child1
 978  Child2
 979  Child3
 980 (3 rows)
 981
 982 SELECT name, listchildren(name) FROM nodes;
 983   name  | listchildren
 984 --------+--------------
 985  Top    | Child1
 986  Top    | Child2
 987  Top    | Child3
 988  Child1 | SubChild1
 989  Child1 | SubChild2
 990 (5 rows)
 991 </screen>
 992
 993      In the last <command>SELECT</command>,
 994      notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
 995      This happens because <function>listchildren</function> returns an empty set
 996      for those arguments, so no result rows are generated.  This is the same
 997      behavior as we got from an inner join to the function result when using
 998      the <literal>LATERAL</> syntax.
 999     </para>
1000
1001     <note>
1002      <para>
1003       If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1004       or <command>DELETE</> with <literal>RETURNING</>, that command will
1005       always be executed to completion, even if the function is not declared
1006       with <literal>SETOF</> or the calling query does not fetch all the
1007       result rows.  Any extra rows produced by the <literal>RETURNING</>
1008       clause are silently dropped, but the commanded table modifications
1009       still happen (and are all completed before returning from the function).
1010      </para>
1011     </note>
1012
1013     <note>
1014      <para>
1015       The key problem with using set-returning functions in the select list,
1016       rather than the <literal>FROM</> clause, is that putting more than one
1017       set-returning function in the same select list does not behave very
1018       sensibly.  (What you actually get if you do so is a number of output
1019       rows equal to the least common multiple of the numbers of rows produced
1020       by each set-returning function.)  The <literal>LATERAL</> syntax
1021       produces less surprising results when calling multiple set-returning
1022       functions, and should usually be used instead.
1023      </para>
1024     </note>
1025    </sect2>
1026
1027    <sect2 id="xfunc-sql-functions-returning-table">
1028     <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1029
1030     <indexterm>
1031      <primary>function</primary>
1032      <secondary>RETURNS TABLE</secondary>
1033     </indexterm>
1034
1035     <para>
1036      There is another way to declare a function as returning a set,
1037      which is to use the syntax
1038      <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1039      This is equivalent to using one or more <literal>OUT</> parameters plus
1040      marking the function as returning <literal>SETOF record</> (or
1041      <literal>SETOF</> a single output parameter's type, as appropriate).
1042      This notation is specified in recent versions of the SQL standard, and
1043      thus may be more portable than using <literal>SETOF</>.
1044     </para>
1045
1046     <para>
1047      For example, the preceding sum-and-product example could also be
1048      done this way:
1049
1050 <programlisting>
1051 CREATE FUNCTION sum_n_product_with_tab (x int)
1052 RETURNS TABLE(sum int, product int) AS $$
1053     SELECT $1 + tab.y, $1 * tab.y FROM tab;
1054 $$ LANGUAGE SQL;
1055 </programlisting>
1056
1057      It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1058      parameters with the <literal>RETURNS TABLE</> notation &mdash; you must
1059      put all the output columns in the <literal>TABLE</> list.
1060     </para>
1061    </sect2>
1062
1063    <sect2>
1064     <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1065
1066     <para>
1067      <acronym>SQL</acronym> functions can be declared to accept and
1068      return the polymorphic types <type>anyelement</type>,
1069      <type>anyarray</type>, <type>anynonarray</type>,
1070      <type>anyenum</type>, and <type>anyrange</type>.  See <xref
1071      linkend="extend-types-polymorphic"> for a more detailed
1072      explanation of polymorphic functions. Here is a polymorphic
1073      function <function>make_array</function> that builds up an array
1074      from two arbitrary data type elements:
1075 <screen>
1076 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1077     SELECT ARRAY[$1, $2];
1078 $$ LANGUAGE SQL;
1079
1080 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1081  intarray | textarray
1082 ----------+-----------
1083  {1,2}    | {a,b}
1084 (1 row)
1085 </screen>
1086     </para>
1087
1088     <para>
1089      Notice the use of the typecast <literal>'a'::text</literal>
1090      to specify that the argument is of type <type>text</type>. This is
1091      required if the argument is just a string literal, since otherwise
1092      it would be treated as type
1093      <type>unknown</type>, and array of <type>unknown</type> is not a valid
1094      type.
1095      Without the typecast, you will get errors like this:
1096 <screen>
1097 <computeroutput>
1098 ERROR:  could not determine polymorphic type because input has type "unknown"
1099 </computeroutput>
1100 </screen>
1101     </para>
1102
1103     <para>
1104      It is permitted to have polymorphic arguments with a fixed
1105      return type, but the converse is not. For example:
1106 <screen>
1107 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1108     SELECT $1 &gt; $2;
1109 $$ LANGUAGE SQL;
1110
1111 SELECT is_greater(1, 2);
1112  is_greater
1113 ------------
1114  f
1115 (1 row)
1116
1117 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1118     SELECT 1;
1119 $$ LANGUAGE SQL;
1120 ERROR:  cannot determine result data type
1121 DETAIL:  A function returning a polymorphic type must have at least one polymorphic argument.
1122 </screen>
1123     </para>
1124
1125     <para>
1126      Polymorphism can be used with functions that have output arguments.
1127      For example:
1128 <screen>
1129 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1130 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1131
1132 SELECT * FROM dup(22);
1133  f2 |   f3
1134 ----+---------
1135  22 | {22,22}
1136 (1 row)
1137 </screen>
1138     </para>
1139
1140     <para>
1141      Polymorphism can also be used with variadic functions.
1142      For example:
1143 <screen>
1144 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1145     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1146 $$ LANGUAGE SQL;
1147
1148 SELECT anyleast(10, -1, 5, 4);
1149  anyleast
1150 ----------
1151        -1
1152 (1 row)
1153
1154 SELECT anyleast('abc'::text, 'def');
1155  anyleast
1156 ----------
1157  abc
1158 (1 row)
1159
1160 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1161     SELECT array_to_string($2, $1);
1162 $$ LANGUAGE SQL;
1163
1164 SELECT concat_values('|', 1, 4, 2);
1165  concat_values
1166 ---------------
1167  1|4|2
1168 (1 row)
1169 </screen>
1170     </para>
1171    </sect2>
1172
1173    <sect2>
1174     <title><acronym>SQL</acronym> Functions with Collations</title>
1175
1176     <indexterm>
1177      <primary>collation</>
1178      <secondary>in SQL functions</>
1179     </indexterm>
1180
1181     <para>
1182      When a SQL function has one or more parameters of collatable data types,
1183      a collation is identified for each function call depending on the
1184      collations assigned to the actual arguments, as described in <xref
1185      linkend="collation">.  If a collation is successfully identified
1186      (i.e., there are no conflicts of implicit collations among the arguments)
1187      then all the collatable parameters are treated as having that collation
1188      implicitly.  This will affect the behavior of collation-sensitive
1189      operations within the function.  For example, using the
1190      <function>anyleast</> function described above, the result of
1191 <programlisting>
1192 SELECT anyleast('abc'::text, 'ABC');
1193 </programlisting>
1194      will depend on the database's default collation.  In <literal>C</> locale
1195      the result will be <literal>ABC</>, but in many other locales it will
1196      be <literal>abc</>.  The collation to use can be forced by adding
1197      a <literal>COLLATE</> clause to any of the arguments, for example
1198 <programlisting>
1199 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1200 </programlisting>
1201      Alternatively, if you wish a function to operate with a particular
1202      collation regardless of what it is called with, insert
1203      <literal>COLLATE</> clauses as needed in the function definition.
1204      This version of <function>anyleast</> would always use <literal>en_US</>
1205      locale to compare strings:
1206 <programlisting>
1207 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1208     SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1209 $$ LANGUAGE SQL;
1210 </programlisting>
1211      But note that this will throw an error if applied to a non-collatable
1212      data type.
1213     </para>
1214
1215     <para>
1216      If no common collation can be identified among the actual arguments,
1217      then a SQL function treats its parameters as having their data types'
1218      default collation (which is usually the database's default collation,
1219      but could be different for parameters of domain types).
1220     </para>
1221
1222     <para>
1223      The behavior of collatable parameters can be thought of as a limited
1224      form of polymorphism, applicable only to textual data types.
1225     </para>
1226    </sect2>
1227   </sect1>
1228
1229   <sect1 id="xfunc-overload">
1230    <title>Function Overloading</title>
1231
1232    <indexterm zone="xfunc-overload">
1233     <primary>overloading</primary>
1234     <secondary>functions</secondary>
1235    </indexterm>
1236
1237    <para>
1238     More than one function can be defined with the same SQL name, so long
1239     as the arguments they take are different.  In other words,
1240     function names can be <firstterm>overloaded</firstterm>.  When a
1241     query is executed, the server will determine which function to
1242     call from the data types and the number of the provided arguments.
1243     Overloading can also be used to simulate functions with a variable
1244     number of arguments, up to a finite maximum number.
1245    </para>
1246
1247    <para>
1248     When creating a family of overloaded functions, one should be
1249     careful not to create ambiguities.  For instance, given the
1250     functions:
1251 <programlisting>
1252 CREATE FUNCTION test(int, real) RETURNS ...
1253 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1254 </programlisting>
1255     it is not immediately clear which function would be called with
1256     some trivial input like <literal>test(1, 1.5)</literal>.  The
1257     currently implemented resolution rules are described in
1258     <xref linkend="typeconv">, but it is unwise to design a system that subtly
1259     relies on this behavior.
1260    </para>
1261
1262    <para>
1263     A function that takes a single argument of a composite type should
1264     generally not have the same name as any attribute (field) of that type.
1265     Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1266     is considered equivalent
1267     to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1268     In the case that there is an
1269     ambiguity between a function on a composite type and an attribute of
1270     the composite type, the attribute will always be used.  It is possible
1271     to override that choice by schema-qualifying the function name
1272     (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1273     </literal>) but it's better to
1274     avoid the problem by not choosing conflicting names.
1275    </para>
1276
1277    <para>
1278     Another possible conflict is between variadic and non-variadic functions.
1279     For instance, it is possible to create both <literal>foo(numeric)</> and
1280     <literal>foo(VARIADIC numeric[])</>.  In this case it is unclear which one
1281     should be matched to a call providing a single numeric argument, such as
1282     <literal>foo(10.1)</>.  The rule is that the function appearing
1283     earlier in the search path is used, or if the two functions are in the
1284     same schema, the non-variadic one is preferred.
1285    </para>
1286
1287    <para>
1288     When overloading C-language functions, there is an additional
1289     constraint: The C name of each function in the family of
1290     overloaded functions must be different from the C names of all
1291     other functions, either internal or dynamically loaded.  If this
1292     rule is violated, the behavior is not portable.  You might get a
1293     run-time linker error, or one of the functions will get called
1294     (usually the internal one).  The alternative form of the
1295     <literal>AS</> clause for the SQL <command>CREATE
1296     FUNCTION</command> command decouples the SQL function name from
1297     the function name in the C source code.  For instance:
1298 <programlisting>
1299 CREATE FUNCTION test(int) RETURNS int
1300     AS '<replaceable>filename</>', 'test_1arg'
1301     LANGUAGE C;
1302 CREATE FUNCTION test(int, int) RETURNS int
1303     AS '<replaceable>filename</>', 'test_2arg'
1304     LANGUAGE C;
1305 </programlisting>
1306     The names of the C functions here reflect one of many possible conventions.
1307    </para>
1308   </sect1>
1309
1310   <sect1 id="xfunc-volatility">
1311    <title>Function Volatility Categories</title>
1312
1313    <indexterm zone="xfunc-volatility">
1314     <primary>volatility</primary>
1315     <secondary>functions</secondary>
1316    </indexterm>
1317    <indexterm zone="xfunc-volatility">
1318     <primary>VOLATILE</primary>
1319    </indexterm>
1320    <indexterm zone="xfunc-volatility">
1321     <primary>STABLE</primary>
1322    </indexterm>
1323    <indexterm zone="xfunc-volatility">
1324     <primary>IMMUTABLE</primary>
1325    </indexterm>
1326
1327    <para>
1328     Every function has a <firstterm>volatility</> classification, with
1329     the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1330     <literal>IMMUTABLE</>.  <literal>VOLATILE</> is the default if the
1331     <xref linkend="sql-createfunction">
1332     command does not specify a category.  The volatility category is a
1333     promise to the optimizer about the behavior of the function:
1334
1335    <itemizedlist>
1336     <listitem>
1337      <para>
1338       A <literal>VOLATILE</> function can do anything, including modifying
1339       the database.  It can return different results on successive calls with
1340       the same arguments.  The optimizer makes no assumptions about the
1341       behavior of such functions.  A query using a volatile function will
1342       re-evaluate the function at every row where its value is needed.
1343      </para>
1344     </listitem>
1345     <listitem>
1346      <para>
1347       A <literal>STABLE</> function cannot modify the database and is
1348       guaranteed to return the same results given the same arguments
1349       for all rows within a single statement. This category allows the
1350       optimizer to optimize multiple calls of the function to a single
1351       call. In particular, it is safe to use an expression containing
1352       such a function in an index scan condition. (Since an index scan
1353       will evaluate the comparison value only once, not once at each
1354       row, it is not valid to use a <literal>VOLATILE</> function in an
1355       index scan condition.)
1356      </para>
1357     </listitem>
1358     <listitem>
1359      <para>
1360       An <literal>IMMUTABLE</> function cannot modify the database and is
1361       guaranteed to return the same results given the same arguments forever.
1362       This category allows the optimizer to pre-evaluate the function when
1363       a query calls it with constant arguments.  For example, a query like
1364       <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1365       <literal>SELECT ... WHERE x = 4</>, because the function underlying
1366       the integer addition operator is marked <literal>IMMUTABLE</>.
1367      </para>
1368     </listitem>
1369    </itemizedlist>
1370    </para>
1371
1372    <para>
1373     For best optimization results, you should label your functions with the
1374     strictest volatility category that is valid for them.
1375    </para>
1376
1377    <para>
1378     Any function with side-effects <emphasis>must</> be labeled
1379     <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1380     Even a function with no side-effects needs to be labeled
1381     <literal>VOLATILE</> if its value can change within a single query;
1382     some examples are <literal>random()</>, <literal>currval()</>,
1383     <literal>timeofday()</>.
1384    </para>
1385
1386    <para>
1387     Another important example is that the <function>current_timestamp</>
1388     family of functions qualify as <literal>STABLE</>, since their values do
1389     not change within a transaction.
1390    </para>
1391
1392    <para>
1393     There is relatively little difference between <literal>STABLE</> and
1394     <literal>IMMUTABLE</> categories when considering simple interactive
1395     queries that are planned and immediately executed: it doesn't matter
1396     a lot whether a function is executed once during planning or once during
1397     query execution startup.  But there is a big difference if the plan is
1398     saved and reused later.  Labeling a function <literal>IMMUTABLE</> when
1399     it really isn't might allow it to be prematurely folded to a constant during
1400     planning, resulting in a stale value being re-used during subsequent uses
1401     of the plan.  This is a hazard when using prepared statements or when
1402     using function languages that cache plans (such as
1403     <application>PL/pgSQL</>).
1404    </para>
1405
1406    <para>
1407     For functions written in SQL or in any of the standard procedural
1408     languages, there is a second important property determined by the
1409     volatility category, namely the visibility of any data changes that have
1410     been made by the SQL command that is calling the function.  A
1411     <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1412     or <literal>IMMUTABLE</> function will not.  This behavior is implemented
1413     using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1414     <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1415     established as of the start of the calling query, whereas
1416     <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1417     each query they execute.
1418    </para>
1419
1420    <note>
1421     <para>
1422      Functions written in C can manage snapshots however they want, but it's
1423      usually a good idea to make C functions work this way too.
1424     </para>
1425    </note>
1426
1427    <para>
1428     Because of this snapshotting behavior,
1429     a function containing only <command>SELECT</> commands can safely be
1430     marked <literal>STABLE</>, even if it selects from tables that might be
1431     undergoing modifications by concurrent queries.
1432     <productname>PostgreSQL</productname> will execute all commands of a
1433     <literal>STABLE</> function using the snapshot established for the
1434     calling query, and so it will see a fixed view of the database throughout
1435     that query.
1436    </para>
1437
1438    <para>
1439     The same snapshotting behavior is used for <command>SELECT</> commands
1440     within <literal>IMMUTABLE</> functions.  It is generally unwise to select
1441     from database tables within an <literal>IMMUTABLE</> function at all,
1442     since the immutability will be broken if the table contents ever change.
1443     However, <productname>PostgreSQL</productname> does not enforce that you
1444     do not do that.
1445    </para>
1446
1447    <para>
1448     A common error is to label a function <literal>IMMUTABLE</> when its
1449     results depend on a configuration parameter.  For example, a function
1450     that manipulates timestamps might well have results that depend on the
1451     <xref linkend="guc-timezone"> setting.  For safety, such functions should
1452     be labeled <literal>STABLE</> instead.
1453    </para>
1454
1455    <note>
1456     <para>
1457      <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1458      and <literal>IMMUTABLE</> functions contain no SQL commands other
1459      than <command>SELECT</> to prevent data modification.
1460      (This is not a completely bulletproof test, since such functions could
1461      still call <literal>VOLATILE</> functions that modify the database.
1462      If you do that, you will find that the <literal>STABLE</> or
1463      <literal>IMMUTABLE</> function does not notice the database changes
1464      applied by the called function, since they are hidden from its snapshot.)
1465     </para>
1466    </note>
1467   </sect1>
1468
1469   <sect1 id="xfunc-pl">
1470    <title>Procedural Language Functions</title>
1471
1472    <para>
1473     <productname>PostgreSQL</productname> allows user-defined functions
1474     to be written in other languages besides SQL and C.  These other
1475     languages are generically called <firstterm>procedural
1476     languages</firstterm> (<acronym>PL</>s).
1477     Procedural languages aren't built into the
1478     <productname>PostgreSQL</productname> server; they are offered
1479     by loadable modules.
1480     See <xref linkend="xplang"> and following chapters for more
1481     information.
1482    </para>
1483   </sect1>
1484
1485   <sect1 id="xfunc-internal">
1486    <title>Internal Functions</title>
1487
1488    <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1489
1490    <para>
1491     Internal functions are functions written in C that have been statically
1492     linked into the <productname>PostgreSQL</productname> server.
1493     The <quote>body</quote> of the function definition
1494     specifies the C-language name of the function, which need not be the
1495     same as the name being declared for SQL use.
1496     (For reasons of backward compatibility, an empty body
1497     is accepted as meaning that the C-language function name is the
1498     same as the SQL name.)
1499    </para>
1500
1501    <para>
1502     Normally, all internal functions present in the
1503     server are declared during the initialization of the database cluster
1504     (see <xref linkend="creating-cluster">),
1505     but a user could use <command>CREATE FUNCTION</command>
1506     to create additional alias names for an internal function.
1507     Internal functions are declared in <command>CREATE FUNCTION</command>
1508     with language name <literal>internal</literal>.  For instance, to
1509     create an alias for the <function>sqrt</function> function:
1510 <programlisting>
1511 CREATE FUNCTION square_root(double precision) RETURNS double precision
1512     AS 'dsqrt'
1513     LANGUAGE internal
1514     STRICT;
1515 </programlisting>
1516     (Most internal functions expect to be declared <quote>strict</quote>.)
1517    </para>
1518
1519    <note>
1520     <para>
1521      Not all <quote>predefined</quote> functions are
1522      <quote>internal</quote> in the above sense.  Some predefined
1523      functions are written in SQL.
1524     </para>
1525    </note>
1526   </sect1>
1527
1528   <sect1 id="xfunc-c">
1529    <title>C-Language Functions</title>
1530
1531    <indexterm zone="xfunc-c">
1532     <primary>function</primary>
1533     <secondary>user-defined</secondary>
1534     <tertiary>in C</tertiary>
1535    </indexterm>
1536
1537    <para>
1538     User-defined functions can be written in C (or a language that can
1539     be made compatible with C, such as C++).  Such functions are
1540     compiled into dynamically loadable objects (also called shared
1541     libraries) and are loaded by the server on demand.  The dynamic
1542     loading feature is what distinguishes <quote>C language</> functions
1543     from <quote>internal</> functions &mdash; the actual coding conventions
1544     are essentially the same for both.  (Hence, the standard internal
1545     function library is a rich source of coding examples for user-defined
1546     C functions.)
1547    </para>
1548
1549    <para>
1550     Two different calling conventions are currently used for C functions.
1551     The newer <quote>version 1</quote> calling convention is indicated by writing
1552     a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1553     as illustrated below.  Lack of such a macro indicates an old-style
1554     (<quote>version 0</quote>) function.  The language name specified in <command>CREATE FUNCTION</command>
1555     is <literal>C</literal> in either case.  Old-style functions are now deprecated
1556     because of portability problems and lack of functionality, but they
1557     are still supported for compatibility reasons.
1558    </para>
1559
1560   <sect2 id="xfunc-c-dynload">
1561    <title>Dynamic Loading</title>
1562
1563    <indexterm zone="xfunc-c-dynload">
1564     <primary>dynamic loading</primary>
1565    </indexterm>
1566
1567    <para>
1568     The first time a user-defined function in a particular
1569     loadable object file is called in a session,
1570     the dynamic loader loads that object file into memory so that the
1571     function can be called.  The <command>CREATE FUNCTION</command>
1572     for a user-defined C function must therefore specify two pieces of
1573     information for the function: the name of the loadable
1574     object file, and the C name (link symbol) of the specific function to call
1575     within that object file.  If the C name is not explicitly specified then
1576     it is assumed to be the same as the SQL function name.
1577    </para>
1578
1579    <para>
1580     The following algorithm is used to locate the shared object file
1581     based on the name given in the <command>CREATE FUNCTION</command>
1582     command:
1583
1584     <orderedlist>
1585      <listitem>
1586       <para>
1587        If the name is an absolute path, the given file is loaded.
1588       </para>
1589      </listitem>
1590
1591      <listitem>
1592       <para>
1593        If the name starts with the string <literal>$libdir</literal>,
1594        that part is replaced by the <productname>PostgreSQL</> package
1595         library directory
1596        name, which is determined at build time.<indexterm><primary>$libdir</></>
1597       </para>
1598      </listitem>
1599
1600      <listitem>
1601       <para>
1602        If the name does not contain a directory part, the file is
1603        searched for in the path specified by the configuration variable
1604        <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1605       </para>
1606      </listitem>
1607
1608      <listitem>
1609       <para>
1610        Otherwise (the file was not found in the path, or it contains a
1611        non-absolute directory part), the dynamic loader will try to
1612        take the name as given, which will most likely fail.  (It is
1613        unreliable to depend on the current working directory.)
1614       </para>
1615      </listitem>
1616     </orderedlist>
1617
1618     If this sequence does not work, the platform-specific shared
1619     library file name extension (often <filename>.so</filename>) is
1620     appended to the given name and this sequence is tried again.  If
1621     that fails as well, the load will fail.
1622    </para>
1623
1624    <para>
1625     It is recommended to locate shared libraries either relative to
1626     <literal>$libdir</literal> or through the dynamic library path.
1627     This simplifies version upgrades if the new installation is at a
1628     different location.  The actual directory that
1629     <literal>$libdir</literal> stands for can be found out with the
1630     command <literal>pg_config --pkglibdir</literal>.
1631    </para>
1632
1633    <para>
1634     The user ID the <productname>PostgreSQL</productname> server runs
1635     as must be able to traverse the path to the file you intend to
1636     load.  Making the file or a higher-level directory not readable
1637     and/or not executable by the <systemitem>postgres</systemitem>
1638     user is a common mistake.
1639    </para>
1640
1641    <para>
1642     In any case, the file name that is given in the
1643     <command>CREATE FUNCTION</command> command is recorded literally
1644     in the system catalogs, so if the file needs to be loaded again
1645     the same procedure is applied.
1646    </para>
1647
1648    <note>
1649     <para>
1650      <productname>PostgreSQL</productname> will not compile a C function
1651      automatically.  The object file must be compiled before it is referenced
1652      in a <command>CREATE
1653      FUNCTION</> command.  See <xref linkend="dfunc"> for additional
1654      information.
1655     </para>
1656    </note>
1657
1658    <indexterm zone="xfunc-c-dynload">
1659     <primary>magic block</primary>
1660    </indexterm>
1661
1662    <para>
1663     To ensure that a dynamically loaded object file is not loaded into an
1664     incompatible server, <productname>PostgreSQL</productname> checks that the
1665     file contains a <quote>magic block</> with the appropriate contents.
1666     This allows the server to detect obvious incompatibilities, such as code
1667     compiled for a different major version of
1668     <productname>PostgreSQL</productname>.  A magic block is required as of
1669     <productname>PostgreSQL</productname> 8.2.  To include a magic block,
1670     write this in one (and only one) of the module source files, after having
1671     included the header <filename>fmgr.h</>:
1672
1673 <programlisting>
1674 #ifdef PG_MODULE_MAGIC
1675 PG_MODULE_MAGIC;
1676 #endif
1677 </programlisting>
1678
1679     The <literal>#ifdef</> test can be omitted if the code doesn't
1680     need to compile against pre-8.2 <productname>PostgreSQL</productname>
1681     releases.
1682    </para>
1683
1684    <para>
1685     After it is used for the first time, a dynamically loaded object
1686     file is retained in memory.  Future calls in the same session to
1687     the function(s) in that file will only incur the small overhead of
1688     a symbol table lookup.  If you need to force a reload of an object
1689     file, for example after recompiling it, begin a fresh session.
1690    </para>
1691
1692    <indexterm zone="xfunc-c-dynload">
1693     <primary>_PG_init</primary>
1694    </indexterm>
1695    <indexterm zone="xfunc-c-dynload">
1696     <primary>_PG_fini</primary>
1697    </indexterm>
1698    <indexterm zone="xfunc-c-dynload">
1699     <primary>library initialization function</primary>
1700    </indexterm>
1701    <indexterm zone="xfunc-c-dynload">
1702     <primary>library finalization function</primary>
1703    </indexterm>
1704
1705    <para>
1706     Optionally, a dynamically loaded file can contain initialization and
1707     finalization functions.  If the file includes a function named
1708     <function>_PG_init</>, that function will be called immediately after
1709     loading the file.  The function receives no parameters and should
1710     return void.  If the file includes a function named
1711     <function>_PG_fini</>, that function will be called immediately before
1712     unloading the file.  Likewise, the function receives no parameters and
1713     should return void.  Note that <function>_PG_fini</> will only be called
1714     during an unload of the file, not during process termination.
1715     (Presently, unloads are disabled and will never occur, but this may
1716     change in the future.)
1717    </para>
1718
1719   </sect2>
1720
1721    <sect2 id="xfunc-c-basetype">
1722     <title>Base Types in C-Language Functions</title>
1723
1724     <indexterm zone="xfunc-c-basetype">
1725      <primary>data type</primary>
1726      <secondary>internal organization</secondary>
1727     </indexterm>
1728
1729     <para>
1730      To know how to write C-language functions, you need to know how
1731      <productname>PostgreSQL</productname> internally represents base
1732      data types and how they can be passed to and from functions.
1733      Internally, <productname>PostgreSQL</productname> regards a base
1734      type as a <quote>blob of memory</quote>.  The user-defined
1735      functions that you define over a type in turn define the way that
1736      <productname>PostgreSQL</productname> can operate on it.  That
1737      is, <productname>PostgreSQL</productname> will only store and
1738      retrieve the data from disk and use your user-defined functions
1739      to input, process, and output the data.
1740     </para>
1741
1742     <para>
1743      Base types can have one of three internal formats:
1744
1745      <itemizedlist>
1746       <listitem>
1747        <para>
1748         pass by value, fixed-length
1749        </para>
1750       </listitem>
1751       <listitem>
1752        <para>
1753         pass by reference, fixed-length
1754        </para>
1755       </listitem>
1756       <listitem>
1757        <para>
1758         pass by reference, variable-length
1759        </para>
1760       </listitem>
1761      </itemizedlist>
1762     </para>
1763
1764     <para>
1765      By-value  types  can  only be 1, 2, or 4 bytes in length
1766      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1767      You should be careful to define your types such that they will be the
1768      same size (in bytes) on all architectures.  For example, the
1769      <literal>long</literal> type is dangerous because it is 4 bytes on some
1770      machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1771      on most Unix machines.  A reasonable implementation of the
1772      <type>int4</type> type on Unix machines might be:
1773
1774 <programlisting>
1775 /* 4-byte integer, passed by value */
1776 typedef int int4;
1777 </programlisting>
1778
1779      (The actual PostgreSQL C code calls this type <type>int32</type>, because
1780      it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1781      means <replaceable>XX</replaceable> <emphasis>bits</emphasis>.  Note
1782      therefore also that the C type <type>int8</type> is 1 byte in size.  The
1783      SQL type <type>int8</type> is called <type>int64</type> in C.  See also
1784      <xref linkend="xfunc-c-type-table">.)
1785     </para>
1786
1787     <para>
1788      On  the  other hand, fixed-length types of any size can
1789      be passed by-reference.  For example, here is a  sample
1790      implementation of a <productname>PostgreSQL</productname> type:
1791
1792 <programlisting>
1793 /* 16-byte structure, passed by reference */
1794 typedef struct
1795 {
1796     double  x, y;
1797 } Point;
1798 </programlisting>
1799
1800      Only  pointers  to  such types can be used when passing
1801      them in and out of <productname>PostgreSQL</productname> functions.
1802      To return a value of such a type, allocate the right amount of
1803      memory with <literal>palloc</literal>, fill in the allocated memory,
1804      and return a pointer to it.  (Also, if you just want to return the
1805      same value as one of your input arguments that's of the same data type,
1806      you can skip the extra <literal>palloc</literal> and just return the
1807      pointer to the input value.)
1808     </para>
1809
1810     <para>
1811      Finally, all variable-length types must also be  passed
1812      by  reference.   All  variable-length  types must begin
1813      with an opaque length field of exactly 4 bytes, which will be set
1814      by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1815      be  stored within that type must be located in the memory
1816      immediately  following  that  length  field.   The
1817      length field contains the total length of the structure,
1818      that is,  it  includes  the  size  of  the  length  field
1819      itself.
1820     </para>
1821
1822     <para>
1823      Another important point is to avoid leaving any uninitialized bits
1824      within data type values; for example, take care to zero out any
1825      alignment padding bytes that might be present in structs.  Without
1826      this, logically-equivalent constants of your data type might be
1827      seen as unequal by the planner, leading to inefficient (though not
1828      incorrect) plans.
1829     </para>
1830
1831     <warning>
1832      <para>
1833       <emphasis>Never</> modify the contents of a pass-by-reference input
1834       value.  If you do so you are likely to corrupt on-disk data, since
1835       the pointer you are given might point directly into a disk buffer.
1836       The sole exception to this rule is explained in
1837       <xref linkend="xaggr">.
1838      </para>
1839     </warning>
1840
1841     <para>
1842      As an example, we can define the type <type>text</type> as
1843      follows:
1844
1845 <programlisting>
1846 typedef struct {
1847     int32 length;
1848     char data[FLEXIBLE_ARRAY_MEMBER];
1849 } text;
1850 </programlisting>
1851
1852      The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1853      length of the data part is not specified by this declaration.
1854     </para>
1855
1856     <para>
1857      When manipulating
1858      variable-length types, we must  be  careful  to  allocate
1859      the  correct amount  of memory and set the length field correctly.
1860      For example, if we wanted to  store  40  bytes  in  a <structname>text</>
1861      structure, we might use a code fragment like this:
1862
1863 <programlisting><![CDATA[
1864 #include "postgres.h"
1865 ...
1866 char buffer[40]; /* our source data */
1867 ...
1868 text *destination = (text *) palloc(VARHDRSZ + 40);
1869 SET_VARSIZE(destination, VARHDRSZ + 40);
1870 memcpy(destination->data, buffer, 40);
1871 ...
1872 ]]>
1873 </programlisting>
1874
1875      <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1876      it's considered good style to use the macro <literal>VARHDRSZ</>
1877      to refer to the size of the overhead for a variable-length type.
1878      Also, the length field <emphasis>must</> be set using the
1879      <literal>SET_VARSIZE</> macro, not by simple assignment.
1880     </para>
1881
1882     <para>
1883      <xref linkend="xfunc-c-type-table"> specifies which C type
1884      corresponds to which SQL type when writing a C-language function
1885      that uses a built-in type of <productname>PostgreSQL</>.
1886      The <quote>Defined In</quote> column gives the header file that
1887      needs to be included to get the type definition.  (The actual
1888      definition might be in a different file that is included by the
1889      listed file.  It is recommended that users stick to the defined
1890      interface.)  Note that you should always include
1891      <filename>postgres.h</filename> first in any source file, because
1892      it declares a number of things that you will need anyway.
1893     </para>
1894
1895      <table tocentry="1" id="xfunc-c-type-table">
1896       <title>Equivalent C Types for Built-in SQL Types</title>
1897       <tgroup cols="3">
1898        <thead>
1899         <row>
1900          <entry>
1901           SQL Type
1902          </entry>
1903          <entry>
1904           C Type
1905          </entry>
1906          <entry>
1907           Defined In
1908          </entry>
1909         </row>
1910        </thead>
1911        <tbody>
1912         <row>
1913          <entry><type>abstime</type></entry>
1914          <entry><type>AbsoluteTime</type></entry>
1915          <entry><filename>utils/nabstime.h</filename></entry>
1916         </row>
1917         <row>
1918          <entry><type>bigint</type> (<type>int8</type>)</entry>
1919          <entry><type>int64</type></entry>
1920          <entry><filename>postgres.h</filename></entry>
1921         </row>
1922         <row>
1923          <entry><type>boolean</type></entry>
1924          <entry><type>bool</type></entry>
1925          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1926         </row>
1927         <row>
1928          <entry><type>box</type></entry>
1929          <entry><type>BOX*</type></entry>
1930          <entry><filename>utils/geo_decls.h</filename></entry>
1931         </row>
1932         <row>
1933          <entry><type>bytea</type></entry>
1934          <entry><type>bytea*</type></entry>
1935          <entry><filename>postgres.h</filename></entry>
1936         </row>
1937         <row>
1938          <entry><type>"char"</type></entry>
1939          <entry><type>char</type></entry>
1940          <entry>(compiler built-in)</entry>
1941         </row>
1942         <row>
1943          <entry><type>character</type></entry>
1944          <entry><type>BpChar*</type></entry>
1945          <entry><filename>postgres.h</filename></entry>
1946         </row>
1947         <row>
1948          <entry><type>cid</type></entry>
1949          <entry><type>CommandId</type></entry>
1950          <entry><filename>postgres.h</filename></entry>
1951         </row>
1952         <row>
1953          <entry><type>date</type></entry>
1954          <entry><type>DateADT</type></entry>
1955          <entry><filename>utils/date.h</filename></entry>
1956         </row>
1957         <row>
1958          <entry><type>smallint</type> (<type>int2</type>)</entry>
1959          <entry><type>int16</type></entry>
1960          <entry><filename>postgres.h</filename></entry>
1961         </row>
1962         <row>
1963          <entry><type>int2vector</type></entry>
1964          <entry><type>int2vector*</type></entry>
1965          <entry><filename>postgres.h</filename></entry>
1966         </row>
1967         <row>
1968          <entry><type>integer</type> (<type>int4</type>)</entry>
1969          <entry><type>int32</type></entry>
1970          <entry><filename>postgres.h</filename></entry>
1971         </row>
1972         <row>
1973          <entry><type>real</type> (<type>float4</type>)</entry>
1974          <entry><type>float4*</type></entry>
1975         <entry><filename>postgres.h</filename></entry>
1976         </row>
1977         <row>
1978          <entry><type>double precision</type> (<type>float8</type>)</entry>
1979          <entry><type>float8*</type></entry>
1980          <entry><filename>postgres.h</filename></entry>
1981         </row>
1982         <row>
1983          <entry><type>interval</type></entry>
1984          <entry><type>Interval*</type></entry>
1985          <entry><filename>datatype/timestamp.h</filename></entry>
1986         </row>
1987         <row>
1988          <entry><type>lseg</type></entry>
1989          <entry><type>LSEG*</type></entry>
1990          <entry><filename>utils/geo_decls.h</filename></entry>
1991         </row>
1992         <row>
1993          <entry><type>name</type></entry>
1994          <entry><type>Name</type></entry>
1995          <entry><filename>postgres.h</filename></entry>
1996         </row>
1997         <row>
1998          <entry><type>oid</type></entry>
1999          <entry><type>Oid</type></entry>
2000          <entry><filename>postgres.h</filename></entry>
2001         </row>
2002         <row>
2003          <entry><type>oidvector</type></entry>
2004          <entry><type>oidvector*</type></entry>
2005          <entry><filename>postgres.h</filename></entry>
2006         </row>
2007         <row>
2008          <entry><type>path</type></entry>
2009          <entry><type>PATH*</type></entry>
2010          <entry><filename>utils/geo_decls.h</filename></entry>
2011         </row>
2012         <row>
2013          <entry><type>point</type></entry>
2014          <entry><type>POINT*</type></entry>
2015          <entry><filename>utils/geo_decls.h</filename></entry>
2016         </row>
2017         <row>
2018          <entry><type>regproc</type></entry>
2019          <entry><type>regproc</type></entry>
2020          <entry><filename>postgres.h</filename></entry>
2021         </row>
2022         <row>
2023          <entry><type>reltime</type></entry>
2024          <entry><type>RelativeTime</type></entry>
2025          <entry><filename>utils/nabstime.h</filename></entry>
2026         </row>
2027         <row>
2028          <entry><type>text</type></entry>
2029          <entry><type>text*</type></entry>
2030          <entry><filename>postgres.h</filename></entry>
2031         </row>
2032         <row>
2033          <entry><type>tid</type></entry>
2034          <entry><type>ItemPointer</type></entry>
2035          <entry><filename>storage/itemptr.h</filename></entry>
2036         </row>
2037         <row>
2038          <entry><type>time</type></entry>
2039          <entry><type>TimeADT</type></entry>
2040          <entry><filename>utils/date.h</filename></entry>
2041         </row>
2042         <row>
2043          <entry><type>time with time zone</type></entry>
2044          <entry><type>TimeTzADT</type></entry>
2045          <entry><filename>utils/date.h</filename></entry>
2046         </row>
2047         <row>
2048          <entry><type>timestamp</type></entry>
2049          <entry><type>Timestamp*</type></entry>
2050          <entry><filename>datatype/timestamp.h</filename></entry>
2051         </row>
2052         <row>
2053          <entry><type>tinterval</type></entry>
2054          <entry><type>TimeInterval</type></entry>
2055          <entry><filename>utils/nabstime.h</filename></entry>
2056         </row>
2057         <row>
2058          <entry><type>varchar</type></entry>
2059          <entry><type>VarChar*</type></entry>
2060          <entry><filename>postgres.h</filename></entry>
2061         </row>
2062         <row>
2063          <entry><type>xid</type></entry>
2064          <entry><type>TransactionId</type></entry>
2065          <entry><filename>postgres.h</filename></entry>
2066         </row>
2067        </tbody>
2068       </tgroup>
2069      </table>
2070
2071     <para>
2072      Now that we've gone over all of the possible structures
2073      for base types, we can show some examples of real functions.
2074     </para>
2075    </sect2>
2076
2077    <sect2>
2078     <title>Version 0 Calling Conventions</title>
2079
2080     <para>
2081      We present the <quote>old style</quote> calling convention first &mdash; although
2082      this approach is now deprecated, it's easier to get a handle on
2083      initially.  In the version-0 method, the arguments and result
2084      of the C function are just declared in normal C style, but being
2085      careful to use the C representation of each SQL data type as shown
2086      above.
2087     </para>
2088
2089     <para>
2090      Here are some examples:
2091
2092 <programlisting><![CDATA[
2093 #include "postgres.h"
2094 #include <string.h>
2095 #include "utils/geo_decls.h"
2096
2097 #ifdef PG_MODULE_MAGIC
2098 PG_MODULE_MAGIC;
2099 #endif
2100
2101 /* by value */
2102
2103 int
2104 add_one(int arg)
2105 {
2106     return arg + 1;
2107 }
2108
2109 /* by reference, fixed length */
2110
2111 float8 *
2112 add_one_float8(float8 *arg)
2113 {
2114     float8    *result = (float8 *) palloc(sizeof(float8));
2115
2116     *result = *arg + 1.0;
2117
2118     return result;
2119 }
2120
2121 Point *
2122 makepoint(Point *pointx, Point *pointy)
2123 {
2124     Point     *new_point = (Point *) palloc(sizeof(Point));
2125
2126     new_point->x = pointx->x;
2127     new_point->y = pointy->y;
2128
2129     return new_point;
2130 }
2131
2132 /* by reference, variable length */
2133
2134 text *
2135 copytext(text *t)
2136 {
2137     /*
2138      * VARSIZE is the total size of the struct in bytes.
2139      */
2140     text *new_t = (text *) palloc(VARSIZE(t));
2141     SET_VARSIZE(new_t, VARSIZE(t));
2142     /*
2143      * VARDATA is a pointer to the data region of the struct.
2144      */
2145     memcpy((void *) VARDATA(new_t), /* destination */
2146            (void *) VARDATA(t),     /* source */
2147            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
2148     return new_t;
2149 }
2150
2151 text *
2152 concat_text(text *arg1, text *arg2)
2153 {
2154     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2155     text *new_text = (text *) palloc(new_text_size);
2156
2157     SET_VARSIZE(new_text, new_text_size);
2158     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2159     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2160            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2161     return new_text;
2162 }
2163 ]]>
2164 </programlisting>
2165     </para>
2166
2167     <para>
2168      Supposing that the above code has been prepared in file
2169      <filename>funcs.c</filename> and compiled into a shared object,
2170      we could define the functions to <productname>PostgreSQL</productname>
2171      with commands like this:
2172
2173 <programlisting>
2174 CREATE FUNCTION add_one(integer) RETURNS integer
2175      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2176      LANGUAGE C STRICT;
2177
2178 -- note overloading of SQL function name "add_one"
2179 CREATE FUNCTION add_one(double precision) RETURNS double precision
2180      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2181      LANGUAGE C STRICT;
2182
2183 CREATE FUNCTION makepoint(point, point) RETURNS point
2184      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2185      LANGUAGE C STRICT;
2186
2187 CREATE FUNCTION copytext(text) RETURNS text
2188      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2189      LANGUAGE C STRICT;
2190
2191 CREATE FUNCTION concat_text(text, text) RETURNS text
2192      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2193      LANGUAGE C STRICT;
2194 </programlisting>
2195     </para>
2196
2197     <para>
2198      Here, <replaceable>DIRECTORY</replaceable> stands for the
2199      directory of the shared library file (for instance the
2200      <productname>PostgreSQL</productname> tutorial directory, which
2201      contains the code for the examples used in this section).
2202      (Better style would be to use just <literal>'funcs'</> in the
2203      <literal>AS</> clause, after having added
2204      <replaceable>DIRECTORY</replaceable> to the search path.  In any
2205      case, we can omit the system-specific extension for a shared
2206      library, commonly <literal>.so</literal> or
2207      <literal>.sl</literal>.)
2208     </para>
2209
2210     <para>
2211      Notice that we have specified the functions as <quote>strict</quote>,
2212      meaning that
2213      the system should automatically assume a null result if any input
2214      value is null.  By doing this, we avoid having to check for null inputs
2215      in the function code.  Without this, we'd have to check for null values
2216      explicitly, by checking for a null pointer for each
2217      pass-by-reference argument.  (For pass-by-value arguments, we don't
2218      even have a way to check!)
2219     </para>
2220
2221     <para>
2222      Although this calling convention is simple to use,
2223      it is not very portable; on some architectures there are problems
2224      with passing data types that are smaller than <type>int</type> this way.  Also, there is
2225      no simple way to return a null result, nor to cope with null arguments
2226      in any way other than making the function strict.  The version-1
2227      convention, presented next, overcomes these objections.
2228     </para>
2229    </sect2>
2230
2231    <sect2>
2232     <title>Version 1 Calling Conventions</title>
2233
2234     <para>
2235      The version-1 calling convention relies on macros to suppress most
2236      of the complexity of passing arguments and results.  The C declaration
2237      of a version-1 function is always:
2238 <programlisting>
2239 Datum funcname(PG_FUNCTION_ARGS)
2240 </programlisting>
2241      In addition, the macro call:
2242 <programlisting>
2243 PG_FUNCTION_INFO_V1(funcname);
2244 </programlisting>
2245      must appear in the same source file.  (Conventionally, it's
2246      written just before the function itself.)  This macro call is not
2247      needed for <literal>internal</>-language functions, since
2248      <productname>PostgreSQL</> assumes that all internal functions
2249      use the version-1 convention.  It is, however, required for
2250      dynamically-loaded functions.
2251     </para>
2252
2253     <para>
2254      In a version-1 function, each actual argument is fetched using a
2255      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2256      macro that corresponds to the argument's data type, and the
2257      result is returned using a
2258      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2259      macro for the return type.
2260      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2261      takes as its argument the number of the function argument to
2262      fetch, where the count starts at 0.
2263      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2264      takes as its argument the actual value to return.
2265     </para>
2266
2267     <para>
2268      Here we show the same functions as above, coded in version-1 style:
2269
2270 <programlisting><![CDATA[
2271 #include "postgres.h"
2272 #include <string.h>
2273 #include "fmgr.h"
2274 #include "utils/geo_decls.h"
2275
2276 #ifdef PG_MODULE_MAGIC
2277 PG_MODULE_MAGIC;
2278 #endif
2279
2280 /* by value */
2281
2282 PG_FUNCTION_INFO_V1(add_one);
2283
2284 Datum
2285 add_one(PG_FUNCTION_ARGS)
2286 {
2287     int32   arg = PG_GETARG_INT32(0);
2288
2289     PG_RETURN_INT32(arg + 1);
2290 }
2291
2292 /* by reference, fixed length */
2293
2294 PG_FUNCTION_INFO_V1(add_one_float8);
2295
2296 Datum
2297 add_one_float8(PG_FUNCTION_ARGS)
2298 {
2299     /* The macros for FLOAT8 hide its pass-by-reference nature. */
2300     float8   arg = PG_GETARG_FLOAT8(0);
2301
2302     PG_RETURN_FLOAT8(arg + 1.0);
2303 }
2304
2305 PG_FUNCTION_INFO_V1(makepoint);
2306
2307 Datum
2308 makepoint(PG_FUNCTION_ARGS)
2309 {
2310     /* Here, the pass-by-reference nature of Point is not hidden. */
2311     Point     *pointx = PG_GETARG_POINT_P(0);
2312     Point     *pointy = PG_GETARG_POINT_P(1);
2313     Point     *new_point = (Point *) palloc(sizeof(Point));
2314
2315     new_point->x = pointx->x;
2316     new_point->y = pointy->y;
2317
2318     PG_RETURN_POINT_P(new_point);
2319 }
2320
2321 /* by reference, variable length */
2322
2323 PG_FUNCTION_INFO_V1(copytext);
2324
2325 Datum
2326 copytext(PG_FUNCTION_ARGS)
2327 {
2328     text     *t = PG_GETARG_TEXT_P(0);
2329     /*
2330      * VARSIZE is the total size of the struct in bytes.
2331      */
2332     text     *new_t = (text *) palloc(VARSIZE(t));
2333     SET_VARSIZE(new_t, VARSIZE(t));
2334     /*
2335      * VARDATA is a pointer to the data region of the struct.
2336      */
2337     memcpy((void *) VARDATA(new_t), /* destination */
2338            (void *) VARDATA(t),     /* source */
2339            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
2340     PG_RETURN_TEXT_P(new_t);
2341 }
2342
2343 PG_FUNCTION_INFO_V1(concat_text);
2344
2345 Datum
2346 concat_text(PG_FUNCTION_ARGS)
2347 {
2348     text  *arg1 = PG_GETARG_TEXT_P(0);
2349     text  *arg2 = PG_GETARG_TEXT_P(1);
2350     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2351     text *new_text = (text *) palloc(new_text_size);
2352
2353     SET_VARSIZE(new_text, new_text_size);
2354     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2355     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2356            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2357     PG_RETURN_TEXT_P(new_text);
2358 }
2359 ]]>
2360 </programlisting>
2361     </para>
2362
2363     <para>
2364      The <command>CREATE FUNCTION</command> commands are the same as
2365      for the version-0 equivalents.
2366     </para>
2367
2368     <para>
2369      At first glance, the version-1 coding conventions might appear to
2370      be just pointless obscurantism.  They do, however, offer a number
2371      of improvements, because the macros can hide unnecessary detail.
2372      An example is that in coding <function>add_one_float8</>, we no longer need to
2373      be aware that <type>float8</type> is a pass-by-reference type.  Another
2374      example is that the <literal>GETARG</> macros for variable-length types allow
2375      for more efficient fetching of <quote>toasted</quote> (compressed or
2376      out-of-line) values.
2377     </para>
2378
2379     <para>
2380      One big improvement in version-1 functions is better handling of null
2381      inputs and results.  The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2382      allows a function to test whether each input is null.  (Of course, doing
2383      this is only necessary in functions not declared <quote>strict</>.)
2384      As with the
2385      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2386      the input arguments are counted beginning at zero.  Note that one
2387      should refrain from executing
2388      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2389      one has verified that the argument isn't null.
2390      To return a null result, execute <function>PG_RETURN_NULL()</function>;
2391      this works in both strict and nonstrict functions.
2392     </para>
2393
2394     <para>
2395      Other options provided in the new-style interface are two
2396      variants of the
2397      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2398      macros. The first of these,
2399      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2400      guarantees to return a copy of the specified argument that is
2401      safe for writing into. (The normal macros will sometimes return a
2402      pointer to a value that is physically stored in a table, which
2403      must not be written to. Using the
2404      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2405      macros guarantees a writable result.)
2406     The second variant consists of the
2407     <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2408     macros which take three arguments. The first is the number of the
2409     function argument (as above). The second and third are the offset and
2410     length of the segment to be returned. Offsets are counted from
2411     zero, and a negative length requests that the remainder of the
2412     value be returned. These macros provide more efficient access to
2413     parts of large values in the case where they have storage type
2414     <quote>external</quote>. (The storage type of a column can be specified using
2415     <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2416     COLUMN <replaceable>colname</replaceable> SET STORAGE
2417     <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2418     <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2419      or <literal>main</>.)
2420     </para>
2421
2422     <para>
2423      Finally, the version-1 function call conventions make it possible
2424      to return set results (<xref linkend="xfunc-c-return-set">) and
2425      implement trigger functions (<xref linkend="triggers">) and
2426      procedural-language call handlers (<xref
2427      linkend="plhandler">).  Version-1 code is also more
2428      portable than version-0, because it does not break restrictions
2429      on function call protocol in the C standard.  For more details
2430      see <filename>src/backend/utils/fmgr/README</filename> in the
2431      source distribution.
2432     </para>
2433    </sect2>
2434
2435    <sect2>
2436     <title>Writing Code</title>
2437
2438     <para>
2439      Before we turn to the more advanced topics, we should discuss
2440      some coding rules for <productname>PostgreSQL</productname>
2441      C-language functions.  While it might be possible to load functions
2442      written in languages other than C into
2443      <productname>PostgreSQL</productname>, this is usually difficult
2444      (when it is possible at all) because other languages, such as
2445      C++, FORTRAN, or Pascal often do not follow the same calling
2446      convention as C.  That is, other languages do not pass argument
2447      and return values between functions in the same way.  For this
2448      reason, we will assume that your C-language functions are
2449      actually written in C.
2450     </para>
2451
2452     <para>
2453      The basic rules for writing and building C functions are as follows:
2454
2455      <itemizedlist>
2456       <listitem>
2457        <para>
2458         Use <literal>pg_config
2459         --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2460         to find out where the <productname>PostgreSQL</> server header
2461         files are installed on your system (or the system that your
2462         users will be running on).
2463        </para>
2464       </listitem>
2465
2466       <listitem>
2467        <para>
2468         Compiling and linking your code so that it can be dynamically
2469         loaded into <productname>PostgreSQL</productname> always
2470         requires special flags.  See <xref linkend="dfunc"> for a
2471         detailed explanation of how to do it for your particular
2472         operating system.
2473        </para>
2474       </listitem>
2475
2476       <listitem>
2477        <para>
2478         Remember to define a <quote>magic block</> for your shared library,
2479         as described in <xref linkend="xfunc-c-dynload">.
2480        </para>
2481       </listitem>
2482
2483       <listitem>
2484        <para>
2485         When allocating memory, use the
2486         <productname>PostgreSQL</productname> functions
2487         <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2488         instead of the corresponding C library functions
2489         <function>malloc</function> and <function>free</function>.
2490         The memory allocated by <function>palloc</function> will be
2491         freed automatically at the end of each transaction, preventing
2492         memory leaks.
2493        </para>
2494       </listitem>
2495
2496       <listitem>
2497        <para>
2498         Always zero the bytes of your structures using <function>memset</>
2499         (or allocate them with <function>palloc0</> in the first place).
2500         Even if you assign to each field of your structure, there might be
2501         alignment padding (holes in the structure) that contain
2502         garbage values.  Without this, it's difficult to
2503         support hash indexes or hash joins, as you must pick out only
2504         the significant bits of your data structure to compute a hash.
2505         The planner also sometimes relies on comparing constants via
2506         bitwise equality, so you can get undesirable planning results if
2507         logically-equivalent values aren't bitwise equal.
2508        </para>
2509       </listitem>
2510
2511       <listitem>
2512        <para>
2513         Most of the internal <productname>PostgreSQL</productname>
2514         types are declared in <filename>postgres.h</filename>, while
2515         the function manager interfaces
2516         (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)  are in
2517         <filename>fmgr.h</filename>, so you will need to include at
2518         least these two files.  For portability reasons it's best to
2519         include <filename>postgres.h</filename> <emphasis>first</>,
2520         before any other system or user header files.  Including
2521         <filename>postgres.h</filename> will also include
2522         <filename>elog.h</filename> and <filename>palloc.h</filename>
2523         for you.
2524        </para>
2525       </listitem>
2526
2527       <listitem>
2528        <para>
2529         Symbol names defined within object files must not conflict
2530         with each other or with symbols defined in the
2531         <productname>PostgreSQL</productname> server executable.  You
2532         will have to rename your functions or variables if you get
2533         error messages to this effect.
2534        </para>
2535       </listitem>
2536      </itemizedlist>
2537     </para>
2538    </sect2>
2539
2540 &dfunc;
2541
2542    <sect2>
2543     <title>Composite-type Arguments</title>
2544
2545     <para>
2546      Composite types do not have a fixed layout like C structures.
2547      Instances of a composite type can contain null fields.  In
2548      addition, composite types that are part of an inheritance
2549      hierarchy can have different fields than other members of the
2550      same inheritance hierarchy.  Therefore,
2551      <productname>PostgreSQL</productname> provides a function
2552      interface for accessing fields of composite types from C.
2553     </para>
2554
2555     <para>
2556      Suppose we want to write a function to answer the query:
2557
2558 <programlisting>
2559 SELECT name, c_overpaid(emp, 1500) AS overpaid
2560     FROM emp
2561     WHERE name = 'Bill' OR name = 'Sam';
2562 </programlisting>
2563
2564      Using call conventions version 0, we can define
2565      <function>c_overpaid</> as:
2566
2567 <programlisting><![CDATA[
2568 #include "postgres.h"
2569 #include "executor/executor.h"  /* for GetAttributeByName() */
2570
2571 #ifdef PG_MODULE_MAGIC
2572 PG_MODULE_MAGIC;
2573 #endif
2574
2575 bool
2576 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2577            int32 limit)
2578 {
2579     bool isnull;
2580     int32 salary;
2581
2582     salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2583     if (isnull)
2584         return false;
2585     return salary > limit;
2586 }
2587 ]]>
2588 </programlisting>
2589
2590      In version-1 coding, the above would look like this:
2591
2592 <programlisting><![CDATA[
2593 #include "postgres.h"
2594 #include "executor/executor.h"  /* for GetAttributeByName() */
2595
2596 #ifdef PG_MODULE_MAGIC
2597 PG_MODULE_MAGIC;
2598 #endif
2599
2600 PG_FUNCTION_INFO_V1(c_overpaid);
2601
2602 Datum
2603 c_overpaid(PG_FUNCTION_ARGS)
2604 {
2605     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
2606     int32            limit = PG_GETARG_INT32(1);
2607     bool isnull;
2608     Datum salary;
2609
2610     salary = GetAttributeByName(t, "salary", &isnull);
2611     if (isnull)
2612         PG_RETURN_BOOL(false);
2613     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2614
2615     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2616 }
2617 ]]>
2618 </programlisting>
2619     </para>
2620
2621     <para>
2622      <function>GetAttributeByName</function> is the
2623      <productname>PostgreSQL</productname> system function that
2624      returns attributes out of the specified row.  It has
2625      three arguments: the argument of type <type>HeapTupleHeader</type> passed
2626      into
2627      the  function, the name of the desired attribute, and a
2628      return parameter that tells whether  the  attribute
2629      is  null.   <function>GetAttributeByName</function> returns a <type>Datum</type>
2630      value that you can convert to the proper data type by using the
2631      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2632      macro.  Note that the return value is meaningless if the null flag is
2633      set; always check the null flag before trying to do anything with the
2634      result.
2635     </para>
2636
2637     <para>
2638      There is also <function>GetAttributeByNum</function>, which selects
2639      the target attribute by column number instead of name.
2640     </para>
2641
2642     <para>
2643      The following command declares the function
2644      <function>c_overpaid</function> in SQL:
2645
2646 <programlisting>
2647 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2648     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2649     LANGUAGE C STRICT;
2650 </programlisting>
2651
2652      Notice we have used <literal>STRICT</> so that we did not have to
2653      check whether the input arguments were NULL.
2654     </para>
2655    </sect2>
2656
2657    <sect2>
2658     <title>Returning Rows (Composite Types)</title>
2659
2660     <para>
2661      To return a row or composite-type value from a C-language
2662      function, you can use a special API that provides macros and
2663      functions to hide most of the complexity of building composite
2664      data types.  To use this API, the source file must include:
2665 <programlisting>
2666 #include "funcapi.h"
2667 </programlisting>
2668     </para>
2669
2670     <para>
2671      There are two ways you can build a composite data value (henceforth
2672      a <quote>tuple</>): you can build it from an array of Datum values,
2673      or from an array of C strings that can be passed to the input
2674      conversion functions of the tuple's column data types.  In either
2675      case, you first need to obtain or construct a <structname>TupleDesc</>
2676      descriptor for the tuple structure.  When working with Datums, you
2677      pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2678      and then call <function>heap_form_tuple</> for each row.  When working
2679      with C strings, you pass the <structname>TupleDesc</> to
2680      <function>TupleDescGetAttInMetadata</>, and then call
2681      <function>BuildTupleFromCStrings</> for each row.  In the case of a
2682      function returning a set of tuples, the setup steps can all be done
2683      once during the first call of the function.
2684     </para>
2685
2686     <para>
2687      Several helper functions are available for setting up the needed
2688      <structname>TupleDesc</>.  The recommended way to do this in most
2689      functions returning composite values is to call:
2690 <programlisting>
2691 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2692                                    Oid *resultTypeId,
2693                                    TupleDesc *resultTupleDesc)
2694 </programlisting>
2695      passing the same <literal>fcinfo</> struct passed to the calling function
2696      itself.  (This of course requires that you use the version-1
2697      calling conventions.)  <varname>resultTypeId</> can be specified
2698      as <literal>NULL</> or as the address of a local variable to receive the
2699      function's result type OID.  <varname>resultTupleDesc</> should be the
2700      address of a local <structname>TupleDesc</> variable.  Check that the
2701      result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2702      <varname>resultTupleDesc</> has been filled with the needed
2703      <structname>TupleDesc</>.  (If it is not, you can report an error along
2704      the lines of <quote>function returning record called in context that
2705      cannot accept type record</quote>.)
2706     </para>
2707
2708     <tip>
2709      <para>
2710       <function>get_call_result_type</> can resolve the actual type of a
2711       polymorphic function result; so it is useful in functions that return
2712       scalar polymorphic results, not only functions that return composites.
2713       The <varname>resultTypeId</> output is primarily useful for functions
2714       returning polymorphic scalars.
2715      </para>
2716     </tip>
2717
2718     <note>
2719      <para>
2720       <function>get_call_result_type</> has a sibling
2721       <function>get_expr_result_type</>, which can be used to resolve the
2722       expected output type for a function call represented by an expression
2723       tree.  This can be used when trying to determine the result type from
2724       outside the function itself.  There is also
2725       <function>get_func_result_type</>, which can be used when only the
2726       function's OID is available.  However these functions are not able
2727       to deal with functions declared to return <structname>record</>, and
2728       <function>get_func_result_type</> cannot resolve polymorphic types,
2729       so you should preferentially use <function>get_call_result_type</>.
2730      </para>
2731     </note>
2732
2733     <para>
2734      Older, now-deprecated functions for obtaining
2735      <structname>TupleDesc</>s are:
2736 <programlisting>
2737 TupleDesc RelationNameGetTupleDesc(const char *relname)
2738 </programlisting>
2739      to get a <structname>TupleDesc</> for the row type of a named relation,
2740      and:
2741 <programlisting>
2742 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2743 </programlisting>
2744      to get a <structname>TupleDesc</> based on a type OID. This can
2745      be used to get a <structname>TupleDesc</> for a base or
2746      composite type.  It will not work for a function that returns
2747      <structname>record</>, however, and it cannot resolve polymorphic
2748      types.
2749     </para>
2750
2751     <para>
2752      Once you have a <structname>TupleDesc</>, call:
2753 <programlisting>
2754 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2755 </programlisting>
2756      if you plan to work with Datums, or:
2757 <programlisting>
2758 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2759 </programlisting>
2760      if you plan to work with C strings.  If you are writing a function
2761      returning set, you can save the results of these functions in the
2762      <structname>FuncCallContext</> structure &mdash; use the
2763      <structfield>tuple_desc</> or <structfield>attinmeta</> field
2764      respectively.
2765     </para>
2766
2767     <para>
2768      When working with Datums, use:
2769 <programlisting>
2770 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2771 </programlisting>
2772      to build a <structname>HeapTuple</> given user data in Datum form.
2773     </para>
2774
2775     <para>
2776      When working with C strings, use:
2777 <programlisting>
2778 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2779 </programlisting>
2780      to build a <structname>HeapTuple</> given user data
2781      in C string form.  <parameter>values</parameter> is an array of C strings,
2782      one for each attribute of the return row. Each C string should be in
2783      the form expected by the input function of the attribute data
2784      type. In order to return a null value for one of the attributes,
2785      the corresponding pointer in the <parameter>values</> array
2786      should be set to <symbol>NULL</>.  This function will need to
2787      be called again for each row you return.
2788     </para>
2789
2790     <para>
2791      Once you have built a tuple to return from your function, it
2792      must be converted into a <type>Datum</>. Use:
2793 <programlisting>
2794 HeapTupleGetDatum(HeapTuple tuple)
2795 </programlisting>
2796      to convert a <structname>HeapTuple</> into a valid Datum.  This
2797      <type>Datum</> can be returned directly if you intend to return
2798      just a single row, or it can be used as the current return value
2799      in a set-returning function.
2800     </para>
2801
2802     <para>
2803      An example appears in the next section.
2804     </para>
2805
2806    </sect2>
2807
2808    <sect2 id="xfunc-c-return-set">
2809     <title>Returning Sets</title>
2810
2811     <para>
2812      There is also a special API that provides support for returning
2813      sets (multiple rows) from a C-language function.  A set-returning
2814      function must follow the version-1 calling conventions.  Also,
2815      source files must include <filename>funcapi.h</filename>, as
2816      above.
2817     </para>
2818
2819     <para>
2820      A set-returning function (<acronym>SRF</>) is called
2821      once for each item it returns.  The <acronym>SRF</> must
2822      therefore save enough state to remember what it was doing and
2823      return the next item on each call.
2824      The structure <structname>FuncCallContext</> is provided to help
2825      control this process.  Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>
2826      is used to hold a pointer to <structname>FuncCallContext</>
2827      across calls.
2828 <programlisting>
2829 typedef struct FuncCallContext
2830 {
2831     /*
2832      * Number of times we've been called before
2833      *
2834      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2835      * incremented for you every time SRF_RETURN_NEXT() is called.
2836      */
2837     uint64 call_cntr;
2838
2839     /*
2840      * OPTIONAL maximum number of calls
2841      *
2842      * max_calls is here for convenience only and setting it is optional.
2843      * If not set, you must provide alternative means to know when the
2844      * function is done.
2845      */
2846     uint64 max_calls;
2847
2848     /*
2849      * OPTIONAL pointer to result slot
2850      *
2851      * This is obsolete and only present for backward compatibility, viz,
2852      * user-defined SRFs that use the deprecated TupleDescGetSlot().
2853      */
2854     TupleTableSlot *slot;
2855
2856     /*
2857      * OPTIONAL pointer to miscellaneous user-provided context information
2858      *
2859      * user_fctx is for use as a pointer to your own data to retain
2860      * arbitrary context information between calls of your function.
2861      */
2862     void *user_fctx;
2863
2864     /*
2865      * OPTIONAL pointer to struct containing attribute type input metadata
2866      *
2867      * attinmeta is for use when returning tuples (i.e., composite data types)
2868      * and is not used when returning base data types. It is only needed
2869      * if you intend to use BuildTupleFromCStrings() to create the return
2870      * tuple.
2871      */
2872     AttInMetadata *attinmeta;
2873
2874     /*
2875      * memory context used for structures that must live for multiple calls
2876      *
2877      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2878      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2879      * context for any memory that is to be reused across multiple calls
2880      * of the SRF.
2881      */
2882     MemoryContext multi_call_memory_ctx;
2883
2884     /*
2885      * OPTIONAL pointer to struct containing tuple description
2886      *
2887      * tuple_desc is for use when returning tuples (i.e., composite data types)
2888      * and is only needed if you are going to build the tuples with
2889      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
2890      * the TupleDesc pointer stored here should usually have been run through
2891      * BlessTupleDesc() first.
2892      */
2893     TupleDesc tuple_desc;
2894
2895 } FuncCallContext;
2896 </programlisting>
2897     </para>
2898
2899     <para>
2900      An <acronym>SRF</> uses several functions and macros that
2901      automatically manipulate the <structname>FuncCallContext</>
2902      structure (and expect to find it via <literal>fn_extra</>).  Use:
2903 <programlisting>
2904 SRF_IS_FIRSTCALL()
2905 </programlisting>
2906      to determine if your function is being called for the first or a
2907      subsequent time. On the first call (only) use:
2908 <programlisting>
2909 SRF_FIRSTCALL_INIT()
2910 </programlisting>
2911      to initialize the <structname>FuncCallContext</>. On every function call,
2912      including the first, use:
2913 <programlisting>
2914 SRF_PERCALL_SETUP()
2915 </programlisting>
2916      to properly set up for using the <structname>FuncCallContext</>
2917      and clearing any previously returned data left over from the
2918      previous pass.
2919     </para>
2920
2921     <para>
2922      If your function has data to return, use:
2923 <programlisting>
2924 SRF_RETURN_NEXT(funcctx, result)
2925 </programlisting>
2926      to return it to the caller.  (<literal>result</> must be of type
2927      <type>Datum</>, either a single value or a tuple prepared as
2928      described above.)  Finally, when your function is finished
2929      returning data, use:
2930 <programlisting>
2931 SRF_RETURN_DONE(funcctx)
2932 </programlisting>
2933      to clean up and end the <acronym>SRF</>.
2934     </para>
2935
2936     <para>
2937      The memory context that is current when the <acronym>SRF</> is called is
2938      a transient context that will be cleared between calls.  This means
2939      that you do not need to call <function>pfree</> on everything
2940      you allocated using <function>palloc</>; it will go away anyway.  However, if you want to allocate
2941      any data structures to live across calls, you need to put them somewhere
2942      else.  The memory context referenced by
2943      <structfield>multi_call_memory_ctx</> is a suitable location for any
2944      data that needs to survive until the <acronym>SRF</> is finished running.  In most
2945      cases, this means that you should switch into
2946      <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2947     </para>
2948
2949     <warning>
2950      <para>
2951       While the actual arguments to the function remain unchanged between
2952       calls, if you detoast the argument values (which is normally done
2953       transparently by the
2954       <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2955       in the transient context then the detoasted copies will be freed on
2956       each cycle. Accordingly, if you keep references to such values in
2957       your <structfield>user_fctx</>, you must either copy them into the
2958       <structfield>multi_call_memory_ctx</> after detoasting, or ensure
2959       that you detoast the values only in that context.
2960      </para>
2961     </warning>
2962
2963     <para>
2964      A complete pseudo-code example looks like the following:
2965 <programlisting>
2966 Datum
2967 my_set_returning_function(PG_FUNCTION_ARGS)
2968 {
2969     FuncCallContext  *funcctx;
2970     Datum             result;
2971     <replaceable>further declarations as needed</replaceable>
2972
2973     if (SRF_IS_FIRSTCALL())
2974     {
2975         MemoryContext oldcontext;
2976
2977         funcctx = SRF_FIRSTCALL_INIT();
2978         oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
2979         /* One-time setup code appears here: */
2980         <replaceable>user code</replaceable>
2981         <replaceable>if returning composite</replaceable>
2982             <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2983         <replaceable>endif returning composite</replaceable>
2984         <replaceable>user code</replaceable>
2985         MemoryContextSwitchTo(oldcontext);
2986     }
2987
2988     /* Each-time setup code appears here: */
2989     <replaceable>user code</replaceable>
2990     funcctx = SRF_PERCALL_SETUP();
2991     <replaceable>user code</replaceable>
2992
2993     /* this is just one way we might test whether we are done: */
2994     if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
2995     {
2996         /* Here we want to return another item: */
2997         <replaceable>user code</replaceable>
2998         <replaceable>obtain result Datum</replaceable>
2999         SRF_RETURN_NEXT(funcctx, result);
3000     }
3001     else
3002     {
3003         /* Here we are done returning items and just need to clean up: */
3004         <replaceable>user code</replaceable>
3005         SRF_RETURN_DONE(funcctx);
3006     }
3007 }
3008 </programlisting>
3009     </para>
3010
3011     <para>
3012      A complete example of a simple <acronym>SRF</> returning a composite type
3013      looks like:
3014 <programlisting><![CDATA[
3015 PG_FUNCTION_INFO_V1(retcomposite);
3016
3017 Datum
3018 retcomposite(PG_FUNCTION_ARGS)
3019 {
3020     FuncCallContext     *funcctx;
3021     int                  call_cntr;
3022     int                  max_calls;
3023     TupleDesc            tupdesc;
3024     AttInMetadata       *attinmeta;
3025
3026     /* stuff done only on the first call of the function */
3027     if (SRF_IS_FIRSTCALL())
3028     {
3029         MemoryContext   oldcontext;
3030
3031         /* create a function context for cross-call persistence */
3032         funcctx = SRF_FIRSTCALL_INIT();
3033
3034         /* switch to memory context appropriate for multiple function calls */
3035         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3036
3037         /* total number of tuples to be returned */
3038         funcctx->max_calls = PG_GETARG_UINT32(0);
3039
3040         /* Build a tuple descriptor for our result type */
3041         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3042             ereport(ERROR,
3043                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3044                      errmsg("function returning record called in context "
3045                             "that cannot accept type record")));
3046
3047         /*
3048          * generate attribute metadata needed later to produce tuples from raw
3049          * C strings
3050          */
3051         attinmeta = TupleDescGetAttInMetadata(tupdesc);
3052         funcctx->attinmeta = attinmeta;
3053
3054         MemoryContextSwitchTo(oldcontext);
3055     }
3056
3057     /* stuff done on every call of the function */
3058     funcctx = SRF_PERCALL_SETUP();
3059
3060     call_cntr = funcctx->call_cntr;
3061     max_calls = funcctx->max_calls;
3062     attinmeta = funcctx->attinmeta;
3063
3064     if (call_cntr < max_calls)    /* do when there is more left to send */
3065     {
3066         char       **values;
3067         HeapTuple    tuple;
3068         Datum        result;
3069
3070         /*
3071          * Prepare a values array for building the returned tuple.
3072          * This should be an array of C strings which will
3073          * be processed later by the type input functions.
3074          */
3075         values = (char **) palloc(3 * sizeof(char *));
3076         values[0] = (char *) palloc(16 * sizeof(char));
3077         values[1] = (char *) palloc(16 * sizeof(char));
3078         values[2] = (char *) palloc(16 * sizeof(char));
3079
3080         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3081         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3082         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3083
3084         /* build a tuple */
3085         tuple = BuildTupleFromCStrings(attinmeta, values);
3086
3087         /* make the tuple into a datum */
3088         result = HeapTupleGetDatum(tuple);
3089
3090         /* clean up (this is not really necessary) */
3091         pfree(values[0]);
3092         pfree(values[1]);
3093         pfree(values[2]);
3094         pfree(values);
3095
3096         SRF_RETURN_NEXT(funcctx, result);
3097     }
3098     else    /* do when there is no more left */
3099     {
3100         SRF_RETURN_DONE(funcctx);
3101     }
3102 }
3103 ]]>
3104 </programlisting>
3105
3106      One way to declare this function in SQL is:
3107 <programlisting>
3108 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3109
3110 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3111     RETURNS SETOF __retcomposite
3112     AS '<replaceable>filename</>', 'retcomposite'
3113     LANGUAGE C IMMUTABLE STRICT;
3114 </programlisting>
3115      A different way is to use OUT parameters:
3116 <programlisting>
3117 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3118     OUT f1 integer, OUT f2 integer, OUT f3 integer)
3119     RETURNS SETOF record
3120     AS '<replaceable>filename</>', 'retcomposite'
3121     LANGUAGE C IMMUTABLE STRICT;
3122 </programlisting>
3123      Notice that in this method the output type of the function is formally
3124      an anonymous <structname>record</> type.
3125     </para>
3126
3127     <para>
3128      The directory <link linkend="tablefunc">contrib/tablefunc</>
3129      module in the source distribution contains more examples of
3130      set-returning functions.
3131     </para>
3132    </sect2>
3133
3134    <sect2>
3135     <title>Polymorphic Arguments and Return Types</title>
3136
3137     <para>
3138      C-language functions can be declared to accept and
3139      return the polymorphic types
3140      <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3141      <type>anyenum</type>, and <type>anyrange</type>.
3142      See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3143      of polymorphic functions. When function arguments or return types
3144      are defined as polymorphic types, the function author cannot know
3145      in advance what data type it will be called with, or
3146      need to return. There are two routines provided in <filename>fmgr.h</>
3147      to allow a version-1 C function to discover the actual data types
3148      of its arguments and the type it is expected to return. The routines are
3149      called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3150      <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3151      They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3152      information is not available.
3153      The structure <literal>flinfo</> is normally accessed as
3154      <literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
3155      is zero based.  <function>get_call_result_type</> can also be used
3156      as an alternative to <function>get_fn_expr_rettype</>.
3157      There is also <function>get_fn_expr_variadic</>, which can be used to
3158      find out whether variadic arguments have been merged into an array.
3159      This is primarily useful for <literal>VARIADIC "any"</> functions,
3160      since such merging will always have occurred for variadic functions
3161      taking ordinary array types.
3162     </para>
3163
3164     <para>
3165      For example, suppose we want to write a function to accept a single
3166      element of any type, and return a one-dimensional array of that type:
3167
3168 <programlisting>
3169 PG_FUNCTION_INFO_V1(make_array);
3170 Datum
3171 make_array(PG_FUNCTION_ARGS)
3172 {
3173     ArrayType  *result;
3174     Oid         element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3175     Datum       element;
3176     bool        isnull;
3177     int16       typlen;
3178     bool        typbyval;
3179     char        typalign;
3180     int         ndims;
3181     int         dims[MAXDIM];
3182     int         lbs[MAXDIM];
3183
3184     if (!OidIsValid(element_type))
3185         elog(ERROR, "could not determine data type of input");
3186
3187     /* get the provided element, being careful in case it's NULL */
3188     isnull = PG_ARGISNULL(0);
3189     if (isnull)
3190         element = (Datum) 0;
3191     else
3192         element = PG_GETARG_DATUM(0);
3193
3194     /* we have one dimension */
3195     ndims = 1;
3196     /* and one element */
3197     dims[0] = 1;
3198     /* and lower bound is 1 */
3199     lbs[0] = 1;
3200
3201     /* get required info about the element type */
3202     get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3203
3204     /* now build the array */
3205     result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3206                                 element_type, typlen, typbyval, typalign);
3207
3208     PG_RETURN_ARRAYTYPE_P(result);
3209 }
3210 </programlisting>
3211     </para>
3212
3213     <para>
3214      The following command declares the function
3215      <function>make_array</function> in SQL:
3216
3217 <programlisting>
3218 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3219     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3220     LANGUAGE C IMMUTABLE;
3221 </programlisting>
3222     </para>
3223
3224     <para>
3225      There is a variant of polymorphism that is only available to C-language
3226      functions: they can be declared to take parameters of type
3227      <literal>"any"</>.  (Note that this type name must be double-quoted,
3228      since it's also a SQL reserved word.)  This works like
3229      <type>anyelement</> except that it does not constrain different
3230      <literal>"any"</> arguments to be the same type, nor do they help
3231      determine the function's result type.  A C-language function can also
3232      declare its final parameter to be <literal>VARIADIC "any"</>.  This will
3233      match one or more actual arguments of any type (not necessarily the same
3234      type).  These arguments will <emphasis>not</> be gathered into an array
3235      as happens with normal variadic functions; they will just be passed to
3236      the function separately.  The <function>PG_NARGS()</> macro and the
3237      methods described above must be used to determine the number of actual
3238      arguments and their types when using this feature.  Also, users of such
3239      a function might wish to use the <literal>VARIADIC</> keyword in their
3240      function call, with the expectation that the function would treat the
3241      array elements as separate arguments.  The function itself must implement
3242      that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3243      detect that the actual argument was marked with <literal>VARIADIC</>.
3244     </para>
3245    </sect2>
3246
3247    <sect2 id="xfunc-transform-functions">
3248     <title>Transform Functions</title>
3249
3250     <para>
3251      Some function calls can be simplified during planning based on
3252      properties specific to the function.  For example,
3253      <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3254      To define such function-specific optimizations, write a
3255      <firstterm>transform function</> and place its OID in the
3256      <structfield>protransform</> field of the primary function's
3257      <structname>pg_proc</> entry.  The transform function must have the SQL
3258      signature <literal>protransform(internal) RETURNS internal</>.  The
3259      argument, actually <type>FuncExpr *</>, is a dummy node representing a
3260      call to the primary function.  If the transform function's study of the
3261      expression tree proves that a simplified expression tree can substitute
3262      for all possible concrete calls represented thereby, build and return
3263      that simplified expression.  Otherwise, return a <literal>NULL</>
3264      pointer (<emphasis>not</> a SQL null).
3265     </para>
3266
3267     <para>
3268      We make no guarantee that <productname>PostgreSQL</> will never call the
3269      primary function in cases that the transform function could simplify.
3270      Ensure rigorous equivalence between the simplified expression and an
3271      actual call to the primary function.
3272     </para>
3273
3274     <para>
3275      Currently, this facility is not exposed to users at the SQL level
3276      because of security concerns, so it is only practical to use for
3277      optimizing built-in functions.
3278     </para>
3279    </sect2>
3280
3281    <sect2>
3282     <title>Shared Memory and LWLocks</title>
3283
3284     <para>
3285      Add-ins can reserve LWLocks and an allocation of shared memory on server
3286      startup.  The add-in's shared library must be preloaded by specifying
3287      it in
3288      <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3289      Shared memory is reserved by calling:
3290 <programlisting>
3291 void RequestAddinShmemSpace(int size)
3292 </programlisting>
3293      from your <function>_PG_init</> function.
3294     </para>
3295     <para>
3296      LWLocks are reserved by calling:
3297 <programlisting>
3298 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3299 </programlisting>
3300      from <function>_PG_init</>.  This will ensure that an array of
3301      <literal>num_lwlocks</> LWLocks is available under the name
3302      <literal>tranche_name</>.  Use <function>GetNamedLWLockTranche</>
3303      to get a pointer to this array.
3304     </para>
3305     <para>
3306      To avoid possible race-conditions, each backend should use the LWLock
3307      <function>AddinShmemInitLock</> when connecting to and initializing
3308      its allocation of shared memory, as shown here:
3309 <programlisting>
3310 static mystruct *ptr = NULL;
3311
3312 if (!ptr)
3313 {
3314         bool    found;
3315
3316         LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3317         ptr = ShmemInitStruct("my struct name", size, &amp;found);
3318         if (!found)
3319         {
3320                 initialize contents of shmem area;
3321                 acquire any requested LWLocks using:
3322                 ptr->locks = GetNamedLWLockTranche("my tranche name");
3323         }
3324         LWLockRelease(AddinShmemInitLock);
3325 }
3326 </programlisting>
3327     </para>
3328    </sect2>
3329
3330    <sect2 id="extend-Cpp">
3331     <title>Using C++ for Extensibility</title>
3332
3333     <indexterm zone="extend-Cpp">
3334      <primary>C++</primary>
3335     </indexterm>
3336
3337     <para>
3338      Although the <productname>PostgreSQL</productname> backend is written in
3339      C, it is possible to write extensions in C++ if these guidelines are
3340      followed:
3341
3342      <itemizedlist>
3343       <listitem>
3344        <para>
3345          All functions accessed by the backend must present a C interface
3346          to the backend;  these C functions can then call C++ functions.
3347          For example, <literal>extern C</> linkage is required for
3348          backend-accessed functions.  This is also necessary for any
3349          functions that are passed as pointers between the backend and
3350          C++ code.
3351        </para>
3352       </listitem>
3353       <listitem>
3354        <para>
3355         Free memory using the appropriate deallocation method.  For example,
3356         most backend memory is allocated using <function>palloc()</>, so use
3357         <function>pfree()</> to free it.  Using C++
3358         <function>delete</> in such cases will fail.
3359        </para>
3360       </listitem>
3361       <listitem>
3362        <para>
3363         Prevent exceptions from propagating into the C code (use a catch-all
3364         block at the top level of all <literal>extern C</> functions).  This
3365         is necessary even if the C++ code does not explicitly throw any
3366         exceptions, because events like out-of-memory can still throw
3367         exceptions.  Any exceptions must be caught and appropriate errors
3368         passed back to the C interface.  If possible, compile C++ with
3369         <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3370         cases, you must check for failures in your C++ code, e.g.  check for
3371         NULL returned by <function>new()</>.
3372        </para>
3373       </listitem>
3374       <listitem>
3375        <para>
3376         If calling backend functions from C++ code, be sure that the
3377         C++ call stack contains only plain old data structures
3378         (<acronym>POD</>).  This is necessary because backend errors
3379         generate a distant <function>longjmp()</> that does not properly
3380         unroll a C++ call stack with non-POD objects.
3381        </para>
3382       </listitem>
3383      </itemizedlist>
3384     </para>
3385
3386     <para>
3387      In summary, it is best to place C++ code behind a wall of
3388      <literal>extern C</> functions that interface to the backend,
3389      and avoid exception, memory, and call stack leakage.
3390     </para>
3391    </sect2>
3392
3393   </sect1>