granicus.if.org Git - postgresql/blob - doc/src/sgml/xfunc.sgml

   1 <!-- doc/src/sgml/xfunc.sgml -->
   2
   3  <sect1 id="xfunc">
   4   <title>User-defined Functions</title>
   5
   6   <indexterm zone="xfunc">
   7    <primary>function</primary>
   8    <secondary>user-defined</secondary>
   9   </indexterm>
  10
  11   <para>
  12    <productname>PostgreSQL</productname> provides four kinds of
  13    functions:
  14
  15    <itemizedlist>
  16     <listitem>
  17      <para>
  18       query language functions (functions written in
  19       <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
  20      </para>
  21     </listitem>
  22     <listitem>
  23      <para>
  24       procedural language functions (functions written in, for
  25       example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
  26       (<xref linkend="xfunc-pl">)
  27      </para>
  28     </listitem>
  29     <listitem>
  30      <para>
  31       internal functions (<xref linkend="xfunc-internal">)
  32      </para>
  33     </listitem>
  34     <listitem>
  35      <para>
  36       C-language functions (<xref linkend="xfunc-c">)
  37      </para>
  38     </listitem>
  39    </itemizedlist>
  40   </para>
  41
  42   <para>
  43    Every kind
  44    of  function  can take base types, composite types, or
  45    combinations of these as arguments (parameters). In addition,
  46    every kind of function can return a base type or
  47    a composite type.  Functions can also be defined to return
  48    sets of base or composite values.
  49   </para>
  50
  51   <para>
  52    Many kinds of functions can take or return certain pseudo-types
  53    (such as polymorphic types), but the available facilities vary.
  54    Consult the description of each kind of function for more details.
  55   </para>
  56
  57   <para>
  58    It's easiest to define <acronym>SQL</acronym>
  59    functions, so we'll start by discussing those.
  60    Most of the concepts presented for <acronym>SQL</acronym> functions
  61    will carry over to the other types of functions.
  62   </para>
  63
  64   <para>
  65    Throughout this chapter, it can be useful to look at the reference
  66    page of the <xref linkend="sql-createfunction"> command to
  67    understand the examples better.  Some examples from this chapter
  68    can be found in <filename>funcs.sql</filename> and
  69    <filename>funcs.c</filename> in the <filename>src/tutorial</>
  70    directory in the <productname>PostgreSQL</productname> source
  71    distribution.
  72   </para>
  73   </sect1>
  74
  75   <sect1 id="xfunc-sql">
  76    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  77
  78    <indexterm zone="xfunc-sql">
  79     <primary>function</primary>
  80     <secondary>user-defined</secondary>
  81     <tertiary>in SQL</tertiary>
  82    </indexterm>
  83
  84    <para>
  85     SQL functions execute an arbitrary list of SQL statements, returning
  86     the result of the last query in the list.
  87     In the simple (non-set)
  88     case, the first row of the last query's result will be returned.
  89     (Bear in mind that <quote>the first row</quote> of a multirow
  90     result is not well-defined unless you use <literal>ORDER BY</>.)
  91     If the last query happens
  92     to return no rows at all, the null value will be returned.
  93    </para>
  94
  95    <para>
  96     Alternatively, an SQL function can be declared to return a set (that is,
  97     multiple rows) by specifying the function's return type as <literal>SETOF
  98     <replaceable>sometype</></literal>, or equivalently by declaring it as
  99     <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.  In this case
 100     all rows of the last query's result are returned.  Further details appear
 101     below.
 102    </para>
 103
 104    <para>
 105     The body of an SQL function must be a list of SQL
 106     statements separated by semicolons.  A semicolon after the last
 107     statement is optional.  Unless the function is declared to return
 108     <type>void</>, the last statement must be a <command>SELECT</>,
 109     or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
 110     that has a <literal>RETURNING</> clause.
 111    </para>
 112
 113     <para>
 114      Any collection of commands in the  <acronym>SQL</acronym>
 115      language can be packaged together and defined as a function.
 116      Besides <command>SELECT</command> queries, the commands can include data
 117      modification queries (<command>INSERT</command>,
 118      <command>UPDATE</command>, and <command>DELETE</command>), as well as
 119      other SQL commands. (You cannot use transaction control commands, e.g.
 120      <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
 121      commands, e.g.  <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
 122      However, the final command
 123      must be a <command>SELECT</command> or have a <literal>RETURNING</>
 124      clause that returns whatever is
 125      specified as the function's return type.  Alternatively, if you
 126      want to define a SQL function that performs actions but has no
 127      useful value to return, you can define it as returning <type>void</>.
 128      For example, this function removes rows with negative salaries from
 129      the <literal>emp</> table:
 130
 131 <screen>
 132 CREATE FUNCTION clean_emp() RETURNS void AS '
 133     DELETE FROM emp
 134         WHERE salary &lt; 0;
 135 ' LANGUAGE SQL;
 136
 137 SELECT clean_emp();
 138
 139  clean_emp
 140 -----------
 141
 142 (1 row)
 143 </screen>
 144     </para>
 145
 146     <note>
 147      <para>
 148       The entire body of a SQL function is parsed before any of it is
 149       executed.  While a SQL function can contain commands that alter
 150       the system catalogs (e.g., <command>CREATE TABLE</>), the effects
 151       of such commands will not be visible during parse analysis of
 152       later commands in the function.  Thus, for example,
 153       <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
 154       will not work as desired if packaged up into a single SQL function,
 155       since <structname>foo</> won't exist yet when the <command>INSERT</>
 156       command is parsed.  It's recommended to use <application>PL/pgSQL</>
 157       instead of a SQL function in this type of situation.
 158      </para>
 159    </note>
 160
 161    <para>
 162     The syntax of the <command>CREATE FUNCTION</command> command requires
 163     the function body to be written as a string constant.  It is usually
 164     most convenient to use dollar quoting (see <xref
 165     linkend="sql-syntax-dollar-quoting">) for the string constant.
 166     If you choose to use regular single-quoted string constant syntax,
 167     you must double single quote marks (<literal>'</>) and backslashes
 168     (<literal>\</>) (assuming escape string syntax) in the body of
 169     the function (see <xref linkend="sql-syntax-strings">).
 170    </para>
 171
 172    <sect2 id="xfunc-sql-function-arguments">
 173     <title>Arguments for <acronym>SQL</acronym> Functions</title>
 174
 175    <indexterm>
 176     <primary>function</primary>
 177     <secondary>named argument</secondary>
 178    </indexterm>
 179
 180     <para>
 181      Arguments of a SQL function can be referenced in the function
 182      body using either names or numbers.  Examples of both methods appear
 183      below.
 184     </para>
 185
 186     <para>
 187      To use a name, declare the function argument as having a name, and
 188      then just write that name in the function body.  If the argument name
 189      is the same as any column name in the current SQL command within the
 190      function, the column name will take precedence.  To override this,
 191      qualify the argument name with the name of the function itself, that is
 192      <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
 193      (If this would conflict with a qualified column name, again the column
 194      name wins.  You can avoid the ambiguity by choosing a different alias for
 195      the table within the SQL command.)
 196     </para>
 197
 198     <para>
 199      In the older numeric approach, arguments are referenced using the syntax
 200      <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
 201      argument, <literal>$2</> to the second, and so on.  This will work
 202      whether or not the particular argument was declared with a name.
 203     </para>
 204
 205     <para>
 206      If an argument is of a composite type, then the dot notation,
 207      e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
 208      <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
 209      argument.  Again, you might need to qualify the argument's name with the
 210      function name to make the form with an argument name unambiguous.
 211     </para>
 212
 213     <para>
 214      SQL function arguments can only be used as data values,
 215      not as identifiers.  Thus for example this is reasonable:
 216 <programlisting>
 217 INSERT INTO mytable VALUES ($1);
 218 </programlisting>
 219 but this will not work:
 220 <programlisting>
 221 INSERT INTO $1 VALUES (42);
 222 </programlisting>
 223     </para>
 224
 225     <note>
 226      <para>
 227       The ability to use names to reference SQL function arguments was added
 228       in <productname>PostgreSQL</productname> 9.2.  Functions to be used in
 229       older servers must use the <literal>$<replaceable>n</></> notation.
 230      </para>
 231     </note>
 232    </sect2>
 233
 234    <sect2 id="xfunc-sql-base-functions">
 235     <title><acronym>SQL</acronym> Functions on Base Types</title>
 236
 237     <para>
 238      The simplest possible <acronym>SQL</acronym> function has no arguments and
 239      simply returns a base type, such as <type>integer</type>:
 240
 241 <screen>
 242 CREATE FUNCTION one() RETURNS integer AS $$
 243     SELECT 1 AS result;
 244 $$ LANGUAGE SQL;
 245
 246 -- Alternative syntax for string literal:
 247 CREATE FUNCTION one() RETURNS integer AS '
 248     SELECT 1 AS result;
 249 ' LANGUAGE SQL;
 250
 251 SELECT one();
 252
 253  one
 254 -----
 255    1
 256 </screen>
 257     </para>
 258
 259     <para>
 260      Notice that we defined a column alias within the function body for the result of the function
 261      (with  the  name <literal>result</>),  but this column alias is not visible
 262      outside the function.  Hence,  the  result  is labeled <literal>one</>
 263      instead of <literal>result</>.
 264     </para>
 265
 266     <para>
 267      It is almost as easy to define <acronym>SQL</acronym> functions
 268      that take base types as arguments:
 269
 270 <screen>
 271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
 272     SELECT x + y;
 273 $$ LANGUAGE SQL;
 274
 275 SELECT add_em(1, 2) AS answer;
 276
 277  answer
 278 --------
 279       3
 280 </screen>
 281     </para>
 282
 283     <para>
 284      Alternatively, we could dispense with names for the arguments and
 285      use numbers:
 286
 287 <screen>
 288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
 289     SELECT $1 + $2;
 290 $$ LANGUAGE SQL;
 291
 292 SELECT add_em(1, 2) AS answer;
 293
 294  answer
 295 --------
 296       3
 297 </screen>
 298     </para>
 299
 300     <para>
 301      Here is a more useful function, which might be used to debit a
 302      bank account:
 303
 304 <programlisting>
 305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 306     UPDATE bank
 307         SET balance = balance - debit
 308         WHERE accountno = tf1.accountno;
 309     SELECT 1;
 310 $$ LANGUAGE SQL;
 311 </programlisting>
 312
 313      A user could execute this function to debit account 17 by $100.00 as
 314      follows:
 315
 316 <programlisting>
 317 SELECT tf1(17, 100.0);
 318 </programlisting>
 319     </para>
 320
 321     <para>
 322      In this example, we chose the name <literal>accountno</> for the first
 323      argument, but this is the same as the name of a column in the
 324      <literal>bank</> table.  Within the <command>UPDATE</> command,
 325      <literal>accountno</> refers to the column <literal>bank.accountno</>,
 326      so <literal>tf1.accountno</> must be used to refer to the argument.
 327      We could of course avoid this by using a different name for the argument.
 328     </para>
 329
 330     <para>
 331      In practice one would probably like a more useful result from the
 332      function than a constant 1, so a more likely definition
 333      is:
 334
 335 <programlisting>
 336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 337     UPDATE bank
 338         SET balance = balance - debit
 339         WHERE accountno = tf1.accountno;
 340     SELECT balance FROM bank WHERE accountno = tf1.accountno;
 341 $$ LANGUAGE SQL;
 342 </programlisting>
 343
 344      which adjusts the balance and returns the new balance.
 345      The same thing could be done in one command using <literal>RETURNING</>:
 346
 347 <programlisting>
 348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
 349     UPDATE bank
 350         SET balance = balance - debit
 351         WHERE accountno = tf1.accountno
 352     RETURNING balance;
 353 $$ LANGUAGE SQL;
 354 </programlisting>
 355     </para>
 356    </sect2>
 357
 358    <sect2 id="xfunc-sql-composite-functions">
 359     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 360
 361     <para>
 362      When writing functions with arguments of composite types, we must not
 363      only specify which argument we want but also the desired attribute
 364      (field) of that argument.  For example, suppose that
 365      <type>emp</type> is a table containing employee data, and therefore
 366      also the name of the composite type of each row of the table.  Here
 367      is a function <function>double_salary</function> that computes what someone's
 368      salary would be if it were doubled:
 369
 370 <screen>
 371 CREATE TABLE emp (
 372     name        text,
 373     salary      numeric,
 374     age         integer,
 375     cubicle     point
 376 );
 377
 378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
 379
 380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
 381     SELECT $1.salary * 2 AS salary;
 382 $$ LANGUAGE SQL;
 383
 384 SELECT name, double_salary(emp.*) AS dream
 385     FROM emp
 386     WHERE emp.cubicle ~= point '(2,1)';
 387
 388  name | dream
 389 ------+-------
 390  Bill |  8400
 391 </screen>
 392     </para>
 393
 394     <para>
 395      Notice the use of the syntax <literal>$1.salary</literal>
 396      to select one field of the argument row value.  Also notice
 397      how the calling <command>SELECT</> command
 398      uses <replaceable>table_name</><literal>.*</> to select
 399      the entire current row of a table as a composite value.  The table
 400      row can alternatively be referenced using just the table name,
 401      like this:
 402 <screen>
 403 SELECT name, double_salary(emp) AS dream
 404     FROM emp
 405     WHERE emp.cubicle ~= point '(2,1)';
 406 </screen>
 407      but this usage is deprecated since it's easy to get confused.
 408      (See <xref linkend="rowtypes-usage"> for details about these
 409      two notations for the composite value of a table row.)
 410     </para>
 411
 412     <para>
 413      Sometimes it is handy to construct a composite argument value
 414      on-the-fly.  This can be done with the <literal>ROW</> construct.
 415      For example, we could adjust the data being passed to the function:
 416 <screen>
 417 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
 418     FROM emp;
 419 </screen>
 420     </para>
 421
 422     <para>
 423      It is also possible to build a function that returns a composite type.
 424      This is an example of a function
 425      that returns a single <type>emp</type> row:
 426
 427 <programlisting>
 428 CREATE FUNCTION new_emp() RETURNS emp AS $$
 429     SELECT text 'None' AS name,
 430         1000.0 AS salary,
 431         25 AS age,
 432         point '(2,2)' AS cubicle;
 433 $$ LANGUAGE SQL;
 434 </programlisting>
 435
 436      In this example we have specified each of  the  attributes
 437      with  a  constant value, but any computation
 438      could have been substituted for these constants.
 439     </para>
 440
 441     <para>
 442      Note two important things about defining the function:
 443
 444      <itemizedlist>
 445       <listitem>
 446        <para>
 447         The select list order in the query must be exactly the same as
 448         that in which the columns appear in the table associated
 449         with the composite type.  (Naming the columns, as we did above,
 450         is irrelevant to the system.)
 451        </para>
 452       </listitem>
 453       <listitem>
 454        <para>
 455         You must typecast the expressions to match the
 456         definition of the composite type, or you will get errors like this:
 457 <screen>
 458 <computeroutput>
 459 ERROR:  function declared to return emp returns varchar instead of text at column 1
 460 </computeroutput>
 461 </screen>
 462        </para>
 463       </listitem>
 464      </itemizedlist>
 465     </para>
 466
 467     <para>
 468      A different way to define the same function is:
 469
 470 <programlisting>
 471 CREATE FUNCTION new_emp() RETURNS emp AS $$
 472     SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
 473 $$ LANGUAGE SQL;
 474 </programlisting>
 475
 476      Here we wrote a <command>SELECT</> that returns just a single
 477      column of the correct composite type.  This isn't really better
 478      in this situation, but it is a handy alternative in some cases
 479      &mdash; for example, if we need to compute the result by calling
 480      another function that returns the desired composite value.
 481     </para>
 482
 483     <para>
 484      We could call this function directly either by using it in
 485      a value expression:
 486
 487 <screen>
 488 SELECT new_emp();
 489
 490          new_emp
 491 --------------------------
 492  (None,1000.0,25,"(2,2)")
 493 </screen>
 494
 495      or by calling it as a table function:
 496
 497 <screen>
 498 SELECT * FROM new_emp();
 499
 500  name | salary | age | cubicle
 501 ------+--------+-----+---------
 502  None | 1000.0 |  25 | (2,2)
 503 </screen>
 504
 505      The second way is described more fully in <xref
 506      linkend="xfunc-sql-table-functions">.
 507     </para>
 508
 509     <para>
 510      When you use a function that returns a composite type,
 511      you might want only one field (attribute) from its result.
 512      You can do that with syntax like this:
 513
 514 <screen>
 515 SELECT (new_emp()).name;
 516
 517  name
 518 ------
 519  None
 520 </screen>
 521
 522      The extra parentheses are needed to keep the parser from getting
 523      confused.  If you try to do it without them, you get something like this:
 524
 525 <screen>
 526 SELECT new_emp().name;
 527 ERROR:  syntax error at or near "."
 528 LINE 1: SELECT new_emp().name;
 529                         ^
 530 </screen>
 531     </para>
 532
 533     <para>
 534      Another option is to use functional notation for extracting an attribute:
 535
 536 <screen>
 537 SELECT name(new_emp());
 538
 539  name
 540 ------
 541  None
 542 </screen>
 543
 544      As explained in <xref linkend="rowtypes-usage">, the field notation and
 545      functional notation are equivalent.
 546     </para>
 547
 548     <para>
 549      Another way to use a function returning a composite type is to pass the
 550      result to another function that accepts the correct row type as input:
 551
 552 <screen>
 553 CREATE FUNCTION getname(emp) RETURNS text AS $$
 554     SELECT $1.name;
 555 $$ LANGUAGE SQL;
 556
 557 SELECT getname(new_emp());
 558  getname
 559 ---------
 560  None
 561 (1 row)
 562 </screen>
 563     </para>
 564    </sect2>
 565
 566    <sect2 id="xfunc-output-parameters">
 567     <title><acronym>SQL</> Functions with Output Parameters</title>
 568
 569    <indexterm>
 570     <primary>function</primary>
 571     <secondary>output parameter</secondary>
 572    </indexterm>
 573
 574     <para>
 575      An alternative way of describing a function's results is to define it
 576      with <firstterm>output parameters</>, as in this example:
 577
 578 <screen>
 579 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
 580 AS 'SELECT x + y'
 581 LANGUAGE SQL;
 582
 583 SELECT add_em(3,7);
 584  add_em
 585 --------
 586      10
 587 (1 row)
 588 </screen>
 589
 590      This is not essentially different from the version of <literal>add_em</>
 591      shown in <xref linkend="xfunc-sql-base-functions">.  The real value of
 592      output parameters is that they provide a convenient way of defining
 593      functions that return several columns.  For example,
 594
 595 <screen>
 596 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
 597 AS 'SELECT x + y, x * y'
 598 LANGUAGE SQL;
 599
 600  SELECT * FROM sum_n_product(11,42);
 601  sum | product
 602 -----+---------
 603   53 |     462
 604 (1 row)
 605 </screen>
 606
 607      What has essentially happened here is that we have created an anonymous
 608      composite type for the result of the function.  The above example has
 609      the same end result as
 610
 611 <screen>
 612 CREATE TYPE sum_prod AS (sum int, product int);
 613
 614 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
 615 AS 'SELECT $1 + $2, $1 * $2'
 616 LANGUAGE SQL;
 617 </screen>
 618
 619      but not having to bother with the separate composite type definition
 620      is often handy.  Notice that the names attached to the output parameters
 621      are not just decoration, but determine the column names of the anonymous
 622      composite type.  (If you omit a name for an output parameter, the
 623      system will choose a name on its own.)
 624     </para>
 625
 626     <para>
 627      Notice that output parameters are not included in the calling argument
 628      list when invoking such a function from SQL.  This is because
 629      <productname>PostgreSQL</productname> considers only the input
 630      parameters to define the function's calling signature.  That means
 631      also that only the input parameters matter when referencing the function
 632      for purposes such as dropping it.  We could drop the above function
 633      with either of
 634
 635 <screen>
 636 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
 637 DROP FUNCTION sum_n_product (int, int);
 638 </screen>
 639     </para>
 640
 641     <para>
 642      Parameters can be marked as <literal>IN</> (the default),
 643      <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
 644      An <literal>INOUT</>
 645      parameter serves as both an input parameter (part of the calling
 646      argument list) and an output parameter (part of the result record type).
 647      <literal>VARIADIC</> parameters are input parameters, but are treated
 648      specially as described next.
 649     </para>
 650    </sect2>
 651
 652    <sect2 id="xfunc-sql-variadic-functions">
 653     <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
 654
 655     <indexterm>
 656      <primary>function</primary>
 657      <secondary>variadic</secondary>
 658     </indexterm>
 659
 660     <indexterm>
 661      <primary>variadic function</primary>
 662     </indexterm>
 663
 664     <para>
 665      <acronym>SQL</acronym> functions can be declared to accept
 666      variable numbers of arguments, so long as all the <quote>optional</>
 667      arguments are of the same data type.  The optional arguments will be
 668      passed to the function as an array.  The function is declared by
 669      marking the last parameter as <literal>VARIADIC</>; this parameter
 670      must be declared as being of an array type.  For example:
 671
 672 <screen>
 673 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
 674     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 675 $$ LANGUAGE SQL;
 676
 677 SELECT mleast(10, -1, 5, 4.4);
 678  mleast
 679 --------
 680      -1
 681 (1 row)
 682 </screen>
 683
 684      Effectively, all the actual arguments at or beyond the
 685      <literal>VARIADIC</> position are gathered up into a one-dimensional
 686      array, as if you had written
 687
 688 <screen>
 689 SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
 690 </screen>
 691
 692      You can't actually write that, though &mdash; or at least, it will
 693      not match this function definition.  A parameter marked
 694      <literal>VARIADIC</> matches one or more occurrences of its element
 695      type, not of its own type.
 696     </para>
 697
 698     <para>
 699      Sometimes it is useful to be able to pass an already-constructed array
 700      to a variadic function; this is particularly handy when one variadic
 701      function wants to pass on its array parameter to another one.  You can
 702      do that by specifying <literal>VARIADIC</> in the call:
 703
 704 <screen>
 705 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
 706 </screen>
 707
 708      This prevents expansion of the function's variadic parameter into its
 709      element type, thereby allowing the array argument value to match
 710      normally.  <literal>VARIADIC</> can only be attached to the last
 711      actual argument of a function call.
 712     </para>
 713
 714     <para>
 715      Specifying <literal>VARIADIC</> in the call is also the only way to
 716      pass an empty array to a variadic function, for example:
 717
 718 <screen>
 719 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
 720 </screen>
 721
 722      Simply writing <literal>SELECT mleast()</> does not work because a
 723      variadic parameter must match at least one actual argument.
 724      (You could define a second function also named <literal>mleast</>,
 725      with no parameters, if you wanted to allow such calls.)
 726     </para>
 727
 728     <para>
 729      The array element parameters generated from a variadic parameter are
 730      treated as not having any names of their own.  This means it is not
 731      possible to call a variadic function using named arguments (<xref
 732      linkend="sql-syntax-calling-funcs">), except when you specify
 733      <literal>VARIADIC</>.  For example, this will work:
 734
 735 <screen>
 736 SELECT mleast(VARIADIC arr =&gt; ARRAY[10, -1, 5, 4.4]);
 737 </screen>
 738
 739      but not these:
 740
 741 <screen>
 742 SELECT mleast(arr =&gt; 10);
 743 SELECT mleast(arr =&gt; ARRAY[10, -1, 5, 4.4]);
 744 </screen>
 745     </para>
 746    </sect2>
 747
 748    <sect2 id="xfunc-sql-parameter-defaults">
 749     <title><acronym>SQL</> Functions with Default Values for Arguments</title>
 750
 751     <indexterm>
 752      <primary>function</primary>
 753      <secondary>default values for arguments</secondary>
 754     </indexterm>
 755
 756     <para>
 757      Functions can be declared with default values for some or all input
 758      arguments.  The default values are inserted whenever the function is
 759      called with insufficiently many actual arguments.  Since arguments
 760      can only be omitted from the end of the actual argument list, all
 761      parameters after a parameter with a default value have to have
 762      default values as well.  (Although the use of named argument notation
 763      could allow this restriction to be relaxed, it's still enforced so that
 764      positional argument notation works sensibly.)
 765     </para>
 766
 767     <para>
 768      For example:
 769 <screen>
 770 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
 771 RETURNS int
 772 LANGUAGE SQL
 773 AS $$
 774     SELECT $1 + $2 + $3;
 775 $$;
 776
 777 SELECT foo(10, 20, 30);
 778  foo
 779 -----
 780   60
 781 (1 row)
 782
 783 SELECT foo(10, 20);
 784  foo
 785 -----
 786   33
 787 (1 row)
 788
 789 SELECT foo(10);
 790  foo
 791 -----
 792   15
 793 (1 row)
 794
 795 SELECT foo();  -- fails since there is no default for the first argument
 796 ERROR:  function foo() does not exist
 797 </screen>
 798      The <literal>=</literal> sign can also be used in place of the
 799      key word <literal>DEFAULT</literal>.
 800     </para>
 801    </sect2>
 802
 803    <sect2 id="xfunc-sql-table-functions">
 804     <title><acronym>SQL</acronym> Functions as Table Sources</title>
 805
 806     <para>
 807      All SQL functions can be used in the <literal>FROM</> clause of a query,
 808      but it is particularly useful for functions returning composite types.
 809      If the function is defined to return a base type, the table function
 810      produces a one-column table.  If the function is defined to return
 811      a composite type, the table function produces a column for each attribute
 812      of the composite type.
 813     </para>
 814
 815     <para>
 816      Here is an example:
 817
 818 <screen>
 819 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 820 INSERT INTO foo VALUES (1, 1, 'Joe');
 821 INSERT INTO foo VALUES (1, 2, 'Ed');
 822 INSERT INTO foo VALUES (2, 1, 'Mary');
 823
 824 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
 825     SELECT * FROM foo WHERE fooid = $1;
 826 $$ LANGUAGE SQL;
 827
 828 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
 829
 830  fooid | foosubid | fooname | upper
 831 -------+----------+---------+-------
 832      1 |        1 | Joe     | JOE
 833 (1 row)
 834 </screen>
 835
 836      As the example shows, we can work with the columns of the function's
 837      result just the same as if they were columns of a regular table.
 838     </para>
 839
 840     <para>
 841      Note that we only got one row out of the function.  This is because
 842      we did not use <literal>SETOF</>.  That is described in the next section.
 843     </para>
 844    </sect2>
 845
 846    <sect2 id="xfunc-sql-functions-returning-set">
 847     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 848
 849     <indexterm>
 850      <primary>function</primary>
 851      <secondary>with SETOF</secondary>
 852     </indexterm>
 853
 854     <para>
 855      When an SQL function is declared as returning <literal>SETOF
 856      <replaceable>sometype</></literal>, the function's final
 857      query is executed to completion, and each row it
 858      outputs is returned as an element of the result set.
 859     </para>
 860
 861     <para>
 862      This feature is normally used when calling the function in the <literal>FROM</>
 863      clause.  In this case each row returned by the function becomes
 864      a row of the table seen by the query.  For example, assume that
 865      table <literal>foo</> has the same contents as above, and we say:
 866
 867 <programlisting>
 868 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 869     SELECT * FROM foo WHERE fooid = $1;
 870 $$ LANGUAGE SQL;
 871
 872 SELECT * FROM getfoo(1) AS t1;
 873 </programlisting>
 874
 875      Then we would get:
 876 <screen>
 877  fooid | foosubid | fooname
 878 -------+----------+---------
 879      1 |        1 | Joe
 880      1 |        2 | Ed
 881 (2 rows)
 882 </screen>
 883     </para>
 884
 885     <para>
 886      It is also possible to return multiple rows with the columns defined by
 887      output parameters, like this:
 888
 889 <programlisting>
 890 CREATE TABLE tab (y int, z int);
 891 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
 892
 893 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
 894 RETURNS SETOF record
 895 AS $$
 896     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 897 $$ LANGUAGE SQL;
 898
 899 SELECT * FROM sum_n_product_with_tab(10);
 900  sum | product
 901 -----+---------
 902   11 |      10
 903   13 |      30
 904   15 |      50
 905   17 |      70
 906 (4 rows)
 907 </programlisting>
 908
 909      The key point here is that you must write <literal>RETURNS SETOF record</>
 910      to indicate that the function returns multiple rows instead of just one.
 911      If there is only one output parameter, write that parameter's type
 912      instead of <type>record</>.
 913     </para>
 914
 915     <para>
 916      It is frequently useful to construct a query's result by invoking a
 917      set-returning function multiple times, with the parameters for each
 918      invocation coming from successive rows of a table or subquery.  The
 919      preferred way to do this is to use the <literal>LATERAL</> key word,
 920      which is described in <xref linkend="queries-lateral">.
 921      Here is an example using a set-returning function to enumerate
 922      elements of a tree structure:
 923
 924 <screen>
 925 SELECT * FROM nodes;
 926    name    | parent
 927 -----------+--------
 928  Top       |
 929  Child1    | Top
 930  Child2    | Top
 931  Child3    | Top
 932  SubChild1 | Child1
 933  SubChild2 | Child1
 934 (6 rows)
 935
 936 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
 937     SELECT name FROM nodes WHERE parent = $1
 938 $$ LANGUAGE SQL STABLE;
 939
 940 SELECT * FROM listchildren('Top');
 941  listchildren
 942 --------------
 943  Child1
 944  Child2
 945  Child3
 946 (3 rows)
 947
 948 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
 949   name  |   child
 950 --------+-----------
 951  Top    | Child1
 952  Top    | Child2
 953  Top    | Child3
 954  Child1 | SubChild1
 955  Child1 | SubChild2
 956 (5 rows)
 957 </screen>
 958
 959      This example does not do anything that we couldn't have done with a
 960      simple join, but in more complex calculations the option to put
 961      some of the work into a function can be quite convenient.
 962     </para>
 963
 964     <para>
 965      Functions returning sets can also be called in the select list
 966      of a query.  For each row that the query
 967      generates by itself, the set-returning function is invoked, and an output
 968      row is generated for each element of the function's result set.
 969      The previous example could also be done with queries like
 970      these:
 971
 972 <screen>
 973 SELECT listchildren('Top');
 974  listchildren
 975 --------------
 976  Child1
 977  Child2
 978  Child3
 979 (3 rows)
 980
 981 SELECT name, listchildren(name) FROM nodes;
 982   name  | listchildren
 983 --------+--------------
 984  Top    | Child1
 985  Top    | Child2
 986  Top    | Child3
 987  Child1 | SubChild1
 988  Child1 | SubChild2
 989 (5 rows)
 990 </screen>
 991
 992      In the last <command>SELECT</command>,
 993      notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
 994      This happens because <function>listchildren</function> returns an empty set
 995      for those arguments, so no result rows are generated.  This is the same
 996      behavior as we got from an inner join to the function result when using
 997      the <literal>LATERAL</> syntax.
 998     </para>
 999
1000     <para>
1001      <productname>PostgreSQL</>'s behavior for a set-returning function in a
1002      query's select list is almost exactly the same as if the set-returning
1003      function had been written in a <literal>LATERAL FROM</>-clause item
1004      instead.  For example,
1005 <programlisting>
1006 SELECT x, generate_series(1,5) AS g FROM tab;
1007 </programlisting>
1008      is almost equivalent to
1009 <programlisting>
1010 SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
1011 </programlisting>
1012      It would be exactly the same, except that in this specific example,
1013      the planner could choose to put <structname>g</> on the outside of the
1014      nestloop join, since <structname>g</> has no actual lateral dependency
1015      on <structname>tab</>.  That would result in a different output row
1016      order.  Set-returning functions in the select list are always evaluated
1017      as though they are on the inside of a nestloop join with the rest of
1018      the <literal>FROM</> clause, so that the function(s) are run to
1019      completion before the next row from the <literal>FROM</> clause is
1020      considered.
1021     </para>
1022
1023     <para>
1024      If there is more than one set-returning function in the query's select
1025      list, the behavior is similar to what you get from putting the functions
1026      into a single <literal>LATERAL ROWS FROM( ... )</> <literal>FROM</>-clause
1027      item.  For each row from the underlying query, there is an output row
1028      using the first result from each function, then an output row using the
1029      second result, and so on.  If some of the set-returning functions
1030      produce fewer outputs than others, null values are substituted for the
1031      missing data, so that the total number of rows emitted for one
1032      underlying row is the same as for the set-returning function that
1033      produced the most outputs.  Thus the set-returning functions
1034      run <quote>in lockstep</> until they are all exhausted, and then
1035      execution continues with the next underlying row.
1036     </para>
1037
1038     <para>
1039      Set-returning functions can be nested in a select list, although that is
1040      not allowed in <literal>FROM</>-clause items.  In such cases, each level
1041      of nesting is treated separately, as though it were
1042      a separate <literal>LATERAL ROWS FROM( ... )</> item.  For example, in
1043 <programlisting>
1044 SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
1045 </programlisting>
1046      the set-returning functions <function>srf2</>, <function>srf3</>,
1047      and <function>srf5</> would be run in lockstep for each row
1048      of <structname>tab</>, and then <function>srf1</> and <function>srf4</>
1049      would be applied in lockstep to each row produced by the lower
1050      functions.
1051     </para>
1052
1053     <para>
1054      Set-returning functions cannot be used within conditional-evaluation
1055      constructs, such as <literal>CASE</> or <literal>COALESCE</>.  For
1056      example, consider
1057 <programlisting>
1058 SELECT x, CASE WHEN x &gt; 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
1059 </programlisting>
1060      It might seem that this should produce five repetitions of input rows
1061      that have <literal>x &gt; 0</>, and a single repetition of those that do
1062      not; but actually, because <function>generate_series(1, 5)</> would be
1063      run in an implicit <literal>LATERAL FROM</> item before
1064      the <literal>CASE</> expression is ever evaluated, it would produce five
1065      repetitions of every input row.  To reduce confusion, such cases produce
1066      a parse-time error instead.
1067     </para>
1068
1069     <note>
1070      <para>
1071       If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1072       or <command>DELETE</> with <literal>RETURNING</>, that command will
1073       always be executed to completion, even if the function is not declared
1074       with <literal>SETOF</> or the calling query does not fetch all the
1075       result rows.  Any extra rows produced by the <literal>RETURNING</>
1076       clause are silently dropped, but the commanded table modifications
1077       still happen (and are all completed before returning from the function).
1078      </para>
1079     </note>
1080
1081     <note>
1082      <para>
1083       Before <productname>PostgreSQL</> 10, putting more than one
1084       set-returning function in the same select list did not behave very
1085       sensibly unless they always produced equal numbers of rows.  Otherwise,
1086       what you got was a number of output rows equal to the least common
1087       multiple of the numbers of rows produced by the set-returning
1088       functions.  Also, nested set-returning functions did not work as
1089       described above; instead, a set-returning function could have at most
1090       one set-returning argument, and each nest of set-returning functions
1091       was run independently.  Also, conditional execution (set-returning
1092       functions inside <literal>CASE</> etc) was previously allowed,
1093       complicating things even more.
1094       Use of the <literal>LATERAL</> syntax is recommended when writing
1095       queries that need to work in older <productname>PostgreSQL</> versions,
1096       because that will give consistent results across different versions.
1097       If you have a query that is relying on conditional execution of a
1098       set-returning function, you may be able to fix it by moving the
1099       conditional test into a custom set-returning function.  For example,
1100 <programlisting>
1101 SELECT x, CASE WHEN y &gt; 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
1102 </programlisting>
1103       could become
1104 <programlisting>
1105 CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
1106   RETURNS SETOF int AS $$
1107 BEGIN
1108   IF cond THEN
1109     RETURN QUERY SELECT generate_series(start, fin);
1110   ELSE
1111     RETURN QUERY SELECT els;
1112   END IF;
1113 END$$ LANGUAGE plpgsql;
1114
1115 SELECT x, case_generate_series(y &gt; 0, 1, z, 5) FROM tab;
1116 </programlisting>
1117       This formulation will work the same in all versions
1118       of <productname>PostgreSQL</>.
1119      </para>
1120     </note>
1121    </sect2>
1122
1123    <sect2 id="xfunc-sql-functions-returning-table">
1124     <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1125
1126     <indexterm>
1127      <primary>function</primary>
1128      <secondary>RETURNS TABLE</secondary>
1129     </indexterm>
1130
1131     <para>
1132      There is another way to declare a function as returning a set,
1133      which is to use the syntax
1134      <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1135      This is equivalent to using one or more <literal>OUT</> parameters plus
1136      marking the function as returning <literal>SETOF record</> (or
1137      <literal>SETOF</> a single output parameter's type, as appropriate).
1138      This notation is specified in recent versions of the SQL standard, and
1139      thus may be more portable than using <literal>SETOF</>.
1140     </para>
1141
1142     <para>
1143      For example, the preceding sum-and-product example could also be
1144      done this way:
1145
1146 <programlisting>
1147 CREATE FUNCTION sum_n_product_with_tab (x int)
1148 RETURNS TABLE(sum int, product int) AS $$
1149     SELECT $1 + tab.y, $1 * tab.y FROM tab;
1150 $$ LANGUAGE SQL;
1151 </programlisting>
1152
1153      It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1154      parameters with the <literal>RETURNS TABLE</> notation &mdash; you must
1155      put all the output columns in the <literal>TABLE</> list.
1156     </para>
1157    </sect2>
1158
1159    <sect2>
1160     <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1161
1162     <para>
1163      <acronym>SQL</acronym> functions can be declared to accept and
1164      return the polymorphic types <type>anyelement</type>,
1165      <type>anyarray</type>, <type>anynonarray</type>,
1166      <type>anyenum</type>, and <type>anyrange</type>.  See <xref
1167      linkend="extend-types-polymorphic"> for a more detailed
1168      explanation of polymorphic functions. Here is a polymorphic
1169      function <function>make_array</function> that builds up an array
1170      from two arbitrary data type elements:
1171 <screen>
1172 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1173     SELECT ARRAY[$1, $2];
1174 $$ LANGUAGE SQL;
1175
1176 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1177  intarray | textarray
1178 ----------+-----------
1179  {1,2}    | {a,b}
1180 (1 row)
1181 </screen>
1182     </para>
1183
1184     <para>
1185      Notice the use of the typecast <literal>'a'::text</literal>
1186      to specify that the argument is of type <type>text</type>. This is
1187      required if the argument is just a string literal, since otherwise
1188      it would be treated as type
1189      <type>unknown</type>, and array of <type>unknown</type> is not a valid
1190      type.
1191      Without the typecast, you will get errors like this:
1192 <screen>
1193 <computeroutput>
1194 ERROR:  could not determine polymorphic type because input has type "unknown"
1195 </computeroutput>
1196 </screen>
1197     </para>
1198
1199     <para>
1200      It is permitted to have polymorphic arguments with a fixed
1201      return type, but the converse is not. For example:
1202 <screen>
1203 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1204     SELECT $1 &gt; $2;
1205 $$ LANGUAGE SQL;
1206
1207 SELECT is_greater(1, 2);
1208  is_greater
1209 ------------
1210  f
1211 (1 row)
1212
1213 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1214     SELECT 1;
1215 $$ LANGUAGE SQL;
1216 ERROR:  cannot determine result data type
1217 DETAIL:  A function returning a polymorphic type must have at least one polymorphic argument.
1218 </screen>
1219     </para>
1220
1221     <para>
1222      Polymorphism can be used with functions that have output arguments.
1223      For example:
1224 <screen>
1225 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1226 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1227
1228 SELECT * FROM dup(22);
1229  f2 |   f3
1230 ----+---------
1231  22 | {22,22}
1232 (1 row)
1233 </screen>
1234     </para>
1235
1236     <para>
1237      Polymorphism can also be used with variadic functions.
1238      For example:
1239 <screen>
1240 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1241     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1242 $$ LANGUAGE SQL;
1243
1244 SELECT anyleast(10, -1, 5, 4);
1245  anyleast
1246 ----------
1247        -1
1248 (1 row)
1249
1250 SELECT anyleast('abc'::text, 'def');
1251  anyleast
1252 ----------
1253  abc
1254 (1 row)
1255
1256 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1257     SELECT array_to_string($2, $1);
1258 $$ LANGUAGE SQL;
1259
1260 SELECT concat_values('|', 1, 4, 2);
1261  concat_values
1262 ---------------
1263  1|4|2
1264 (1 row)
1265 </screen>
1266     </para>
1267    </sect2>
1268
1269    <sect2>
1270     <title><acronym>SQL</acronym> Functions with Collations</title>
1271
1272     <indexterm>
1273      <primary>collation</>
1274      <secondary>in SQL functions</>
1275     </indexterm>
1276
1277     <para>
1278      When a SQL function has one or more parameters of collatable data types,
1279      a collation is identified for each function call depending on the
1280      collations assigned to the actual arguments, as described in <xref
1281      linkend="collation">.  If a collation is successfully identified
1282      (i.e., there are no conflicts of implicit collations among the arguments)
1283      then all the collatable parameters are treated as having that collation
1284      implicitly.  This will affect the behavior of collation-sensitive
1285      operations within the function.  For example, using the
1286      <function>anyleast</> function described above, the result of
1287 <programlisting>
1288 SELECT anyleast('abc'::text, 'ABC');
1289 </programlisting>
1290      will depend on the database's default collation.  In <literal>C</> locale
1291      the result will be <literal>ABC</>, but in many other locales it will
1292      be <literal>abc</>.  The collation to use can be forced by adding
1293      a <literal>COLLATE</> clause to any of the arguments, for example
1294 <programlisting>
1295 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1296 </programlisting>
1297      Alternatively, if you wish a function to operate with a particular
1298      collation regardless of what it is called with, insert
1299      <literal>COLLATE</> clauses as needed in the function definition.
1300      This version of <function>anyleast</> would always use <literal>en_US</>
1301      locale to compare strings:
1302 <programlisting>
1303 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1304     SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1305 $$ LANGUAGE SQL;
1306 </programlisting>
1307      But note that this will throw an error if applied to a non-collatable
1308      data type.
1309     </para>
1310
1311     <para>
1312      If no common collation can be identified among the actual arguments,
1313      then a SQL function treats its parameters as having their data types'
1314      default collation (which is usually the database's default collation,
1315      but could be different for parameters of domain types).
1316     </para>
1317
1318     <para>
1319      The behavior of collatable parameters can be thought of as a limited
1320      form of polymorphism, applicable only to textual data types.
1321     </para>
1322    </sect2>
1323   </sect1>
1324
1325   <sect1 id="xfunc-overload">
1326    <title>Function Overloading</title>
1327
1328    <indexterm zone="xfunc-overload">
1329     <primary>overloading</primary>
1330     <secondary>functions</secondary>
1331    </indexterm>
1332
1333    <para>
1334     More than one function can be defined with the same SQL name, so long
1335     as the arguments they take are different.  In other words,
1336     function names can be <firstterm>overloaded</firstterm>.  When a
1337     query is executed, the server will determine which function to
1338     call from the data types and the number of the provided arguments.
1339     Overloading can also be used to simulate functions with a variable
1340     number of arguments, up to a finite maximum number.
1341    </para>
1342
1343    <para>
1344     When creating a family of overloaded functions, one should be
1345     careful not to create ambiguities.  For instance, given the
1346     functions:
1347 <programlisting>
1348 CREATE FUNCTION test(int, real) RETURNS ...
1349 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1350 </programlisting>
1351     it is not immediately clear which function would be called with
1352     some trivial input like <literal>test(1, 1.5)</literal>.  The
1353     currently implemented resolution rules are described in
1354     <xref linkend="typeconv">, but it is unwise to design a system that subtly
1355     relies on this behavior.
1356    </para>
1357
1358    <para>
1359     A function that takes a single argument of a composite type should
1360     generally not have the same name as any attribute (field) of that type.
1361     Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1362     is considered equivalent
1363     to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1364     In the case that there is an
1365     ambiguity between a function on a composite type and an attribute of
1366     the composite type, the attribute will always be used.  It is possible
1367     to override that choice by schema-qualifying the function name
1368     (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1369     </literal>) but it's better to
1370     avoid the problem by not choosing conflicting names.
1371    </para>
1372
1373    <para>
1374     Another possible conflict is between variadic and non-variadic functions.
1375     For instance, it is possible to create both <literal>foo(numeric)</> and
1376     <literal>foo(VARIADIC numeric[])</>.  In this case it is unclear which one
1377     should be matched to a call providing a single numeric argument, such as
1378     <literal>foo(10.1)</>.  The rule is that the function appearing
1379     earlier in the search path is used, or if the two functions are in the
1380     same schema, the non-variadic one is preferred.
1381    </para>
1382
1383    <para>
1384     When overloading C-language functions, there is an additional
1385     constraint: The C name of each function in the family of
1386     overloaded functions must be different from the C names of all
1387     other functions, either internal or dynamically loaded.  If this
1388     rule is violated, the behavior is not portable.  You might get a
1389     run-time linker error, or one of the functions will get called
1390     (usually the internal one).  The alternative form of the
1391     <literal>AS</> clause for the SQL <command>CREATE
1392     FUNCTION</command> command decouples the SQL function name from
1393     the function name in the C source code.  For instance:
1394 <programlisting>
1395 CREATE FUNCTION test(int) RETURNS int
1396     AS '<replaceable>filename</>', 'test_1arg'
1397     LANGUAGE C;
1398 CREATE FUNCTION test(int, int) RETURNS int
1399     AS '<replaceable>filename</>', 'test_2arg'
1400     LANGUAGE C;
1401 </programlisting>
1402     The names of the C functions here reflect one of many possible conventions.
1403    </para>
1404   </sect1>
1405
1406   <sect1 id="xfunc-volatility">
1407    <title>Function Volatility Categories</title>
1408
1409    <indexterm zone="xfunc-volatility">
1410     <primary>volatility</primary>
1411     <secondary>functions</secondary>
1412    </indexterm>
1413    <indexterm zone="xfunc-volatility">
1414     <primary>VOLATILE</primary>
1415    </indexterm>
1416    <indexterm zone="xfunc-volatility">
1417     <primary>STABLE</primary>
1418    </indexterm>
1419    <indexterm zone="xfunc-volatility">
1420     <primary>IMMUTABLE</primary>
1421    </indexterm>
1422
1423    <para>
1424     Every function has a <firstterm>volatility</> classification, with
1425     the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1426     <literal>IMMUTABLE</>.  <literal>VOLATILE</> is the default if the
1427     <xref linkend="sql-createfunction">
1428     command does not specify a category.  The volatility category is a
1429     promise to the optimizer about the behavior of the function:
1430
1431    <itemizedlist>
1432     <listitem>
1433      <para>
1434       A <literal>VOLATILE</> function can do anything, including modifying
1435       the database.  It can return different results on successive calls with
1436       the same arguments.  The optimizer makes no assumptions about the
1437       behavior of such functions.  A query using a volatile function will
1438       re-evaluate the function at every row where its value is needed.
1439      </para>
1440     </listitem>
1441     <listitem>
1442      <para>
1443       A <literal>STABLE</> function cannot modify the database and is
1444       guaranteed to return the same results given the same arguments
1445       for all rows within a single statement. This category allows the
1446       optimizer to optimize multiple calls of the function to a single
1447       call. In particular, it is safe to use an expression containing
1448       such a function in an index scan condition. (Since an index scan
1449       will evaluate the comparison value only once, not once at each
1450       row, it is not valid to use a <literal>VOLATILE</> function in an
1451       index scan condition.)
1452      </para>
1453     </listitem>
1454     <listitem>
1455      <para>
1456       An <literal>IMMUTABLE</> function cannot modify the database and is
1457       guaranteed to return the same results given the same arguments forever.
1458       This category allows the optimizer to pre-evaluate the function when
1459       a query calls it with constant arguments.  For example, a query like
1460       <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1461       <literal>SELECT ... WHERE x = 4</>, because the function underlying
1462       the integer addition operator is marked <literal>IMMUTABLE</>.
1463      </para>
1464     </listitem>
1465    </itemizedlist>
1466    </para>
1467
1468    <para>
1469     For best optimization results, you should label your functions with the
1470     strictest volatility category that is valid for them.
1471    </para>
1472
1473    <para>
1474     Any function with side-effects <emphasis>must</> be labeled
1475     <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1476     Even a function with no side-effects needs to be labeled
1477     <literal>VOLATILE</> if its value can change within a single query;
1478     some examples are <literal>random()</>, <literal>currval()</>,
1479     <literal>timeofday()</>.
1480    </para>
1481
1482    <para>
1483     Another important example is that the <function>current_timestamp</>
1484     family of functions qualify as <literal>STABLE</>, since their values do
1485     not change within a transaction.
1486    </para>
1487
1488    <para>
1489     There is relatively little difference between <literal>STABLE</> and
1490     <literal>IMMUTABLE</> categories when considering simple interactive
1491     queries that are planned and immediately executed: it doesn't matter
1492     a lot whether a function is executed once during planning or once during
1493     query execution startup.  But there is a big difference if the plan is
1494     saved and reused later.  Labeling a function <literal>IMMUTABLE</> when
1495     it really isn't might allow it to be prematurely folded to a constant during
1496     planning, resulting in a stale value being re-used during subsequent uses
1497     of the plan.  This is a hazard when using prepared statements or when
1498     using function languages that cache plans (such as
1499     <application>PL/pgSQL</>).
1500    </para>
1501
1502    <para>
1503     For functions written in SQL or in any of the standard procedural
1504     languages, there is a second important property determined by the
1505     volatility category, namely the visibility of any data changes that have
1506     been made by the SQL command that is calling the function.  A
1507     <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1508     or <literal>IMMUTABLE</> function will not.  This behavior is implemented
1509     using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1510     <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1511     established as of the start of the calling query, whereas
1512     <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1513     each query they execute.
1514    </para>
1515
1516    <note>
1517     <para>
1518      Functions written in C can manage snapshots however they want, but it's
1519      usually a good idea to make C functions work this way too.
1520     </para>
1521    </note>
1522
1523    <para>
1524     Because of this snapshotting behavior,
1525     a function containing only <command>SELECT</> commands can safely be
1526     marked <literal>STABLE</>, even if it selects from tables that might be
1527     undergoing modifications by concurrent queries.
1528     <productname>PostgreSQL</productname> will execute all commands of a
1529     <literal>STABLE</> function using the snapshot established for the
1530     calling query, and so it will see a fixed view of the database throughout
1531     that query.
1532    </para>
1533
1534    <para>
1535     The same snapshotting behavior is used for <command>SELECT</> commands
1536     within <literal>IMMUTABLE</> functions.  It is generally unwise to select
1537     from database tables within an <literal>IMMUTABLE</> function at all,
1538     since the immutability will be broken if the table contents ever change.
1539     However, <productname>PostgreSQL</productname> does not enforce that you
1540     do not do that.
1541    </para>
1542
1543    <para>
1544     A common error is to label a function <literal>IMMUTABLE</> when its
1545     results depend on a configuration parameter.  For example, a function
1546     that manipulates timestamps might well have results that depend on the
1547     <xref linkend="guc-timezone"> setting.  For safety, such functions should
1548     be labeled <literal>STABLE</> instead.
1549    </para>
1550
1551    <note>
1552     <para>
1553      <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1554      and <literal>IMMUTABLE</> functions contain no SQL commands other
1555      than <command>SELECT</> to prevent data modification.
1556      (This is not a completely bulletproof test, since such functions could
1557      still call <literal>VOLATILE</> functions that modify the database.
1558      If you do that, you will find that the <literal>STABLE</> or
1559      <literal>IMMUTABLE</> function does not notice the database changes
1560      applied by the called function, since they are hidden from its snapshot.)
1561     </para>
1562    </note>
1563   </sect1>
1564
1565   <sect1 id="xfunc-pl">
1566    <title>Procedural Language Functions</title>
1567
1568    <para>
1569     <productname>PostgreSQL</productname> allows user-defined functions
1570     to be written in other languages besides SQL and C.  These other
1571     languages are generically called <firstterm>procedural
1572     languages</firstterm> (<acronym>PL</>s).
1573     Procedural languages aren't built into the
1574     <productname>PostgreSQL</productname> server; they are offered
1575     by loadable modules.
1576     See <xref linkend="xplang"> and following chapters for more
1577     information.
1578    </para>
1579   </sect1>
1580
1581   <sect1 id="xfunc-internal">
1582    <title>Internal Functions</title>
1583
1584    <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1585
1586    <para>
1587     Internal functions are functions written in C that have been statically
1588     linked into the <productname>PostgreSQL</productname> server.
1589     The <quote>body</quote> of the function definition
1590     specifies the C-language name of the function, which need not be the
1591     same as the name being declared for SQL use.
1592     (For reasons of backward compatibility, an empty body
1593     is accepted as meaning that the C-language function name is the
1594     same as the SQL name.)
1595    </para>
1596
1597    <para>
1598     Normally, all internal functions present in the
1599     server are declared during the initialization of the database cluster
1600     (see <xref linkend="creating-cluster">),
1601     but a user could use <command>CREATE FUNCTION</command>
1602     to create additional alias names for an internal function.
1603     Internal functions are declared in <command>CREATE FUNCTION</command>
1604     with language name <literal>internal</literal>.  For instance, to
1605     create an alias for the <function>sqrt</function> function:
1606 <programlisting>
1607 CREATE FUNCTION square_root(double precision) RETURNS double precision
1608     AS 'dsqrt'
1609     LANGUAGE internal
1610     STRICT;
1611 </programlisting>
1612     (Most internal functions expect to be declared <quote>strict</quote>.)
1613    </para>
1614
1615    <note>
1616     <para>
1617      Not all <quote>predefined</quote> functions are
1618      <quote>internal</quote> in the above sense.  Some predefined
1619      functions are written in SQL.
1620     </para>
1621    </note>
1622   </sect1>
1623
1624   <sect1 id="xfunc-c">
1625    <title>C-Language Functions</title>
1626
1627    <indexterm zone="xfunc-c">
1628     <primary>function</primary>
1629     <secondary>user-defined</secondary>
1630     <tertiary>in C</tertiary>
1631    </indexterm>
1632
1633    <para>
1634     User-defined functions can be written in C (or a language that can
1635     be made compatible with C, such as C++).  Such functions are
1636     compiled into dynamically loadable objects (also called shared
1637     libraries) and are loaded by the server on demand.  The dynamic
1638     loading feature is what distinguishes <quote>C language</> functions
1639     from <quote>internal</> functions &mdash; the actual coding conventions
1640     are essentially the same for both.  (Hence, the standard internal
1641     function library is a rich source of coding examples for user-defined
1642     C functions.)
1643    </para>
1644
1645    <para>
1646     Currently only one calling convention is used for C functions
1647     (<quote>version 1</quote>). Support for that calling convention is
1648     indicated by writing a <literal>PG_FUNCTION_INFO_V1()</literal> macro
1649     call for the function, as illustrated below.
1650    </para>
1651
1652   <sect2 id="xfunc-c-dynload">
1653    <title>Dynamic Loading</title>
1654
1655    <indexterm zone="xfunc-c-dynload">
1656     <primary>dynamic loading</primary>
1657    </indexterm>
1658
1659    <para>
1660     The first time a user-defined function in a particular
1661     loadable object file is called in a session,
1662     the dynamic loader loads that object file into memory so that the
1663     function can be called.  The <command>CREATE FUNCTION</command>
1664     for a user-defined C function must therefore specify two pieces of
1665     information for the function: the name of the loadable
1666     object file, and the C name (link symbol) of the specific function to call
1667     within that object file.  If the C name is not explicitly specified then
1668     it is assumed to be the same as the SQL function name.
1669    </para>
1670
1671    <para>
1672     The following algorithm is used to locate the shared object file
1673     based on the name given in the <command>CREATE FUNCTION</command>
1674     command:
1675
1676     <orderedlist>
1677      <listitem>
1678       <para>
1679        If the name is an absolute path, the given file is loaded.
1680       </para>
1681      </listitem>
1682
1683      <listitem>
1684       <para>
1685        If the name starts with the string <literal>$libdir</literal>,
1686        that part is replaced by the <productname>PostgreSQL</> package
1687         library directory
1688        name, which is determined at build time.<indexterm><primary>$libdir</></>
1689       </para>
1690      </listitem>
1691
1692      <listitem>
1693       <para>
1694        If the name does not contain a directory part, the file is
1695        searched for in the path specified by the configuration variable
1696        <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1697       </para>
1698      </listitem>
1699
1700      <listitem>
1701       <para>
1702        Otherwise (the file was not found in the path, or it contains a
1703        non-absolute directory part), the dynamic loader will try to
1704        take the name as given, which will most likely fail.  (It is
1705        unreliable to depend on the current working directory.)
1706       </para>
1707      </listitem>
1708     </orderedlist>
1709
1710     If this sequence does not work, the platform-specific shared
1711     library file name extension (often <filename>.so</filename>) is
1712     appended to the given name and this sequence is tried again.  If
1713     that fails as well, the load will fail.
1714    </para>
1715
1716    <para>
1717     It is recommended to locate shared libraries either relative to
1718     <literal>$libdir</literal> or through the dynamic library path.
1719     This simplifies version upgrades if the new installation is at a
1720     different location.  The actual directory that
1721     <literal>$libdir</literal> stands for can be found out with the
1722     command <literal>pg_config --pkglibdir</literal>.
1723    </para>
1724
1725    <para>
1726     The user ID the <productname>PostgreSQL</productname> server runs
1727     as must be able to traverse the path to the file you intend to
1728     load.  Making the file or a higher-level directory not readable
1729     and/or not executable by the <systemitem>postgres</systemitem>
1730     user is a common mistake.
1731    </para>
1732
1733    <para>
1734     In any case, the file name that is given in the
1735     <command>CREATE FUNCTION</command> command is recorded literally
1736     in the system catalogs, so if the file needs to be loaded again
1737     the same procedure is applied.
1738    </para>
1739
1740    <note>
1741     <para>
1742      <productname>PostgreSQL</productname> will not compile a C function
1743      automatically.  The object file must be compiled before it is referenced
1744      in a <command>CREATE
1745      FUNCTION</> command.  See <xref linkend="dfunc"> for additional
1746      information.
1747     </para>
1748    </note>
1749
1750    <indexterm zone="xfunc-c-dynload">
1751     <primary>magic block</primary>
1752    </indexterm>
1753
1754    <para>
1755     To ensure that a dynamically loaded object file is not loaded into an
1756     incompatible server, <productname>PostgreSQL</productname> checks that the
1757     file contains a <quote>magic block</> with the appropriate contents.
1758     This allows the server to detect obvious incompatibilities, such as code
1759     compiled for a different major version of
1760     <productname>PostgreSQL</productname>.  A magic block is required as of
1761     <productname>PostgreSQL</productname> 8.2.  To include a magic block,
1762     write this in one (and only one) of the module source files, after having
1763     included the header <filename>fmgr.h</>:
1764
1765 <programlisting>
1766 #ifdef PG_MODULE_MAGIC
1767 PG_MODULE_MAGIC;
1768 #endif
1769 </programlisting>
1770
1771     The <literal>#ifdef</> test can be omitted if the code doesn't
1772     need to compile against pre-8.2 <productname>PostgreSQL</productname>
1773     releases.
1774    </para>
1775
1776    <para>
1777     After it is used for the first time, a dynamically loaded object
1778     file is retained in memory.  Future calls in the same session to
1779     the function(s) in that file will only incur the small overhead of
1780     a symbol table lookup.  If you need to force a reload of an object
1781     file, for example after recompiling it, begin a fresh session.
1782    </para>
1783
1784    <indexterm zone="xfunc-c-dynload">
1785     <primary>_PG_init</primary>
1786    </indexterm>
1787    <indexterm zone="xfunc-c-dynload">
1788     <primary>_PG_fini</primary>
1789    </indexterm>
1790    <indexterm zone="xfunc-c-dynload">
1791     <primary>library initialization function</primary>
1792    </indexterm>
1793    <indexterm zone="xfunc-c-dynload">
1794     <primary>library finalization function</primary>
1795    </indexterm>
1796
1797    <para>
1798     Optionally, a dynamically loaded file can contain initialization and
1799     finalization functions.  If the file includes a function named
1800     <function>_PG_init</>, that function will be called immediately after
1801     loading the file.  The function receives no parameters and should
1802     return void.  If the file includes a function named
1803     <function>_PG_fini</>, that function will be called immediately before
1804     unloading the file.  Likewise, the function receives no parameters and
1805     should return void.  Note that <function>_PG_fini</> will only be called
1806     during an unload of the file, not during process termination.
1807     (Presently, unloads are disabled and will never occur, but this may
1808     change in the future.)
1809    </para>
1810
1811   </sect2>
1812
1813    <sect2 id="xfunc-c-basetype">
1814     <title>Base Types in C-Language Functions</title>
1815
1816     <indexterm zone="xfunc-c-basetype">
1817      <primary>data type</primary>
1818      <secondary>internal organization</secondary>
1819     </indexterm>
1820
1821     <para>
1822      To know how to write C-language functions, you need to know how
1823      <productname>PostgreSQL</productname> internally represents base
1824      data types and how they can be passed to and from functions.
1825      Internally, <productname>PostgreSQL</productname> regards a base
1826      type as a <quote>blob of memory</quote>.  The user-defined
1827      functions that you define over a type in turn define the way that
1828      <productname>PostgreSQL</productname> can operate on it.  That
1829      is, <productname>PostgreSQL</productname> will only store and
1830      retrieve the data from disk and use your user-defined functions
1831      to input, process, and output the data.
1832     </para>
1833
1834     <para>
1835      Base types can have one of three internal formats:
1836
1837      <itemizedlist>
1838       <listitem>
1839        <para>
1840         pass by value, fixed-length
1841        </para>
1842       </listitem>
1843       <listitem>
1844        <para>
1845         pass by reference, fixed-length
1846        </para>
1847       </listitem>
1848       <listitem>
1849        <para>
1850         pass by reference, variable-length
1851        </para>
1852       </listitem>
1853      </itemizedlist>
1854     </para>
1855
1856     <para>
1857      By-value  types  can  only be 1, 2, or 4 bytes in length
1858      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1859      You should be careful to define your types such that they will be the
1860      same size (in bytes) on all architectures.  For example, the
1861      <literal>long</literal> type is dangerous because it is 4 bytes on some
1862      machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1863      on most Unix machines.  A reasonable implementation of the
1864      <type>int4</type> type on Unix machines might be:
1865
1866 <programlisting>
1867 /* 4-byte integer, passed by value */
1868 typedef int int4;
1869 </programlisting>
1870
1871      (The actual PostgreSQL C code calls this type <type>int32</type>, because
1872      it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1873      means <replaceable>XX</replaceable> <emphasis>bits</emphasis>.  Note
1874      therefore also that the C type <type>int8</type> is 1 byte in size.  The
1875      SQL type <type>int8</type> is called <type>int64</type> in C.  See also
1876      <xref linkend="xfunc-c-type-table">.)
1877     </para>
1878
1879     <para>
1880      On  the  other hand, fixed-length types of any size can
1881      be passed by-reference.  For example, here is a  sample
1882      implementation of a <productname>PostgreSQL</productname> type:
1883
1884 <programlisting>
1885 /* 16-byte structure, passed by reference */
1886 typedef struct
1887 {
1888     double  x, y;
1889 } Point;
1890 </programlisting>
1891
1892      Only  pointers  to  such types can be used when passing
1893      them in and out of <productname>PostgreSQL</productname> functions.
1894      To return a value of such a type, allocate the right amount of
1895      memory with <literal>palloc</literal>, fill in the allocated memory,
1896      and return a pointer to it.  (Also, if you just want to return the
1897      same value as one of your input arguments that's of the same data type,
1898      you can skip the extra <literal>palloc</literal> and just return the
1899      pointer to the input value.)
1900     </para>
1901
1902     <para>
1903      Finally, all variable-length types must also be  passed
1904      by  reference.   All  variable-length  types must begin
1905      with an opaque length field of exactly 4 bytes, which will be set
1906      by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1907      be  stored within that type must be located in the memory
1908      immediately  following  that  length  field.   The
1909      length field contains the total length of the structure,
1910      that is,  it  includes  the  size  of  the  length  field
1911      itself.
1912     </para>
1913
1914     <para>
1915      Another important point is to avoid leaving any uninitialized bits
1916      within data type values; for example, take care to zero out any
1917      alignment padding bytes that might be present in structs.  Without
1918      this, logically-equivalent constants of your data type might be
1919      seen as unequal by the planner, leading to inefficient (though not
1920      incorrect) plans.
1921     </para>
1922
1923     <warning>
1924      <para>
1925       <emphasis>Never</> modify the contents of a pass-by-reference input
1926       value.  If you do so you are likely to corrupt on-disk data, since
1927       the pointer you are given might point directly into a disk buffer.
1928       The sole exception to this rule is explained in
1929       <xref linkend="xaggr">.
1930      </para>
1931     </warning>
1932
1933     <para>
1934      As an example, we can define the type <type>text</type> as
1935      follows:
1936
1937 <programlisting>
1938 typedef struct {
1939     int32 length;
1940     char data[FLEXIBLE_ARRAY_MEMBER];
1941 } text;
1942 </programlisting>
1943
1944      The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1945      length of the data part is not specified by this declaration.
1946     </para>
1947
1948     <para>
1949      When manipulating
1950      variable-length types, we must  be  careful  to  allocate
1951      the  correct amount  of memory and set the length field correctly.
1952      For example, if we wanted to  store  40  bytes  in  a <structname>text</>
1953      structure, we might use a code fragment like this:
1954
1955 <programlisting><![CDATA[
1956 #include "postgres.h"
1957 ...
1958 char buffer[40]; /* our source data */
1959 ...
1960 text *destination = (text *) palloc(VARHDRSZ + 40);
1961 SET_VARSIZE(destination, VARHDRSZ + 40);
1962 memcpy(destination->data, buffer, 40);
1963 ...
1964 ]]>
1965 </programlisting>
1966
1967      <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1968      it's considered good style to use the macro <literal>VARHDRSZ</>
1969      to refer to the size of the overhead for a variable-length type.
1970      Also, the length field <emphasis>must</> be set using the
1971      <literal>SET_VARSIZE</> macro, not by simple assignment.
1972     </para>
1973
1974     <para>
1975      <xref linkend="xfunc-c-type-table"> specifies which C type
1976      corresponds to which SQL type when writing a C-language function
1977      that uses a built-in type of <productname>PostgreSQL</>.
1978      The <quote>Defined In</quote> column gives the header file that
1979      needs to be included to get the type definition.  (The actual
1980      definition might be in a different file that is included by the
1981      listed file.  It is recommended that users stick to the defined
1982      interface.)  Note that you should always include
1983      <filename>postgres.h</filename> first in any source file, because
1984      it declares a number of things that you will need anyway.
1985     </para>
1986
1987      <table tocentry="1" id="xfunc-c-type-table">
1988       <title>Equivalent C Types for Built-in SQL Types</title>
1989       <tgroup cols="3">
1990        <thead>
1991         <row>
1992          <entry>
1993           SQL Type
1994          </entry>
1995          <entry>
1996           C Type
1997          </entry>
1998          <entry>
1999           Defined In
2000          </entry>
2001         </row>
2002        </thead>
2003        <tbody>
2004         <row>
2005          <entry><type>abstime</type></entry>
2006          <entry><type>AbsoluteTime</type></entry>
2007          <entry><filename>utils/nabstime.h</filename></entry>
2008         </row>
2009         <row>
2010          <entry><type>bigint</type> (<type>int8</type>)</entry>
2011          <entry><type>int64</type></entry>
2012          <entry><filename>postgres.h</filename></entry>
2013         </row>
2014         <row>
2015          <entry><type>boolean</type></entry>
2016          <entry><type>bool</type></entry>
2017          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
2018         </row>
2019         <row>
2020          <entry><type>box</type></entry>
2021          <entry><type>BOX*</type></entry>
2022          <entry><filename>utils/geo_decls.h</filename></entry>
2023         </row>
2024         <row>
2025          <entry><type>bytea</type></entry>
2026          <entry><type>bytea*</type></entry>
2027          <entry><filename>postgres.h</filename></entry>
2028         </row>
2029         <row>
2030          <entry><type>"char"</type></entry>
2031          <entry><type>char</type></entry>
2032          <entry>(compiler built-in)</entry>
2033         </row>
2034         <row>
2035          <entry><type>character</type></entry>
2036          <entry><type>BpChar*</type></entry>
2037          <entry><filename>postgres.h</filename></entry>
2038         </row>
2039         <row>
2040          <entry><type>cid</type></entry>
2041          <entry><type>CommandId</type></entry>
2042          <entry><filename>postgres.h</filename></entry>
2043         </row>
2044         <row>
2045          <entry><type>date</type></entry>
2046          <entry><type>DateADT</type></entry>
2047          <entry><filename>utils/date.h</filename></entry>
2048         </row>
2049         <row>
2050          <entry><type>smallint</type> (<type>int2</type>)</entry>
2051          <entry><type>int16</type></entry>
2052          <entry><filename>postgres.h</filename></entry>
2053         </row>
2054         <row>
2055          <entry><type>int2vector</type></entry>
2056          <entry><type>int2vector*</type></entry>
2057          <entry><filename>postgres.h</filename></entry>
2058         </row>
2059         <row>
2060          <entry><type>integer</type> (<type>int4</type>)</entry>
2061          <entry><type>int32</type></entry>
2062          <entry><filename>postgres.h</filename></entry>
2063         </row>
2064         <row>
2065          <entry><type>real</type> (<type>float4</type>)</entry>
2066          <entry><type>float4*</type></entry>
2067         <entry><filename>postgres.h</filename></entry>
2068         </row>
2069         <row>
2070          <entry><type>double precision</type> (<type>float8</type>)</entry>
2071          <entry><type>float8*</type></entry>
2072          <entry><filename>postgres.h</filename></entry>
2073         </row>
2074         <row>
2075          <entry><type>interval</type></entry>
2076          <entry><type>Interval*</type></entry>
2077          <entry><filename>datatype/timestamp.h</filename></entry>
2078         </row>
2079         <row>
2080          <entry><type>lseg</type></entry>
2081          <entry><type>LSEG*</type></entry>
2082          <entry><filename>utils/geo_decls.h</filename></entry>
2083         </row>
2084         <row>
2085          <entry><type>name</type></entry>
2086          <entry><type>Name</type></entry>
2087          <entry><filename>postgres.h</filename></entry>
2088         </row>
2089         <row>
2090          <entry><type>oid</type></entry>
2091          <entry><type>Oid</type></entry>
2092          <entry><filename>postgres.h</filename></entry>
2093         </row>
2094         <row>
2095          <entry><type>oidvector</type></entry>
2096          <entry><type>oidvector*</type></entry>
2097          <entry><filename>postgres.h</filename></entry>
2098         </row>
2099         <row>
2100          <entry><type>path</type></entry>
2101          <entry><type>PATH*</type></entry>
2102          <entry><filename>utils/geo_decls.h</filename></entry>
2103         </row>
2104         <row>
2105          <entry><type>point</type></entry>
2106          <entry><type>POINT*</type></entry>
2107          <entry><filename>utils/geo_decls.h</filename></entry>
2108         </row>
2109         <row>
2110          <entry><type>regproc</type></entry>
2111          <entry><type>regproc</type></entry>
2112          <entry><filename>postgres.h</filename></entry>
2113         </row>
2114         <row>
2115          <entry><type>reltime</type></entry>
2116          <entry><type>RelativeTime</type></entry>
2117          <entry><filename>utils/nabstime.h</filename></entry>
2118         </row>
2119         <row>
2120          <entry><type>text</type></entry>
2121          <entry><type>text*</type></entry>
2122          <entry><filename>postgres.h</filename></entry>
2123         </row>
2124         <row>
2125          <entry><type>tid</type></entry>
2126          <entry><type>ItemPointer</type></entry>
2127          <entry><filename>storage/itemptr.h</filename></entry>
2128         </row>
2129         <row>
2130          <entry><type>time</type></entry>
2131          <entry><type>TimeADT</type></entry>
2132          <entry><filename>utils/date.h</filename></entry>
2133         </row>
2134         <row>
2135          <entry><type>time with time zone</type></entry>
2136          <entry><type>TimeTzADT</type></entry>
2137          <entry><filename>utils/date.h</filename></entry>
2138         </row>
2139         <row>
2140          <entry><type>timestamp</type></entry>
2141          <entry><type>Timestamp*</type></entry>
2142          <entry><filename>datatype/timestamp.h</filename></entry>
2143         </row>
2144         <row>
2145          <entry><type>tinterval</type></entry>
2146          <entry><type>TimeInterval</type></entry>
2147          <entry><filename>utils/nabstime.h</filename></entry>
2148         </row>
2149         <row>
2150          <entry><type>varchar</type></entry>
2151          <entry><type>VarChar*</type></entry>
2152          <entry><filename>postgres.h</filename></entry>
2153         </row>
2154         <row>
2155          <entry><type>xid</type></entry>
2156          <entry><type>TransactionId</type></entry>
2157          <entry><filename>postgres.h</filename></entry>
2158         </row>
2159        </tbody>
2160       </tgroup>
2161      </table>
2162
2163     <para>
2164      Now that we've gone over all of the possible structures
2165      for base types, we can show some examples of real functions.
2166     </para>
2167    </sect2>
2168
2169    <sect2>
2170     <title>Version 1 Calling Conventions</title>
2171
2172     <para>
2173      The version-1 calling convention relies on macros to suppress most
2174      of the complexity of passing arguments and results.  The C declaration
2175      of a version-1 function is always:
2176 <programlisting>
2177 Datum funcname(PG_FUNCTION_ARGS)
2178 </programlisting>
2179      In addition, the macro call:
2180 <programlisting>
2181 PG_FUNCTION_INFO_V1(funcname);
2182 </programlisting>
2183      must appear in the same source file.  (Conventionally, it's
2184      written just before the function itself.)  This macro call is not
2185      needed for <literal>internal</>-language functions, since
2186      <productname>PostgreSQL</> assumes that all internal functions
2187      use the version-1 convention.  It is, however, required for
2188      dynamically-loaded functions.
2189     </para>
2190
2191     <para>
2192      In a version-1 function, each actual argument is fetched using a
2193      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2194      macro that corresponds to the argument's data type.  In non-strict
2195      functions there needs to be a previous check about argument null-ness
2196      using <function>PG_ARGNULL_<replaceable>xxx</replaceable>()</function>.
2197      The result is returned using a
2198      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2199      macro for the return type.
2200      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2201      takes as its argument the number of the function argument to
2202      fetch, where the count starts at 0.
2203      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2204      takes as its argument the actual value to return.
2205     </para>
2206
2207     <para>
2208      Here are some examples using the version-1 calling convention:
2209     </para>
2210
2211 <programlisting><![CDATA[
2212 #include "postgres.h"
2213 #include <string.h>
2214 #include "fmgr.h"
2215 #include "utils/geo_decls.h"
2216
2217 #ifdef PG_MODULE_MAGIC
2218 PG_MODULE_MAGIC;
2219 #endif
2220
2221 /* by value */
2222
2223 PG_FUNCTION_INFO_V1(add_one);
2224
2225 Datum
2226 add_one(PG_FUNCTION_ARGS)
2227 {
2228     int32   arg = PG_GETARG_INT32(0);
2229
2230     PG_RETURN_INT32(arg + 1);
2231 }
2232
2233 /* by reference, fixed length */
2234
2235 PG_FUNCTION_INFO_V1(add_one_float8);
2236
2237 Datum
2238 add_one_float8(PG_FUNCTION_ARGS)
2239 {
2240     /* The macros for FLOAT8 hide its pass-by-reference nature. */
2241     float8   arg = PG_GETARG_FLOAT8(0);
2242
2243     PG_RETURN_FLOAT8(arg + 1.0);
2244 }
2245
2246 PG_FUNCTION_INFO_V1(makepoint);
2247
2248 Datum
2249 makepoint(PG_FUNCTION_ARGS)
2250 {
2251     /* Here, the pass-by-reference nature of Point is not hidden. */
2252     Point     *pointx = PG_GETARG_POINT_P(0);
2253     Point     *pointy = PG_GETARG_POINT_P(1);
2254     Point     *new_point = (Point *) palloc(sizeof(Point));
2255
2256     new_point->x = pointx->x;
2257     new_point->y = pointy->y;
2258
2259     PG_RETURN_POINT_P(new_point);
2260 }
2261
2262 /* by reference, variable length */
2263
2264 PG_FUNCTION_INFO_V1(copytext);
2265
2266 Datum
2267 copytext(PG_FUNCTION_ARGS)
2268 {
2269     text     *t = PG_GETARG_TEXT_PP(0);
2270
2271     /*
2272      * VARSIZE_ANY_EXHDR is the size of the struct in bytes, minus the
2273      * VARHDRSZ or VARHDRSZ_SHORT of its header.  Construct the copy with a
2274      * full-length header.
2275      */
2276     text     *new_t = (text *) palloc(VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2277     SET_VARSIZE(new_t, VARSIZE_ANY_EXHDR(t) + VARHDRSZ);
2278
2279     /*
2280      * VARDATA is a pointer to the data region of the new struct.  The source
2281      * could be a short datum, so retrieve its data through VARDATA_ANY.
2282      */
2283     memcpy((void *) VARDATA(new_t), /* destination */
2284            (void *) VARDATA_ANY(t), /* source */
2285            VARSIZE_ANY_EXHDR(t));   /* how many bytes */
2286     PG_RETURN_TEXT_P(new_t);
2287 }
2288
2289 PG_FUNCTION_INFO_V1(concat_text);
2290
2291 Datum
2292 concat_text(PG_FUNCTION_ARGS)
2293 {
2294     text  *arg1 = PG_GETARG_TEXT_PP(0);
2295     text  *arg2 = PG_GETARG_TEXT_PP(1);
2296     int32 arg1_size = VARSIZE_ANY_EXHDR(arg1);
2297     int32 arg2_size = VARSIZE_ANY_EXHDR(arg2);
2298     int32 new_text_size = arg1_size + arg2_size + VARHDRSZ;
2299     text *new_text = (text *) palloc(new_text_size);
2300
2301     SET_VARSIZE(new_text, new_text_size);
2302     memcpy(VARDATA(new_text), VARDATA_ANY(arg1), arg1_size);
2303     memcpy(VARDATA(new_text) + arg1_size, VARDATA_ANY(arg2), arg2_size);
2304     PG_RETURN_TEXT_P(new_text);
2305 }
2306 ]]>
2307 </programlisting>
2308
2309     <para>
2310      Supposing that the above code has been prepared in file
2311      <filename>funcs.c</filename> and compiled into a shared object,
2312      we could define the functions to <productname>PostgreSQL</productname>
2313      with commands like this:
2314     </para>
2315
2316 <programlisting>
2317 CREATE FUNCTION add_one(integer) RETURNS integer
2318      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2319      LANGUAGE C STRICT;
2320
2321 -- note overloading of SQL function name "add_one"
2322 CREATE FUNCTION add_one(double precision) RETURNS double precision
2323      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2324      LANGUAGE C STRICT;
2325
2326 CREATE FUNCTION makepoint(point, point) RETURNS point
2327      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2328      LANGUAGE C STRICT;
2329
2330 CREATE FUNCTION copytext(text) RETURNS text
2331      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2332      LANGUAGE C STRICT;
2333
2334 CREATE FUNCTION concat_text(text, text) RETURNS text
2335      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2336      LANGUAGE C STRICT;
2337 </programlisting>
2338
2339     <para>
2340      Here, <replaceable>DIRECTORY</replaceable> stands for the
2341      directory of the shared library file (for instance the
2342      <productname>PostgreSQL</productname> tutorial directory, which
2343      contains the code for the examples used in this section).
2344      (Better style would be to use just <literal>'funcs'</> in the
2345      <literal>AS</> clause, after having added
2346      <replaceable>DIRECTORY</replaceable> to the search path.  In any
2347      case, we can omit the system-specific extension for a shared
2348      library, commonly <literal>.so</literal>.)
2349     </para>
2350
2351     <para>
2352      Notice that we have specified the functions as <quote>strict</quote>,
2353      meaning that
2354      the system should automatically assume a null result if any input
2355      value is null.  By doing this, we avoid having to check for null inputs
2356      in the function code.  Without this, we'd have to check for null values
2357      explicitly, using PG_ARGISNULL().
2358     </para>
2359
2360     <para>
2361      At first glance, the version-1 coding conventions might appear to be just
2362      pointless obscurantism, over using plain <literal>C</> calling
2363      conventions.  They do however allow to deal with <literal>NULL</>able
2364      arguments/return values, and <quote>toasted</quote> (compressed or
2365      out-of-line) values.
2366     </para>
2367
2368     <para>
2369      The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2370      allows a function to test whether each input is null.  (Of course, doing
2371      this is only necessary in functions not declared <quote>strict</>.)
2372      As with the
2373      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2374      the input arguments are counted beginning at zero.  Note that one
2375      should refrain from executing
2376      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2377      one has verified that the argument isn't null.
2378      To return a null result, execute <function>PG_RETURN_NULL()</function>;
2379      this works in both strict and nonstrict functions.
2380     </para>
2381
2382     <para>
2383      Other options provided by the version-1 interface are two
2384      variants of the
2385      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2386      macros. The first of these,
2387      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2388      guarantees to return a copy of the specified argument that is
2389      safe for writing into. (The normal macros will sometimes return a
2390      pointer to a value that is physically stored in a table, which
2391      must not be written to. Using the
2392      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2393      macros guarantees a writable result.)
2394     The second variant consists of the
2395     <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2396     macros which take three arguments. The first is the number of the
2397     function argument (as above). The second and third are the offset and
2398     length of the segment to be returned. Offsets are counted from
2399     zero, and a negative length requests that the remainder of the
2400     value be returned. These macros provide more efficient access to
2401     parts of large values in the case where they have storage type
2402     <quote>external</quote>. (The storage type of a column can be specified using
2403     <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2404     COLUMN <replaceable>colname</replaceable> SET STORAGE
2405     <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2406     <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2407      or <literal>main</>.)
2408     </para>
2409
2410     <para>
2411      Finally, the version-1 function call conventions make it possible
2412      to return set results (<xref linkend="xfunc-c-return-set">) and
2413      implement trigger functions (<xref linkend="triggers">) and
2414      procedural-language call handlers (<xref
2415      linkend="plhandler">).  For more details
2416      see <filename>src/backend/utils/fmgr/README</filename> in the
2417      source distribution.
2418     </para>
2419    </sect2>
2420
2421    <sect2>
2422     <title>Writing Code</title>
2423
2424     <para>
2425      Before we turn to the more advanced topics, we should discuss
2426      some coding rules for <productname>PostgreSQL</productname>
2427      C-language functions.  While it might be possible to load functions
2428      written in languages other than C into
2429      <productname>PostgreSQL</productname>, this is usually difficult
2430      (when it is possible at all) because other languages, such as
2431      C++, FORTRAN, or Pascal often do not follow the same calling
2432      convention as C.  That is, other languages do not pass argument
2433      and return values between functions in the same way.  For this
2434      reason, we will assume that your C-language functions are
2435      actually written in C.
2436     </para>
2437
2438     <para>
2439      The basic rules for writing and building C functions are as follows:
2440
2441      <itemizedlist>
2442       <listitem>
2443        <para>
2444         Use <literal>pg_config
2445         --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2446         to find out where the <productname>PostgreSQL</> server header
2447         files are installed on your system (or the system that your
2448         users will be running on).
2449        </para>
2450       </listitem>
2451
2452       <listitem>
2453        <para>
2454         Compiling and linking your code so that it can be dynamically
2455         loaded into <productname>PostgreSQL</productname> always
2456         requires special flags.  See <xref linkend="dfunc"> for a
2457         detailed explanation of how to do it for your particular
2458         operating system.
2459        </para>
2460       </listitem>
2461
2462       <listitem>
2463        <para>
2464         Remember to define a <quote>magic block</> for your shared library,
2465         as described in <xref linkend="xfunc-c-dynload">.
2466        </para>
2467       </listitem>
2468
2469       <listitem>
2470        <para>
2471         When allocating memory, use the
2472         <productname>PostgreSQL</productname> functions
2473         <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2474         instead of the corresponding C library functions
2475         <function>malloc</function> and <function>free</function>.
2476         The memory allocated by <function>palloc</function> will be
2477         freed automatically at the end of each transaction, preventing
2478         memory leaks.
2479        </para>
2480       </listitem>
2481
2482       <listitem>
2483        <para>
2484         Always zero the bytes of your structures using <function>memset</>
2485         (or allocate them with <function>palloc0</> in the first place).
2486         Even if you assign to each field of your structure, there might be
2487         alignment padding (holes in the structure) that contain
2488         garbage values.  Without this, it's difficult to
2489         support hash indexes or hash joins, as you must pick out only
2490         the significant bits of your data structure to compute a hash.
2491         The planner also sometimes relies on comparing constants via
2492         bitwise equality, so you can get undesirable planning results if
2493         logically-equivalent values aren't bitwise equal.
2494        </para>
2495       </listitem>
2496
2497       <listitem>
2498        <para>
2499         Most of the internal <productname>PostgreSQL</productname>
2500         types are declared in <filename>postgres.h</filename>, while
2501         the function manager interfaces
2502         (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)  are in
2503         <filename>fmgr.h</filename>, so you will need to include at
2504         least these two files.  For portability reasons it's best to
2505         include <filename>postgres.h</filename> <emphasis>first</>,
2506         before any other system or user header files.  Including
2507         <filename>postgres.h</filename> will also include
2508         <filename>elog.h</filename> and <filename>palloc.h</filename>
2509         for you.
2510        </para>
2511       </listitem>
2512
2513       <listitem>
2514        <para>
2515         Symbol names defined within object files must not conflict
2516         with each other or with symbols defined in the
2517         <productname>PostgreSQL</productname> server executable.  You
2518         will have to rename your functions or variables if you get
2519         error messages to this effect.
2520        </para>
2521       </listitem>
2522      </itemizedlist>
2523     </para>
2524    </sect2>
2525
2526 &dfunc;
2527
2528    <sect2>
2529     <title>Composite-type Arguments</title>
2530
2531     <para>
2532      Composite types do not have a fixed layout like C structures.
2533      Instances of a composite type can contain null fields.  In
2534      addition, composite types that are part of an inheritance
2535      hierarchy can have different fields than other members of the
2536      same inheritance hierarchy.  Therefore,
2537      <productname>PostgreSQL</productname> provides a function
2538      interface for accessing fields of composite types from C.
2539     </para>
2540
2541     <para>
2542      Suppose we want to write a function to answer the query:
2543
2544 <programlisting>
2545 SELECT name, c_overpaid(emp, 1500) AS overpaid
2546     FROM emp
2547     WHERE name = 'Bill' OR name = 'Sam';
2548 </programlisting>
2549
2550      Using the version-1 calling conventions, we can define
2551      <function>c_overpaid</> as:
2552
2553 <programlisting><![CDATA[
2554 #include "postgres.h"
2555 #include "executor/executor.h"  /* for GetAttributeByName() */
2556
2557 #ifdef PG_MODULE_MAGIC
2558 PG_MODULE_MAGIC;
2559 #endif
2560
2561 PG_FUNCTION_INFO_V1(c_overpaid);
2562
2563 Datum
2564 c_overpaid(PG_FUNCTION_ARGS)
2565 {
2566     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
2567     int32            limit = PG_GETARG_INT32(1);
2568     bool isnull;
2569     Datum salary;
2570
2571     salary = GetAttributeByName(t, "salary", &isnull);
2572     if (isnull)
2573         PG_RETURN_BOOL(false);
2574     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2575
2576     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2577 }
2578 ]]>
2579 </programlisting>
2580     </para>
2581
2582     <para>
2583      <function>GetAttributeByName</function> is the
2584      <productname>PostgreSQL</productname> system function that
2585      returns attributes out of the specified row.  It has
2586      three arguments: the argument of type <type>HeapTupleHeader</type> passed
2587      into
2588      the  function, the name of the desired attribute, and a
2589      return parameter that tells whether  the  attribute
2590      is  null.   <function>GetAttributeByName</function> returns a <type>Datum</type>
2591      value that you can convert to the proper data type by using the
2592      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2593      macro.  Note that the return value is meaningless if the null flag is
2594      set; always check the null flag before trying to do anything with the
2595      result.
2596     </para>
2597
2598     <para>
2599      There is also <function>GetAttributeByNum</function>, which selects
2600      the target attribute by column number instead of name.
2601     </para>
2602
2603     <para>
2604      The following command declares the function
2605      <function>c_overpaid</function> in SQL:
2606
2607 <programlisting>
2608 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2609     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2610     LANGUAGE C STRICT;
2611 </programlisting>
2612
2613      Notice we have used <literal>STRICT</> so that we did not have to
2614      check whether the input arguments were NULL.
2615     </para>
2616    </sect2>
2617
2618    <sect2>
2619     <title>Returning Rows (Composite Types)</title>
2620
2621     <para>
2622      To return a row or composite-type value from a C-language
2623      function, you can use a special API that provides macros and
2624      functions to hide most of the complexity of building composite
2625      data types.  To use this API, the source file must include:
2626 <programlisting>
2627 #include "funcapi.h"
2628 </programlisting>
2629     </para>
2630
2631     <para>
2632      There are two ways you can build a composite data value (henceforth
2633      a <quote>tuple</>): you can build it from an array of Datum values,
2634      or from an array of C strings that can be passed to the input
2635      conversion functions of the tuple's column data types.  In either
2636      case, you first need to obtain or construct a <structname>TupleDesc</>
2637      descriptor for the tuple structure.  When working with Datums, you
2638      pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2639      and then call <function>heap_form_tuple</> for each row.  When working
2640      with C strings, you pass the <structname>TupleDesc</> to
2641      <function>TupleDescGetAttInMetadata</>, and then call
2642      <function>BuildTupleFromCStrings</> for each row.  In the case of a
2643      function returning a set of tuples, the setup steps can all be done
2644      once during the first call of the function.
2645     </para>
2646
2647     <para>
2648      Several helper functions are available for setting up the needed
2649      <structname>TupleDesc</>.  The recommended way to do this in most
2650      functions returning composite values is to call:
2651 <programlisting>
2652 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2653                                    Oid *resultTypeId,
2654                                    TupleDesc *resultTupleDesc)
2655 </programlisting>
2656      passing the same <literal>fcinfo</> struct passed to the calling function
2657      itself.  (This of course requires that you use the version-1
2658      calling conventions.)  <varname>resultTypeId</> can be specified
2659      as <literal>NULL</> or as the address of a local variable to receive the
2660      function's result type OID.  <varname>resultTupleDesc</> should be the
2661      address of a local <structname>TupleDesc</> variable.  Check that the
2662      result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2663      <varname>resultTupleDesc</> has been filled with the needed
2664      <structname>TupleDesc</>.  (If it is not, you can report an error along
2665      the lines of <quote>function returning record called in context that
2666      cannot accept type record</quote>.)
2667     </para>
2668
2669     <tip>
2670      <para>
2671       <function>get_call_result_type</> can resolve the actual type of a
2672       polymorphic function result; so it is useful in functions that return
2673       scalar polymorphic results, not only functions that return composites.
2674       The <varname>resultTypeId</> output is primarily useful for functions
2675       returning polymorphic scalars.
2676      </para>
2677     </tip>
2678
2679     <note>
2680      <para>
2681       <function>get_call_result_type</> has a sibling
2682       <function>get_expr_result_type</>, which can be used to resolve the
2683       expected output type for a function call represented by an expression
2684       tree.  This can be used when trying to determine the result type from
2685       outside the function itself.  There is also
2686       <function>get_func_result_type</>, which can be used when only the
2687       function's OID is available.  However these functions are not able
2688       to deal with functions declared to return <structname>record</>, and
2689       <function>get_func_result_type</> cannot resolve polymorphic types,
2690       so you should preferentially use <function>get_call_result_type</>.
2691      </para>
2692     </note>
2693
2694     <para>
2695      Older, now-deprecated functions for obtaining
2696      <structname>TupleDesc</>s are:
2697 <programlisting>
2698 TupleDesc RelationNameGetTupleDesc(const char *relname)
2699 </programlisting>
2700      to get a <structname>TupleDesc</> for the row type of a named relation,
2701      and:
2702 <programlisting>
2703 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2704 </programlisting>
2705      to get a <structname>TupleDesc</> based on a type OID. This can
2706      be used to get a <structname>TupleDesc</> for a base or
2707      composite type.  It will not work for a function that returns
2708      <structname>record</>, however, and it cannot resolve polymorphic
2709      types.
2710     </para>
2711
2712     <para>
2713      Once you have a <structname>TupleDesc</>, call:
2714 <programlisting>
2715 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2716 </programlisting>
2717      if you plan to work with Datums, or:
2718 <programlisting>
2719 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2720 </programlisting>
2721      if you plan to work with C strings.  If you are writing a function
2722      returning set, you can save the results of these functions in the
2723      <structname>FuncCallContext</> structure &mdash; use the
2724      <structfield>tuple_desc</> or <structfield>attinmeta</> field
2725      respectively.
2726     </para>
2727
2728     <para>
2729      When working with Datums, use:
2730 <programlisting>
2731 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2732 </programlisting>
2733      to build a <structname>HeapTuple</> given user data in Datum form.
2734     </para>
2735
2736     <para>
2737      When working with C strings, use:
2738 <programlisting>
2739 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2740 </programlisting>
2741      to build a <structname>HeapTuple</> given user data
2742      in C string form.  <parameter>values</parameter> is an array of C strings,
2743      one for each attribute of the return row. Each C string should be in
2744      the form expected by the input function of the attribute data
2745      type. In order to return a null value for one of the attributes,
2746      the corresponding pointer in the <parameter>values</> array
2747      should be set to <symbol>NULL</>.  This function will need to
2748      be called again for each row you return.
2749     </para>
2750
2751     <para>
2752      Once you have built a tuple to return from your function, it
2753      must be converted into a <type>Datum</>. Use:
2754 <programlisting>
2755 HeapTupleGetDatum(HeapTuple tuple)
2756 </programlisting>
2757      to convert a <structname>HeapTuple</> into a valid Datum.  This
2758      <type>Datum</> can be returned directly if you intend to return
2759      just a single row, or it can be used as the current return value
2760      in a set-returning function.
2761     </para>
2762
2763     <para>
2764      An example appears in the next section.
2765     </para>
2766
2767    </sect2>
2768
2769    <sect2 id="xfunc-c-return-set">
2770     <title>Returning Sets</title>
2771
2772     <para>
2773      There is also a special API that provides support for returning
2774      sets (multiple rows) from a C-language function.  A set-returning
2775      function must follow the version-1 calling conventions.  Also,
2776      source files must include <filename>funcapi.h</filename>, as
2777      above.
2778     </para>
2779
2780     <para>
2781      A set-returning function (<acronym>SRF</>) is called
2782      once for each item it returns.  The <acronym>SRF</> must
2783      therefore save enough state to remember what it was doing and
2784      return the next item on each call.
2785      The structure <structname>FuncCallContext</> is provided to help
2786      control this process.  Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>
2787      is used to hold a pointer to <structname>FuncCallContext</>
2788      across calls.
2789 <programlisting>
2790 typedef struct FuncCallContext
2791 {
2792     /*
2793      * Number of times we've been called before
2794      *
2795      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2796      * incremented for you every time SRF_RETURN_NEXT() is called.
2797      */
2798     uint64 call_cntr;
2799
2800     /*
2801      * OPTIONAL maximum number of calls
2802      *
2803      * max_calls is here for convenience only and setting it is optional.
2804      * If not set, you must provide alternative means to know when the
2805      * function is done.
2806      */
2807     uint64 max_calls;
2808
2809     /*
2810      * OPTIONAL pointer to result slot
2811      *
2812      * This is obsolete and only present for backward compatibility, viz,
2813      * user-defined SRFs that use the deprecated TupleDescGetSlot().
2814      */
2815     TupleTableSlot *slot;
2816
2817     /*
2818      * OPTIONAL pointer to miscellaneous user-provided context information
2819      *
2820      * user_fctx is for use as a pointer to your own data to retain
2821      * arbitrary context information between calls of your function.
2822      */
2823     void *user_fctx;
2824
2825     /*
2826      * OPTIONAL pointer to struct containing attribute type input metadata
2827      *
2828      * attinmeta is for use when returning tuples (i.e., composite data types)
2829      * and is not used when returning base data types. It is only needed
2830      * if you intend to use BuildTupleFromCStrings() to create the return
2831      * tuple.
2832      */
2833     AttInMetadata *attinmeta;
2834
2835     /*
2836      * memory context used for structures that must live for multiple calls
2837      *
2838      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2839      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2840      * context for any memory that is to be reused across multiple calls
2841      * of the SRF.
2842      */
2843     MemoryContext multi_call_memory_ctx;
2844
2845     /*
2846      * OPTIONAL pointer to struct containing tuple description
2847      *
2848      * tuple_desc is for use when returning tuples (i.e., composite data types)
2849      * and is only needed if you are going to build the tuples with
2850      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
2851      * the TupleDesc pointer stored here should usually have been run through
2852      * BlessTupleDesc() first.
2853      */
2854     TupleDesc tuple_desc;
2855
2856 } FuncCallContext;
2857 </programlisting>
2858     </para>
2859
2860     <para>
2861      An <acronym>SRF</> uses several functions and macros that
2862      automatically manipulate the <structname>FuncCallContext</>
2863      structure (and expect to find it via <literal>fn_extra</>).  Use:
2864 <programlisting>
2865 SRF_IS_FIRSTCALL()
2866 </programlisting>
2867      to determine if your function is being called for the first or a
2868      subsequent time. On the first call (only) use:
2869 <programlisting>
2870 SRF_FIRSTCALL_INIT()
2871 </programlisting>
2872      to initialize the <structname>FuncCallContext</>. On every function call,
2873      including the first, use:
2874 <programlisting>
2875 SRF_PERCALL_SETUP()
2876 </programlisting>
2877      to properly set up for using the <structname>FuncCallContext</>
2878      and clearing any previously returned data left over from the
2879      previous pass.
2880     </para>
2881
2882     <para>
2883      If your function has data to return, use:
2884 <programlisting>
2885 SRF_RETURN_NEXT(funcctx, result)
2886 </programlisting>
2887      to return it to the caller.  (<literal>result</> must be of type
2888      <type>Datum</>, either a single value or a tuple prepared as
2889      described above.)  Finally, when your function is finished
2890      returning data, use:
2891 <programlisting>
2892 SRF_RETURN_DONE(funcctx)
2893 </programlisting>
2894      to clean up and end the <acronym>SRF</>.
2895     </para>
2896
2897     <para>
2898      The memory context that is current when the <acronym>SRF</> is called is
2899      a transient context that will be cleared between calls.  This means
2900      that you do not need to call <function>pfree</> on everything
2901      you allocated using <function>palloc</>; it will go away anyway.  However, if you want to allocate
2902      any data structures to live across calls, you need to put them somewhere
2903      else.  The memory context referenced by
2904      <structfield>multi_call_memory_ctx</> is a suitable location for any
2905      data that needs to survive until the <acronym>SRF</> is finished running.  In most
2906      cases, this means that you should switch into
2907      <structfield>multi_call_memory_ctx</> while doing the first-call setup.
2908     </para>
2909
2910     <warning>
2911      <para>
2912       While the actual arguments to the function remain unchanged between
2913       calls, if you detoast the argument values (which is normally done
2914       transparently by the
2915       <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
2916       in the transient context then the detoasted copies will be freed on
2917       each cycle. Accordingly, if you keep references to such values in
2918       your <structfield>user_fctx</>, you must either copy them into the
2919       <structfield>multi_call_memory_ctx</> after detoasting, or ensure
2920       that you detoast the values only in that context.
2921      </para>
2922     </warning>
2923
2924     <para>
2925      A complete pseudo-code example looks like the following:
2926 <programlisting>
2927 Datum
2928 my_set_returning_function(PG_FUNCTION_ARGS)
2929 {
2930     FuncCallContext  *funcctx;
2931     Datum             result;
2932     <replaceable>further declarations as needed</replaceable>
2933
2934     if (SRF_IS_FIRSTCALL())
2935     {
2936         MemoryContext oldcontext;
2937
2938         funcctx = SRF_FIRSTCALL_INIT();
2939         oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
2940         /* One-time setup code appears here: */
2941         <replaceable>user code</replaceable>
2942         <replaceable>if returning composite</replaceable>
2943             <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2944         <replaceable>endif returning composite</replaceable>
2945         <replaceable>user code</replaceable>
2946         MemoryContextSwitchTo(oldcontext);
2947     }
2948
2949     /* Each-time setup code appears here: */
2950     <replaceable>user code</replaceable>
2951     funcctx = SRF_PERCALL_SETUP();
2952     <replaceable>user code</replaceable>
2953
2954     /* this is just one way we might test whether we are done: */
2955     if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
2956     {
2957         /* Here we want to return another item: */
2958         <replaceable>user code</replaceable>
2959         <replaceable>obtain result Datum</replaceable>
2960         SRF_RETURN_NEXT(funcctx, result);
2961     }
2962     else
2963     {
2964         /* Here we are done returning items and just need to clean up: */
2965         <replaceable>user code</replaceable>
2966         SRF_RETURN_DONE(funcctx);
2967     }
2968 }
2969 </programlisting>
2970     </para>
2971
2972     <para>
2973      A complete example of a simple <acronym>SRF</> returning a composite type
2974      looks like:
2975 <programlisting><![CDATA[
2976 PG_FUNCTION_INFO_V1(retcomposite);
2977
2978 Datum
2979 retcomposite(PG_FUNCTION_ARGS)
2980 {
2981     FuncCallContext     *funcctx;
2982     int                  call_cntr;
2983     int                  max_calls;
2984     TupleDesc            tupdesc;
2985     AttInMetadata       *attinmeta;
2986
2987     /* stuff done only on the first call of the function */
2988     if (SRF_IS_FIRSTCALL())
2989     {
2990         MemoryContext   oldcontext;
2991
2992         /* create a function context for cross-call persistence */
2993         funcctx = SRF_FIRSTCALL_INIT();
2994
2995         /* switch to memory context appropriate for multiple function calls */
2996         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2997
2998         /* total number of tuples to be returned */
2999         funcctx->max_calls = PG_GETARG_UINT32(0);
3000
3001         /* Build a tuple descriptor for our result type */
3002         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3003             ereport(ERROR,
3004                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3005                      errmsg("function returning record called in context "
3006                             "that cannot accept type record")));
3007
3008         /*
3009          * generate attribute metadata needed later to produce tuples from raw
3010          * C strings
3011          */
3012         attinmeta = TupleDescGetAttInMetadata(tupdesc);
3013         funcctx->attinmeta = attinmeta;
3014
3015         MemoryContextSwitchTo(oldcontext);
3016     }
3017
3018     /* stuff done on every call of the function */
3019     funcctx = SRF_PERCALL_SETUP();
3020
3021     call_cntr = funcctx->call_cntr;
3022     max_calls = funcctx->max_calls;
3023     attinmeta = funcctx->attinmeta;
3024
3025     if (call_cntr < max_calls)    /* do when there is more left to send */
3026     {
3027         char       **values;
3028         HeapTuple    tuple;
3029         Datum        result;
3030
3031         /*
3032          * Prepare a values array for building the returned tuple.
3033          * This should be an array of C strings which will
3034          * be processed later by the type input functions.
3035          */
3036         values = (char **) palloc(3 * sizeof(char *));
3037         values[0] = (char *) palloc(16 * sizeof(char));
3038         values[1] = (char *) palloc(16 * sizeof(char));
3039         values[2] = (char *) palloc(16 * sizeof(char));
3040
3041         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3042         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3043         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3044
3045         /* build a tuple */
3046         tuple = BuildTupleFromCStrings(attinmeta, values);
3047
3048         /* make the tuple into a datum */
3049         result = HeapTupleGetDatum(tuple);
3050
3051         /* clean up (this is not really necessary) */
3052         pfree(values[0]);
3053         pfree(values[1]);
3054         pfree(values[2]);
3055         pfree(values);
3056
3057         SRF_RETURN_NEXT(funcctx, result);
3058     }
3059     else    /* do when there is no more left */
3060     {
3061         SRF_RETURN_DONE(funcctx);
3062     }
3063 }
3064 ]]>
3065 </programlisting>
3066
3067      One way to declare this function in SQL is:
3068 <programlisting>
3069 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3070
3071 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3072     RETURNS SETOF __retcomposite
3073     AS '<replaceable>filename</>', 'retcomposite'
3074     LANGUAGE C IMMUTABLE STRICT;
3075 </programlisting>
3076      A different way is to use OUT parameters:
3077 <programlisting>
3078 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3079     OUT f1 integer, OUT f2 integer, OUT f3 integer)
3080     RETURNS SETOF record
3081     AS '<replaceable>filename</>', 'retcomposite'
3082     LANGUAGE C IMMUTABLE STRICT;
3083 </programlisting>
3084      Notice that in this method the output type of the function is formally
3085      an anonymous <structname>record</> type.
3086     </para>
3087
3088     <para>
3089      The directory <link linkend="tablefunc">contrib/tablefunc</>
3090      module in the source distribution contains more examples of
3091      set-returning functions.
3092     </para>
3093    </sect2>
3094
3095    <sect2>
3096     <title>Polymorphic Arguments and Return Types</title>
3097
3098     <para>
3099      C-language functions can be declared to accept and
3100      return the polymorphic types
3101      <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3102      <type>anyenum</type>, and <type>anyrange</type>.
3103      See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3104      of polymorphic functions. When function arguments or return types
3105      are defined as polymorphic types, the function author cannot know
3106      in advance what data type it will be called with, or
3107      need to return. There are two routines provided in <filename>fmgr.h</>
3108      to allow a version-1 C function to discover the actual data types
3109      of its arguments and the type it is expected to return. The routines are
3110      called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3111      <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3112      They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3113      information is not available.
3114      The structure <literal>flinfo</> is normally accessed as
3115      <literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
3116      is zero based.  <function>get_call_result_type</> can also be used
3117      as an alternative to <function>get_fn_expr_rettype</>.
3118      There is also <function>get_fn_expr_variadic</>, which can be used to
3119      find out whether variadic arguments have been merged into an array.
3120      This is primarily useful for <literal>VARIADIC "any"</> functions,
3121      since such merging will always have occurred for variadic functions
3122      taking ordinary array types.
3123     </para>
3124
3125     <para>
3126      For example, suppose we want to write a function to accept a single
3127      element of any type, and return a one-dimensional array of that type:
3128
3129 <programlisting>
3130 PG_FUNCTION_INFO_V1(make_array);
3131 Datum
3132 make_array(PG_FUNCTION_ARGS)
3133 {
3134     ArrayType  *result;
3135     Oid         element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3136     Datum       element;
3137     bool        isnull;
3138     int16       typlen;
3139     bool        typbyval;
3140     char        typalign;
3141     int         ndims;
3142     int         dims[MAXDIM];
3143     int         lbs[MAXDIM];
3144
3145     if (!OidIsValid(element_type))
3146         elog(ERROR, "could not determine data type of input");
3147
3148     /* get the provided element, being careful in case it's NULL */
3149     isnull = PG_ARGISNULL(0);
3150     if (isnull)
3151         element = (Datum) 0;
3152     else
3153         element = PG_GETARG_DATUM(0);
3154
3155     /* we have one dimension */
3156     ndims = 1;
3157     /* and one element */
3158     dims[0] = 1;
3159     /* and lower bound is 1 */
3160     lbs[0] = 1;
3161
3162     /* get required info about the element type */
3163     get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3164
3165     /* now build the array */
3166     result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3167                                 element_type, typlen, typbyval, typalign);
3168
3169     PG_RETURN_ARRAYTYPE_P(result);
3170 }
3171 </programlisting>
3172     </para>
3173
3174     <para>
3175      The following command declares the function
3176      <function>make_array</function> in SQL:
3177
3178 <programlisting>
3179 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3180     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3181     LANGUAGE C IMMUTABLE;
3182 </programlisting>
3183     </para>
3184
3185     <para>
3186      There is a variant of polymorphism that is only available to C-language
3187      functions: they can be declared to take parameters of type
3188      <literal>"any"</>.  (Note that this type name must be double-quoted,
3189      since it's also a SQL reserved word.)  This works like
3190      <type>anyelement</> except that it does not constrain different
3191      <literal>"any"</> arguments to be the same type, nor do they help
3192      determine the function's result type.  A C-language function can also
3193      declare its final parameter to be <literal>VARIADIC "any"</>.  This will
3194      match one or more actual arguments of any type (not necessarily the same
3195      type).  These arguments will <emphasis>not</> be gathered into an array
3196      as happens with normal variadic functions; they will just be passed to
3197      the function separately.  The <function>PG_NARGS()</> macro and the
3198      methods described above must be used to determine the number of actual
3199      arguments and their types when using this feature.  Also, users of such
3200      a function might wish to use the <literal>VARIADIC</> keyword in their
3201      function call, with the expectation that the function would treat the
3202      array elements as separate arguments.  The function itself must implement
3203      that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3204      detect that the actual argument was marked with <literal>VARIADIC</>.
3205     </para>
3206    </sect2>
3207
3208    <sect2 id="xfunc-transform-functions">
3209     <title>Transform Functions</title>
3210
3211     <para>
3212      Some function calls can be simplified during planning based on
3213      properties specific to the function.  For example,
3214      <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3215      To define such function-specific optimizations, write a
3216      <firstterm>transform function</> and place its OID in the
3217      <structfield>protransform</> field of the primary function's
3218      <structname>pg_proc</> entry.  The transform function must have the SQL
3219      signature <literal>protransform(internal) RETURNS internal</>.  The
3220      argument, actually <type>FuncExpr *</>, is a dummy node representing a
3221      call to the primary function.  If the transform function's study of the
3222      expression tree proves that a simplified expression tree can substitute
3223      for all possible concrete calls represented thereby, build and return
3224      that simplified expression.  Otherwise, return a <literal>NULL</>
3225      pointer (<emphasis>not</> a SQL null).
3226     </para>
3227
3228     <para>
3229      We make no guarantee that <productname>PostgreSQL</> will never call the
3230      primary function in cases that the transform function could simplify.
3231      Ensure rigorous equivalence between the simplified expression and an
3232      actual call to the primary function.
3233     </para>
3234
3235     <para>
3236      Currently, this facility is not exposed to users at the SQL level
3237      because of security concerns, so it is only practical to use for
3238      optimizing built-in functions.
3239     </para>
3240    </sect2>
3241
3242    <sect2>
3243     <title>Shared Memory and LWLocks</title>
3244
3245     <para>
3246      Add-ins can reserve LWLocks and an allocation of shared memory on server
3247      startup.  The add-in's shared library must be preloaded by specifying
3248      it in
3249      <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3250      Shared memory is reserved by calling:
3251 <programlisting>
3252 void RequestAddinShmemSpace(int size)
3253 </programlisting>
3254      from your <function>_PG_init</> function.
3255     </para>
3256     <para>
3257      LWLocks are reserved by calling:
3258 <programlisting>
3259 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3260 </programlisting>
3261      from <function>_PG_init</>.  This will ensure that an array of
3262      <literal>num_lwlocks</> LWLocks is available under the name
3263      <literal>tranche_name</>.  Use <function>GetNamedLWLockTranche</>
3264      to get a pointer to this array.
3265     </para>
3266     <para>
3267      To avoid possible race-conditions, each backend should use the LWLock
3268      <function>AddinShmemInitLock</> when connecting to and initializing
3269      its allocation of shared memory, as shown here:
3270 <programlisting>
3271 static mystruct *ptr = NULL;
3272
3273 if (!ptr)
3274 {
3275         bool    found;
3276
3277         LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3278         ptr = ShmemInitStruct("my struct name", size, &amp;found);
3279         if (!found)
3280         {
3281                 initialize contents of shmem area;
3282                 acquire any requested LWLocks using:
3283                 ptr->locks = GetNamedLWLockTranche("my tranche name");
3284         }
3285         LWLockRelease(AddinShmemInitLock);
3286 }
3287 </programlisting>
3288     </para>
3289    </sect2>
3290
3291    <sect2 id="extend-Cpp">
3292     <title>Using C++ for Extensibility</title>
3293
3294     <indexterm zone="extend-Cpp">
3295      <primary>C++</primary>
3296     </indexterm>
3297
3298     <para>
3299      Although the <productname>PostgreSQL</productname> backend is written in
3300      C, it is possible to write extensions in C++ if these guidelines are
3301      followed:
3302
3303      <itemizedlist>
3304       <listitem>
3305        <para>
3306          All functions accessed by the backend must present a C interface
3307          to the backend;  these C functions can then call C++ functions.
3308          For example, <literal>extern C</> linkage is required for
3309          backend-accessed functions.  This is also necessary for any
3310          functions that are passed as pointers between the backend and
3311          C++ code.
3312        </para>
3313       </listitem>
3314       <listitem>
3315        <para>
3316         Free memory using the appropriate deallocation method.  For example,
3317         most backend memory is allocated using <function>palloc()</>, so use
3318         <function>pfree()</> to free it.  Using C++
3319         <function>delete</> in such cases will fail.
3320        </para>
3321       </listitem>
3322       <listitem>
3323        <para>
3324         Prevent exceptions from propagating into the C code (use a catch-all
3325         block at the top level of all <literal>extern C</> functions).  This
3326         is necessary even if the C++ code does not explicitly throw any
3327         exceptions, because events like out-of-memory can still throw
3328         exceptions.  Any exceptions must be caught and appropriate errors
3329         passed back to the C interface.  If possible, compile C++ with
3330         <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3331         cases, you must check for failures in your C++ code, e.g.  check for
3332         NULL returned by <function>new()</>.
3333        </para>
3334       </listitem>
3335       <listitem>
3336        <para>
3337         If calling backend functions from C++ code, be sure that the
3338         C++ call stack contains only plain old data structures
3339         (<acronym>POD</>).  This is necessary because backend errors
3340         generate a distant <function>longjmp()</> that does not properly
3341         unroll a C++ call stack with non-POD objects.
3342        </para>
3343       </listitem>
3344      </itemizedlist>
3345     </para>
3346
3347     <para>
3348      In summary, it is best to place C++ code behind a wall of
3349      <literal>extern C</> functions that interface to the backend,
3350      and avoid exception, memory, and call stack leakage.
3351     </para>
3352    </sect2>
3353
3354   </sect1>