granicus.if.org Git - postgresql/blob - doc/src/sgml/xfunc.sgml

   1 <!-- doc/src/sgml/xfunc.sgml -->
   2
   3  <sect1 id="xfunc">
   4   <title>User-defined Functions</title>
   5
   6   <indexterm zone="xfunc">
   7    <primary>function</primary>
   8    <secondary>user-defined</secondary>
   9   </indexterm>
  10
  11   <para>
  12    <productname>PostgreSQL</productname> provides four kinds of
  13    functions:
  14
  15    <itemizedlist>
  16     <listitem>
  17      <para>
  18       query language functions (functions written in
  19       <acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
  20      </para>
  21     </listitem>
  22     <listitem>
  23      <para>
  24       procedural language functions (functions written in, for
  25       example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
  26       (<xref linkend="xfunc-pl">)
  27      </para>
  28     </listitem>
  29     <listitem>
  30      <para>
  31       internal functions (<xref linkend="xfunc-internal">)
  32      </para>
  33     </listitem>
  34     <listitem>
  35      <para>
  36       C-language functions (<xref linkend="xfunc-c">)
  37      </para>
  38     </listitem>
  39    </itemizedlist>
  40   </para>
  41
  42   <para>
  43    Every kind
  44    of  function  can take base types, composite types, or
  45    combinations of these as arguments (parameters). In addition,
  46    every kind of function can return a base type or
  47    a composite type.  Functions can also be defined to return
  48    sets of base or composite values.
  49   </para>
  50
  51   <para>
  52    Many kinds of functions can take or return certain pseudo-types
  53    (such as polymorphic types), but the available facilities vary.
  54    Consult the description of each kind of function for more details.
  55   </para>
  56
  57   <para>
  58    It's easiest to define <acronym>SQL</acronym>
  59    functions, so we'll start by discussing those.
  60    Most of the concepts presented for <acronym>SQL</acronym> functions
  61    will carry over to the other types of functions.
  62   </para>
  63
  64   <para>
  65    Throughout this chapter, it can be useful to look at the reference
  66    page of the <xref linkend="sql-createfunction"> command to
  67    understand the examples better.  Some examples from this chapter
  68    can be found in <filename>funcs.sql</filename> and
  69    <filename>funcs.c</filename> in the <filename>src/tutorial</>
  70    directory in the <productname>PostgreSQL</productname> source
  71    distribution.
  72   </para>
  73   </sect1>
  74
  75   <sect1 id="xfunc-sql">
  76    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  77
  78    <indexterm zone="xfunc-sql">
  79     <primary>function</primary>
  80     <secondary>user-defined</secondary>
  81     <tertiary>in SQL</tertiary>
  82    </indexterm>
  83
  84    <para>
  85     SQL functions execute an arbitrary list of SQL statements, returning
  86     the result of the last query in the list.
  87     In the simple (non-set)
  88     case, the first row of the last query's result will be returned.
  89     (Bear in mind that <quote>the first row</quote> of a multirow
  90     result is not well-defined unless you use <literal>ORDER BY</>.)
  91     If the last query happens
  92     to return no rows at all, the null value will be returned.
  93    </para>
  94
  95    <para>
  96     Alternatively, an SQL function can be declared to return a set (that is,
  97     multiple rows) by specifying the function's return type as <literal>SETOF
  98     <replaceable>sometype</></literal>, or equivalently by declaring it as
  99     <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.  In this case
 100     all rows of the last query's result are returned.  Further details appear
 101     below.
 102    </para>
 103
 104    <para>
 105     The body of an SQL function must be a list of SQL
 106     statements separated by semicolons.  A semicolon after the last
 107     statement is optional.  Unless the function is declared to return
 108     <type>void</>, the last statement must be a <command>SELECT</>,
 109     or an <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>
 110     that has a <literal>RETURNING</> clause.
 111    </para>
 112
 113     <para>
 114      Any collection of commands in the  <acronym>SQL</acronym>
 115      language can be packaged together and defined as a function.
 116      Besides <command>SELECT</command> queries, the commands can include data
 117      modification queries (<command>INSERT</command>,
 118      <command>UPDATE</command>, and <command>DELETE</command>), as well as
 119      other SQL commands. (You cannot use transaction control commands, e.g.
 120      <command>COMMIT</>, <command>SAVEPOINT</>, and some utility
 121      commands, e.g.  <literal>VACUUM</>, in <acronym>SQL</acronym> functions.)
 122      However, the final command
 123      must be a <command>SELECT</command> or have a <literal>RETURNING</>
 124      clause that returns whatever is
 125      specified as the function's return type.  Alternatively, if you
 126      want to define a SQL function that performs actions but has no
 127      useful value to return, you can define it as returning <type>void</>.
 128      For example, this function removes rows with negative salaries from
 129      the <literal>emp</> table:
 130
 131 <screen>
 132 CREATE FUNCTION clean_emp() RETURNS void AS '
 133     DELETE FROM emp
 134         WHERE salary &lt; 0;
 135 ' LANGUAGE SQL;
 136
 137 SELECT clean_emp();
 138
 139  clean_emp
 140 -----------
 141
 142 (1 row)
 143 </screen>
 144     </para>
 145
 146     <note>
 147      <para>
 148       The entire body of a SQL function is parsed before any of it is
 149       executed.  While a SQL function can contain commands that alter
 150       the system catalogs (e.g., <command>CREATE TABLE</>), the effects
 151       of such commands will not be visible during parse analysis of
 152       later commands in the function.  Thus, for example,
 153       <literal>CREATE TABLE foo (...); INSERT INTO foo VALUES(...);</literal>
 154       will not work as desired if packaged up into a single SQL function,
 155       since <structname>foo</> won't exist yet when the <command>INSERT</>
 156       command is parsed.  It's recommended to use <application>PL/PgSQL</>
 157       instead of a SQL function in this type of situation.
 158      </para>
 159    </note>
 160
 161    <para>
 162     The syntax of the <command>CREATE FUNCTION</command> command requires
 163     the function body to be written as a string constant.  It is usually
 164     most convenient to use dollar quoting (see <xref
 165     linkend="sql-syntax-dollar-quoting">) for the string constant.
 166     If you choose to use regular single-quoted string constant syntax,
 167     you must double single quote marks (<literal>'</>) and backslashes
 168     (<literal>\</>) (assuming escape string syntax) in the body of
 169     the function (see <xref linkend="sql-syntax-strings">).
 170    </para>
 171
 172    <sect2 id="xfunc-sql-function-arguments">
 173     <title>Arguments for <acronym>SQL</acronym> Functions</title>
 174
 175    <indexterm>
 176     <primary>function</primary>
 177     <secondary>named argument</secondary>
 178    </indexterm>
 179
 180     <para>
 181      Arguments of a SQL function can be referenced in the function
 182      body using either names or numbers.  Examples of both methods appear
 183      below.
 184     </para>
 185
 186     <para>
 187      To use a name, declare the function argument as having a name, and
 188      then just write that name in the function body.  If the argument name
 189      is the same as any column name in the current SQL command within the
 190      function, the column name will take precedence.  To override this,
 191      qualify the argument name with the name of the function itself, that is
 192      <literal><replaceable>function_name</>.<replaceable>argument_name</></literal>.
 193      (If this would conflict with a qualified column name, again the column
 194      name wins.  You can avoid the ambiguity by choosing a different alias for
 195      the table within the SQL command.)
 196     </para>
 197
 198     <para>
 199      In the older numeric approach, arguments are referenced using the syntax
 200      <literal>$<replaceable>n</></>: <literal>$1</> refers to the first input
 201      argument, <literal>$2</> to the second, and so on.  This will work
 202      whether or not the particular argument was declared with a name.
 203     </para>
 204
 205     <para>
 206      If an argument is of a composite type, then the dot notation,
 207      e.g., <literal><replaceable>argname</>.<replaceable>fieldname</></literal> or
 208      <literal>$1.<replaceable>fieldname</></literal>, can be used to access attributes of the
 209      argument.  Again, you might need to qualify the argument's name with the
 210      function name to make the form with an argument name unambiguous.
 211     </para>
 212
 213     <para>
 214      SQL function arguments can only be used as data values,
 215      not as identifiers.  Thus for example this is reasonable:
 216 <programlisting>
 217 INSERT INTO mytable VALUES ($1);
 218 </programlisting>
 219 but this will not work:
 220 <programlisting>
 221 INSERT INTO $1 VALUES (42);
 222 </programlisting>
 223     </para>
 224
 225     <note>
 226      <para>
 227       The ability to use names to reference SQL function arguments was added
 228       in <productname>PostgreSQL</productname> 9.2.  Functions to be used in
 229       older servers must use the <literal>$<replaceable>n</></> notation.
 230      </para>
 231     </note>
 232    </sect2>
 233
 234    <sect2 id="xfunc-sql-base-functions">
 235     <title><acronym>SQL</acronym> Functions on Base Types</title>
 236
 237     <para>
 238      The simplest possible <acronym>SQL</acronym> function has no arguments and
 239      simply returns a base type, such as <type>integer</type>:
 240
 241 <screen>
 242 CREATE FUNCTION one() RETURNS integer AS $$
 243     SELECT 1 AS result;
 244 $$ LANGUAGE SQL;
 245
 246 -- Alternative syntax for string literal:
 247 CREATE FUNCTION one() RETURNS integer AS '
 248     SELECT 1 AS result;
 249 ' LANGUAGE SQL;
 250
 251 SELECT one();
 252
 253  one
 254 -----
 255    1
 256 </screen>
 257     </para>
 258
 259     <para>
 260      Notice that we defined a column alias within the function body for the result of the function
 261      (with  the  name <literal>result</>),  but this column alias is not visible
 262      outside the function.  Hence,  the  result  is labeled <literal>one</>
 263      instead of <literal>result</>.
 264     </para>
 265
 266     <para>
 267      It is almost as easy to define <acronym>SQL</acronym> functions
 268      that take base types as arguments:
 269
 270 <screen>
 271 CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
 272     SELECT x + y;
 273 $$ LANGUAGE SQL;
 274
 275 SELECT add_em(1, 2) AS answer;
 276
 277  answer
 278 --------
 279       3
 280 </screen>
 281     </para>
 282
 283     <para>
 284      Alternatively, we could dispense with names for the arguments and
 285      use numbers:
 286
 287 <screen>
 288 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
 289     SELECT $1 + $2;
 290 $$ LANGUAGE SQL;
 291
 292 SELECT add_em(1, 2) AS answer;
 293
 294  answer
 295 --------
 296       3
 297 </screen>
 298     </para>
 299
 300     <para>
 301      Here is a more useful function, which might be used to debit a
 302      bank account:
 303
 304 <programlisting>
 305 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 306     UPDATE bank
 307         SET balance = balance - debit
 308         WHERE accountno = tf1.accountno;
 309     SELECT 1;
 310 $$ LANGUAGE SQL;
 311 </programlisting>
 312
 313      A user could execute this function to debit account 17 by $100.00 as
 314      follows:
 315
 316 <programlisting>
 317 SELECT tf1(17, 100.0);
 318 </programlisting>
 319     </para>
 320
 321     <para>
 322      In this example, we chose the name <literal>accountno</> for the first
 323      argument, but this is the same as the name of a column in the
 324      <literal>bank</> table.  Within the <command>UPDATE</> command,
 325      <literal>accountno</> refers to the column <literal>bank.accountno</>,
 326      so <literal>tf1.accountno</> must be used to refer to the argument.
 327      We could of course avoid this by using a different name for the argument.
 328     </para>
 329
 330     <para>
 331      In practice one would probably like a more useful result from the
 332      function than a constant 1, so a more likely definition
 333      is:
 334
 335 <programlisting>
 336 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 337     UPDATE bank
 338         SET balance = balance - debit
 339         WHERE accountno = tf1.accountno;
 340     SELECT balance FROM bank WHERE accountno = tf1.accountno;
 341 $$ LANGUAGE SQL;
 342 </programlisting>
 343
 344      which adjusts the balance and returns the new balance.
 345      The same thing could be done in one command using <literal>RETURNING</>:
 346
 347 <programlisting>
 348 CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS integer AS $$
 349     UPDATE bank
 350         SET balance = balance - debit
 351         WHERE accountno = tf1.accountno
 352     RETURNING balance;
 353 $$ LANGUAGE SQL;
 354 </programlisting>
 355     </para>
 356    </sect2>
 357
 358    <sect2 id="xfunc-sql-composite-functions">
 359     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 360
 361     <para>
 362      When writing functions with arguments of composite types, we must not
 363      only specify which argument we want but also the desired attribute
 364      (field) of that argument.  For example, suppose that
 365      <type>emp</type> is a table containing employee data, and therefore
 366      also the name of the composite type of each row of the table.  Here
 367      is a function <function>double_salary</function> that computes what someone's
 368      salary would be if it were doubled:
 369
 370 <screen>
 371 CREATE TABLE emp (
 372     name        text,
 373     salary      numeric,
 374     age         integer,
 375     cubicle     point
 376 );
 377
 378 INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
 379
 380 CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
 381     SELECT $1.salary * 2 AS salary;
 382 $$ LANGUAGE SQL;
 383
 384 SELECT name, double_salary(emp.*) AS dream
 385     FROM emp
 386     WHERE emp.cubicle ~= point '(2,1)';
 387
 388  name | dream
 389 ------+-------
 390  Bill |  8400
 391 </screen>
 392     </para>
 393
 394     <para>
 395      Notice the use of the syntax <literal>$1.salary</literal>
 396      to select one field of the argument row value.  Also notice
 397      how the calling <command>SELECT</> command uses <literal>*</>
 398      to select
 399      the entire current row of a table as a composite value.  The table
 400      row can alternatively be referenced using just the table name,
 401      like this:
 402 <screen>
 403 SELECT name, double_salary(emp) AS dream
 404     FROM emp
 405     WHERE emp.cubicle ~= point '(2,1)';
 406 </screen>
 407      but this usage is deprecated since it's easy to get confused.
 408     </para>
 409
 410     <para>
 411      Sometimes it is handy to construct a composite argument value
 412      on-the-fly.  This can be done with the <literal>ROW</> construct.
 413      For example, we could adjust the data being passed to the function:
 414 <screen>
 415 SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
 416     FROM emp;
 417 </screen>
 418     </para>
 419
 420     <para>
 421      It is also possible to build a function that returns a composite type.
 422      This is an example of a function
 423      that returns a single <type>emp</type> row:
 424
 425 <programlisting>
 426 CREATE FUNCTION new_emp() RETURNS emp AS $$
 427     SELECT text 'None' AS name,
 428         1000.0 AS salary,
 429         25 AS age,
 430         point '(2,2)' AS cubicle;
 431 $$ LANGUAGE SQL;
 432 </programlisting>
 433
 434      In this example we have specified each of  the  attributes
 435      with  a  constant value, but any computation
 436      could have been substituted for these constants.
 437     </para>
 438
 439     <para>
 440      Note two important things about defining the function:
 441
 442      <itemizedlist>
 443       <listitem>
 444        <para>
 445         The select list order in the query must be exactly the same as
 446         that in which the columns appear in the table associated
 447         with the composite type.  (Naming the columns, as we did above,
 448         is irrelevant to the system.)
 449        </para>
 450       </listitem>
 451       <listitem>
 452        <para>
 453         You must typecast the expressions to match the
 454         definition of the composite type, or you will get errors like this:
 455 <screen>
 456 <computeroutput>
 457 ERROR:  function declared to return emp returns varchar instead of text at column 1
 458 </computeroutput>
 459 </screen>
 460        </para>
 461       </listitem>
 462      </itemizedlist>
 463     </para>
 464
 465     <para>
 466      A different way to define the same function is:
 467
 468 <programlisting>
 469 CREATE FUNCTION new_emp() RETURNS emp AS $$
 470     SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
 471 $$ LANGUAGE SQL;
 472 </programlisting>
 473
 474      Here we wrote a <command>SELECT</> that returns just a single
 475      column of the correct composite type.  This isn't really better
 476      in this situation, but it is a handy alternative in some cases
 477      &mdash; for example, if we need to compute the result by calling
 478      another function that returns the desired composite value.
 479     </para>
 480
 481     <para>
 482      We could call this function directly in either of two ways:
 483
 484 <screen>
 485 SELECT new_emp();
 486
 487          new_emp
 488 --------------------------
 489  (None,1000.0,25,"(2,2)")
 490
 491 SELECT * FROM new_emp();
 492
 493  name | salary | age | cubicle
 494 ------+--------+-----+---------
 495  None | 1000.0 |  25 | (2,2)
 496 </screen>
 497
 498      The second way is described more fully in <xref
 499      linkend="xfunc-sql-table-functions">.
 500     </para>
 501
 502     <para>
 503      When you use a function that returns a composite type,
 504      you might want only one field (attribute) from its result.
 505      You can do that with syntax like this:
 506
 507 <screen>
 508 SELECT (new_emp()).name;
 509
 510  name
 511 ------
 512  None
 513 </screen>
 514
 515      The extra parentheses are needed to keep the parser from getting
 516      confused.  If you try to do it without them, you get something like this:
 517
 518 <screen>
 519 SELECT new_emp().name;
 520 ERROR:  syntax error at or near "."
 521 LINE 1: SELECT new_emp().name;
 522                         ^
 523 </screen>
 524     </para>
 525
 526     <para>
 527      Another option is to use
 528      functional notation for extracting an attribute.  The  simple  way
 529      to explain this is that we can use the
 530      notations <literal><replaceable>attribute</>(<replaceable>table</>)</>
 531      and  <literal><replaceable>table</>.<replaceable>attribute</></>
 532      interchangeably.
 533
 534 <screen>
 535 SELECT name(new_emp());
 536
 537  name
 538 ------
 539  None
 540 </screen>
 541
 542 <screen>
 543 -- This is the same as:
 544 -- SELECT emp.name AS youngster FROM emp WHERE emp.age &lt; 30;
 545
 546 SELECT name(emp) AS youngster FROM emp WHERE age(emp) &lt; 30;
 547
 548  youngster
 549 -----------
 550  Sam
 551  Andy
 552 </screen>
 553     </para>
 554
 555     <tip>
 556      <para>
 557       The equivalence between functional notation and attribute notation
 558       makes it possible to use functions on composite types to emulate
 559       <quote>computed fields</>.
 560       <indexterm>
 561        <primary>computed field</primary>
 562       </indexterm>
 563       <indexterm>
 564        <primary>field</primary>
 565        <secondary>computed</secondary>
 566       </indexterm>
 567       For example, using the previous definition
 568       for <literal>double_salary(emp)</>, we can write
 569
 570 <screen>
 571 SELECT emp.name, emp.double_salary FROM emp;
 572 </screen>
 573
 574       An application using this wouldn't need to be directly aware that
 575       <literal>double_salary</> isn't a real column of the table.
 576       (You can also emulate computed fields with views.)
 577      </para>
 578
 579      <para>
 580       Because of this behavior, it's unwise to give a function that takes
 581       a single composite-type argument the same name as any of the fields of
 582       that composite type.
 583      </para>
 584     </tip>
 585
 586     <para>
 587      Another way to use a function returning a composite type is to pass the
 588      result to another function that accepts the correct row type as input:
 589
 590 <screen>
 591 CREATE FUNCTION getname(emp) RETURNS text AS $$
 592     SELECT $1.name;
 593 $$ LANGUAGE SQL;
 594
 595 SELECT getname(new_emp());
 596  getname
 597 ---------
 598  None
 599 (1 row)
 600 </screen>
 601     </para>
 602
 603     <para>
 604      Still another way to use a function that returns a composite type is to
 605      call it as a table function, as described in <xref
 606      linkend="xfunc-sql-table-functions">.
 607     </para>
 608    </sect2>
 609
 610    <sect2 id="xfunc-output-parameters">
 611     <title><acronym>SQL</> Functions with Output Parameters</title>
 612
 613    <indexterm>
 614     <primary>function</primary>
 615     <secondary>output parameter</secondary>
 616    </indexterm>
 617
 618     <para>
 619      An alternative way of describing a function's results is to define it
 620      with <firstterm>output parameters</>, as in this example:
 621
 622 <screen>
 623 CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
 624 AS 'SELECT x + y'
 625 LANGUAGE SQL;
 626
 627 SELECT add_em(3,7);
 628  add_em
 629 --------
 630      10
 631 (1 row)
 632 </screen>
 633
 634      This is not essentially different from the version of <literal>add_em</>
 635      shown in <xref linkend="xfunc-sql-base-functions">.  The real value of
 636      output parameters is that they provide a convenient way of defining
 637      functions that return several columns.  For example,
 638
 639 <screen>
 640 CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
 641 AS 'SELECT x + y, x * y'
 642 LANGUAGE SQL;
 643
 644  SELECT * FROM sum_n_product(11,42);
 645  sum | product
 646 -----+---------
 647   53 |     462
 648 (1 row)
 649 </screen>
 650
 651      What has essentially happened here is that we have created an anonymous
 652      composite type for the result of the function.  The above example has
 653      the same end result as
 654
 655 <screen>
 656 CREATE TYPE sum_prod AS (sum int, product int);
 657
 658 CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
 659 AS 'SELECT $1 + $2, $1 * $2'
 660 LANGUAGE SQL;
 661 </screen>
 662
 663      but not having to bother with the separate composite type definition
 664      is often handy.  Notice that the names attached to the output parameters
 665      are not just decoration, but determine the column names of the anonymous
 666      composite type.  (If you omit a name for an output parameter, the
 667      system will choose a name on its own.)
 668     </para>
 669
 670     <para>
 671      Notice that output parameters are not included in the calling argument
 672      list when invoking such a function from SQL.  This is because
 673      <productname>PostgreSQL</productname> considers only the input
 674      parameters to define the function's calling signature.  That means
 675      also that only the input parameters matter when referencing the function
 676      for purposes such as dropping it.  We could drop the above function
 677      with either of
 678
 679 <screen>
 680 DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
 681 DROP FUNCTION sum_n_product (int, int);
 682 </screen>
 683     </para>
 684
 685     <para>
 686      Parameters can be marked as <literal>IN</> (the default),
 687      <literal>OUT</>, <literal>INOUT</>, or <literal>VARIADIC</>.
 688      An <literal>INOUT</>
 689      parameter serves as both an input parameter (part of the calling
 690      argument list) and an output parameter (part of the result record type).
 691      <literal>VARIADIC</> parameters are input parameters, but are treated
 692      specially as described next.
 693     </para>
 694    </sect2>
 695
 696    <sect2 id="xfunc-sql-variadic-functions">
 697     <title><acronym>SQL</> Functions with Variable Numbers of Arguments</title>
 698
 699     <indexterm>
 700      <primary>function</primary>
 701      <secondary>variadic</secondary>
 702     </indexterm>
 703
 704     <indexterm>
 705      <primary>variadic function</primary>
 706     </indexterm>
 707
 708     <para>
 709      <acronym>SQL</acronym> functions can be declared to accept
 710      variable numbers of arguments, so long as all the <quote>optional</>
 711      arguments are of the same data type.  The optional arguments will be
 712      passed to the function as an array.  The function is declared by
 713      marking the last parameter as <literal>VARIADIC</>; this parameter
 714      must be declared as being of an array type.  For example:
 715
 716 <screen>
 717 CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
 718     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
 719 $$ LANGUAGE SQL;
 720
 721 SELECT mleast(10, -1, 5, 4.4);
 722  mleast
 723 --------
 724      -1
 725 (1 row)
 726 </screen>
 727
 728      Effectively, all the actual arguments at or beyond the
 729      <literal>VARIADIC</> position are gathered up into a one-dimensional
 730      array, as if you had written
 731
 732 <screen>
 733 SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
 734 </screen>
 735
 736      You can't actually write that, though &mdash; or at least, it will
 737      not match this function definition.  A parameter marked
 738      <literal>VARIADIC</> matches one or more occurrences of its element
 739      type, not of its own type.
 740     </para>
 741
 742     <para>
 743      Sometimes it is useful to be able to pass an already-constructed array
 744      to a variadic function; this is particularly handy when one variadic
 745      function wants to pass on its array parameter to another one.  You can
 746      do that by specifying <literal>VARIADIC</> in the call:
 747
 748 <screen>
 749 SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
 750 </screen>
 751
 752      This prevents expansion of the function's variadic parameter into its
 753      element type, thereby allowing the array argument value to match
 754      normally.  <literal>VARIADIC</> can only be attached to the last
 755      actual argument of a function call.
 756     </para>
 757
 758     <para>
 759      Specifying <literal>VARIADIC</> in the call is also the only way to
 760      pass an empty array to a variadic function, for example:
 761
 762 <screen>
 763 SELECT mleast(VARIADIC ARRAY[]::numeric[]);
 764 </screen>
 765
 766      Simply writing <literal>SELECT mleast()</> does not work because a
 767      variadic parameter must match at least one actual argument.
 768      (You could define a second function also named <literal>mleast</>,
 769      with no parameters, if you wanted to allow such calls.)
 770     </para>
 771
 772     <para>
 773      The array element parameters generated from a variadic parameter are
 774      treated as not having any names of their own.  This means it is not
 775      possible to call a variadic function using named arguments (<xref
 776      linkend="sql-syntax-calling-funcs">), except when you specify
 777      <literal>VARIADIC</>.  For example, this will work:
 778
 779 <screen>
 780 SELECT mleast(VARIADIC arr =&gt; ARRAY[10, -1, 5, 4.4]);
 781 </screen>
 782
 783      but not these:
 784
 785 <screen>
 786 SELECT mleast(arr =&gt; 10);
 787 SELECT mleast(arr =&gt; ARRAY[10, -1, 5, 4.4]);
 788 </screen>
 789     </para>
 790    </sect2>
 791
 792    <sect2 id="xfunc-sql-parameter-defaults">
 793     <title><acronym>SQL</> Functions with Default Values for Arguments</title>
 794
 795     <indexterm>
 796      <primary>function</primary>
 797      <secondary>default values for arguments</secondary>
 798     </indexterm>
 799
 800     <para>
 801      Functions can be declared with default values for some or all input
 802      arguments.  The default values are inserted whenever the function is
 803      called with insufficiently many actual arguments.  Since arguments
 804      can only be omitted from the end of the actual argument list, all
 805      parameters after a parameter with a default value have to have
 806      default values as well.  (Although the use of named argument notation
 807      could allow this restriction to be relaxed, it's still enforced so that
 808      positional argument notation works sensibly.)
 809     </para>
 810
 811     <para>
 812      For example:
 813 <screen>
 814 CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
 815 RETURNS int
 816 LANGUAGE SQL
 817 AS $$
 818     SELECT $1 + $2 + $3;
 819 $$;
 820
 821 SELECT foo(10, 20, 30);
 822  foo
 823 -----
 824   60
 825 (1 row)
 826
 827 SELECT foo(10, 20);
 828  foo
 829 -----
 830   33
 831 (1 row)
 832
 833 SELECT foo(10);
 834  foo
 835 -----
 836   15
 837 (1 row)
 838
 839 SELECT foo();  -- fails since there is no default for the first argument
 840 ERROR:  function foo() does not exist
 841 </screen>
 842      The <literal>=</literal> sign can also be used in place of the
 843      key word <literal>DEFAULT</literal>.
 844     </para>
 845    </sect2>
 846
 847    <sect2 id="xfunc-sql-table-functions">
 848     <title><acronym>SQL</acronym> Functions as Table Sources</title>
 849
 850     <para>
 851      All SQL functions can be used in the <literal>FROM</> clause of a query,
 852      but it is particularly useful for functions returning composite types.
 853      If the function is defined to return a base type, the table function
 854      produces a one-column table.  If the function is defined to return
 855      a composite type, the table function produces a column for each attribute
 856      of the composite type.
 857     </para>
 858
 859     <para>
 860      Here is an example:
 861
 862 <screen>
 863 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 864 INSERT INTO foo VALUES (1, 1, 'Joe');
 865 INSERT INTO foo VALUES (1, 2, 'Ed');
 866 INSERT INTO foo VALUES (2, 1, 'Mary');
 867
 868 CREATE FUNCTION getfoo(int) RETURNS foo AS $$
 869     SELECT * FROM foo WHERE fooid = $1;
 870 $$ LANGUAGE SQL;
 871
 872 SELECT *, upper(fooname) FROM getfoo(1) AS t1;
 873
 874  fooid | foosubid | fooname | upper
 875 -------+----------+---------+-------
 876      1 |        1 | Joe     | JOE
 877 (1 row)
 878 </screen>
 879
 880      As the example shows, we can work with the columns of the function's
 881      result just the same as if they were columns of a regular table.
 882     </para>
 883
 884     <para>
 885      Note that we only got one row out of the function.  This is because
 886      we did not use <literal>SETOF</>.  That is described in the next section.
 887     </para>
 888    </sect2>
 889
 890    <sect2 id="xfunc-sql-functions-returning-set">
 891     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 892
 893     <indexterm>
 894      <primary>function</primary>
 895      <secondary>with SETOF</secondary>
 896     </indexterm>
 897
 898     <para>
 899      When an SQL function is declared as returning <literal>SETOF
 900      <replaceable>sometype</></literal>, the function's final
 901      query is executed to completion, and each row it
 902      outputs is returned as an element of the result set.
 903     </para>
 904
 905     <para>
 906      This feature is normally used when calling the function in the <literal>FROM</>
 907      clause.  In this case each row returned by the function becomes
 908      a row of the table seen by the query.  For example, assume that
 909      table <literal>foo</> has the same contents as above, and we say:
 910
 911 <programlisting>
 912 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 913     SELECT * FROM foo WHERE fooid = $1;
 914 $$ LANGUAGE SQL;
 915
 916 SELECT * FROM getfoo(1) AS t1;
 917 </programlisting>
 918
 919      Then we would get:
 920 <screen>
 921  fooid | foosubid | fooname
 922 -------+----------+---------
 923      1 |        1 | Joe
 924      1 |        2 | Ed
 925 (2 rows)
 926 </screen>
 927     </para>
 928
 929     <para>
 930      It is also possible to return multiple rows with the columns defined by
 931      output parameters, like this:
 932
 933 <programlisting>
 934 CREATE TABLE tab (y int, z int);
 935 INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
 936
 937 CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
 938 RETURNS SETOF record
 939 AS $$
 940     SELECT $1 + tab.y, $1 * tab.y FROM tab;
 941 $$ LANGUAGE SQL;
 942
 943 SELECT * FROM sum_n_product_with_tab(10);
 944  sum | product
 945 -----+---------
 946   11 |      10
 947   13 |      30
 948   15 |      50
 949   17 |      70
 950 (4 rows)
 951 </programlisting>
 952
 953      The key point here is that you must write <literal>RETURNS SETOF record</>
 954      to indicate that the function returns multiple rows instead of just one.
 955      If there is only one output parameter, write that parameter's type
 956      instead of <type>record</>.
 957     </para>
 958
 959     <para>
 960      It is frequently useful to construct a query's result by invoking a
 961      set-returning function multiple times, with the parameters for each
 962      invocation coming from successive rows of a table or subquery.  The
 963      preferred way to do this is to use the <literal>LATERAL</> key word,
 964      which is described in <xref linkend="queries-lateral">.
 965      Here is an example using a set-returning function to enumerate
 966      elements of a tree structure:
 967
 968 <screen>
 969 SELECT * FROM nodes;
 970    name    | parent
 971 -----------+--------
 972  Top       |
 973  Child1    | Top
 974  Child2    | Top
 975  Child3    | Top
 976  SubChild1 | Child1
 977  SubChild2 | Child1
 978 (6 rows)
 979
 980 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
 981     SELECT name FROM nodes WHERE parent = $1
 982 $$ LANGUAGE SQL STABLE;
 983
 984 SELECT * FROM listchildren('Top');
 985  listchildren
 986 --------------
 987  Child1
 988  Child2
 989  Child3
 990 (3 rows)
 991
 992 SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
 993   name  |   child
 994 --------+-----------
 995  Top    | Child1
 996  Top    | Child2
 997  Top    | Child3
 998  Child1 | SubChild1
 999  Child1 | SubChild2
1000 (5 rows)
1001 </screen>
1002
1003      This example does not do anything that we couldn't have done with a
1004      simple join, but in more complex calculations the option to put
1005      some of the work into a function can be quite convenient.
1006     </para>
1007
1008     <para>
1009      Currently, functions returning sets can also be called in the select list
1010      of a query.  For each row that the query
1011      generates by itself, the function returning set is invoked, and an output
1012      row is generated for each element of the function's result set. Note,
1013      however, that this capability is deprecated and might be removed in future
1014      releases. The previous example could also be done with queries like
1015      these:
1016
1017 <screen>
1018 SELECT listchildren('Top');
1019  listchildren
1020 --------------
1021  Child1
1022  Child2
1023  Child3
1024 (3 rows)
1025
1026 SELECT name, listchildren(name) FROM nodes;
1027   name  | listchildren
1028 --------+--------------
1029  Top    | Child1
1030  Top    | Child2
1031  Top    | Child3
1032  Child1 | SubChild1
1033  Child1 | SubChild2
1034 (5 rows)
1035 </screen>
1036
1037      In the last <command>SELECT</command>,
1038      notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
1039      This happens because <function>listchildren</function> returns an empty set
1040      for those arguments, so no result rows are generated.  This is the same
1041      behavior as we got from an inner join to the function result when using
1042      the <literal>LATERAL</> syntax.
1043     </para>
1044
1045     <note>
1046      <para>
1047       If a function's last command is <command>INSERT</>, <command>UPDATE</>,
1048       or <command>DELETE</> with <literal>RETURNING</>, that command will
1049       always be executed to completion, even if the function is not declared
1050       with <literal>SETOF</> or the calling query does not fetch all the
1051       result rows.  Any extra rows produced by the <literal>RETURNING</>
1052       clause are silently dropped, but the commanded table modifications
1053       still happen (and are all completed before returning from the function).
1054      </para>
1055     </note>
1056
1057     <note>
1058      <para>
1059       The key problem with using set-returning functions in the select list,
1060       rather than the <literal>FROM</> clause, is that putting more than one
1061       set-returning function in the same select list does not behave very
1062       sensibly.  (What you actually get if you do so is a number of output
1063       rows equal to the least common multiple of the numbers of rows produced
1064       by each set-returning function.)  The <literal>LATERAL</> syntax
1065       produces less surprising results when calling multiple set-returning
1066       functions, and should usually be used instead.
1067      </para>
1068     </note>
1069    </sect2>
1070
1071    <sect2 id="xfunc-sql-functions-returning-table">
1072     <title><acronym>SQL</acronym> Functions Returning <literal>TABLE</></title>
1073
1074     <indexterm>
1075      <primary>function</primary>
1076      <secondary>RETURNS TABLE</secondary>
1077     </indexterm>
1078
1079     <para>
1080      There is another way to declare a function as returning a set,
1081      which is to use the syntax
1082      <literal>RETURNS TABLE(<replaceable>columns</>)</literal>.
1083      This is equivalent to using one or more <literal>OUT</> parameters plus
1084      marking the function as returning <literal>SETOF record</> (or
1085      <literal>SETOF</> a single output parameter's type, as appropriate).
1086      This notation is specified in recent versions of the SQL standard, and
1087      thus may be more portable than using <literal>SETOF</>.
1088     </para>
1089
1090     <para>
1091      For example, the preceding sum-and-product example could also be
1092      done this way:
1093
1094 <programlisting>
1095 CREATE FUNCTION sum_n_product_with_tab (x int)
1096 RETURNS TABLE(sum int, product int) AS $$
1097     SELECT $1 + tab.y, $1 * tab.y FROM tab;
1098 $$ LANGUAGE SQL;
1099 </programlisting>
1100
1101      It is not allowed to use explicit <literal>OUT</> or <literal>INOUT</>
1102      parameters with the <literal>RETURNS TABLE</> notation &mdash; you must
1103      put all the output columns in the <literal>TABLE</> list.
1104     </para>
1105    </sect2>
1106
1107    <sect2>
1108     <title>Polymorphic <acronym>SQL</acronym> Functions</title>
1109
1110     <para>
1111      <acronym>SQL</acronym> functions can be declared to accept and
1112      return the polymorphic types <type>anyelement</type>,
1113      <type>anyarray</type>, <type>anynonarray</type>,
1114      <type>anyenum</type>, and <type>anyrange</type>.  See <xref
1115      linkend="extend-types-polymorphic"> for a more detailed
1116      explanation of polymorphic functions. Here is a polymorphic
1117      function <function>make_array</function> that builds up an array
1118      from two arbitrary data type elements:
1119 <screen>
1120 CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
1121     SELECT ARRAY[$1, $2];
1122 $$ LANGUAGE SQL;
1123
1124 SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
1125  intarray | textarray
1126 ----------+-----------
1127  {1,2}    | {a,b}
1128 (1 row)
1129 </screen>
1130     </para>
1131
1132     <para>
1133      Notice the use of the typecast <literal>'a'::text</literal>
1134      to specify that the argument is of type <type>text</type>. This is
1135      required if the argument is just a string literal, since otherwise
1136      it would be treated as type
1137      <type>unknown</type>, and array of <type>unknown</type> is not a valid
1138      type.
1139      Without the typecast, you will get errors like this:
1140 <screen>
1141 <computeroutput>
1142 ERROR:  could not determine polymorphic type because input has type "unknown"
1143 </computeroutput>
1144 </screen>
1145     </para>
1146
1147     <para>
1148      It is permitted to have polymorphic arguments with a fixed
1149      return type, but the converse is not. For example:
1150 <screen>
1151 CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
1152     SELECT $1 &gt; $2;
1153 $$ LANGUAGE SQL;
1154
1155 SELECT is_greater(1, 2);
1156  is_greater
1157 ------------
1158  f
1159 (1 row)
1160
1161 CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
1162     SELECT 1;
1163 $$ LANGUAGE SQL;
1164 ERROR:  cannot determine result data type
1165 DETAIL:  A function returning a polymorphic type must have at least one polymorphic argument.
1166 </screen>
1167     </para>
1168
1169     <para>
1170      Polymorphism can be used with functions that have output arguments.
1171      For example:
1172 <screen>
1173 CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
1174 AS 'select $1, array[$1,$1]' LANGUAGE SQL;
1175
1176 SELECT * FROM dup(22);
1177  f2 |   f3
1178 ----+---------
1179  22 | {22,22}
1180 (1 row)
1181 </screen>
1182     </para>
1183
1184     <para>
1185      Polymorphism can also be used with variadic functions.
1186      For example:
1187 <screen>
1188 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1189     SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
1190 $$ LANGUAGE SQL;
1191
1192 SELECT anyleast(10, -1, 5, 4);
1193  anyleast
1194 ----------
1195        -1
1196 (1 row)
1197
1198 SELECT anyleast('abc'::text, 'def');
1199  anyleast
1200 ----------
1201  abc
1202 (1 row)
1203
1204 CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
1205     SELECT array_to_string($2, $1);
1206 $$ LANGUAGE SQL;
1207
1208 SELECT concat_values('|', 1, 4, 2);
1209  concat_values
1210 ---------------
1211  1|4|2
1212 (1 row)
1213 </screen>
1214     </para>
1215    </sect2>
1216
1217    <sect2>
1218     <title><acronym>SQL</acronym> Functions with Collations</title>
1219
1220     <indexterm>
1221      <primary>collation</>
1222      <secondary>in SQL functions</>
1223     </indexterm>
1224
1225     <para>
1226      When a SQL function has one or more parameters of collatable data types,
1227      a collation is identified for each function call depending on the
1228      collations assigned to the actual arguments, as described in <xref
1229      linkend="collation">.  If a collation is successfully identified
1230      (i.e., there are no conflicts of implicit collations among the arguments)
1231      then all the collatable parameters are treated as having that collation
1232      implicitly.  This will affect the behavior of collation-sensitive
1233      operations within the function.  For example, using the
1234      <function>anyleast</> function described above, the result of
1235 <programlisting>
1236 SELECT anyleast('abc'::text, 'ABC');
1237 </programlisting>
1238      will depend on the database's default collation.  In <literal>C</> locale
1239      the result will be <literal>ABC</>, but in many other locales it will
1240      be <literal>abc</>.  The collation to use can be forced by adding
1241      a <literal>COLLATE</> clause to any of the arguments, for example
1242 <programlisting>
1243 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
1244 </programlisting>
1245      Alternatively, if you wish a function to operate with a particular
1246      collation regardless of what it is called with, insert
1247      <literal>COLLATE</> clauses as needed in the function definition.
1248      This version of <function>anyleast</> would always use <literal>en_US</>
1249      locale to compare strings:
1250 <programlisting>
1251 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
1252     SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
1253 $$ LANGUAGE SQL;
1254 </programlisting>
1255      But note that this will throw an error if applied to a non-collatable
1256      data type.
1257     </para>
1258
1259     <para>
1260      If no common collation can be identified among the actual arguments,
1261      then a SQL function treats its parameters as having their data types'
1262      default collation (which is usually the database's default collation,
1263      but could be different for parameters of domain types).
1264     </para>
1265
1266     <para>
1267      The behavior of collatable parameters can be thought of as a limited
1268      form of polymorphism, applicable only to textual data types.
1269     </para>
1270    </sect2>
1271   </sect1>
1272
1273   <sect1 id="xfunc-overload">
1274    <title>Function Overloading</title>
1275
1276    <indexterm zone="xfunc-overload">
1277     <primary>overloading</primary>
1278     <secondary>functions</secondary>
1279    </indexterm>
1280
1281    <para>
1282     More than one function can be defined with the same SQL name, so long
1283     as the arguments they take are different.  In other words,
1284     function names can be <firstterm>overloaded</firstterm>.  When a
1285     query is executed, the server will determine which function to
1286     call from the data types and the number of the provided arguments.
1287     Overloading can also be used to simulate functions with a variable
1288     number of arguments, up to a finite maximum number.
1289    </para>
1290
1291    <para>
1292     When creating a family of overloaded functions, one should be
1293     careful not to create ambiguities.  For instance, given the
1294     functions:
1295 <programlisting>
1296 CREATE FUNCTION test(int, real) RETURNS ...
1297 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1298 </programlisting>
1299     it is not immediately clear which function would be called with
1300     some trivial input like <literal>test(1, 1.5)</literal>.  The
1301     currently implemented resolution rules are described in
1302     <xref linkend="typeconv">, but it is unwise to design a system that subtly
1303     relies on this behavior.
1304    </para>
1305
1306    <para>
1307     A function that takes a single argument of a composite type should
1308     generally not have the same name as any attribute (field) of that type.
1309     Recall that <literal><replaceable>attribute</>(<replaceable>table</>)</literal>
1310     is considered equivalent
1311     to <literal><replaceable>table</>.<replaceable>attribute</></literal>.
1312     In the case that there is an
1313     ambiguity between a function on a composite type and an attribute of
1314     the composite type, the attribute will always be used.  It is possible
1315     to override that choice by schema-qualifying the function name
1316     (that is, <literal><replaceable>schema</>.<replaceable>func</>(<replaceable>table</>)
1317     </literal>) but it's better to
1318     avoid the problem by not choosing conflicting names.
1319    </para>
1320
1321    <para>
1322     Another possible conflict is between variadic and non-variadic functions.
1323     For instance, it is possible to create both <literal>foo(numeric)</> and
1324     <literal>foo(VARIADIC numeric[])</>.  In this case it is unclear which one
1325     should be matched to a call providing a single numeric argument, such as
1326     <literal>foo(10.1)</>.  The rule is that the function appearing
1327     earlier in the search path is used, or if the two functions are in the
1328     same schema, the non-variadic one is preferred.
1329    </para>
1330
1331    <para>
1332     When overloading C-language functions, there is an additional
1333     constraint: The C name of each function in the family of
1334     overloaded functions must be different from the C names of all
1335     other functions, either internal or dynamically loaded.  If this
1336     rule is violated, the behavior is not portable.  You might get a
1337     run-time linker error, or one of the functions will get called
1338     (usually the internal one).  The alternative form of the
1339     <literal>AS</> clause for the SQL <command>CREATE
1340     FUNCTION</command> command decouples the SQL function name from
1341     the function name in the C source code.  For instance:
1342 <programlisting>
1343 CREATE FUNCTION test(int) RETURNS int
1344     AS '<replaceable>filename</>', 'test_1arg'
1345     LANGUAGE C;
1346 CREATE FUNCTION test(int, int) RETURNS int
1347     AS '<replaceable>filename</>', 'test_2arg'
1348     LANGUAGE C;
1349 </programlisting>
1350     The names of the C functions here reflect one of many possible conventions.
1351    </para>
1352   </sect1>
1353
1354   <sect1 id="xfunc-volatility">
1355    <title>Function Volatility Categories</title>
1356
1357    <indexterm zone="xfunc-volatility">
1358     <primary>volatility</primary>
1359     <secondary>functions</secondary>
1360    </indexterm>
1361    <indexterm zone="xfunc-volatility">
1362     <primary>VOLATILE</primary>
1363    </indexterm>
1364    <indexterm zone="xfunc-volatility">
1365     <primary>STABLE</primary>
1366    </indexterm>
1367    <indexterm zone="xfunc-volatility">
1368     <primary>IMMUTABLE</primary>
1369    </indexterm>
1370
1371    <para>
1372     Every function has a <firstterm>volatility</> classification, with
1373     the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
1374     <literal>IMMUTABLE</>.  <literal>VOLATILE</> is the default if the
1375     <xref linkend="sql-createfunction">
1376     command does not specify a category.  The volatility category is a
1377     promise to the optimizer about the behavior of the function:
1378
1379    <itemizedlist>
1380     <listitem>
1381      <para>
1382       A <literal>VOLATILE</> function can do anything, including modifying
1383       the database.  It can return different results on successive calls with
1384       the same arguments.  The optimizer makes no assumptions about the
1385       behavior of such functions.  A query using a volatile function will
1386       re-evaluate the function at every row where its value is needed.
1387      </para>
1388     </listitem>
1389     <listitem>
1390      <para>
1391       A <literal>STABLE</> function cannot modify the database and is
1392       guaranteed to return the same results given the same arguments
1393       for all rows within a single statement. This category allows the
1394       optimizer to optimize multiple calls of the function to a single
1395       call. In particular, it is safe to use an expression containing
1396       such a function in an index scan condition. (Since an index scan
1397       will evaluate the comparison value only once, not once at each
1398       row, it is not valid to use a <literal>VOLATILE</> function in an
1399       index scan condition.)
1400      </para>
1401     </listitem>
1402     <listitem>
1403      <para>
1404       An <literal>IMMUTABLE</> function cannot modify the database and is
1405       guaranteed to return the same results given the same arguments forever.
1406       This category allows the optimizer to pre-evaluate the function when
1407       a query calls it with constant arguments.  For example, a query like
1408       <literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
1409       <literal>SELECT ... WHERE x = 4</>, because the function underlying
1410       the integer addition operator is marked <literal>IMMUTABLE</>.
1411      </para>
1412     </listitem>
1413    </itemizedlist>
1414    </para>
1415
1416    <para>
1417     For best optimization results, you should label your functions with the
1418     strictest volatility category that is valid for them.
1419    </para>
1420
1421    <para>
1422     Any function with side-effects <emphasis>must</> be labeled
1423     <literal>VOLATILE</>, so that calls to it cannot be optimized away.
1424     Even a function with no side-effects needs to be labeled
1425     <literal>VOLATILE</> if its value can change within a single query;
1426     some examples are <literal>random()</>, <literal>currval()</>,
1427     <literal>timeofday()</>.
1428    </para>
1429
1430    <para>
1431     Another important example is that the <function>current_timestamp</>
1432     family of functions qualify as <literal>STABLE</>, since their values do
1433     not change within a transaction.
1434    </para>
1435
1436    <para>
1437     There is relatively little difference between <literal>STABLE</> and
1438     <literal>IMMUTABLE</> categories when considering simple interactive
1439     queries that are planned and immediately executed: it doesn't matter
1440     a lot whether a function is executed once during planning or once during
1441     query execution startup.  But there is a big difference if the plan is
1442     saved and reused later.  Labeling a function <literal>IMMUTABLE</> when
1443     it really isn't might allow it to be prematurely folded to a constant during
1444     planning, resulting in a stale value being re-used during subsequent uses
1445     of the plan.  This is a hazard when using prepared statements or when
1446     using function languages that cache plans (such as
1447     <application>PL/pgSQL</>).
1448    </para>
1449
1450    <para>
1451     For functions written in SQL or in any of the standard procedural
1452     languages, there is a second important property determined by the
1453     volatility category, namely the visibility of any data changes that have
1454     been made by the SQL command that is calling the function.  A
1455     <literal>VOLATILE</> function will see such changes, a <literal>STABLE</>
1456     or <literal>IMMUTABLE</> function will not.  This behavior is implemented
1457     using the snapshotting behavior of MVCC (see <xref linkend="mvcc">):
1458     <literal>STABLE</> and <literal>IMMUTABLE</> functions use a snapshot
1459     established as of the start of the calling query, whereas
1460     <literal>VOLATILE</> functions obtain a fresh snapshot at the start of
1461     each query they execute.
1462    </para>
1463
1464    <note>
1465     <para>
1466      Functions written in C can manage snapshots however they want, but it's
1467      usually a good idea to make C functions work this way too.
1468     </para>
1469    </note>
1470
1471    <para>
1472     Because of this snapshotting behavior,
1473     a function containing only <command>SELECT</> commands can safely be
1474     marked <literal>STABLE</>, even if it selects from tables that might be
1475     undergoing modifications by concurrent queries.
1476     <productname>PostgreSQL</productname> will execute all commands of a
1477     <literal>STABLE</> function using the snapshot established for the
1478     calling query, and so it will see a fixed view of the database throughout
1479     that query.
1480    </para>
1481
1482    <para>
1483     The same snapshotting behavior is used for <command>SELECT</> commands
1484     within <literal>IMMUTABLE</> functions.  It is generally unwise to select
1485     from database tables within an <literal>IMMUTABLE</> function at all,
1486     since the immutability will be broken if the table contents ever change.
1487     However, <productname>PostgreSQL</productname> does not enforce that you
1488     do not do that.
1489    </para>
1490
1491    <para>
1492     A common error is to label a function <literal>IMMUTABLE</> when its
1493     results depend on a configuration parameter.  For example, a function
1494     that manipulates timestamps might well have results that depend on the
1495     <xref linkend="guc-timezone"> setting.  For safety, such functions should
1496     be labeled <literal>STABLE</> instead.
1497    </para>
1498
1499    <note>
1500     <para>
1501      <productname>PostgreSQL</productname> requires that <literal>STABLE</>
1502      and <literal>IMMUTABLE</> functions contain no SQL commands other
1503      than <command>SELECT</> to prevent data modification.
1504      (This is not a completely bulletproof test, since such functions could
1505      still call <literal>VOLATILE</> functions that modify the database.
1506      If you do that, you will find that the <literal>STABLE</> or
1507      <literal>IMMUTABLE</> function does not notice the database changes
1508      applied by the called function, since they are hidden from its snapshot.)
1509     </para>
1510    </note>
1511   </sect1>
1512
1513   <sect1 id="xfunc-pl">
1514    <title>Procedural Language Functions</title>
1515
1516    <para>
1517     <productname>PostgreSQL</productname> allows user-defined functions
1518     to be written in other languages besides SQL and C.  These other
1519     languages are generically called <firstterm>procedural
1520     languages</firstterm> (<acronym>PL</>s).
1521     Procedural languages aren't built into the
1522     <productname>PostgreSQL</productname> server; they are offered
1523     by loadable modules.
1524     See <xref linkend="xplang"> and following chapters for more
1525     information.
1526    </para>
1527   </sect1>
1528
1529   <sect1 id="xfunc-internal">
1530    <title>Internal Functions</title>
1531
1532    <indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
1533
1534    <para>
1535     Internal functions are functions written in C that have been statically
1536     linked into the <productname>PostgreSQL</productname> server.
1537     The <quote>body</quote> of the function definition
1538     specifies the C-language name of the function, which need not be the
1539     same as the name being declared for SQL use.
1540     (For reasons of backward compatibility, an empty body
1541     is accepted as meaning that the C-language function name is the
1542     same as the SQL name.)
1543    </para>
1544
1545    <para>
1546     Normally, all internal functions present in the
1547     server are declared during the initialization of the database cluster
1548     (see <xref linkend="creating-cluster">),
1549     but a user could use <command>CREATE FUNCTION</command>
1550     to create additional alias names for an internal function.
1551     Internal functions are declared in <command>CREATE FUNCTION</command>
1552     with language name <literal>internal</literal>.  For instance, to
1553     create an alias for the <function>sqrt</function> function:
1554 <programlisting>
1555 CREATE FUNCTION square_root(double precision) RETURNS double precision
1556     AS 'dsqrt'
1557     LANGUAGE internal
1558     STRICT;
1559 </programlisting>
1560     (Most internal functions expect to be declared <quote>strict</quote>.)
1561    </para>
1562
1563    <note>
1564     <para>
1565      Not all <quote>predefined</quote> functions are
1566      <quote>internal</quote> in the above sense.  Some predefined
1567      functions are written in SQL.
1568     </para>
1569    </note>
1570   </sect1>
1571
1572   <sect1 id="xfunc-c">
1573    <title>C-Language Functions</title>
1574
1575    <indexterm zone="xfunc-c">
1576     <primary>function</primary>
1577     <secondary>user-defined</secondary>
1578     <tertiary>in C</tertiary>
1579    </indexterm>
1580
1581    <para>
1582     User-defined functions can be written in C (or a language that can
1583     be made compatible with C, such as C++).  Such functions are
1584     compiled into dynamically loadable objects (also called shared
1585     libraries) and are loaded by the server on demand.  The dynamic
1586     loading feature is what distinguishes <quote>C language</> functions
1587     from <quote>internal</> functions &mdash; the actual coding conventions
1588     are essentially the same for both.  (Hence, the standard internal
1589     function library is a rich source of coding examples for user-defined
1590     C functions.)
1591    </para>
1592
1593    <para>
1594     Two different calling conventions are currently used for C functions.
1595     The newer <quote>version 1</quote> calling convention is indicated by writing
1596     a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
1597     as illustrated below.  Lack of such a macro indicates an old-style
1598     (<quote>version 0</quote>) function.  The language name specified in <command>CREATE FUNCTION</command>
1599     is <literal>C</literal> in either case.  Old-style functions are now deprecated
1600     because of portability problems and lack of functionality, but they
1601     are still supported for compatibility reasons.
1602    </para>
1603
1604   <sect2 id="xfunc-c-dynload">
1605    <title>Dynamic Loading</title>
1606
1607    <indexterm zone="xfunc-c-dynload">
1608     <primary>dynamic loading</primary>
1609    </indexterm>
1610
1611    <para>
1612     The first time a user-defined function in a particular
1613     loadable object file is called in a session,
1614     the dynamic loader loads that object file into memory so that the
1615     function can be called.  The <command>CREATE FUNCTION</command>
1616     for a user-defined C function must therefore specify two pieces of
1617     information for the function: the name of the loadable
1618     object file, and the C name (link symbol) of the specific function to call
1619     within that object file.  If the C name is not explicitly specified then
1620     it is assumed to be the same as the SQL function name.
1621    </para>
1622
1623    <para>
1624     The following algorithm is used to locate the shared object file
1625     based on the name given in the <command>CREATE FUNCTION</command>
1626     command:
1627
1628     <orderedlist>
1629      <listitem>
1630       <para>
1631        If the name is an absolute path, the given file is loaded.
1632       </para>
1633      </listitem>
1634
1635      <listitem>
1636       <para>
1637        If the name starts with the string <literal>$libdir</literal>,
1638        that part is replaced by the <productname>PostgreSQL</> package
1639         library directory
1640        name, which is determined at build time.<indexterm><primary>$libdir</></>
1641       </para>
1642      </listitem>
1643
1644      <listitem>
1645       <para>
1646        If the name does not contain a directory part, the file is
1647        searched for in the path specified by the configuration variable
1648        <xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1649       </para>
1650      </listitem>
1651
1652      <listitem>
1653       <para>
1654        Otherwise (the file was not found in the path, or it contains a
1655        non-absolute directory part), the dynamic loader will try to
1656        take the name as given, which will most likely fail.  (It is
1657        unreliable to depend on the current working directory.)
1658       </para>
1659      </listitem>
1660     </orderedlist>
1661
1662     If this sequence does not work, the platform-specific shared
1663     library file name extension (often <filename>.so</filename>) is
1664     appended to the given name and this sequence is tried again.  If
1665     that fails as well, the load will fail.
1666    </para>
1667
1668    <para>
1669     It is recommended to locate shared libraries either relative to
1670     <literal>$libdir</literal> or through the dynamic library path.
1671     This simplifies version upgrades if the new installation is at a
1672     different location.  The actual directory that
1673     <literal>$libdir</literal> stands for can be found out with the
1674     command <literal>pg_config --pkglibdir</literal>.
1675    </para>
1676
1677    <para>
1678     The user ID the <productname>PostgreSQL</productname> server runs
1679     as must be able to traverse the path to the file you intend to
1680     load.  Making the file or a higher-level directory not readable
1681     and/or not executable by the <systemitem>postgres</systemitem>
1682     user is a common mistake.
1683    </para>
1684
1685    <para>
1686     In any case, the file name that is given in the
1687     <command>CREATE FUNCTION</command> command is recorded literally
1688     in the system catalogs, so if the file needs to be loaded again
1689     the same procedure is applied.
1690    </para>
1691
1692    <note>
1693     <para>
1694      <productname>PostgreSQL</productname> will not compile a C function
1695      automatically.  The object file must be compiled before it is referenced
1696      in a <command>CREATE
1697      FUNCTION</> command.  See <xref linkend="dfunc"> for additional
1698      information.
1699     </para>
1700    </note>
1701
1702    <indexterm zone="xfunc-c-dynload">
1703     <primary>magic block</primary>
1704    </indexterm>
1705
1706    <para>
1707     To ensure that a dynamically loaded object file is not loaded into an
1708     incompatible server, <productname>PostgreSQL</productname> checks that the
1709     file contains a <quote>magic block</> with the appropriate contents.
1710     This allows the server to detect obvious incompatibilities, such as code
1711     compiled for a different major version of
1712     <productname>PostgreSQL</productname>.  A magic block is required as of
1713     <productname>PostgreSQL</productname> 8.2.  To include a magic block,
1714     write this in one (and only one) of the module source files, after having
1715     included the header <filename>fmgr.h</>:
1716
1717 <programlisting>
1718 #ifdef PG_MODULE_MAGIC
1719 PG_MODULE_MAGIC;
1720 #endif
1721 </programlisting>
1722
1723     The <literal>#ifdef</> test can be omitted if the code doesn't
1724     need to compile against pre-8.2 <productname>PostgreSQL</productname>
1725     releases.
1726    </para>
1727
1728    <para>
1729     After it is used for the first time, a dynamically loaded object
1730     file is retained in memory.  Future calls in the same session to
1731     the function(s) in that file will only incur the small overhead of
1732     a symbol table lookup.  If you need to force a reload of an object
1733     file, for example after recompiling it, begin a fresh session.
1734    </para>
1735
1736    <indexterm zone="xfunc-c-dynload">
1737     <primary>_PG_init</primary>
1738    </indexterm>
1739    <indexterm zone="xfunc-c-dynload">
1740     <primary>_PG_fini</primary>
1741    </indexterm>
1742    <indexterm zone="xfunc-c-dynload">
1743     <primary>library initialization function</primary>
1744    </indexterm>
1745    <indexterm zone="xfunc-c-dynload">
1746     <primary>library finalization function</primary>
1747    </indexterm>
1748
1749    <para>
1750     Optionally, a dynamically loaded file can contain initialization and
1751     finalization functions.  If the file includes a function named
1752     <function>_PG_init</>, that function will be called immediately after
1753     loading the file.  The function receives no parameters and should
1754     return void.  If the file includes a function named
1755     <function>_PG_fini</>, that function will be called immediately before
1756     unloading the file.  Likewise, the function receives no parameters and
1757     should return void.  Note that <function>_PG_fini</> will only be called
1758     during an unload of the file, not during process termination.
1759     (Presently, unloads are disabled and will never occur, but this may
1760     change in the future.)
1761    </para>
1762
1763   </sect2>
1764
1765    <sect2 id="xfunc-c-basetype">
1766     <title>Base Types in C-Language Functions</title>
1767
1768     <indexterm zone="xfunc-c-basetype">
1769      <primary>data type</primary>
1770      <secondary>internal organization</secondary>
1771     </indexterm>
1772
1773     <para>
1774      To know how to write C-language functions, you need to know how
1775      <productname>PostgreSQL</productname> internally represents base
1776      data types and how they can be passed to and from functions.
1777      Internally, <productname>PostgreSQL</productname> regards a base
1778      type as a <quote>blob of memory</quote>.  The user-defined
1779      functions that you define over a type in turn define the way that
1780      <productname>PostgreSQL</productname> can operate on it.  That
1781      is, <productname>PostgreSQL</productname> will only store and
1782      retrieve the data from disk and use your user-defined functions
1783      to input, process, and output the data.
1784     </para>
1785
1786     <para>
1787      Base types can have one of three internal formats:
1788
1789      <itemizedlist>
1790       <listitem>
1791        <para>
1792         pass by value, fixed-length
1793        </para>
1794       </listitem>
1795       <listitem>
1796        <para>
1797         pass by reference, fixed-length
1798        </para>
1799       </listitem>
1800       <listitem>
1801        <para>
1802         pass by reference, variable-length
1803        </para>
1804       </listitem>
1805      </itemizedlist>
1806     </para>
1807
1808     <para>
1809      By-value  types  can  only be 1, 2, or 4 bytes in length
1810      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1811      You should be careful to define your types such that they will be the
1812      same size (in bytes) on all architectures.  For example, the
1813      <literal>long</literal> type is dangerous because it is 4 bytes on some
1814      machines and 8 bytes on others, whereas <type>int</type> type is 4 bytes
1815      on most Unix machines.  A reasonable implementation of the
1816      <type>int4</type> type on Unix machines might be:
1817
1818 <programlisting>
1819 /* 4-byte integer, passed by value */
1820 typedef int int4;
1821 </programlisting>
1822
1823      (The actual PostgreSQL C code calls this type <type>int32</type>, because
1824      it is a convention in C that <type>int<replaceable>XX</replaceable></type>
1825      means <replaceable>XX</replaceable> <emphasis>bits</emphasis>.  Note
1826      therefore also that the C type <type>int8</type> is 1 byte in size.  The
1827      SQL type <type>int8</type> is called <type>int64</type> in C.  See also
1828      <xref linkend="xfunc-c-type-table">.)
1829     </para>
1830
1831     <para>
1832      On  the  other hand, fixed-length types of any size can
1833      be passed by-reference.  For example, here is a  sample
1834      implementation of a <productname>PostgreSQL</productname> type:
1835
1836 <programlisting>
1837 /* 16-byte structure, passed by reference */
1838 typedef struct
1839 {
1840     double  x, y;
1841 } Point;
1842 </programlisting>
1843
1844      Only  pointers  to  such types can be used when passing
1845      them in and out of <productname>PostgreSQL</productname> functions.
1846      To return a value of such a type, allocate the right amount of
1847      memory with <literal>palloc</literal>, fill in the allocated memory,
1848      and return a pointer to it.  (Also, if you just want to return the
1849      same value as one of your input arguments that's of the same data type,
1850      you can skip the extra <literal>palloc</literal> and just return the
1851      pointer to the input value.)
1852     </para>
1853
1854     <para>
1855      Finally, all variable-length types must also be  passed
1856      by  reference.   All  variable-length  types must begin
1857      with an opaque length field of exactly 4 bytes, which will be set
1858      by <symbol>SET_VARSIZE</symbol>; never set this field directly! All data to
1859      be  stored within that type must be located in the memory
1860      immediately  following  that  length  field.   The
1861      length field contains the total length of the structure,
1862      that is,  it  includes  the  size  of  the  length  field
1863      itself.
1864     </para>
1865
1866     <para>
1867      Another important point is to avoid leaving any uninitialized bits
1868      within data type values; for example, take care to zero out any
1869      alignment padding bytes that might be present in structs.  Without
1870      this, logically-equivalent constants of your data type might be
1871      seen as unequal by the planner, leading to inefficient (though not
1872      incorrect) plans.
1873     </para>
1874
1875     <warning>
1876      <para>
1877       <emphasis>Never</> modify the contents of a pass-by-reference input
1878       value.  If you do so you are likely to corrupt on-disk data, since
1879       the pointer you are given might point directly into a disk buffer.
1880       The sole exception to this rule is explained in
1881       <xref linkend="xaggr">.
1882      </para>
1883     </warning>
1884
1885     <para>
1886      As an example, we can define the type <type>text</type> as
1887      follows:
1888
1889 <programlisting>
1890 typedef struct {
1891     int32 length;
1892     char data[FLEXIBLE_ARRAY_MEMBER];
1893 } text;
1894 </programlisting>
1895
1896      The <literal>[FLEXIBLE_ARRAY_MEMBER]</> notation means that the actual
1897      length of the data part is not specified by this declaration.
1898     </para>
1899
1900     <para>
1901      When manipulating
1902      variable-length types, we must  be  careful  to  allocate
1903      the  correct amount  of memory and set the length field correctly.
1904      For example, if we wanted to  store  40  bytes  in  a <structname>text</>
1905      structure, we might use a code fragment like this:
1906
1907 <programlisting><![CDATA[
1908 #include "postgres.h"
1909 ...
1910 char buffer[40]; /* our source data */
1911 ...
1912 text *destination = (text *) palloc(VARHDRSZ + 40);
1913 SET_VARSIZE(destination, VARHDRSZ + 40);
1914 memcpy(destination->data, buffer, 40);
1915 ...
1916 ]]>
1917 </programlisting>
1918
1919      <literal>VARHDRSZ</> is the same as <literal>sizeof(int32)</>, but
1920      it's considered good style to use the macro <literal>VARHDRSZ</>
1921      to refer to the size of the overhead for a variable-length type.
1922      Also, the length field <emphasis>must</> be set using the
1923      <literal>SET_VARSIZE</> macro, not by simple assignment.
1924     </para>
1925
1926     <para>
1927      <xref linkend="xfunc-c-type-table"> specifies which C type
1928      corresponds to which SQL type when writing a C-language function
1929      that uses a built-in type of <productname>PostgreSQL</>.
1930      The <quote>Defined In</quote> column gives the header file that
1931      needs to be included to get the type definition.  (The actual
1932      definition might be in a different file that is included by the
1933      listed file.  It is recommended that users stick to the defined
1934      interface.)  Note that you should always include
1935      <filename>postgres.h</filename> first in any source file, because
1936      it declares a number of things that you will need anyway.
1937     </para>
1938
1939      <table tocentry="1" id="xfunc-c-type-table">
1940       <title>Equivalent C Types for Built-in SQL Types</title>
1941       <tgroup cols="3">
1942        <thead>
1943         <row>
1944          <entry>
1945           SQL Type
1946          </entry>
1947          <entry>
1948           C Type
1949          </entry>
1950          <entry>
1951           Defined In
1952          </entry>
1953         </row>
1954        </thead>
1955        <tbody>
1956         <row>
1957          <entry><type>abstime</type></entry>
1958          <entry><type>AbsoluteTime</type></entry>
1959          <entry><filename>utils/nabstime.h</filename></entry>
1960         </row>
1961         <row>
1962          <entry><type>bigint</type> (<type>int8</type>)</entry>
1963          <entry><type>int64</type></entry>
1964          <entry><filename>postgres.h</filename></entry>
1965         </row>
1966         <row>
1967          <entry><type>boolean</type></entry>
1968          <entry><type>bool</type></entry>
1969          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1970         </row>
1971         <row>
1972          <entry><type>box</type></entry>
1973          <entry><type>BOX*</type></entry>
1974          <entry><filename>utils/geo_decls.h</filename></entry>
1975         </row>
1976         <row>
1977          <entry><type>bytea</type></entry>
1978          <entry><type>bytea*</type></entry>
1979          <entry><filename>postgres.h</filename></entry>
1980         </row>
1981         <row>
1982          <entry><type>"char"</type></entry>
1983          <entry><type>char</type></entry>
1984          <entry>(compiler built-in)</entry>
1985         </row>
1986         <row>
1987          <entry><type>character</type></entry>
1988          <entry><type>BpChar*</type></entry>
1989          <entry><filename>postgres.h</filename></entry>
1990         </row>
1991         <row>
1992          <entry><type>cid</type></entry>
1993          <entry><type>CommandId</type></entry>
1994          <entry><filename>postgres.h</filename></entry>
1995         </row>
1996         <row>
1997          <entry><type>date</type></entry>
1998          <entry><type>DateADT</type></entry>
1999          <entry><filename>utils/date.h</filename></entry>
2000         </row>
2001         <row>
2002          <entry><type>smallint</type> (<type>int2</type>)</entry>
2003          <entry><type>int16</type></entry>
2004          <entry><filename>postgres.h</filename></entry>
2005         </row>
2006         <row>
2007          <entry><type>int2vector</type></entry>
2008          <entry><type>int2vector*</type></entry>
2009          <entry><filename>postgres.h</filename></entry>
2010         </row>
2011         <row>
2012          <entry><type>integer</type> (<type>int4</type>)</entry>
2013          <entry><type>int32</type></entry>
2014          <entry><filename>postgres.h</filename></entry>
2015         </row>
2016         <row>
2017          <entry><type>real</type> (<type>float4</type>)</entry>
2018          <entry><type>float4*</type></entry>
2019         <entry><filename>postgres.h</filename></entry>
2020         </row>
2021         <row>
2022          <entry><type>double precision</type> (<type>float8</type>)</entry>
2023          <entry><type>float8*</type></entry>
2024          <entry><filename>postgres.h</filename></entry>
2025         </row>
2026         <row>
2027          <entry><type>interval</type></entry>
2028          <entry><type>Interval*</type></entry>
2029          <entry><filename>datatype/timestamp.h</filename></entry>
2030         </row>
2031         <row>
2032          <entry><type>lseg</type></entry>
2033          <entry><type>LSEG*</type></entry>
2034          <entry><filename>utils/geo_decls.h</filename></entry>
2035         </row>
2036         <row>
2037          <entry><type>name</type></entry>
2038          <entry><type>Name</type></entry>
2039          <entry><filename>postgres.h</filename></entry>
2040         </row>
2041         <row>
2042          <entry><type>oid</type></entry>
2043          <entry><type>Oid</type></entry>
2044          <entry><filename>postgres.h</filename></entry>
2045         </row>
2046         <row>
2047          <entry><type>oidvector</type></entry>
2048          <entry><type>oidvector*</type></entry>
2049          <entry><filename>postgres.h</filename></entry>
2050         </row>
2051         <row>
2052          <entry><type>path</type></entry>
2053          <entry><type>PATH*</type></entry>
2054          <entry><filename>utils/geo_decls.h</filename></entry>
2055         </row>
2056         <row>
2057          <entry><type>point</type></entry>
2058          <entry><type>POINT*</type></entry>
2059          <entry><filename>utils/geo_decls.h</filename></entry>
2060         </row>
2061         <row>
2062          <entry><type>regproc</type></entry>
2063          <entry><type>regproc</type></entry>
2064          <entry><filename>postgres.h</filename></entry>
2065         </row>
2066         <row>
2067          <entry><type>reltime</type></entry>
2068          <entry><type>RelativeTime</type></entry>
2069          <entry><filename>utils/nabstime.h</filename></entry>
2070         </row>
2071         <row>
2072          <entry><type>text</type></entry>
2073          <entry><type>text*</type></entry>
2074          <entry><filename>postgres.h</filename></entry>
2075         </row>
2076         <row>
2077          <entry><type>tid</type></entry>
2078          <entry><type>ItemPointer</type></entry>
2079          <entry><filename>storage/itemptr.h</filename></entry>
2080         </row>
2081         <row>
2082          <entry><type>time</type></entry>
2083          <entry><type>TimeADT</type></entry>
2084          <entry><filename>utils/date.h</filename></entry>
2085         </row>
2086         <row>
2087          <entry><type>time with time zone</type></entry>
2088          <entry><type>TimeTzADT</type></entry>
2089          <entry><filename>utils/date.h</filename></entry>
2090         </row>
2091         <row>
2092          <entry><type>timestamp</type></entry>
2093          <entry><type>Timestamp*</type></entry>
2094          <entry><filename>datatype/timestamp.h</filename></entry>
2095         </row>
2096         <row>
2097          <entry><type>tinterval</type></entry>
2098          <entry><type>TimeInterval</type></entry>
2099          <entry><filename>utils/nabstime.h</filename></entry>
2100         </row>
2101         <row>
2102          <entry><type>varchar</type></entry>
2103          <entry><type>VarChar*</type></entry>
2104          <entry><filename>postgres.h</filename></entry>
2105         </row>
2106         <row>
2107          <entry><type>xid</type></entry>
2108          <entry><type>TransactionId</type></entry>
2109          <entry><filename>postgres.h</filename></entry>
2110         </row>
2111        </tbody>
2112       </tgroup>
2113      </table>
2114
2115     <para>
2116      Now that we've gone over all of the possible structures
2117      for base types, we can show some examples of real functions.
2118     </para>
2119    </sect2>
2120
2121    <sect2>
2122     <title>Version 0 Calling Conventions</title>
2123
2124     <para>
2125      We present the <quote>old style</quote> calling convention first &mdash; although
2126      this approach is now deprecated, it's easier to get a handle on
2127      initially.  In the version-0 method, the arguments and result
2128      of the C function are just declared in normal C style, but being
2129      careful to use the C representation of each SQL data type as shown
2130      above.
2131     </para>
2132
2133     <para>
2134      Here are some examples:
2135
2136 <programlisting><![CDATA[
2137 #include "postgres.h"
2138 #include <string.h>
2139 #include "utils/geo_decls.h"
2140
2141 #ifdef PG_MODULE_MAGIC
2142 PG_MODULE_MAGIC;
2143 #endif
2144
2145 /* by value */
2146
2147 int
2148 add_one(int arg)
2149 {
2150     return arg + 1;
2151 }
2152
2153 /* by reference, fixed length */
2154
2155 float8 *
2156 add_one_float8(float8 *arg)
2157 {
2158     float8    *result = (float8 *) palloc(sizeof(float8));
2159
2160     *result = *arg + 1.0;
2161
2162     return result;
2163 }
2164
2165 Point *
2166 makepoint(Point *pointx, Point *pointy)
2167 {
2168     Point     *new_point = (Point *) palloc(sizeof(Point));
2169
2170     new_point->x = pointx->x;
2171     new_point->y = pointy->y;
2172
2173     return new_point;
2174 }
2175
2176 /* by reference, variable length */
2177
2178 text *
2179 copytext(text *t)
2180 {
2181     /*
2182      * VARSIZE is the total size of the struct in bytes.
2183      */
2184     text *new_t = (text *) palloc(VARSIZE(t));
2185     SET_VARSIZE(new_t, VARSIZE(t));
2186     /*
2187      * VARDATA is a pointer to the data region of the struct.
2188      */
2189     memcpy((void *) VARDATA(new_t), /* destination */
2190            (void *) VARDATA(t),     /* source */
2191            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
2192     return new_t;
2193 }
2194
2195 text *
2196 concat_text(text *arg1, text *arg2)
2197 {
2198     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2199     text *new_text = (text *) palloc(new_text_size);
2200
2201     SET_VARSIZE(new_text, new_text_size);
2202     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2203     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2204            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2205     return new_text;
2206 }
2207 ]]>
2208 </programlisting>
2209     </para>
2210
2211     <para>
2212      Supposing that the above code has been prepared in file
2213      <filename>funcs.c</filename> and compiled into a shared object,
2214      we could define the functions to <productname>PostgreSQL</productname>
2215      with commands like this:
2216
2217 <programlisting>
2218 CREATE FUNCTION add_one(integer) RETURNS integer
2219      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
2220      LANGUAGE C STRICT;
2221
2222 -- note overloading of SQL function name "add_one"
2223 CREATE FUNCTION add_one(double precision) RETURNS double precision
2224      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
2225      LANGUAGE C STRICT;
2226
2227 CREATE FUNCTION makepoint(point, point) RETURNS point
2228      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
2229      LANGUAGE C STRICT;
2230
2231 CREATE FUNCTION copytext(text) RETURNS text
2232      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
2233      LANGUAGE C STRICT;
2234
2235 CREATE FUNCTION concat_text(text, text) RETURNS text
2236      AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text'
2237      LANGUAGE C STRICT;
2238 </programlisting>
2239     </para>
2240
2241     <para>
2242      Here, <replaceable>DIRECTORY</replaceable> stands for the
2243      directory of the shared library file (for instance the
2244      <productname>PostgreSQL</productname> tutorial directory, which
2245      contains the code for the examples used in this section).
2246      (Better style would be to use just <literal>'funcs'</> in the
2247      <literal>AS</> clause, after having added
2248      <replaceable>DIRECTORY</replaceable> to the search path.  In any
2249      case, we can omit the system-specific extension for a shared
2250      library, commonly <literal>.so</literal> or
2251      <literal>.sl</literal>.)
2252     </para>
2253
2254     <para>
2255      Notice that we have specified the functions as <quote>strict</quote>,
2256      meaning that
2257      the system should automatically assume a null result if any input
2258      value is null.  By doing this, we avoid having to check for null inputs
2259      in the function code.  Without this, we'd have to check for null values
2260      explicitly, by checking for a null pointer for each
2261      pass-by-reference argument.  (For pass-by-value arguments, we don't
2262      even have a way to check!)
2263     </para>
2264
2265     <para>
2266      Although this calling convention is simple to use,
2267      it is not very portable; on some architectures there are problems
2268      with passing data types that are smaller than <type>int</type> this way.  Also, there is
2269      no simple way to return a null result, nor to cope with null arguments
2270      in any way other than making the function strict.  The version-1
2271      convention, presented next, overcomes these objections.
2272     </para>
2273    </sect2>
2274
2275    <sect2>
2276     <title>Version 1 Calling Conventions</title>
2277
2278     <para>
2279      The version-1 calling convention relies on macros to suppress most
2280      of the complexity of passing arguments and results.  The C declaration
2281      of a version-1 function is always:
2282 <programlisting>
2283 Datum funcname(PG_FUNCTION_ARGS)
2284 </programlisting>
2285      In addition, the macro call:
2286 <programlisting>
2287 PG_FUNCTION_INFO_V1(funcname);
2288 </programlisting>
2289      must appear in the same source file.  (Conventionally, it's
2290      written just before the function itself.)  This macro call is not
2291      needed for <literal>internal</>-language functions, since
2292      <productname>PostgreSQL</> assumes that all internal functions
2293      use the version-1 convention.  It is, however, required for
2294      dynamically-loaded functions.
2295     </para>
2296
2297     <para>
2298      In a version-1 function, each actual argument is fetched using a
2299      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2300      macro that corresponds to the argument's data type, and the
2301      result is returned using a
2302      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2303      macro for the return type.
2304      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2305      takes as its argument the number of the function argument to
2306      fetch, where the count starts at 0.
2307      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
2308      takes as its argument the actual value to return.
2309     </para>
2310
2311     <para>
2312      Here we show the same functions as above, coded in version-1 style:
2313
2314 <programlisting><![CDATA[
2315 #include "postgres.h"
2316 #include <string.h>
2317 #include "fmgr.h"
2318 #include "utils/geo_decls.h"
2319
2320 #ifdef PG_MODULE_MAGIC
2321 PG_MODULE_MAGIC;
2322 #endif
2323
2324 /* by value */
2325
2326 PG_FUNCTION_INFO_V1(add_one);
2327
2328 Datum
2329 add_one(PG_FUNCTION_ARGS)
2330 {
2331     int32   arg = PG_GETARG_INT32(0);
2332
2333     PG_RETURN_INT32(arg + 1);
2334 }
2335
2336 /* by reference, fixed length */
2337
2338 PG_FUNCTION_INFO_V1(add_one_float8);
2339
2340 Datum
2341 add_one_float8(PG_FUNCTION_ARGS)
2342 {
2343     /* The macros for FLOAT8 hide its pass-by-reference nature. */
2344     float8   arg = PG_GETARG_FLOAT8(0);
2345
2346     PG_RETURN_FLOAT8(arg + 1.0);
2347 }
2348
2349 PG_FUNCTION_INFO_V1(makepoint);
2350
2351 Datum
2352 makepoint(PG_FUNCTION_ARGS)
2353 {
2354     /* Here, the pass-by-reference nature of Point is not hidden. */
2355     Point     *pointx = PG_GETARG_POINT_P(0);
2356     Point     *pointy = PG_GETARG_POINT_P(1);
2357     Point     *new_point = (Point *) palloc(sizeof(Point));
2358
2359     new_point->x = pointx->x;
2360     new_point->y = pointy->y;
2361
2362     PG_RETURN_POINT_P(new_point);
2363 }
2364
2365 /* by reference, variable length */
2366
2367 PG_FUNCTION_INFO_V1(copytext);
2368
2369 Datum
2370 copytext(PG_FUNCTION_ARGS)
2371 {
2372     text     *t = PG_GETARG_TEXT_P(0);
2373     /*
2374      * VARSIZE is the total size of the struct in bytes.
2375      */
2376     text     *new_t = (text *) palloc(VARSIZE(t));
2377     SET_VARSIZE(new_t, VARSIZE(t));
2378     /*
2379      * VARDATA is a pointer to the data region of the struct.
2380      */
2381     memcpy((void *) VARDATA(new_t), /* destination */
2382            (void *) VARDATA(t),     /* source */
2383            VARSIZE(t) - VARHDRSZ);  /* how many bytes */
2384     PG_RETURN_TEXT_P(new_t);
2385 }
2386
2387 PG_FUNCTION_INFO_V1(concat_text);
2388
2389 Datum
2390 concat_text(PG_FUNCTION_ARGS)
2391 {
2392     text  *arg1 = PG_GETARG_TEXT_P(0);
2393     text  *arg2 = PG_GETARG_TEXT_P(1);
2394     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
2395     text *new_text = (text *) palloc(new_text_size);
2396
2397     SET_VARSIZE(new_text, new_text_size);
2398     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1) - VARHDRSZ);
2399     memcpy(VARDATA(new_text) + (VARSIZE(arg1) - VARHDRSZ),
2400            VARDATA(arg2), VARSIZE(arg2) - VARHDRSZ);
2401     PG_RETURN_TEXT_P(new_text);
2402 }
2403 ]]>
2404 </programlisting>
2405     </para>
2406
2407     <para>
2408      The <command>CREATE FUNCTION</command> commands are the same as
2409      for the version-0 equivalents.
2410     </para>
2411
2412     <para>
2413      At first glance, the version-1 coding conventions might appear to
2414      be just pointless obscurantism.  They do, however, offer a number
2415      of improvements, because the macros can hide unnecessary detail.
2416      An example is that in coding <function>add_one_float8</>, we no longer need to
2417      be aware that <type>float8</type> is a pass-by-reference type.  Another
2418      example is that the <literal>GETARG</> macros for variable-length types allow
2419      for more efficient fetching of <quote>toasted</quote> (compressed or
2420      out-of-line) values.
2421     </para>
2422
2423     <para>
2424      One big improvement in version-1 functions is better handling of null
2425      inputs and results.  The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
2426      allows a function to test whether each input is null.  (Of course, doing
2427      this is only necessary in functions not declared <quote>strict</>.)
2428      As with the
2429      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
2430      the input arguments are counted beginning at zero.  Note that one
2431      should refrain from executing
2432      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
2433      one has verified that the argument isn't null.
2434      To return a null result, execute <function>PG_RETURN_NULL()</function>;
2435      this works in both strict and nonstrict functions.
2436     </para>
2437
2438     <para>
2439      Other options provided in the new-style interface are two
2440      variants of the
2441      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
2442      macros. The first of these,
2443      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
2444      guarantees to return a copy of the specified argument that is
2445      safe for writing into. (The normal macros will sometimes return a
2446      pointer to a value that is physically stored in a table, which
2447      must not be written to. Using the
2448      <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
2449      macros guarantees a writable result.)
2450     The second variant consists of the
2451     <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
2452     macros which take three arguments. The first is the number of the
2453     function argument (as above). The second and third are the offset and
2454     length of the segment to be returned. Offsets are counted from
2455     zero, and a negative length requests that the remainder of the
2456     value be returned. These macros provide more efficient access to
2457     parts of large values in the case where they have storage type
2458     <quote>external</quote>. (The storage type of a column can be specified using
2459     <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
2460     COLUMN <replaceable>colname</replaceable> SET STORAGE
2461     <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
2462     <literal>plain</>, <literal>external</>, <literal>extended</literal>,
2463      or <literal>main</>.)
2464     </para>
2465
2466     <para>
2467      Finally, the version-1 function call conventions make it possible
2468      to return set results (<xref linkend="xfunc-c-return-set">) and
2469      implement trigger functions (<xref linkend="triggers">) and
2470      procedural-language call handlers (<xref
2471      linkend="plhandler">).  Version-1 code is also more
2472      portable than version-0, because it does not break restrictions
2473      on function call protocol in the C standard.  For more details
2474      see <filename>src/backend/utils/fmgr/README</filename> in the
2475      source distribution.
2476     </para>
2477    </sect2>
2478
2479    <sect2>
2480     <title>Writing Code</title>
2481
2482     <para>
2483      Before we turn to the more advanced topics, we should discuss
2484      some coding rules for <productname>PostgreSQL</productname>
2485      C-language functions.  While it might be possible to load functions
2486      written in languages other than C into
2487      <productname>PostgreSQL</productname>, this is usually difficult
2488      (when it is possible at all) because other languages, such as
2489      C++, FORTRAN, or Pascal often do not follow the same calling
2490      convention as C.  That is, other languages do not pass argument
2491      and return values between functions in the same way.  For this
2492      reason, we will assume that your C-language functions are
2493      actually written in C.
2494     </para>
2495
2496     <para>
2497      The basic rules for writing and building C functions are as follows:
2498
2499      <itemizedlist>
2500       <listitem>
2501        <para>
2502         Use <literal>pg_config
2503         --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
2504         to find out where the <productname>PostgreSQL</> server header
2505         files are installed on your system (or the system that your
2506         users will be running on).
2507        </para>
2508       </listitem>
2509
2510       <listitem>
2511        <para>
2512         Compiling and linking your code so that it can be dynamically
2513         loaded into <productname>PostgreSQL</productname> always
2514         requires special flags.  See <xref linkend="dfunc"> for a
2515         detailed explanation of how to do it for your particular
2516         operating system.
2517        </para>
2518       </listitem>
2519
2520       <listitem>
2521        <para>
2522         Remember to define a <quote>magic block</> for your shared library,
2523         as described in <xref linkend="xfunc-c-dynload">.
2524        </para>
2525       </listitem>
2526
2527       <listitem>
2528        <para>
2529         When allocating memory, use the
2530         <productname>PostgreSQL</productname> functions
2531         <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
2532         instead of the corresponding C library functions
2533         <function>malloc</function> and <function>free</function>.
2534         The memory allocated by <function>palloc</function> will be
2535         freed automatically at the end of each transaction, preventing
2536         memory leaks.
2537        </para>
2538       </listitem>
2539
2540       <listitem>
2541        <para>
2542         Always zero the bytes of your structures using <function>memset</>
2543         (or allocate them with <function>palloc0</> in the first place).
2544         Even if you assign to each field of your structure, there might be
2545         alignment padding (holes in the structure) that contain
2546         garbage values.  Without this, it's difficult to
2547         support hash indexes or hash joins, as you must pick out only
2548         the significant bits of your data structure to compute a hash.
2549         The planner also sometimes relies on comparing constants via
2550         bitwise equality, so you can get undesirable planning results if
2551         logically-equivalent values aren't bitwise equal.
2552        </para>
2553       </listitem>
2554
2555       <listitem>
2556        <para>
2557         Most of the internal <productname>PostgreSQL</productname>
2558         types are declared in <filename>postgres.h</filename>, while
2559         the function manager interfaces
2560         (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)  are in
2561         <filename>fmgr.h</filename>, so you will need to include at
2562         least these two files.  For portability reasons it's best to
2563         include <filename>postgres.h</filename> <emphasis>first</>,
2564         before any other system or user header files.  Including
2565         <filename>postgres.h</filename> will also include
2566         <filename>elog.h</filename> and <filename>palloc.h</filename>
2567         for you.
2568        </para>
2569       </listitem>
2570
2571       <listitem>
2572        <para>
2573         Symbol names defined within object files must not conflict
2574         with each other or with symbols defined in the
2575         <productname>PostgreSQL</productname> server executable.  You
2576         will have to rename your functions or variables if you get
2577         error messages to this effect.
2578        </para>
2579       </listitem>
2580
2581       <listitem>
2582        <para>
2583         To work correctly on Windows, <literal>C</>-language functions need
2584         to be marked with <literal>PGDLLEXPORT</>, unless you use a build
2585         process that marks all global functions that way.  In simple cases
2586         this detail will be handled transparently by
2587         the <literal>PG_FUNCTION_INFO_V1</> macro.  However, if you write
2588         explicit external declarations (perhaps in header files), be sure
2589         to write them like this:
2590 <programlisting>
2591 extern PGDLLEXPORT Datum funcname(PG_FUNCTION_ARGS);
2592 </programlisting>
2593         or you'll get compiler complaints when building on Windows.  (On
2594         other platforms, the <literal>PGDLLEXPORT</> macro does nothing.)
2595        </para>
2596       </listitem>
2597      </itemizedlist>
2598     </para>
2599    </sect2>
2600
2601 &dfunc;
2602
2603    <sect2>
2604     <title>Composite-type Arguments</title>
2605
2606     <para>
2607      Composite types do not have a fixed layout like C structures.
2608      Instances of a composite type can contain null fields.  In
2609      addition, composite types that are part of an inheritance
2610      hierarchy can have different fields than other members of the
2611      same inheritance hierarchy.  Therefore,
2612      <productname>PostgreSQL</productname> provides a function
2613      interface for accessing fields of composite types from C.
2614     </para>
2615
2616     <para>
2617      Suppose we want to write a function to answer the query:
2618
2619 <programlisting>
2620 SELECT name, c_overpaid(emp, 1500) AS overpaid
2621     FROM emp
2622     WHERE name = 'Bill' OR name = 'Sam';
2623 </programlisting>
2624
2625      Using call conventions version 0, we can define
2626      <function>c_overpaid</> as:
2627
2628 <programlisting><![CDATA[
2629 #include "postgres.h"
2630 #include "executor/executor.h"  /* for GetAttributeByName() */
2631
2632 #ifdef PG_MODULE_MAGIC
2633 PG_MODULE_MAGIC;
2634 #endif
2635
2636 bool
2637 c_overpaid(HeapTupleHeader t, /* the current row of emp */
2638            int32 limit)
2639 {
2640     bool isnull;
2641     int32 salary;
2642
2643     salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2644     if (isnull)
2645         return false;
2646     return salary > limit;
2647 }
2648 ]]>
2649 </programlisting>
2650
2651      In version-1 coding, the above would look like this:
2652
2653 <programlisting><![CDATA[
2654 #include "postgres.h"
2655 #include "executor/executor.h"  /* for GetAttributeByName() */
2656
2657 #ifdef PG_MODULE_MAGIC
2658 PG_MODULE_MAGIC;
2659 #endif
2660
2661 PG_FUNCTION_INFO_V1(c_overpaid);
2662
2663 Datum
2664 c_overpaid(PG_FUNCTION_ARGS)
2665 {
2666     HeapTupleHeader  t = PG_GETARG_HEAPTUPLEHEADER(0);
2667     int32            limit = PG_GETARG_INT32(1);
2668     bool isnull;
2669     Datum salary;
2670
2671     salary = GetAttributeByName(t, "salary", &isnull);
2672     if (isnull)
2673         PG_RETURN_BOOL(false);
2674     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2675
2676     PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2677 }
2678 ]]>
2679 </programlisting>
2680     </para>
2681
2682     <para>
2683      <function>GetAttributeByName</function> is the
2684      <productname>PostgreSQL</productname> system function that
2685      returns attributes out of the specified row.  It has
2686      three arguments: the argument of type <type>HeapTupleHeader</type> passed
2687      into
2688      the  function, the name of the desired attribute, and a
2689      return parameter that tells whether  the  attribute
2690      is  null.   <function>GetAttributeByName</function> returns a <type>Datum</type>
2691      value that you can convert to the proper data type by using the
2692      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2693      macro.  Note that the return value is meaningless if the null flag is
2694      set; always check the null flag before trying to do anything with the
2695      result.
2696     </para>
2697
2698     <para>
2699      There is also <function>GetAttributeByNum</function>, which selects
2700      the target attribute by column number instead of name.
2701     </para>
2702
2703     <para>
2704      The following command declares the function
2705      <function>c_overpaid</function> in SQL:
2706
2707 <programlisting>
2708 CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2709     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2710     LANGUAGE C STRICT;
2711 </programlisting>
2712
2713      Notice we have used <literal>STRICT</> so that we did not have to
2714      check whether the input arguments were NULL.
2715     </para>
2716    </sect2>
2717
2718    <sect2>
2719     <title>Returning Rows (Composite Types)</title>
2720
2721     <para>
2722      To return a row or composite-type value from a C-language
2723      function, you can use a special API that provides macros and
2724      functions to hide most of the complexity of building composite
2725      data types.  To use this API, the source file must include:
2726 <programlisting>
2727 #include "funcapi.h"
2728 </programlisting>
2729     </para>
2730
2731     <para>
2732      There are two ways you can build a composite data value (henceforth
2733      a <quote>tuple</>): you can build it from an array of Datum values,
2734      or from an array of C strings that can be passed to the input
2735      conversion functions of the tuple's column data types.  In either
2736      case, you first need to obtain or construct a <structname>TupleDesc</>
2737      descriptor for the tuple structure.  When working with Datums, you
2738      pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2739      and then call <function>heap_form_tuple</> for each row.  When working
2740      with C strings, you pass the <structname>TupleDesc</> to
2741      <function>TupleDescGetAttInMetadata</>, and then call
2742      <function>BuildTupleFromCStrings</> for each row.  In the case of a
2743      function returning a set of tuples, the setup steps can all be done
2744      once during the first call of the function.
2745     </para>
2746
2747     <para>
2748      Several helper functions are available for setting up the needed
2749      <structname>TupleDesc</>.  The recommended way to do this in most
2750      functions returning composite values is to call:
2751 <programlisting>
2752 TypeFuncClass get_call_result_type(FunctionCallInfo fcinfo,
2753                                    Oid *resultTypeId,
2754                                    TupleDesc *resultTupleDesc)
2755 </programlisting>
2756      passing the same <literal>fcinfo</> struct passed to the calling function
2757      itself.  (This of course requires that you use the version-1
2758      calling conventions.)  <varname>resultTypeId</> can be specified
2759      as <literal>NULL</> or as the address of a local variable to receive the
2760      function's result type OID.  <varname>resultTupleDesc</> should be the
2761      address of a local <structname>TupleDesc</> variable.  Check that the
2762      result is <literal>TYPEFUNC_COMPOSITE</>; if so,
2763      <varname>resultTupleDesc</> has been filled with the needed
2764      <structname>TupleDesc</>.  (If it is not, you can report an error along
2765      the lines of <quote>function returning record called in context that
2766      cannot accept type record</quote>.)
2767     </para>
2768
2769     <tip>
2770      <para>
2771       <function>get_call_result_type</> can resolve the actual type of a
2772       polymorphic function result; so it is useful in functions that return
2773       scalar polymorphic results, not only functions that return composites.
2774       The <varname>resultTypeId</> output is primarily useful for functions
2775       returning polymorphic scalars.
2776      </para>
2777     </tip>
2778
2779     <note>
2780      <para>
2781       <function>get_call_result_type</> has a sibling
2782       <function>get_expr_result_type</>, which can be used to resolve the
2783       expected output type for a function call represented by an expression
2784       tree.  This can be used when trying to determine the result type from
2785       outside the function itself.  There is also
2786       <function>get_func_result_type</>, which can be used when only the
2787       function's OID is available.  However these functions are not able
2788       to deal with functions declared to return <structname>record</>, and
2789       <function>get_func_result_type</> cannot resolve polymorphic types,
2790       so you should preferentially use <function>get_call_result_type</>.
2791      </para>
2792     </note>
2793
2794     <para>
2795      Older, now-deprecated functions for obtaining
2796      <structname>TupleDesc</>s are:
2797 <programlisting>
2798 TupleDesc RelationNameGetTupleDesc(const char *relname)
2799 </programlisting>
2800      to get a <structname>TupleDesc</> for the row type of a named relation,
2801      and:
2802 <programlisting>
2803 TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2804 </programlisting>
2805      to get a <structname>TupleDesc</> based on a type OID. This can
2806      be used to get a <structname>TupleDesc</> for a base or
2807      composite type.  It will not work for a function that returns
2808      <structname>record</>, however, and it cannot resolve polymorphic
2809      types.
2810     </para>
2811
2812     <para>
2813      Once you have a <structname>TupleDesc</>, call:
2814 <programlisting>
2815 TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2816 </programlisting>
2817      if you plan to work with Datums, or:
2818 <programlisting>
2819 AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2820 </programlisting>
2821      if you plan to work with C strings.  If you are writing a function
2822      returning set, you can save the results of these functions in the
2823      <structname>FuncCallContext</> structure &mdash; use the
2824      <structfield>tuple_desc</> or <structfield>attinmeta</> field
2825      respectively.
2826     </para>
2827
2828     <para>
2829      When working with Datums, use:
2830 <programlisting>
2831 HeapTuple heap_form_tuple(TupleDesc tupdesc, Datum *values, bool *isnull)
2832 </programlisting>
2833      to build a <structname>HeapTuple</> given user data in Datum form.
2834     </para>
2835
2836     <para>
2837      When working with C strings, use:
2838 <programlisting>
2839 HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2840 </programlisting>
2841      to build a <structname>HeapTuple</> given user data
2842      in C string form.  <parameter>values</parameter> is an array of C strings,
2843      one for each attribute of the return row. Each C string should be in
2844      the form expected by the input function of the attribute data
2845      type. In order to return a null value for one of the attributes,
2846      the corresponding pointer in the <parameter>values</> array
2847      should be set to <symbol>NULL</>.  This function will need to
2848      be called again for each row you return.
2849     </para>
2850
2851     <para>
2852      Once you have built a tuple to return from your function, it
2853      must be converted into a <type>Datum</>. Use:
2854 <programlisting>
2855 HeapTupleGetDatum(HeapTuple tuple)
2856 </programlisting>
2857      to convert a <structname>HeapTuple</> into a valid Datum.  This
2858      <type>Datum</> can be returned directly if you intend to return
2859      just a single row, or it can be used as the current return value
2860      in a set-returning function.
2861     </para>
2862
2863     <para>
2864      An example appears in the next section.
2865     </para>
2866
2867    </sect2>
2868
2869    <sect2 id="xfunc-c-return-set">
2870     <title>Returning Sets</title>
2871
2872     <para>
2873      There is also a special API that provides support for returning
2874      sets (multiple rows) from a C-language function.  A set-returning
2875      function must follow the version-1 calling conventions.  Also,
2876      source files must include <filename>funcapi.h</filename>, as
2877      above.
2878     </para>
2879
2880     <para>
2881      A set-returning function (<acronym>SRF</>) is called
2882      once for each item it returns.  The <acronym>SRF</> must
2883      therefore save enough state to remember what it was doing and
2884      return the next item on each call.
2885      The structure <structname>FuncCallContext</> is provided to help
2886      control this process.  Within a function, <literal>fcinfo-&gt;flinfo-&gt;fn_extra</>
2887      is used to hold a pointer to <structname>FuncCallContext</>
2888      across calls.
2889 <programlisting>
2890 typedef struct
2891 {
2892     /*
2893      * Number of times we've been called before
2894      *
2895      * call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2896      * incremented for you every time SRF_RETURN_NEXT() is called.
2897      */
2898     uint32 call_cntr;
2899
2900     /*
2901      * OPTIONAL maximum number of calls
2902      *
2903      * max_calls is here for convenience only and setting it is optional.
2904      * If not set, you must provide alternative means to know when the
2905      * function is done.
2906      */
2907     uint32 max_calls;
2908
2909     /*
2910      * OPTIONAL pointer to result slot
2911      *
2912      * This is obsolete and only present for backward compatibility, viz,
2913      * user-defined SRFs that use the deprecated TupleDescGetSlot().
2914      */
2915     TupleTableSlot *slot;
2916
2917     /*
2918      * OPTIONAL pointer to miscellaneous user-provided context information
2919      *
2920      * user_fctx is for use as a pointer to your own data to retain
2921      * arbitrary context information between calls of your function.
2922      */
2923     void *user_fctx;
2924
2925     /*
2926      * OPTIONAL pointer to struct containing attribute type input metadata
2927      *
2928      * attinmeta is for use when returning tuples (i.e., composite data types)
2929      * and is not used when returning base data types. It is only needed
2930      * if you intend to use BuildTupleFromCStrings() to create the return
2931      * tuple.
2932      */
2933     AttInMetadata *attinmeta;
2934
2935     /*
2936      * memory context used for structures that must live for multiple calls
2937      *
2938      * multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2939      * by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2940      * context for any memory that is to be reused across multiple calls
2941      * of the SRF.
2942      */
2943     MemoryContext multi_call_memory_ctx;
2944
2945     /*
2946      * OPTIONAL pointer to struct containing tuple description
2947      *
2948      * tuple_desc is for use when returning tuples (i.e., composite data types)
2949      * and is only needed if you are going to build the tuples with
2950      * heap_form_tuple() rather than with BuildTupleFromCStrings().  Note that
2951      * the TupleDesc pointer stored here should usually have been run through
2952      * BlessTupleDesc() first.
2953      */
2954     TupleDesc tuple_desc;
2955
2956 } FuncCallContext;
2957 </programlisting>
2958     </para>
2959
2960     <para>
2961      An <acronym>SRF</> uses several functions and macros that
2962      automatically manipulate the <structname>FuncCallContext</>
2963      structure (and expect to find it via <literal>fn_extra</>).  Use:
2964 <programlisting>
2965 SRF_IS_FIRSTCALL()
2966 </programlisting>
2967      to determine if your function is being called for the first or a
2968      subsequent time. On the first call (only) use:
2969 <programlisting>
2970 SRF_FIRSTCALL_INIT()
2971 </programlisting>
2972      to initialize the <structname>FuncCallContext</>. On every function call,
2973      including the first, use:
2974 <programlisting>
2975 SRF_PERCALL_SETUP()
2976 </programlisting>
2977      to properly set up for using the <structname>FuncCallContext</>
2978      and clearing any previously returned data left over from the
2979      previous pass.
2980     </para>
2981
2982     <para>
2983      If your function has data to return, use:
2984 <programlisting>
2985 SRF_RETURN_NEXT(funcctx, result)
2986 </programlisting>
2987      to return it to the caller.  (<literal>result</> must be of type
2988      <type>Datum</>, either a single value or a tuple prepared as
2989      described above.)  Finally, when your function is finished
2990      returning data, use:
2991 <programlisting>
2992 SRF_RETURN_DONE(funcctx)
2993 </programlisting>
2994      to clean up and end the <acronym>SRF</>.
2995     </para>
2996
2997     <para>
2998      The memory context that is current when the <acronym>SRF</> is called is
2999      a transient context that will be cleared between calls.  This means
3000      that you do not need to call <function>pfree</> on everything
3001      you allocated using <function>palloc</>; it will go away anyway.  However, if you want to allocate
3002      any data structures to live across calls, you need to put them somewhere
3003      else.  The memory context referenced by
3004      <structfield>multi_call_memory_ctx</> is a suitable location for any
3005      data that needs to survive until the <acronym>SRF</> is finished running.  In most
3006      cases, this means that you should switch into
3007      <structfield>multi_call_memory_ctx</> while doing the first-call setup.
3008     </para>
3009
3010     <warning>
3011      <para>
3012       While the actual arguments to the function remain unchanged between
3013       calls, if you detoast the argument values (which is normally done
3014       transparently by the
3015       <function>PG_GETARG_<replaceable>xxx</replaceable></function> macro)
3016       in the transient context then the detoasted copies will be freed on
3017       each cycle. Accordingly, if you keep references to such values in
3018       your <structfield>user_fctx</>, you must either copy them into the
3019       <structfield>multi_call_memory_ctx</> after detoasting, or ensure
3020       that you detoast the values only in that context.
3021      </para>
3022     </warning>
3023
3024     <para>
3025      A complete pseudo-code example looks like the following:
3026 <programlisting>
3027 Datum
3028 my_set_returning_function(PG_FUNCTION_ARGS)
3029 {
3030     FuncCallContext  *funcctx;
3031     Datum             result;
3032     <replaceable>further declarations as needed</replaceable>
3033
3034     if (SRF_IS_FIRSTCALL())
3035     {
3036         MemoryContext oldcontext;
3037
3038         funcctx = SRF_FIRSTCALL_INIT();
3039         oldcontext = MemoryContextSwitchTo(funcctx-&gt;multi_call_memory_ctx);
3040         /* One-time setup code appears here: */
3041         <replaceable>user code</replaceable>
3042         <replaceable>if returning composite</replaceable>
3043             <replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
3044         <replaceable>endif returning composite</replaceable>
3045         <replaceable>user code</replaceable>
3046         MemoryContextSwitchTo(oldcontext);
3047     }
3048
3049     /* Each-time setup code appears here: */
3050     <replaceable>user code</replaceable>
3051     funcctx = SRF_PERCALL_SETUP();
3052     <replaceable>user code</replaceable>
3053
3054     /* this is just one way we might test whether we are done: */
3055     if (funcctx-&gt;call_cntr &lt; funcctx-&gt;max_calls)
3056     {
3057         /* Here we want to return another item: */
3058         <replaceable>user code</replaceable>
3059         <replaceable>obtain result Datum</replaceable>
3060         SRF_RETURN_NEXT(funcctx, result);
3061     }
3062     else
3063     {
3064         /* Here we are done returning items and just need to clean up: */
3065         <replaceable>user code</replaceable>
3066         SRF_RETURN_DONE(funcctx);
3067     }
3068 }
3069 </programlisting>
3070     </para>
3071
3072     <para>
3073      A complete example of a simple <acronym>SRF</> returning a composite type
3074      looks like:
3075 <programlisting><![CDATA[
3076 PG_FUNCTION_INFO_V1(retcomposite);
3077
3078 Datum
3079 retcomposite(PG_FUNCTION_ARGS)
3080 {
3081     FuncCallContext     *funcctx;
3082     int                  call_cntr;
3083     int                  max_calls;
3084     TupleDesc            tupdesc;
3085     AttInMetadata       *attinmeta;
3086
3087     /* stuff done only on the first call of the function */
3088     if (SRF_IS_FIRSTCALL())
3089     {
3090         MemoryContext   oldcontext;
3091
3092         /* create a function context for cross-call persistence */
3093         funcctx = SRF_FIRSTCALL_INIT();
3094
3095         /* switch to memory context appropriate for multiple function calls */
3096         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
3097
3098         /* total number of tuples to be returned */
3099         funcctx->max_calls = PG_GETARG_UINT32(0);
3100
3101         /* Build a tuple descriptor for our result type */
3102         if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
3103             ereport(ERROR,
3104                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
3105                      errmsg("function returning record called in context "
3106                             "that cannot accept type record")));
3107
3108         /*
3109          * generate attribute metadata needed later to produce tuples from raw
3110          * C strings
3111          */
3112         attinmeta = TupleDescGetAttInMetadata(tupdesc);
3113         funcctx->attinmeta = attinmeta;
3114
3115         MemoryContextSwitchTo(oldcontext);
3116     }
3117
3118     /* stuff done on every call of the function */
3119     funcctx = SRF_PERCALL_SETUP();
3120
3121     call_cntr = funcctx->call_cntr;
3122     max_calls = funcctx->max_calls;
3123     attinmeta = funcctx->attinmeta;
3124
3125     if (call_cntr < max_calls)    /* do when there is more left to send */
3126     {
3127         char       **values;
3128         HeapTuple    tuple;
3129         Datum        result;
3130
3131         /*
3132          * Prepare a values array for building the returned tuple.
3133          * This should be an array of C strings which will
3134          * be processed later by the type input functions.
3135          */
3136         values = (char **) palloc(3 * sizeof(char *));
3137         values[0] = (char *) palloc(16 * sizeof(char));
3138         values[1] = (char *) palloc(16 * sizeof(char));
3139         values[2] = (char *) palloc(16 * sizeof(char));
3140
3141         snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
3142         snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
3143         snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
3144
3145         /* build a tuple */
3146         tuple = BuildTupleFromCStrings(attinmeta, values);
3147
3148         /* make the tuple into a datum */
3149         result = HeapTupleGetDatum(tuple);
3150
3151         /* clean up (this is not really necessary) */
3152         pfree(values[0]);
3153         pfree(values[1]);
3154         pfree(values[2]);
3155         pfree(values);
3156
3157         SRF_RETURN_NEXT(funcctx, result);
3158     }
3159     else    /* do when there is no more left */
3160     {
3161         SRF_RETURN_DONE(funcctx);
3162     }
3163 }
3164 ]]>
3165 </programlisting>
3166
3167      One way to declare this function in SQL is:
3168 <programlisting>
3169 CREATE TYPE __retcomposite AS (f1 integer, f2 integer, f3 integer);
3170
3171 CREATE OR REPLACE FUNCTION retcomposite(integer, integer)
3172     RETURNS SETOF __retcomposite
3173     AS '<replaceable>filename</>', 'retcomposite'
3174     LANGUAGE C IMMUTABLE STRICT;
3175 </programlisting>
3176      A different way is to use OUT parameters:
3177 <programlisting>
3178 CREATE OR REPLACE FUNCTION retcomposite(IN integer, IN integer,
3179     OUT f1 integer, OUT f2 integer, OUT f3 integer)
3180     RETURNS SETOF record
3181     AS '<replaceable>filename</>', 'retcomposite'
3182     LANGUAGE C IMMUTABLE STRICT;
3183 </programlisting>
3184      Notice that in this method the output type of the function is formally
3185      an anonymous <structname>record</> type.
3186     </para>
3187
3188     <para>
3189      The directory <link linkend="tablefunc">contrib/tablefunc</>
3190      module in the source distribution contains more examples of
3191      set-returning functions.
3192     </para>
3193    </sect2>
3194
3195    <sect2>
3196     <title>Polymorphic Arguments and Return Types</title>
3197
3198     <para>
3199      C-language functions can be declared to accept and
3200      return the polymorphic types
3201      <type>anyelement</type>, <type>anyarray</type>, <type>anynonarray</type>,
3202      <type>anyenum</type>, and <type>anyrange</type>.
3203      See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
3204      of polymorphic functions. When function arguments or return types
3205      are defined as polymorphic types, the function author cannot know
3206      in advance what data type it will be called with, or
3207      need to return. There are two routines provided in <filename>fmgr.h</>
3208      to allow a version-1 C function to discover the actual data types
3209      of its arguments and the type it is expected to return. The routines are
3210      called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
3211      <literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
3212      They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
3213      information is not available.
3214      The structure <literal>flinfo</> is normally accessed as
3215      <literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
3216      is zero based.  <function>get_call_result_type</> can also be used
3217      as an alternative to <function>get_fn_expr_rettype</>.
3218      There is also <function>get_fn_expr_variadic</>, which can be used to
3219      find out whether variadic arguments have been merged into an array.
3220      This is primarily useful for <literal>VARIADIC "any"</> functions,
3221      since such merging will always have occurred for variadic functions
3222      taking ordinary array types.
3223     </para>
3224
3225     <para>
3226      For example, suppose we want to write a function to accept a single
3227      element of any type, and return a one-dimensional array of that type:
3228
3229 <programlisting>
3230 PG_FUNCTION_INFO_V1(make_array);
3231 Datum
3232 make_array(PG_FUNCTION_ARGS)
3233 {
3234     ArrayType  *result;
3235     Oid         element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
3236     Datum       element;
3237     bool        isnull;
3238     int16       typlen;
3239     bool        typbyval;
3240     char        typalign;
3241     int         ndims;
3242     int         dims[MAXDIM];
3243     int         lbs[MAXDIM];
3244
3245     if (!OidIsValid(element_type))
3246         elog(ERROR, "could not determine data type of input");
3247
3248     /* get the provided element, being careful in case it's NULL */
3249     isnull = PG_ARGISNULL(0);
3250     if (isnull)
3251         element = (Datum) 0;
3252     else
3253         element = PG_GETARG_DATUM(0);
3254
3255     /* we have one dimension */
3256     ndims = 1;
3257     /* and one element */
3258     dims[0] = 1;
3259     /* and lower bound is 1 */
3260     lbs[0] = 1;
3261
3262     /* get required info about the element type */
3263     get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
3264
3265     /* now build the array */
3266     result = construct_md_array(&amp;element, &amp;isnull, ndims, dims, lbs,
3267                                 element_type, typlen, typbyval, typalign);
3268
3269     PG_RETURN_ARRAYTYPE_P(result);
3270 }
3271 </programlisting>
3272     </para>
3273
3274     <para>
3275      The following command declares the function
3276      <function>make_array</function> in SQL:
3277
3278 <programlisting>
3279 CREATE FUNCTION make_array(anyelement) RETURNS anyarray
3280     AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
3281     LANGUAGE C IMMUTABLE;
3282 </programlisting>
3283     </para>
3284
3285     <para>
3286      There is a variant of polymorphism that is only available to C-language
3287      functions: they can be declared to take parameters of type
3288      <literal>"any"</>.  (Note that this type name must be double-quoted,
3289      since it's also a SQL reserved word.)  This works like
3290      <type>anyelement</> except that it does not constrain different
3291      <literal>"any"</> arguments to be the same type, nor do they help
3292      determine the function's result type.  A C-language function can also
3293      declare its final parameter to be <literal>VARIADIC "any"</>.  This will
3294      match one or more actual arguments of any type (not necessarily the same
3295      type).  These arguments will <emphasis>not</> be gathered into an array
3296      as happens with normal variadic functions; they will just be passed to
3297      the function separately.  The <function>PG_NARGS()</> macro and the
3298      methods described above must be used to determine the number of actual
3299      arguments and their types when using this feature.  Also, users of such
3300      a function might wish to use the <literal>VARIADIC</> keyword in their
3301      function call, with the expectation that the function would treat the
3302      array elements as separate arguments.  The function itself must implement
3303      that behavior if wanted, after using <function>get_fn_expr_variadic</> to
3304      detect that the actual argument was marked with <literal>VARIADIC</>.
3305     </para>
3306    </sect2>
3307
3308    <sect2 id="xfunc-transform-functions">
3309     <title>Transform Functions</title>
3310
3311     <para>
3312      Some function calls can be simplified during planning based on
3313      properties specific to the function.  For example,
3314      <literal>int4mul(n, 1)</> could be simplified to just <literal>n</>.
3315      To define such function-specific optimizations, write a
3316      <firstterm>transform function</> and place its OID in the
3317      <structfield>protransform</> field of the primary function's
3318      <structname>pg_proc</> entry.  The transform function must have the SQL
3319      signature <literal>protransform(internal) RETURNS internal</>.  The
3320      argument, actually <type>FuncExpr *</>, is a dummy node representing a
3321      call to the primary function.  If the transform function's study of the
3322      expression tree proves that a simplified expression tree can substitute
3323      for all possible concrete calls represented thereby, build and return
3324      that simplified expression.  Otherwise, return a <literal>NULL</>
3325      pointer (<emphasis>not</> a SQL null).
3326     </para>
3327
3328     <para>
3329      We make no guarantee that <productname>PostgreSQL</> will never call the
3330      primary function in cases that the transform function could simplify.
3331      Ensure rigorous equivalence between the simplified expression and an
3332      actual call to the primary function.
3333     </para>
3334
3335     <para>
3336      Currently, this facility is not exposed to users at the SQL level
3337      because of security concerns, so it is only practical to use for
3338      optimizing built-in functions.
3339     </para>
3340    </sect2>
3341
3342    <sect2>
3343     <title>Shared Memory and LWLocks</title>
3344
3345     <para>
3346      Add-ins can reserve LWLocks and an allocation of shared memory on server
3347      startup.  The add-in's shared library must be preloaded by specifying
3348      it in
3349      <xref linkend="guc-shared-preload-libraries"><indexterm><primary>shared_preload_libraries</></>.
3350      Shared memory is reserved by calling:
3351 <programlisting>
3352 void RequestAddinShmemSpace(int size)
3353 </programlisting>
3354      from your <function>_PG_init</> function.
3355     </para>
3356     <para>
3357      LWLocks are reserved by calling:
3358 <programlisting>
3359 void RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
3360 </programlisting>
3361      from <function>_PG_init</>.  This will ensure that an array of
3362      <literal>num_lwlocks</> LWLocks is available under the name
3363      <literal>tranche_name</>.  Use <function>GetNamedLWLockTranche</>
3364      to get a pointer to this array.
3365     </para>
3366     <para>
3367      To avoid possible race-conditions, each backend should use the LWLock
3368      <function>AddinShmemInitLock</> when connecting to and initializing
3369      its allocation of shared memory, as shown here:
3370 <programlisting>
3371 static mystruct *ptr = NULL;
3372
3373 if (!ptr)
3374 {
3375         bool    found;
3376
3377         LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
3378         ptr = ShmemInitStruct("my struct name", size, &amp;found);
3379         if (!found)
3380         {
3381                 initialize contents of shmem area;
3382                 acquire any requested LWLocks using:
3383                 ptr->locks = GetNamedLWLockTranche("my tranche name");
3384         }
3385         LWLockRelease(AddinShmemInitLock);
3386 }
3387 </programlisting>
3388     </para>
3389    </sect2>
3390
3391    <sect2 id="extend-Cpp">
3392     <title>Using C++ for Extensibility</title>
3393
3394     <indexterm zone="extend-Cpp">
3395      <primary>C++</primary>
3396     </indexterm>
3397
3398     <para>
3399      Although the <productname>PostgreSQL</productname> backend is written in
3400      C, it is possible to write extensions in C++ if these guidelines are
3401      followed:
3402
3403      <itemizedlist>
3404       <listitem>
3405        <para>
3406          All functions accessed by the backend must present a C interface
3407          to the backend;  these C functions can then call C++ functions.
3408          For example, <literal>extern C</> linkage is required for
3409          backend-accessed functions.  This is also necessary for any
3410          functions that are passed as pointers between the backend and
3411          C++ code.
3412        </para>
3413       </listitem>
3414       <listitem>
3415        <para>
3416         Free memory using the appropriate deallocation method.  For example,
3417         most backend memory is allocated using <function>palloc()</>, so use
3418         <function>pfree()</> to free it.  Using C++
3419         <function>delete</> in such cases will fail.
3420        </para>
3421       </listitem>
3422       <listitem>
3423        <para>
3424         Prevent exceptions from propagating into the C code (use a catch-all
3425         block at the top level of all <literal>extern C</> functions).  This
3426         is necessary even if the C++ code does not explicitly throw any
3427         exceptions, because events like out-of-memory can still throw
3428         exceptions.  Any exceptions must be caught and appropriate errors
3429         passed back to the C interface.  If possible, compile C++ with
3430         <option>-fno-exceptions</> to eliminate exceptions entirely; in such
3431         cases, you must check for failures in your C++ code, e.g.  check for
3432         NULL returned by <function>new()</>.
3433        </para>
3434       </listitem>
3435       <listitem>
3436        <para>
3437         If calling backend functions from C++ code, be sure that the
3438         C++ call stack contains only plain old data structures
3439         (<acronym>POD</>).  This is necessary because backend errors
3440         generate a distant <function>longjmp()</> that does not properly
3441         unroll a C++ call stack with non-POD objects.
3442        </para>
3443       </listitem>
3444      </itemizedlist>
3445     </para>
3446
3447     <para>
3448      In summary, it is best to place C++ code behind a wall of
3449      <literal>extern C</> functions that interface to the backend,
3450      and avoid exception, memory, and call stack leakage.
3451     </para>
3452    </sect2>
3453
3454   </sect1>