granicus.if.org Git - postgresql/blob - doc/src/sgml/xfunc.sgml

   1 <!--
   2 $Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.39 2001/10/26 19:58:12 tgl Exp $
   3 -->
   4
   5  <chapter id="xfunc">
   6   <title id="xfunc-title">Extending <acronym>SQL</acronym>: Functions</title>
   7
   8   <sect1 id="xfunc-intro">
   9    <title>Introduction</title>
  10
  11   <comment>
  12    Historically, functions were perhaps considered a tool for creating
  13    types.  Today, few people build their own types but many write
  14    their own functions.  This introduction ought to be changed to
  15    reflect this.
  16   </comment>
  17
  18   <para>
  19    As  it  turns  out,  part of defining a new type is the
  20    definition of functions  that  describe  its  behavior.
  21    Consequently,  while  it  is  possible  to define a new
  22    function without defining a new type,  the  reverse  is
  23    not  true.   We therefore describe how to add new functions
  24    to <productname>Postgres</productname> before  describing
  25    how  to  add  new types.
  26   </para>
  27
  28   <para>
  29    <productname>PostgreSQL</productname> provides four kinds of
  30    functions:
  31
  32    <itemizedlist>
  33     <listitem>
  34      <para>
  35       query language functions
  36       (functions written in <acronym>SQL</acronym>)
  37      </para>
  38     </listitem>
  39     <listitem>
  40      <para>
  41       procedural language
  42       functions (functions written in, for example, <application>PL/Tcl</> or <application>PL/pgSQL</>)
  43      </para>
  44     </listitem>
  45     <listitem>
  46      <para>
  47       internal functions
  48      </para>
  49     </listitem>
  50     <listitem>
  51      <para>
  52       C language functions
  53      </para>
  54     </listitem>
  55    </itemizedlist>
  56   </para>
  57
  58   <para>
  59    Every kind
  60    of  function  can take a base type, a composite type or
  61    some combination as arguments (parameters).   In  addition,
  62    every kind of function can return a base type or
  63    a composite type.  It's easiest to define <acronym>SQL</acronym>
  64    functions, so we'll start with those.  Examples in this section
  65    can also be found in <filename>funcs.sql</filename>
  66    and <filename>funcs.c</filename> in the tutorial directory.
  67   </para>
  68   </sect1>
  69
  70   <sect1 id="xfunc-sql">
  71    <title>Query Language (<acronym>SQL</acronym>) Functions</title>
  72
  73    <para>
  74     SQL functions execute an arbitrary list of SQL statements, returning
  75     the results of the last query in the list.  In the simple (non-set)
  76     case, the first row of the last query's result will be returned.
  77     (Bear in mind that <quote>the first row</quote> is not well-defined
  78     unless you use <literal>ORDER BY</>.)  If the last query happens
  79     to return no rows at all, NULL will be returned.
  80    </para>
  81
  82    <para>
  83     Alternatively, an SQL function may be declared to return a set,
  84     by specifying the function's return type
  85     as <literal>SETOF</literal> <replaceable>sometype</>.  In this case
  86     all rows of the last query's result are returned.  Further details
  87     appear below.
  88    </para>
  89
  90    <para>
  91     The body of an SQL function should be a list of one or more SQL
  92     statements separated by semicolons.  Note that because the syntax
  93     of the <command>CREATE FUNCTION</command> command requires the body of the
  94     function to be enclosed in single quotes, single quote marks
  95     (<literal>'</>) used
  96     in the body of the function must be escaped, by writing two single
  97     quotes (<literal>''</>) or a backslash (<literal>\'</>) where each
  98     quote is desired.
  99    </para>
 100
 101    <para>
 102     Arguments to the SQL function may be referenced in the function
 103     body using the syntax <literal>$<replaceable>n</></>: $1 refers to
 104     the first argument, $2 to the second, and so on.  If an argument
 105     is of a composite type, then the <quote>dot notation</quote>,
 106     e.g., <literal>$1.emp</literal>, may be used to access attributes
 107     of the argument.
 108    </para>
 109
 110    <sect2>
 111     <title>Examples</title>
 112
 113     <para>
 114      To illustrate a simple SQL function, consider the following,
 115      which might be used to debit a bank account:
 116
 117 <programlisting>
 118 CREATE FUNCTION tp1 (integer, numeric) RETURNS integer AS '
 119     UPDATE bank
 120         SET balance = balance - $2
 121         WHERE accountno = $1;
 122     SELECT 1;
 123 ' LANGUAGE SQL;
 124 </programlisting>
 125
 126      A user could execute this function to debit account 17 by $100.00 as
 127      follows:
 128
 129 <programlisting>
 130 SELECT tp1(17, 100.0);
 131 </programlisting>
 132     </para>
 133
 134     <para>
 135      In practice one would probably like a more useful result from the
 136      function than a constant <quote>1</>, so a more likely definition
 137      is
 138
 139 <programlisting>
 140 CREATE FUNCTION tp1 (integer, numeric) RETURNS numeric AS '
 141     UPDATE bank
 142         SET balance = balance - $2
 143         WHERE accountno = $1;
 144     SELECT balance FROM bank WHERE accountno = $1;
 145 ' LANGUAGE SQL;
 146 </programlisting>
 147
 148      which adjusts the balance and returns the new balance.
 149     </para>
 150
 151     <para>
 152      Any collection of commands in the  <acronym>SQL</acronym>
 153      language can be packaged together and defined as a function.
 154      The commands can include data modification (i.e.,
 155      <command>INSERT</command>, <command>UPDATE</command>, and
 156      <command>DELETE</command>) as well
 157      as <command>SELECT</command> queries.  However, the final command
 158      must be a <command>SELECT</command> that returns whatever is
 159      specified as the function's return type.
 160
 161 <programlisting>
 162 CREATE FUNCTION clean_EMP () RETURNS integer AS '
 163     DELETE FROM EMP
 164         WHERE EMP.salary &lt;= 0;
 165     SELECT 1 AS ignore_this;
 166 ' LANGUAGE SQL;
 167
 168 SELECT clean_EMP();
 169 </programlisting>
 170
 171 <screen>
 172  x
 173 ---
 174  1
 175 </screen>
 176     </para>
 177    </sect2>
 178
 179    <sect2>
 180     <title><acronym>SQL</acronym> Functions on Base Types</title>
 181
 182     <para>
 183      The simplest possible <acronym>SQL</acronym> function has no arguments and
 184      simply returns a base type, such as <type>integer</type>:
 185
 186 <programlisting>
 187 CREATE FUNCTION one() RETURNS integer AS '
 188     SELECT 1 as RESULT;
 189 ' LANGUAGE SQL;
 190
 191 SELECT one();
 192 </programlisting>
 193
 194 <screen>
 195  one
 196 -----
 197    1
 198 </screen>
 199     </para>
 200
 201     <para>
 202      Notice that we defined a column alias within the function body for the result of the function
 203      (with  the  name <literal>RESULT</>),  but this column alias is not visible
 204      outside the function.  Hence,  the  result  is labelled <literal>one</>
 205      instead of <literal>RESULT</>.
 206     </para>
 207
 208     <para>
 209      It is almost as easy to define <acronym>SQL</acronym> functions
 210      that take base types as arguments.  In the example below, notice
 211      how we refer to the arguments within the function as <literal>$1</>
 212      and <literal>$2</>:
 213
 214 <programlisting>
 215 CREATE FUNCTION add_em(integer, integer) RETURNS integer AS '
 216     SELECT $1 + $2;
 217 ' LANGUAGE SQL;
 218
 219 SELECT add_em(1, 2) AS answer;
 220 </programlisting>
 221
 222 <screen>
 223  answer
 224 --------
 225       3
 226 </screen>
 227     </para>
 228    </sect2>
 229
 230    <sect2>
 231     <title><acronym>SQL</acronym> Functions on Composite Types</title>
 232
 233     <para>
 234      When  specifying  functions with arguments of composite
 235      types, we must  not  only  specify  which
 236      argument  we  want (as we did above with <literal>$1</> and <literal>$2</literal>) but
 237      also the attributes of  that  argument.   For  example, suppose that
 238      <type>EMP</type> is a table containing employee data, and therefore
 239      also the name of the composite type of each row of the table.  Here
 240      is a function <function>double_salary</function> that computes what your
 241      salary would be if it were doubled:
 242
 243 <programlisting>
 244 CREATE FUNCTION double_salary(EMP) RETURNS integer AS '
 245     SELECT $1.salary * 2 AS salary;
 246 ' LANGUAGE SQL;
 247
 248 SELECT name, double_salary(EMP) AS dream
 249     FROM EMP
 250     WHERE EMP.cubicle ~= point '(2,1)';
 251 </programlisting>
 252
 253 <screen>
 254  name | dream
 255 ------+-------
 256  Sam  |  2400
 257 </screen>
 258     </para>
 259
 260     <para>
 261      Notice the use of the syntax <literal>$1.salary</literal>
 262      to select one field of the argument row value.  Also notice
 263      how the calling SELECT command uses a table name to denote
 264      the entire current row of that table as a composite value.
 265     </para>
 266
 267     <para>
 268      It is also possible to build a function that returns a composite type.
 269      (However, as we'll see below, there are some
 270      unfortunate restrictions on how the function may be used.)
 271      This is an example of a function
 272      that returns a single <type>EMP</type> row:
 273
 274 <programlisting>
 275 CREATE FUNCTION new_emp() RETURNS EMP AS '
 276     SELECT text ''None'' AS name,
 277         1000 AS salary,
 278         25 AS age,
 279         point ''(2,2)'' AS cubicle;
 280 ' LANGUAGE SQL;
 281 </programlisting>
 282     </para>
 283
 284     <para>
 285      In this case we have specified each of  the  attributes
 286      with  a  constant value, but any computation or expression
 287      could have been substituted for these constants.
 288      Note two important things about defining the function:
 289
 290      <itemizedlist>
 291       <listitem>
 292        <para>
 293         The  target  list  order must be exactly the same as
 294         that in which the columns appear in the table associated
 295         with the composite type.
 296        </para>
 297       </listitem>
 298       <listitem>
 299        <para>
 300         You must typecast the expressions to match the
 301         definition of the composite type, or you will get errors like this:
 302 <screen>
 303 <computeroutput>
 304 ERROR:  function declared to return emp returns varchar instead of text at column 1
 305 </computeroutput>
 306 </screen>
 307        </para>
 308       </listitem>
 309      </itemizedlist>
 310     </para>
 311
 312     <para>
 313      In the present release of <productname>PostgreSQL</productname>
 314      there are some unpleasant restrictions on how functions returning
 315      composite types can be used.  Briefly, when calling a function that
 316      returns a row, we cannot retrieve the entire row.  We must either
 317      project a single attribute out of the row or pass the entire row into
 318      another function.  (Trying to display the entire row value will yield
 319      a meaningless number.)  For example,
 320
 321 <programlisting>
 322 SELECT name(new_emp());
 323 </programlisting>
 324
 325 <screen>
 326  name
 327 ------
 328  None
 329 </screen>
 330     </para>
 331
 332     <para>
 333      This example makes use of the
 334      function notation for projecting attributes.  The  simple  way
 335      to explain this is that we can usually use the
 336      notations <literal>attribute(table)</>  and  <literal>table.attribute</>
 337      interchangeably:
 338
 339 <programlisting>
 340 --
 341 -- this is the same as:
 342 --  SELECT EMP.name AS youngster FROM EMP WHERE EMP.age &lt; 30
 343 --
 344 SELECT name(EMP) AS youngster
 345     FROM EMP
 346     WHERE age(EMP) &lt; 30;
 347 </programlisting>
 348
 349 <screen>
 350  youngster
 351 -----------
 352  Sam
 353 </screen>
 354     </para>
 355
 356     <para>
 357         The reason why, in general, we must use the function
 358         syntax  for projecting attributes of function return
 359         values is that the parser  just  doesn't  understand
 360         the  dot syntax for projection when combined
 361         with function calls.
 362
 363 <screen>
 364 SELECT new_emp().name AS nobody;
 365 ERROR:  parser: parse error at or near "."
 366 </screen>
 367     </para>
 368
 369     <para>
 370      Another way to use a function returning a row result is to declare a
 371      second function accepting a rowtype parameter, and pass the function
 372      result to it:
 373
 374 <programlisting>
 375 CREATE FUNCTION getname(emp) RETURNS text AS
 376 'SELECT $1.name;'
 377 LANGUAGE SQL;
 378 </programlisting>
 379
 380 <screen>
 381 SELECT getname(new_emp());
 382  getname
 383 ---------
 384  None
 385 (1 row)
 386 </screen>
 387     </para>
 388    </sect2>
 389
 390    <sect2>
 391     <title><acronym>SQL</acronym> Functions Returning Sets</title>
 392
 393     <para>
 394      As previously mentioned, an SQL function may be declared as
 395      returning <literal>SETOF</literal> <replaceable>sometype</>.
 396      In this case the function's final SELECT query is executed to
 397      completion, and each row it outputs is returned as an element
 398      of the set.
 399     </para>
 400
 401     <para>
 402      Functions returning sets may only be called in the target list
 403      of a SELECT query.  For each row that the SELECT generates by itself,
 404      the function returning set is invoked, and an output row is generated
 405      for each element of the function's result set.  An example:
 406
 407 <programlisting>
 408 CREATE FUNCTION listchildren(text) RETURNS SETOF text AS
 409 'SELECT name FROM nodes WHERE parent = $1'
 410 LANGUAGE SQL;
 411 </programlisting>
 412
 413 <screen>
 414 SELECT * FROM nodes;
 415    name    | parent
 416 -----------+--------
 417  Top       |
 418  Child1    | Top
 419  Child2    | Top
 420  Child3    | Top
 421  SubChild1 | Child1
 422  SubChild2 | Child1
 423 (6 rows)
 424
 425 SELECT listchildren('Top');
 426  listchildren
 427 --------------
 428  Child1
 429  Child2
 430  Child3
 431 (3 rows)
 432
 433 SELECT name, listchildren(name) FROM nodes;
 434   name  | listchildren
 435 --------+--------------
 436  Top    | Child1
 437  Top    | Child2
 438  Top    | Child3
 439  Child1 | SubChild1
 440  Child1 | SubChild2
 441 (5 rows)
 442 </screen>
 443
 444      Notice that no output row appears for Child2, Child3, etc.
 445      This happens because listchildren() returns an empty set
 446      for those inputs, so no output rows are generated.
 447     </para>
 448    </sect2>
 449   </sect1>
 450
 451   <sect1 id="xfunc-pl">
 452    <title>Procedural Language Functions</title>
 453
 454    <para>
 455     Procedural languages aren't built into the <productname>PostgreSQL</productname> server; they are offered
 456     by loadable modules. Please refer to the documentation of the
 457     procedural language in question for details about the syntax and how the function body
 458     is interpreted for each language.
 459    </para>
 460
 461    <para>
 462     There are currently four procedural languages available in the
 463     standard <productname>PostgreSQL</productname> distribution:
 464     <application>PL/pgSQL</application>, <application>PL/Tcl</application>,
 465     <application>PL/Perl</application>, and <application>PL/Python</application>.  Other languages can be
 466     defined by users.  Refer to <xref linkend="xplang"> for more
 467     information.  The basics of developing a new procedural language are covered in <xref linkend="xfunc-plhandler">.
 468    </para>
 469   </sect1>
 470
 471   <sect1 id="xfunc-internal">
 472    <title>Internal Functions</title>
 473
 474    <para>
 475     Internal functions are functions written in C that have been statically
 476     linked into the <productname>PostgreSQL</productname> server.
 477     The <quote>body</quote> of the function definition
 478     specifies the C-language name of the function, which need not be the
 479     same as the name being declared for SQL use.
 480     (For reasons of backwards compatibility, an empty body
 481     is accepted as meaning that the C-language function name is the
 482     same as the SQL name.)
 483    </para>
 484
 485    <para>
 486     Normally, all internal functions present in the
 487     backend are declared during the initialization of the database cluster (<command>initdb</command>),
 488     but a user could use <command>CREATE FUNCTION</command>
 489     to create additional alias names for an internal function.
 490     Internal functions are declared in <command>CREATE FUNCTION</command>
 491     with language name <literal>internal</literal>.  For instance, to
 492     create an alias for the <function>sqrt</function> function:
 493 <programlisting>
 494 CREATE FUNCTION square_root(double precision) RETURNS double precision
 495     AS 'dsqrt'
 496     LANGUAGE INTERNAL
 497     WITH (isStrict);
 498 </programlisting>
 499     (Most internal functions expect to be declared <quote>strict</quote>.)
 500    </para>
 501
 502    <note>
 503     <para>
 504      Not all <quote>predefined</quote> functions are
 505      <quote>internal</quote> in the above sense.  Some predefined
 506      functions are written in SQL.
 507     </para>
 508    </note>
 509   </sect1>
 510
 511   <sect1 id="xfunc-c">
 512    <title>C Language Functions</title>
 513
 514    <para>
 515     User-defined functions can be written in C (or a language that can
 516     be made compatible with C, such as C++).  Such functions are
 517     compiled into dynamically loadable objects (also called shared
 518     libraries) and are loaded by the server on demand.  The dynamic
 519     loading feature is what distinguishes <quote>C language</> functions
 520     from <quote>internal</> functions --- the actual coding conventions
 521     are essentially the same for both.  (Hence, the standard internal
 522     function library is a rich source of coding examples for user-defined
 523     C functions.)
 524    </para>
 525
 526    <para>
 527     Two different calling conventions are currently used for C functions.
 528     The newer <quote>version 1</quote> calling convention is indicated by writing
 529     a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
 530     as illustrated below.  Lack of such a macro indicates an old-style
 531     ("version 0") function.  The language name specified in <command>CREATE FUNCTION</command>
 532     is <literal>C</literal> in either case.  Old-style functions are now deprecated
 533     because of portability problems and lack of functionality, but they
 534     are still supported for compatibility reasons.
 535    </para>
 536
 537   <sect2 id="xfunc-c-dynload">
 538    <title>Dynamic Loading</title>
 539
 540    <para>
 541     The first time a user-defined function in a particular
 542     loadable object file is called in a backend session,
 543     the dynamic loader loads that object file into memory so that the
 544     function can be called.  The <command>CREATE FUNCTION</command>
 545     for a user-defined C function must therefore specify two pieces of
 546     information for the function: the name of the loadable
 547     object file, and the C name (link symbol) of the specific function to call
 548     within that object file.  If the C name is not explicitly specified then
 549     it is assumed to be the same as the SQL function name.
 550    </para>
 551
 552    <para>
 553     The following algorithm is used to locate the shared object file
 554     based on the name given in the <command>CREATE FUNCTION</command>
 555     command:
 556
 557     <orderedlist>
 558      <listitem>
 559       <para>
 560        If the name is an absolute path, the given file is loaded.
 561       </para>
 562      </listitem>
 563
 564      <listitem>
 565       <para>
 566        If the name starts with the string <literal>$libdir</literal>,
 567        that part is replaced by the PostgreSQL package library directory
 568        name, which is determined at build time.
 569       </para>
 570      </listitem>
 571
 572      <listitem>
 573       <para>
 574        If the name does not contain a directory part, the file is
 575        searched for in the path specified by the configuration variable
 576        <varname>dynamic_library_path</varname>.
 577       </para>
 578      </listitem>
 579
 580      <listitem>
 581       <para>
 582        Otherwise (the file was not found in the path, or it contains a
 583        non-absolute directory part), the dynamic loader will try to
 584        take the name as given, which will most likely fail.  (It is
 585        unreliable to depend on the current working directory.)
 586       </para>
 587      </listitem>
 588     </orderedlist>
 589
 590     If this sequence does not work, the platform-specific shared
 591     library file name extension (often <filename>.so</filename>) is
 592     appended to the given name and this sequence is tried again.  If
 593     that fails as well, the load will fail.
 594    </para>
 595
 596    <note>
 597     <para>
 598      The user id the <application>PostgreSQL</application> server runs
 599      as must be able to traverse the path to the file you intend to
 600      load.  Making the file or a higher-level directory not readable
 601      and/or not executable by the <quote>postgres</quote> user is a
 602      common mistake.
 603     </para>
 604    </note>
 605
 606    <para>
 607     In any case, the file name that is given in the
 608     <command>CREATE FUNCTION</command> command is recorded literally
 609     in the system catalogs, so if the file needs to be loaded again
 610     the same procedure is applied.
 611    </para>
 612
 613    <note>
 614     <para>
 615      <application>PostgreSQL</application> will not compile a C function
 616      automatically.  The object file must be compiled before it is referenced
 617      in a <command>CREATE
 618      FUNCTION</> command.  See <xref linkend="dfunc"> for additional
 619      information.
 620     </para>
 621    </note>
 622
 623    <note>
 624     <para>
 625      After it is used for the first time, a dynamically loaded object
 626      file is retained in memory.  Future calls in the same session to the
 627      function(s) in that file will only incur the small overhead of a symbol
 628      table lookup.  If you need to force a reload of an object file, for
 629      example after recompiling it, use the <command>LOAD</> command or
 630      begin a fresh session.
 631     </para>
 632    </note>
 633
 634    <para>
 635     It is recommended to locate shared libraries either relative to
 636     <literal>$libdir</literal> or through the dynamic library path.
 637     This simplifies version upgrades if the new installation is at a
 638     different location.  The actual directory that
 639     <literal>$libdir</literal> stands for can be found out with the
 640     command <literal>pg_config --pkglibdir</literal>.
 641    </para>
 642
 643    <note>
 644     <para>
 645      Before <application>PostgreSQL</application> release 7.2, only exact
 646      absolute paths to object files could be specified in <command>CREATE
 647      FUNCTION</>.  This approach is now deprecated since it makes the
 648      function definition unnecessarily unportable.  It's best to specify
 649      just the shared library name with no path nor extension, and let
 650      the search mechanism provide that information instead.
 651     </para>
 652    </note>
 653
 654   </sect2>
 655
 656    <sect2>
 657     <title>Base Types in C-Language Functions</title>
 658
 659     <para>
 660      <xref linkend="xfunc-c-type-table"> gives the C type required for
 661      parameters in the C functions that will be loaded into Postgres.
 662      The <quote>Defined In</quote> column gives the header file that
 663      needs to be included to get the type definition.  (The actual
 664      definition may be in a different file that is included by the
 665      listed file.  It is recommended that users stick to the defined
 666      interface.)  Note that you should always include
 667      <filename>postgres.h</filename> first in any source file, because
 668      it declares a number of things that you will need anyway.
 669     </para>
 670
 671      <table tocentry="1" id="xfunc-c-type-table">
 672       <title>Equivalent C Types
 673        for Built-In <productname>PostgreSQL</productname> Types</title>
 674       <titleabbrev>Equivalent C Types</titleabbrev>
 675       <tgroup cols="3">
 676        <thead>
 677         <row>
 678          <entry>
 679           SQL Type
 680          </entry>
 681          <entry>
 682           C Type
 683          </entry>
 684          <entry>
 685           Defined In
 686          </entry>
 687         </row>
 688        </thead>
 689        <tbody>
 690         <row>
 691          <entry><type>abstime</type></entry>
 692          <entry><type>AbsoluteTime</type></entry>
 693          <entry><filename>utils/nabstime.h</filename></entry>
 694         </row>
 695         <row>
 696          <entry><type>boolean</type></entry>
 697          <entry><type>bool</type></entry>
 698          <entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
 699         </row>
 700         <row>
 701          <entry><type>box</type></entry>
 702          <entry><type>BOX*</type></entry>
 703          <entry><filename>utils/geo-decls.h</filename></entry>
 704         </row>
 705         <row>
 706          <entry><type>bytea</type></entry>
 707          <entry><type>bytea*</type></entry>
 708          <entry><filename>postgres.h</filename></entry>
 709         </row>
 710         <row>
 711          <entry><type>"char"</type></entry>
 712          <entry><type>char</type></entry>
 713          <entry>(compiler built-in)</entry>
 714         </row>
 715         <row>
 716          <entry><type>character</type></entry>
 717          <entry><type>BpChar*</type></entry>
 718          <entry><filename>postgres.h</filename></entry>
 719         </row>
 720         <row>
 721          <entry><type>cid</type></entry>
 722          <entry><type>CommandId</type></entry>
 723          <entry><filename>postgres.h</filename></entry>
 724         </row>
 725         <row>
 726          <entry><type>date</type></entry>
 727          <entry><type>DateADT</type></entry>
 728          <entry><filename>utils/date.h</filename></entry>
 729         </row>
 730         <row>
 731          <entry><type>smallint</type> (<type>int2</type>)</entry>
 732          <entry><type>int2</type> or <type>int16</type></entry>
 733          <entry><filename>postgres.h</filename></entry>
 734         </row>
 735         <row>
 736          <entry><type>int2vector</type></entry>
 737          <entry><type>int2vector*</type></entry>
 738          <entry><filename>postgres.h</filename></entry>
 739         </row>
 740         <row>
 741          <entry><type>integer</type> (<type>int4</type>)</entry>
 742          <entry><type>int4</type> or <type>int32</type></entry>
 743          <entry><filename>postgres.h</filename></entry>
 744         </row>
 745         <row>
 746          <entry><type>real</type> (<type>float4</type>)</entry>
 747          <entry><type>float4*</type></entry>
 748         <entry><filename>postgres.h</filename></entry>
 749         </row>
 750         <row>
 751          <entry><type>double precision</type> (<type>float8</type>)</entry>
 752          <entry><type>float8*</type></entry>
 753          <entry><filename>postgres.h</filename></entry>
 754         </row>
 755         <row>
 756          <entry><type>interval</type></entry>
 757          <entry><type>Interval*</type></entry>
 758          <entry><filename>utils/timestamp.h</filename></entry>
 759         </row>
 760         <row>
 761          <entry><type>lseg</type></entry>
 762          <entry><type>LSEG*</type></entry>
 763          <entry><filename>utils/geo-decls.h</filename></entry>
 764         </row>
 765         <row>
 766          <entry><type>name</type></entry>
 767          <entry><type>Name</type></entry>
 768          <entry><filename>postgres.h</filename></entry>
 769         </row>
 770         <row>
 771          <entry><type>oid</type></entry>
 772          <entry><type>Oid</type></entry>
 773          <entry><filename>postgres.h</filename></entry>
 774         </row>
 775         <row>
 776          <entry><type>oidvector</type></entry>
 777          <entry><type>oidvector*</type></entry>
 778          <entry><filename>postgres.h</filename></entry>
 779         </row>
 780         <row>
 781          <entry><type>path</type></entry>
 782          <entry><type>PATH*</type></entry>
 783          <entry><filename>utils/geo-decls.h</filename></entry>
 784         </row>
 785         <row>
 786          <entry><type>point</type></entry>
 787          <entry><type>POINT*</type></entry>
 788          <entry><filename>utils/geo-decls.h</filename></entry>
 789         </row>
 790         <row>
 791          <entry><type>regproc</type></entry>
 792          <entry><type>regproc</type></entry>
 793          <entry><filename>postgres.h</filename></entry>
 794         </row>
 795         <row>
 796          <entry><type>reltime</type></entry>
 797          <entry><type>RelativeTime</type></entry>
 798          <entry><filename>utils/nabstime.h</filename></entry>
 799         </row>
 800         <row>
 801          <entry><type>text</type></entry>
 802          <entry><type>text*</type></entry>
 803          <entry><filename>postgres.h</filename></entry>
 804         </row>
 805         <row>
 806          <entry><type>tid</type></entry>
 807          <entry><type>ItemPointer</type></entry>
 808          <entry><filename>storage/itemptr.h</filename></entry>
 809         </row>
 810         <row>
 811          <entry><type>time</type></entry>
 812          <entry><type>TimeADT</type></entry>
 813          <entry><filename>utils/date.h</filename></entry>
 814         </row>
 815         <row>
 816          <entry><type>time with time zone</type></entry>
 817          <entry><type>TimeTzADT</type></entry>
 818          <entry><filename>utils/date.h</filename></entry>
 819         </row>
 820         <row>
 821          <entry><type>timestamp</type></entry>
 822          <entry><type>Timestamp*</type></entry>
 823          <entry><filename>utils/timestamp.h</filename></entry>
 824         </row>
 825         <row>
 826          <entry><type>tinterval</type></entry>
 827          <entry><type>TimeInterval</type></entry>
 828          <entry><filename>utils/nabstime.h</filename></entry>
 829         </row>
 830         <row>
 831          <entry><type>varchar</type></entry>
 832          <entry><type>VarChar*</type></entry>
 833          <entry><filename>postgres.h</filename></entry>
 834         </row>
 835         <row>
 836          <entry><type>xid</type></entry>
 837          <entry><type>TransactionId</type></entry>
 838          <entry><filename>postgres.h</filename></entry>
 839         </row>
 840        </tbody>
 841       </tgroup>
 842      </table>
 843
 844     <para>
 845      Internally, <productname>Postgres</productname> regards a
 846      base type as a <quote>blob  of memory</quote>.   The  user-defined
 847      functions that you define over a type in turn define the
 848      way  that  <productname>Postgres</productname> can operate
 849      on  it.  That is, <productname>Postgres</productname> will
 850      only store and retrieve the data from disk and use  your
 851      user-defined functions to input, process, and output the data.
 852      Base types can have one of three internal formats:
 853
 854      <itemizedlist>
 855       <listitem>
 856        <para>
 857         pass by value, fixed-length
 858        </para>
 859       </listitem>
 860       <listitem>
 861        <para>
 862         pass by reference, fixed-length
 863        </para>
 864       </listitem>
 865       <listitem>
 866        <para>
 867         pass by reference, variable-length
 868        </para>
 869       </listitem>
 870      </itemizedlist>
 871     </para>
 872
 873     <para>
 874      By-value  types  can  only be 1, 2 or 4 bytes in length
 875      (also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
 876      You should be careful
 877      to define your types such that  they  will  be  the  same
 878      size (in bytes) on all architectures.  For example, the
 879      <literal>long</literal> type is dangerous because  it
 880      is 4 bytes on some machines and 8 bytes on others, whereas
 881      <type>int</type>  type  is  4  bytes  on  most
 882      Unix machines.  A reasonable implementation of
 883      the  <type>int4</type>  type  on  Unix
 884      machines might be:
 885
 886 <programlisting>
 887 /* 4-byte integer, passed by value */
 888 typedef int int4;
 889 </programlisting>
 890
 891      <productname>PostgreSQL</productname> automatically figures
 892      things out so that the integer types really have the size they
 893      advertise.
 894     </para>
 895
 896     <para>
 897      On  the  other hand, fixed-length types of any size may
 898      be passed by-reference.  For example, here is a  sample
 899      implementation of a <productname>PostgreSQL</productname> type:
 900
 901 <programlisting>
 902 /* 16-byte structure, passed by reference */
 903 typedef struct
 904 {
 905     double  x, y;
 906 } Point;
 907 </programlisting>
 908     </para>
 909
 910     <para>
 911      Only  pointers  to  such types can be used when passing
 912      them in and out of <productname>Postgres</productname> functions.
 913      To return a value of such a type, allocate the right amount of
 914      memory with <literal>palloc()</literal>, fill in the allocated memory,
 915      and return a pointer to it.  (Alternatively, you can return an input
 916      value of the same type by returning its pointer.  <emphasis>Never</>
 917      modify the contents of a pass-by-reference input value, however.)
 918     </para>
 919
 920     <para>
 921      Finally, all variable-length types must also be  passed
 922      by  reference.   All  variable-length  types must begin
 923      with a length field of exactly 4 bytes, and all data to
 924      be  stored within that type must be located in the memory
 925      immediately  following  that  length  field.   The
 926      length  field  is  the  total  length  of the structure
 927      (i.e.,  it  includes  the  size  of  the  length  field
 928      itself).  We can define the text type as follows:
 929
 930 <programlisting>
 931 typedef struct {
 932     int4 length;
 933     char data[1];
 934 } text;
 935 </programlisting>
 936     </para>
 937
 938     <para>
 939      Obviously,  the  data  field shown here is not long enough to hold
 940      all possible strings; it's impossible to declare such
 941      a  structure  in  <acronym>C</acronym>.  When manipulating
 942      variable-length types, we must  be  careful  to  allocate
 943      the  correct amount  of memory and initialize the length field.
 944      For example, if we wanted to  store  40  bytes  in  a  text
 945      structure, we might use a code fragment like this:
 946
 947 <programlisting>
 948 #include "postgres.h"
 949 ...
 950 char buffer[40]; /* our source data */
 951 ...
 952 text *destination = (text *) palloc(VARHDRSZ + 40);
 953 destination-&gt;length = VARHDRSZ + 40;
 954 memmove(destination-&gt;data, buffer, 40);
 955 ...
 956 </programlisting>
 957     </para>
 958
 959     <para>
 960      Now that we've gone over all of the possible structures
 961      for base types, we can show some examples of real functions.
 962     </para>
 963    </sect2>
 964
 965    <sect2>
 966     <title>Version-0 Calling Conventions for C-Language Functions</title>
 967
 968     <para>
 969      We present the <quote>old style</quote> calling convention first --- although
 970      this approach is now deprecated, it's easier to get a handle on
 971      initially.  In the version-0 method, the arguments and result
 972      of the C function are just declared in normal C style, but being
 973      careful to use the C representation of each SQL data type as shown
 974      above.
 975     </para>
 976
 977     <para>
 978      Here are some examples:
 979
 980 <programlisting>
 981 #include "postgres.h"
 982 #include &lt;string.h&gt;
 983
 984 /* By Value */
 985
 986 int
 987 add_one(int arg)
 988 {
 989     return arg + 1;
 990 }
 991
 992 /* By Reference, Fixed Length */
 993
 994 float8 *
 995 add_one_float8(float8 *arg)
 996 {
 997     float8    *result = (float8 *) palloc(sizeof(float8));
 998
 999     *result = *arg + 1.0;
1000
1001     return result;
1002 }
1003
1004 Point *
1005 makepoint(Point *pointx, Point *pointy)
1006 {
1007     Point     *new_point = (Point *) palloc(sizeof(Point));
1008
1009     new_point->x = pointx->x;
1010     new_point->y = pointy->y;
1011
1012     return new_point;
1013 }
1014
1015 /* By Reference, Variable Length */
1016
1017 text *
1018 copytext(text *t)
1019 {
1020     /*
1021      * VARSIZE is the total size of the struct in bytes.
1022      */
1023     text *new_t = (text *) palloc(VARSIZE(t));
1024     VARATT_SIZEP(new_t) = VARSIZE(t);
1025     /*
1026      * VARDATA is a pointer to the data region of the struct.
1027      */
1028     memcpy((void *) VARDATA(new_t), /* destination */
1029            (void *) VARDATA(t),     /* source */
1030            VARSIZE(t)-VARHDRSZ);    /* how many bytes */
1031     return new_t;
1032 }
1033
1034 text *
1035 concat_text(text *arg1, text *arg2)
1036 {
1037     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1038     text *new_text = (text *) palloc(new_text_size);
1039
1040     VARATT_SIZEP(new_text) = new_text_size;
1041     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1042     memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1043            VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1044     return new_text;
1045 }
1046 </programlisting>
1047     </para>
1048
1049     <para>
1050      Supposing that the above code has been prepared in file
1051      <filename>funcs.c</filename> and compiled into a shared object,
1052      we could define the functions to <productname>Postgres</productname>
1053      with commands like this:
1054
1055 <programlisting>
1056 CREATE FUNCTION add_one(int4) RETURNS int4
1057      AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1058      WITH (isStrict);
1059
1060 -- note overloading of SQL function name add_one()
1061 CREATE FUNCTION add_one(float8) RETURNS float8
1062      AS '<replaceable>PGROOT</replaceable>/tutorial/funcs',
1063         'add_one_float8'
1064      LANGUAGE 'c' WITH (isStrict);
1065
1066 CREATE FUNCTION makepoint(point, point) RETURNS point
1067      AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1068      WITH (isStrict);
1069
1070 CREATE FUNCTION copytext(text) RETURNS text
1071      AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1072      WITH (isStrict);
1073
1074 CREATE FUNCTION concat_text(text, text) RETURNS text
1075      AS '<replaceable>PGROOT</replaceable>/tutorial/funcs' LANGUAGE 'c'
1076      WITH (isStrict);
1077 </programlisting>
1078     </para>
1079
1080     <para>
1081      Here <replaceable>PGROOT</replaceable> stands for the full path to
1082      the <productname>Postgres</productname> source tree. (Better style would
1083      be to use just <literal>'funcs'</> in the <literal>AS</> clause,
1084      after having added <replaceable>PGROOT</replaceable><literal>/tutorial</>
1085      to the search path.  In any case, we may omit the system-specific
1086      extension for a shared library, commonly <literal>.so</literal> or
1087      <literal>.sl</literal>.)
1088     </para>
1089
1090     <para>
1091      Notice that we have specified the functions as <quote>strict</quote>,
1092      meaning that
1093      the system should automatically assume a NULL result if any input
1094      value is NULL.  By doing this, we avoid having to check for NULL inputs
1095      in the function code.  Without this, we'd have to check for NULLs
1096      explicitly, for example by checking for a null pointer for each
1097      pass-by-reference argument.  (For pass-by-value arguments, we don't
1098      even have a way to check!)
1099     </para>
1100
1101     <para>
1102      Although this calling convention is simple to use,
1103      it is not very portable; on some architectures there are problems
1104      with passing smaller-than-int data types this way.  Also, there is
1105      no simple way to return a NULL result, nor to cope with NULL arguments
1106      in any way other than making the function strict.  The version-1
1107      convention, presented next, overcomes these objections.
1108     </para>
1109    </sect2>
1110
1111    <sect2>
1112     <title>Version-1 Calling Conventions for C-Language Functions</title>
1113
1114     <para>
1115      The version-1 calling convention relies on macros to suppress most
1116      of the complexity of passing arguments and results.  The C declaration
1117      of a version-1 function is always
1118 <programlisting>
1119 Datum funcname(PG_FUNCTION_ARGS)
1120 </programlisting>
1121      In addition, the macro call
1122 <programlisting>
1123 PG_FUNCTION_INFO_V1(funcname);
1124 </programlisting>
1125      must appear in the same source file (conventionally it's written
1126      just before the function itself).  This macro call is not needed
1127      for <literal>internal</>-language functions, since Postgres currently
1128      assumes all internal functions are version-1.  However, it is
1129      <emphasis>required</emphasis> for dynamically-loaded functions.
1130     </para>
1131
1132     <para>
1133      In a version-1 function, each actual argument is fetched using a
1134      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1135      macro that corresponds to the argument's datatype, and the result
1136      is returned using a
1137      <function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1138      macro for the return type.
1139     </para>
1140
1141     <para>
1142      Here we show the same functions as above, coded in version-1 style:
1143
1144 <programlisting>
1145 #include "postgres.h"
1146 #include &lt;string.h&gt;
1147 #include "fmgr.h"
1148
1149 /* By Value */
1150
1151 PG_FUNCTION_INFO_V1(add_one);
1152
1153 Datum
1154 add_one(PG_FUNCTION_ARGS)
1155 {
1156     int32   arg = PG_GETARG_INT32(0);
1157
1158     PG_RETURN_INT32(arg + 1);
1159 }
1160
1161 /* By Reference, Fixed Length */
1162
1163 PG_FUNCTION_INFO_V1(add_one_float8);
1164
1165 Datum
1166 add_one_float8(PG_FUNCTION_ARGS)
1167 {
1168     /* The macros for FLOAT8 hide its pass-by-reference nature */
1169     float8   arg = PG_GETARG_FLOAT8(0);
1170
1171     PG_RETURN_FLOAT8(arg + 1.0);
1172 }
1173
1174 PG_FUNCTION_INFO_V1(makepoint);
1175
1176 Datum
1177 makepoint(PG_FUNCTION_ARGS)
1178 {
1179     /* Here, the pass-by-reference nature of Point is not hidden */
1180     Point     *pointx = PG_GETARG_POINT_P(0);
1181     Point     *pointy = PG_GETARG_POINT_P(1);
1182     Point     *new_point = (Point *) palloc(sizeof(Point));
1183
1184     new_point->x = pointx->x;
1185     new_point->y = pointy->y;
1186
1187     PG_RETURN_POINT_P(new_point);
1188 }
1189
1190 /* By Reference, Variable Length */
1191
1192 PG_FUNCTION_INFO_V1(copytext);
1193
1194 Datum
1195 copytext(PG_FUNCTION_ARGS)
1196 {
1197     text     *t = PG_GETARG_TEXT_P(0);
1198     /*
1199      * VARSIZE is the total size of the struct in bytes.
1200      */
1201     text     *new_t = (text *) palloc(VARSIZE(t));
1202     VARATT_SIZEP(new_t) = VARSIZE(t);
1203     /*
1204      * VARDATA is a pointer to the data region of the struct.
1205      */
1206     memcpy((void *) VARDATA(new_t), /* destination */
1207            (void *) VARDATA(t),     /* source */
1208            VARSIZE(t)-VARHDRSZ);    /* how many bytes */
1209     PG_RETURN_TEXT_P(new_t);
1210 }
1211
1212 PG_FUNCTION_INFO_V1(concat_text);
1213
1214 Datum
1215 concat_text(PG_FUNCTION_ARGS)
1216 {
1217     text  *arg1 = PG_GETARG_TEXT_P(0);
1218     text  *arg2 = PG_GETARG_TEXT_P(1);
1219     int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1220     text *new_text = (text *) palloc(new_text_size);
1221
1222     VARATT_SIZEP(new_text) = new_text_size;
1223     memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1224     memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1225            VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1226     PG_RETURN_TEXT_P(new_text);
1227 }
1228 </programlisting>
1229     </para>
1230
1231     <para>
1232      The <command>CREATE FUNCTION</command> commands are the same as
1233      for the version-0 equivalents.
1234     </para>
1235
1236     <para>
1237      At first glance, the version-1 coding conventions may appear to
1238      be just pointless obscurantism.  However, they do offer a number
1239      of improvements, because the macros can hide unnecessary detail.
1240      An example is that in coding add_one_float8, we no longer need to
1241      be aware that float8 is a pass-by-reference type.  Another
1242      example is that the GETARG macros for variable-length types hide
1243      the need to deal with fetching <quote>toasted</quote> (compressed or
1244      out-of-line) values.  The old-style <function>copytext</function>
1245      and <function>concat_text</function> functions shown above are
1246      actually wrong in the presence of toasted values, because they
1247      don't call <function>pg_detoast_datum()</function> on their
1248      inputs.  (The handler for old-style dynamically-loaded functions
1249      currently takes care of this detail, but it does so less
1250      efficiently than is possible for a version-1 function.)
1251     </para>
1252
1253     <para>
1254      One big improvement in version-1 functions is better handling of NULL
1255      inputs and results.  The macro <function>PG_ARGISNULL(n)</function>
1256      allows a function to test whether each input is NULL (of course, doing
1257      this is only necessary in functions not declared <quote>strict</>).
1258      As with the
1259      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
1260      the input arguments are counted beginning at zero.  Note that one
1261      should refrain from executing
1262      <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
1263      one has verified that the argument isn't NULL.
1264      To return a NULL result, execute <function>PG_RETURN_NULL()</function>;
1265      this works in both strict and non-strict functions.
1266     </para>
1267
1268     <para>
1269      The version-1 function call conventions make it possible to
1270      return <quote>set</quote> results and implement trigger functions and
1271      procedural-language call handlers.  Version-1 code is also more
1272      portable than version-0, because it does not break ANSI C restrictions
1273      on function call protocol.  For more details see
1274      <filename>src/backend/utils/fmgr/README</filename> in the source
1275      distribution.
1276     </para>
1277    </sect2>
1278
1279    <sect2>
1280     <title>Composite Types in C-Language Functions</title>
1281
1282     <para>
1283      Composite types do not  have  a  fixed  layout  like  C
1284      structures.   Instances of a composite type may contain
1285      null fields.  In addition,  composite  types  that  are
1286      part  of  an  inheritance  hierarchy may have different
1287      fields than other members of the same inheritance hierarchy.
1288      Therefore,  <productname>Postgres</productname>  provides
1289      a  procedural interface for accessing fields of composite types
1290      from C.  As <productname>Postgres</productname> processes
1291      a set of rows, each row will be passed into your
1292      function as an  opaque  structure of type <literal>TUPLE</literal>.
1293      Suppose we want to write a function to answer the query
1294
1295 <programlisting>
1296 SELECT name, c_overpaid(emp, 1500) AS overpaid
1297 FROM emp
1298 WHERE name = 'Bill' OR name = 'Sam';
1299 </programlisting>
1300
1301      In the query above, we can define c_overpaid as:
1302
1303 <programlisting>
1304 #include "postgres.h"
1305 #include "executor/executor.h"  /* for GetAttributeByName() */
1306
1307 bool
1308 c_overpaid(TupleTableSlot *t, /* the current row of EMP */
1309            int32 limit)
1310 {
1311     bool isnull;
1312     int32 salary;
1313
1314     salary = DatumGetInt32(GetAttributeByName(t, "salary", &amp;isnull));
1315     if (isnull)
1316         return (false);
1317     return salary &gt; limit;
1318 }
1319
1320 /* In version-1 coding, the above would look like this: */
1321
1322 PG_FUNCTION_INFO_V1(c_overpaid);
1323
1324 Datum
1325 c_overpaid(PG_FUNCTION_ARGS)
1326 {
1327     TupleTableSlot  *t = (TupleTableSlot *) PG_GETARG_POINTER(0);
1328     int32            limit = PG_GETARG_INT32(1);
1329     bool isnull;
1330     int32 salary;
1331
1332     salary = DatumGetInt32(GetAttributeByName(t, "salary", &amp;isnull));
1333     if (isnull)
1334         PG_RETURN_BOOL(false);
1335     /* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary */
1336
1337     PG_RETURN_BOOL(salary &gt; limit);
1338 }
1339 </programlisting>
1340     </para>
1341
1342     <para>
1343      <function>GetAttributeByName</function> is the
1344      <productname>Postgres</productname> system function that
1345      returns attributes out of the current row.  It has
1346      three arguments: the argument of type <type>TupleTableSlot*</type> passed into
1347      the  function, the name of the desired attribute, and a
1348      return parameter that tells whether  the  attribute
1349      is  null.   <function>GetAttributeByName</function> returns a Datum
1350      value that you can convert to the proper datatype by using the
1351      appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> macro.
1352     </para>
1353
1354     <para>
1355      The  following  query  lets  <productname>Postgres</productname>
1356      know  about  the <function>c_overpaid</function> function:
1357
1358 <programlisting>
1359 CREATE FUNCTION c_overpaid(emp, int4)
1360 RETURNS bool
1361 AS '<replaceable>PGROOT</replaceable>/tutorial/obj/funcs'
1362 LANGUAGE 'c';
1363 </programlisting>
1364     </para>
1365
1366     <para>
1367      While there are ways to construct new rows or modify
1368      existing rows from within a C function, these
1369      are far too complex to discuss in this manual.
1370      Consult the backend source code for examples.
1371     </para>
1372    </sect2>
1373
1374    <sect2>
1375     <title>Writing Code</title>
1376
1377     <para>
1378      We now turn to the more difficult task of writing
1379      programming  language  functions.  Be warned: this section
1380      of the manual will not make you a programmer.  You must
1381      have  a  good  understanding of <acronym>C</acronym>
1382      (including the use of pointers and the malloc memory manager)
1383      before  trying to write <acronym>C</acronym> functions for
1384      use with <productname>Postgres</productname>. While  it may
1385      be possible to load functions written in languages other
1386      than <acronym>C</acronym> into  <productname>Postgres</productname>,
1387      this  is  often difficult  (when  it  is possible at all)
1388      because other languages, such as <acronym>FORTRAN</acronym>
1389      and <acronym>Pascal</acronym> often do not follow the same
1390      <firstterm>calling convention</firstterm>
1391      as <acronym>C</acronym>.  That is, other
1392      languages  do  not  pass  argument  and  return  values
1393      between functions in the same way.  For this reason, we
1394      will assume that your  programming  language  functions
1395      are written in <acronym>C</acronym>.
1396     </para>
1397
1398     <para>
1399      The  basic  rules  for building <acronym>C</acronym> functions
1400      are as follows:
1401
1402      <itemizedlist>
1403       <listitem>
1404        <para>
1405         Use <literal>pg_config --includedir-server</literal> to find
1406         out where the PostgreSQL server header files are installed on
1407         your system (or the system that your users will be running
1408         on).  This option is new with PostgreSQL 7.2.  For PostgreSQL
1409         7.1 you should use the option <option>--includedir</option>.
1410         (<command>pg_config</command> will exit with a non-zero status
1411         if it encounters an unknown option.)  For releases prior to
1412         7.1 you will have to guess, but since that was before the
1413         current calling conventions were introduced, it is unlikely
1414         that you want to support those releases.
1415        </para>
1416       </listitem>
1417
1418       <listitem>
1419        <para>
1420         When allocating memory, use the
1421         <productname>Postgres</productname> routines
1422         <function>palloc</function> and <function>pfree</function>
1423         instead of the corresponding <acronym>C</acronym> library
1424         routines <function>malloc</function> and
1425         <function>free</function>.  The memory allocated by
1426         <function>palloc</function> will be freed automatically at the
1427         end of each transaction, preventing memory leaks.
1428        </para>
1429       </listitem>
1430
1431       <listitem>
1432        <para>
1433         Always zero the bytes of your structures using
1434         <function>memset</function> or <function>bzero</function>.
1435         Several routines (such as the hash access method, hash join
1436         and the sort algorithm) compute functions of the raw bits
1437         contained in your structure.  Even if you initialize all
1438         fields of your structure, there may be several bytes of
1439         alignment padding (holes in the structure) that may contain
1440         garbage values.
1441        </para>
1442       </listitem>
1443
1444       <listitem>
1445        <para>
1446         Most of the internal <productname>Postgres</productname> types
1447         are declared in <filename>postgres.h</filename>, while the function
1448         manager interfaces (<symbol>PG_FUNCTION_ARGS</symbol>, etc.)
1449         are in <filename>fmgr.h</filename>, so you will need to
1450         include at least these two files.  For portability reasons it's best
1451         to include <filename>postgres.h</filename> <emphasis>first</>,
1452         before any other system or user header files.
1453         Including <filename>postgres.h</filename> will also include
1454         <filename>elog.h</filename> and <filename>palloc.h</filename>
1455         for you.
1456        </para>
1457       </listitem>
1458
1459       <listitem>
1460        <para>
1461         Symbol names defined within object files must not conflict
1462         with each other or with symbols defined in the
1463         <productname>PostgreSQL</productname> server executable.  You
1464         will have to rename your functions or variables if you get
1465         error messages to this effect.
1466        </para>
1467       </listitem>
1468
1469       <listitem>
1470        <para>
1471         Compiling and linking your object code  so  that
1472         it  can  be  dynamically  loaded  into
1473         <productname>Postgres</productname>
1474         always requires special flags.
1475         See <xref linkend="dfunc">
1476         for  a  detailed explanation of how to do it for
1477         your particular operating system.
1478        </para>
1479       </listitem>
1480      </itemizedlist>
1481     </para>
1482    </sect2>
1483
1484 &dfunc;
1485
1486   </sect1>
1487
1488   <sect1 id="xfunc-overload">
1489    <title>Function Overloading</title>
1490
1491    <para>
1492     More than one function may be defined with the same SQL name, so long
1493     as the arguments they take are different.  In other words,
1494     function names can be <firstterm>overloaded</firstterm>.  When a
1495     query is executed, the server will determine which function to
1496     call from the data types and the number of the provided arguments.
1497     Overloading can also be used to simulate functions with a variable
1498     number of arguments, up to a finite maximum number.
1499    </para>
1500
1501    <para>
1502     A function may also have the same name as an attribute.  In the case
1503     that there is an ambiguity between a function on a complex type and
1504     an attribute of the complex type, the attribute will always be used.
1505    </para>
1506
1507    <para>
1508     When creating a family of overloaded functions, one should be
1509     careful not to create ambiguities.  For instance, given the
1510     functions
1511 <programlisting>
1512 CREATE FUNCTION test(int, real) RETURNS ...
1513 CREATE FUNCTION test(smallint, double precision) RETURNS ...
1514 </programlisting>
1515     it is not immediately clear which function would be called with
1516     some trivial input like <literal>test(1, 1.5)</literal>.  The
1517     currently implemented resolution rules are described in the
1518     <citetitle>User's Guide</citetitle>, but it is unwise to design a
1519     system that subtly relies on this behavior.
1520    </para>
1521
1522    <para>
1523     When overloading C language functions, there is an additional
1524     constraint: The C name of each function in the family of
1525     overloaded functions must be different from the C names of all
1526     other functions, either internal or dynamically loaded.  If this
1527     rule is violated, the behavior is not portable.  You might get a
1528     run-time linker error, or one of the functions will get called
1529     (usually the internal one).  The alternative form of the
1530     <literal>AS</> clause for the SQL <command>CREATE
1531     FUNCTION</command> command decouples the SQL function name from
1532     the function name in the C source code.  E.g.,
1533 <programlisting>
1534 CREATE FUNCTION test(int) RETURNS int
1535     AS '<replaceable>filename</>', 'test_1arg'
1536     LANGUAGE C;
1537 CREATE FUNCTION test(int, int) RETURNS int
1538     AS '<replaceable>filename</>', 'test_2arg'
1539     LANGUAGE C;
1540 </programlisting>
1541     The names of the C functions here reflect one of many possible conventions.
1542    </para>
1543
1544    <para>
1545     Prior to <productname>PostgreSQL</productname> 7.0, this
1546     alternative syntax did not exist.  There is a trick to get around
1547     the problem, by defining a set of C functions with different names
1548     and then define a set of identically-named SQL function wrappers
1549     that take the appropriate argument types and call the matching C
1550     function.
1551    </para>
1552   </sect1>
1553
1554
1555   <sect1 id="xfunc-plhandler">
1556    <title>Procedural Language Handlers</title>
1557
1558    <para>
1559     All calls to functions that are written in a language other than
1560     the current <quote>version 1</quote> interface for compiled
1561     languages (this includes functions in user-defined procedural languages,
1562     functions written in SQL, and functions using the version 0 compiled
1563     language interface), go through a <firstterm>call handler</firstterm>
1564     function for the specific language.  It is the responsibility of
1565     the call handler to execute the function in a meaningful way, such
1566     as by interpreting the supplied source text.  This section
1567     describes how a language call handler can be written.  This is not
1568     a common task, in fact, it has only been done a handful of times
1569     in the history of <productname>PostgreSQL</productname>, but the
1570     topic naturally belongs in this chapter, and the material might
1571     give some insight into the extensible nature of the
1572     <productname>PostgreSQL</productname> system.
1573    </para>
1574
1575    <para>
1576     The call handler for a procedural language is a
1577     <quote>normal</quote> function, which must be written in a
1578     compiled language such as C and registered with
1579     <productname>PostgreSQL</productname> as taking no arguments and
1580     returning the <type>opaque</type> type, a placeholder for
1581     unspecified or undefined types.  This prevents the call handler
1582     from being called directly as a function from queries.  (However,
1583     arguments may be supplied in the actual call to the handler when a
1584     function in the language offered by the handler is to be
1585     executed.)
1586    </para>
1587
1588    <note>
1589     <para>
1590      In <productname>PostgreSQL</productname> 7.1 and later, call
1591      handlers must adhere to the <quote>version 1</quote> function
1592      manager interface, not the old-style interface.
1593     </para>
1594    </note>
1595
1596    <para>
1597     The call handler is called in the same way as any other function:
1598     It receives a pointer to a
1599     <structname>FunctionCallInfoData</structname> struct containing
1600     argument values and information about the called function, and it
1601     is expected to return a <type>Datum</type> result (and possibly
1602     set the <structfield>isnull</structfield> field of the
1603     <structname>FunctionCallInfoData</structname> struct, if it wishes
1604     to return an SQL NULL result).  The difference between a call
1605     handler and an ordinary callee function is that the
1606     <structfield>flinfo-&gt;fn_oid</structfield> field of the
1607     <structname>FunctionCallInfoData</structname> struct will contain
1608     the OID of the actual function to be called, not of the call
1609     handler itself.  The call handler must use this field to determine
1610     which function to execute.  Also, the passed argument list has
1611     been set up according to the declaration of the target function,
1612     not of the call handler.
1613    </para>
1614
1615    <para>
1616     It's up to the call handler to fetch the
1617     <classname>pg_proc</classname> entry and to analyze the argument
1618     and return types of the called procedure. The AS clause from the
1619     <command>CREATE FUNCTION</command> of the procedure will be found
1620     in the <literal>prosrc</literal> attribute of the
1621     <classname>pg_proc</classname> table entry. This may be the source
1622     text in the procedural language itself (like for PL/Tcl), a
1623     path name to a file, or anything else that tells the call handler
1624     what to do in detail.
1625    </para>
1626
1627    <para>
1628     Often, the same function is called many times per SQL statement.
1629     A call handler can avoid repeated lookups of information about the
1630     called function by using the
1631     <structfield>flinfo-&gt;fn_extra</structfield> field.  This will
1632     initially be NULL, but can be set by the call handler to point at
1633     information about the PL function.  On subsequent calls, if
1634     <structfield>flinfo-&gt;fn_extra</structfield> is already non-NULL
1635     then it can be used and the information lookup step skipped.  The
1636     call handler must be careful that
1637     <structfield>flinfo-&gt;fn_extra</structfield> is made to point at
1638     memory that will live at least until the end of the current query,
1639     since an <structname>FmgrInfo</structname> data structure could be
1640     kept that long.  One way to do this is to allocate the extra data
1641     in the memory context specified by
1642     <structfield>flinfo-&gt;fn_mcxt</structfield>; such data will
1643     normally have the same lifespan as the
1644     <structname>FmgrInfo</structname> itself.  But the handler could
1645     also choose to use a longer-lived context so that it can cache
1646     function definition information across queries.
1647    </para>
1648
1649    <para>
1650     When a PL function is invoked as a trigger, no explicit arguments
1651     are passed, but the
1652     <structname>FunctionCallInfoData</structname>'s
1653     <structfield>context</structfield> field points at a
1654     <structname>TriggerData</structname> node, rather than being NULL
1655     as it is in a plain function call.  A language handler should
1656     provide mechanisms for PL functions to get at the trigger
1657     information.
1658    </para>
1659
1660    <para>
1661     This is a template for a PL handler written in C:
1662 <programlisting>
1663 #include "postgres.h"
1664 #include "executor/spi.h"
1665 #include "commands/trigger.h"
1666 #include "utils/elog.h"
1667 #include "fmgr.h"
1668 #include "access/heapam.h"
1669 #include "utils/syscache.h"
1670 #include "catalog/pg_proc.h"
1671 #include "catalog/pg_type.h"
1672
1673 PG_FUNCTION_INFO_V1(plsample_call_handler);
1674
1675 Datum
1676 plsample_call_handler(PG_FUNCTION_ARGS)
1677 {
1678     Datum          retval;
1679
1680     if (CALLED_AS_TRIGGER(fcinfo))
1681     {
1682         /*
1683          * Called as a trigger procedure
1684          */
1685         TriggerData    *trigdata = (TriggerData *) fcinfo->context;
1686
1687         retval = ...
1688     }
1689     else {
1690         /*
1691          * Called as a function
1692          */
1693
1694         retval = ...
1695     }
1696
1697     return retval;
1698 }
1699 </programlisting>
1700    </para>
1701
1702    <para>
1703     Only a few thousand lines of code have to be added instead of the
1704     dots to complete the call handler.  See <xref linkend="xfunc-c">
1705     for information on how to compile it into a loadable module.
1706    </para>
1707
1708    <para>
1709     The following commands then register the sample procedural
1710     language:
1711 <programlisting>
1712 CREATE FUNCTION plsample_call_handler () RETURNS opaque
1713     AS '/usr/local/pgsql/lib/plsample'
1714     LANGUAGE C;
1715 CREATE LANGUAGE plsample
1716     HANDLER plsample_call_handler;
1717 </programlisting>
1718    </para>
1719   </sect1>
1720  </chapter>
1721
1722 <!-- Keep this comment at the end of the file
1723 Local variables:
1724 mode:sgml
1725 sgml-omittag:nil
1726 sgml-shorttag:t
1727 sgml-minimize-attributes:nil
1728 sgml-always-quote-attributes:t
1729 sgml-indent-step:1
1730 sgml-indent-data:t
1731 sgml-parent-document:nil
1732 sgml-default-dtd-file:"./reference.ced"
1733 sgml-exposed-tags:nil
1734 sgml-local-catalogs:("/usr/lib/sgml/catalog")
1735 sgml-local-ecat-files:nil
1736 End:
1737 -->