-<!-- $PostgreSQL: pgsql/doc/src/sgml/plpgsql.sgml,v 1.146 2009/11/10 14:22:45 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/plpgsql.sgml,v 1.147 2009/11/13 22:43:39 tgl Exp $ -->
<chapter id="plpgsql">
<title><application>PL/pgSQL</application> - <acronym>SQL</acronym> Procedural Language</title>
</para>
<para>
- There are two types of comments in <application>PL/pgSQL</>. A double
- dash (<literal>--</literal>) starts a comment that extends to the end of
- the line. A <literal>/*</literal> starts a block comment that extends to
- the next occurrence of <literal>*/</literal>. Block comments nest,
- just as in ordinary SQL.
+ Comments work the same way in <application>PL/pgSQL</> code as in
+ ordinary SQL. A double dash (<literal>--</literal>) starts a comment
+ that extends to the end of the line. A <literal>/*</literal> starts a
+ block comment that extends to the matching occurrence of
+ <literal>*/</literal>. Block comments nest.
</para>
<para>
to the variable when the block is entered. If the <literal>DEFAULT</> clause
is not given then the variable is initialized to the
<acronym>SQL</acronym> null value.
- The <literal>CONSTANT</> option prevents the variable from being assigned to,
- so that its value remains constant for the duration of the block.
+ The <literal>CONSTANT</> option prevents the variable from being
+ assigned to, so that its value will remain constant for the duration of
+ the block.
If <literal>NOT NULL</>
is specified, an assignment of a null value results in a run-time
error. All variables declared as <literal>NOT NULL</>
<programlisting>
IF x < y THEN ...
</programlisting>
- what happens behind the scenes is
+ what happens behind the scenes is equivalent to
<programlisting>
PREPARE <replaceable>statement_name</>(integer, integer) AS SELECT $1 < $2;
</programlisting>
<para>
An assignment of a value to a <application>PL/pgSQL</application>
- variable or row/record field is written as:
+ variable is written as:
<synopsis>
<replaceable>variable</replaceable> := <replaceable>expression</replaceable>;
</synopsis>
- As explained above, the expression in such a statement is evaluated
+ As explained previously, the expression in such a statement is evaluated
by means of an SQL <command>SELECT</> command sent to the main
- database engine. The expression must yield a single value.
+ database engine. The expression must yield a single value (possibly
+ a row value, if the variable is a row or record variable). The target
+ variable can be a simple variable (optionally qualified with a block
+ name), a field of a row or record variable, or an element of an array
+ that is a simple variable or field.
</para>
<para>
<para>
Any <application>PL/pgSQL</application> variable name appearing
- in the command text is replaced by a parameter symbol, and then the
+ in the command text is treated as a parameter, and then the
current value of the variable is provided as the parameter value
at run time. This is exactly like the processing described earlier
for expressions; for details see <xref linkend="plpgsql-var-subst">.
- As an example, if you write:
-<programlisting>
-DECLARE
- key TEXT;
- delta INTEGER;
-BEGIN
- ...
- UPDATE mytab SET val = val + delta WHERE id = key;
-</programlisting>
- the command text seen by the main SQL engine will look like:
-<programlisting>
- UPDATE mytab SET val = val + $1 WHERE id = $2;
-</programlisting>
- Although you don't normally have to think about this, it's helpful
- to know it when you need to make sense of syntax-error messages.
</para>
- <caution>
- <para>
- <application>PL/pgSQL</application> will substitute for any identifier
- matching one of the function's declared variables; it is not bright
- enough to know whether that's what you meant! Thus, it is a bad idea
- to use a variable name that is the same as any table, column, or
- function name that you need to reference in commands within the
- function. For more discussion see <xref linkend="plpgsql-var-subst">.
- </para>
- </caution>
-
<para>
When executing a SQL command in this way,
<application>PL/pgSQL</application> plans the command just once
<para>
If a row or a variable list is used as target, the query's result columns
must exactly match the structure of the target as to number and data
- types, or a run-time error
+ types, or else a run-time error
occurs. When a record variable is the target, it automatically
configures itself to the row type of the query result columns.
</para>
INTO c
USING checked_user, checked_date;
</programlisting>
+ </para>
+ <para>
Note that parameter symbols can only be used for data values
— if you want to use dynamically determined table or column
names, you must insert them into the command string textually.
INTO c
USING checked_user, checked_date;
</programlisting>
+ Another restriction on parameter symbols is that they only work in
+ <command>SELECT</>, <command>INSERT</>, <command>UPDATE</>, and
+ <command>DELETE</> commands. In other statement
+ types (generically called utility statements), you must insert
+ values textually even if they are just data values.
</para>
<para>
type <type>boolean</type>. <literal>FOUND</literal> starts out
false within each <application>PL/pgSQL</application> function call.
It is set by each of the following types of statements:
+
<itemizedlist>
<listitem>
<para>
</listitem>
</itemizedlist>
+ Other <application>PL/pgSQL</application> statements do not change
+ the state of <literal>FOUND</literal>.
+ Note in particular that <command>EXECUTE</command>
+ changes the output of <command>GET DIAGNOSTICS</command>, but
+ does not change <literal>FOUND</literal>.
+ </para>
+
+ <para>
<literal>FOUND</literal> is a local variable within each
<application>PL/pgSQL</application> function; any changes to it
- affect only the current function. <literal>EXECUTE</literal>
- changes the output of <command>GET DIAGNOSTICS</command>, but
- does not change the state of <literal>FOUND</literal>.
+ affect only the current function.
</para>
</sect2>
<command>RETURN</command> with an expression terminates the
function and returns the value of
<replaceable>expression</replaceable> to the caller. This form
- is to be used for <application>PL/pgSQL</> functions that do
+ is used for <application>PL/pgSQL</> functions that do
not return a set.
</para>
or deleted using the cursor to identify the row. There are
restrictions on what the cursor's query can be (in particular,
no grouping) and it's best to use <literal>FOR UPDATE</> in the
- cursor. For additional information see the
+ cursor. For more information see the
<xref linkend="sql-declare" endterm="sql-declare-title">
reference page.
</para>
<listitem>
<para>
Data type array of <type>text</type>; the arguments from
- the <command>CREATE TRIGGER</command> statement.
+ the <command>CREATE TRIGGER</command> statement.
The index counts from 0. Invalid
- indices (less than 0 or greater than or equal to <varname>tg_nargs</>) result in a null value.
+ indexes (less than 0 or greater than or equal to <varname>tg_nargs</>)
+ result in a null value.
</para>
</listitem>
</varlistentry>
<title>Variable Substitution</title>
<para>
- When <application>PL/pgSQL</> prepares a SQL statement or expression
- for execution, any <application>PL/pgSQL</application> variable name
- appearing in the statement or expression is replaced by a parameter symbol,
- <literal>$<replaceable>n</replaceable></literal>. The current value
- of the variable is then provided as the value for the parameter whenever
- the statement or expression is executed. As an example, consider the
- function
+ SQL statements and expressions within a <application>PL/pgSQL</> function
+ can refer to variables and parameters of the function. Behind the scenes,
+ <application>PL/pgSQL</> substitutes query parameters for such references.
+ Parameters will only be substituted in places where a parameter or
+ column reference is syntactically allowed. As an extreme case, consider
+ this example of poor programming style:
<programlisting>
-CREATE FUNCTION logfunc(logtxt text) RETURNS void AS $$
- DECLARE
- curtime timestamp := now();
- BEGIN
- INSERT INTO logtable VALUES (logtxt, curtime);
- END;
-$$ LANGUAGE plpgsql;
-</programlisting>
- The <command>INSERT</> statement will effectively be processed as
-<programlisting>
-PREPARE <replaceable>statement_name</>(text, timestamp) AS
- INSERT INTO logtable VALUES ($1, $2);
+ INSERT INTO foo (foo) VALUES (foo);
</programlisting>
- followed on each execution by <command>EXECUTE</> with the current
- actual values of the two variables. (Note: here we are speaking of
- the main SQL engine's
- <xref linkend="sql-execute" endterm="sql-execute-title"> command,
- not <application>PL/pgSQL</application>'s <command>EXECUTE</>.)
+ The first occurrence of <literal>foo</> must syntactically be a table
+ name, so it will not be substituted, even if the function has a variable
+ named <literal>foo</>. The second occurrence must be the name of a
+ column of the table, so it will not be substituted either. Only the
+ third occurrence is a candidate to be a reference to the function's
+ variable.
</para>
+ <note>
+ <para>
+ <productname>PostgreSQL</productname> versions before 8.5 would try
+ to substitute the variable in all three cases, leading to syntax errors.
+ </para>
+ </note>
+
<para>
- <emphasis>The substitution mechanism will replace any token that matches a
- known variable's name.</> This poses various traps for the unwary.
- For example, it is a bad idea
- to use a variable name that is the same as any table or column name
- that you need to reference in queries within the function, because
- what you think is a table or column name will still get replaced.
- In the above example, suppose that <structname>logtable</> has
- column names <structfield>logtxt</> and <structfield>logtime</>,
- and we try to write the <command>INSERT</> as
-<programlisting>
- INSERT INTO logtable (logtxt, logtime) VALUES (logtxt, curtime);
-</programlisting>
- This will be fed to the main SQL parser as
+ Since the names of variables are syntactically no different from the names
+ of table columns, there can be ambiguity in statements that also refer to
+ tables: is a given name meant to refer to a table column, or a variable?
+ Let's change the previous example to
<programlisting>
- INSERT INTO logtable ($1, logtime) VALUES ($1, $2);
+ INSERT INTO dest (col) SELECT foo + bar FROM src;
</programlisting>
- resulting in a syntax error like this:
-<screen>
-ERROR: syntax error at or near "$1"
-LINE 1: INSERT INTO logtable ( $1 , logtime) VALUES ( $1 , $2 )
- ^
-QUERY: INSERT INTO logtable ( $1 , logtime) VALUES ( $1 , $2 )
-CONTEXT: SQL statement in PL/PgSQL function "logfunc2" near line 5
-</screen>
+ Here, <literal>dest</> and <literal>src</> must be table names, and
+ <literal>col</> must be a column of <literal>dest</>, but <literal>foo</>
+ and <literal>bar</> might reasonably be either variables of the function
+ or columns of <literal>src</>.
</para>
<para>
- This example is fairly easy to diagnose, since it leads to an
- obvious syntax error. Much nastier are cases where the substitution
- is syntactically permissible, since the only symptom may be misbehavior
- of the function. In one case, a user wrote something like this:
-<programlisting>
- DECLARE
- val text;
- search_key integer;
- BEGIN
- ...
- FOR val IN SELECT val FROM table WHERE key = search_key LOOP ...
-</programlisting>
- and wondered why all his table entries seemed to be NULL. Of course
- what happened here was that the query became
-<programlisting>
- SELECT $1 FROM table WHERE key = $2
-</programlisting>
- and thus it was just an expensive way of assigning <literal>val</>'s
- current value back to itself for each row.
+ By default, <application>PL/pgSQL</> will report an error if a name
+ in a SQL statement could refer to either a variable or a table column.
+ You can fix such a problem by renaming the variable or column,
+ or by qualifying the ambiguous reference, or by telling
+ <application>PL/pgSQL</> which interpretation to prefer.
</para>
<para>
- A commonly used coding rule for avoiding such traps is to use a
+ The simplest solution is to rename the variable or column.
+ A common coding rule is to use a
different naming convention for <application>PL/pgSQL</application>
- variables than you use for table and column names. For example,
- if all your variables are named
+ variables than you use for column names. For example,
+ if you consistently name function variables
<literal>v_<replaceable>something</></literal> while none of your
- table or column names start with <literal>v_</>, you're pretty safe.
+ column names start with <literal>v_</>, no conflicts will occur.
</para>
<para>
- Another workaround is to use qualified (dotted) names for SQL entities.
- For instance we could safely have written the above example as
+ Alternatively you can qualify ambiguous references to make them clear.
+ In the above example, <literal>src.foo</> would be an unambiguous reference
+ to the table column. To create an unambiguous reference to a variable,
+ declare it in a labeled block and use the block's label
+ (see <xref linkend="plpgsql-structure">). For example,
<programlisting>
- FOR val IN SELECT table.val FROM table WHERE key = search_key LOOP ...
+ <<block>>
+ DECLARE
+ foo int;
+ BEGIN
+ foo := ...;
+ INSERT INTO dest (col) SELECT block.foo + bar FROM src;
</programlisting>
- because <application>PL/pgSQL</application> will not substitute a
- variable for a trailing component of a qualified name.
- However this solution does not work in every case — you can't
- qualify a name in an <command>INSERT</>'s column name list, for instance.
- Another point is that record and row variable names will be matched to
- the first components of qualified names, so a qualified SQL name is
- still vulnerable in some cases.
- In such cases choosing a non-conflicting variable name is the only way.
+ Here <literal>block.foo</> means the variable even if there is a column
+ <literal>foo</> in <literal>src</>. Function parameters, as well as
+ special variables such as <literal>FOUND</>, can be qualified by the
+ function's name, because they are implicitly declared in an outer block
+ labeled with the function's name.
</para>
<para>
- Another technique you can use is to attach a label to the block in
- which your variables are declared, and then qualify the variable names
- in your SQL commands (see <xref linkend="plpgsql-structure">).
- For example,
+ Sometimes it is impractical to fix all the ambiguous references in a
+ large body of <application>PL/pgSQL</> code. In such cases you can
+ specify that <application>PL/pgSQL</> should resolve ambiguous references
+ as the variable (which is compatible with <application>PL/pgSQL</>'s
+ behavior before <productname>PostgreSQL</productname> 8.5), or as the
+ table column (which is compatible with some other systems such as
+ <productname>Oracle</productname>).
+ </para>
+
+ <indexterm>
+ <primary><varname>plpgsql.variable_conflict</> configuration parameter</primary>
+ </indexterm>
+
+ <para>
+ To change this behavior on a system-wide basis, set the configuration
+ parameter <literal>plpgsql.variable_conflict</> to one of
+ <literal>error</>, <literal>use_variable</>, or
+ <literal>use_column</> (where <literal>error</> is the factory default).
+ This parameter affects subsequent compilations
+ of statements in <application>PL/pgSQL</> functions, but not statements
+ already compiled in the current session. To set the parameter before
+ <application>PL/pgSQL</> has been loaded, it is necessary to have added
+ <quote><literal>plpgsql</></> to the <xref
+ linkend="guc-custom-variable-classes"> list in
+ <filename>postgresql.conf</filename>. Because changing this setting
+ can cause unexpected changes in the behavior of <application>PL/pgSQL</>
+ functions, it can only be changed by a superuser.
+ </para>
+
+ <para>
+ You can also set the behavior on a function-by-function basis, by
+ inserting one of these special commands at the start of the function
+ text:
+<programlisting>
+#variable_conflict error
+#variable_conflict use_variable
+#variable_conflict use_column
+</programlisting>
+ These commands affect only the function they are written in, and override
+ the setting of <literal>plpgsql.variable_conflict</>. An example is
<programlisting>
- <<pl>>
+CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$
+ #variable_conflict use_variable
DECLARE
- val text;
+ curtime timestamp := now();
BEGIN
- ...
- UPDATE table SET col = pl.val WHERE ...
+ UPDATE users SET last_modified = curtime, comment = comment
+ WHERE users.id = id;
+ END;
+$$ LANGUAGE plpgsql;
+</programlisting>
+ In the <literal>UPDATE</> command, <literal>curtime</>, <literal>comment</>,
+ and <literal>id</> will refer to the function's variable and parameters
+ whether or not <literal>users</> has columns of those names. Notice
+ that we had to qualify the reference to <literal>users.id</> in the
+ <literal>WHERE</> clause to make it refer to the table column.
+ But we did not have to qualify the reference to <literal>comment</>
+ as a target in the <literal>UPDATE</> list, because syntactically
+ that must be a column of <literal>users</>. We could write the same
+ function without depending on the <literal>variable_conflict</> setting
+ in this way:
+<programlisting>
+CREATE FUNCTION stamp_user(id int, comment text) RETURNS void AS $$
+ <<fn>>
+ DECLARE
+ curtime timestamp := now();
+ BEGIN
+ UPDATE users SET last_modified = fn.curtime, comment = stamp_user.comment
+ WHERE users.id = stamp_user.id;
+ END;
+$$ LANGUAGE plpgsql;
</programlisting>
- This is not in itself a solution to the problem of conflicts,
- since an unqualified name in a SQL command is still at risk of being
- interpreted the <quote>wrong</> way. But it is useful for clarifying
- the intent of potentially-ambiguous code.
</para>
<para>
Variable substitution does not happen in the command string given
to <command>EXECUTE</> or one of its variants. If you need to
insert a varying value into such a command, do so as part of
- constructing the string value, as illustrated in
+ constructing the string value, or use <literal>USING</>, as illustrated in
<xref linkend="plpgsql-statements-executing-dyn">.
</para>
<para>
Variable substitution currently works only in <command>SELECT</>,
<command>INSERT</>, <command>UPDATE</>, and <command>DELETE</> commands,
- because the main SQL engine allows parameter symbols only in these
+ because the main SQL engine allows query parameters only in these
commands. To use a non-constant name or value in other statement
types (generically called utility statements), you must construct
the utility statement as a string and <command>EXECUTE</> it.
</para>
<para>
- Once <application>PL/pgSQL</> has made an execution plan for a particular
- command in a function, it will reuse that plan for the life of the
- database connection. This is usually a win for performance, but it
- can cause some problems if you dynamically
- alter your database schema. For example:
-
-<programlisting>
-CREATE FUNCTION populate() RETURNS integer AS $$
-DECLARE
- -- declarations
-BEGIN
- PERFORM my_function();
-END;
-$$ LANGUAGE plpgsql;
-</programlisting>
-
- If you execute the above function, it will reference the OID for
- <function>my_function()</function> in the execution plan produced for
- the <command>PERFORM</command> statement. Later, if you
- drop and recreate <function>my_function()</function>, then
- <function>populate()</function> will not be able to find
- <function>my_function()</function> anymore. You would then have to
- start a new database session so that <function>populate()</function>
- will be compiled afresh, before it will work again. You can avoid
- this problem by using <command>CREATE OR REPLACE FUNCTION</command>
- when updating the definition of
- <function>my_function</function>, since when a function is
- <quote>replaced</quote>, its OID is not changed.
+ A saved plan will be re-planned automatically if there is any schema
+ change to any table used in the query, or if any user-defined function
+ used in the query is redefined. This makes the re-use of prepared plans
+ transparent in most cases, but there are corner cases where a stale plan
+ might be re-used. An example is that dropping and re-creating a
+ user-defined operator won't affect already-cached plans; they'll continue
+ to call the original operator's underlying function, if that has not been
+ changed. When necessary, the cache can be flushed by starting a fresh
+ database session.
</para>
- <note>
- <para>
- In <productname>PostgreSQL</productname> 8.3 and later, saved plans
- will be replaced whenever any schema changes have occurred to any
- tables they reference. This eliminates one of the major disadvantages
- of saved plans. However, there is no such mechanism for function
- references, and thus the above example involving a reference to a
- deleted function is still valid.
- </para>
- </note>
-
<para>
Because <application>PL/pgSQL</application> saves execution plans
in this way, SQL commands that appear directly in a
<application>PL/pgSQL</application> are:
<itemizedlist>
- <listitem>
- <para>
- There are no default values for parameters in <productname>PostgreSQL</>.
- </para>
- </listitem>
-
- <listitem>
- <para>
- You can overload function names in <productname>PostgreSQL</>. This is
- often used to work around the lack of default parameters.
- </para>
- </listitem>
-
<listitem>
<para>
If a name used in a SQL command could be either a column name of a
table or a reference to a variable of the function,
- <application>PL/SQL</> treats it as a column name, while
- <application>PL/pgSQL</> treats it as a variable name. It's best
- to avoid such ambiguities in the first place, but if necessary you
- can fix them by properly qualifying the ambiguous name.
- (See <xref linkend="plpgsql-var-subst">.)
+ <application>PL/SQL</> treats it as a column name. This corresponds
+ to <application>PL/pgSQL</>'s
+ <literal>plpgsql.variable_conflict</> = <literal>use_column</>
+ behavior, which is not the default,
+ as explained in <xref linkend="plpgsql-var-subst">.
+ It's often best to avoid such ambiguities in the first place,
+ but if you have to port a large amount of code that depends on
+ this behavior, setting <literal>variable_conflict</> may be the
+ best solution.
</para>
</listitem>
The exception names supported by <application>PL/pgSQL</> are
different from Oracle's. The set of built-in exception names
is much larger (see <xref linkend="errcodes-appendix">). There
- is not currently a way to declare user-defined exception names.
+ is not currently a way to declare user-defined exception names,
+ although you can throw user-chosen SQLSTATE values instead.
</para>
</callout>
</calloutlist>