1 <!-- $PostgreSQL: pgsql/doc/src/sgml/plpython.sgml,v 1.46 2010/03/18 19:43:03 petere Exp $ -->
3 <chapter id="plpython">
4 <title>PL/Python - Python Procedural Language</title>
6 <indexterm zone="plpython"><primary>PL/Python</></>
7 <indexterm zone="plpython"><primary>Python</></>
10 The <application>PL/Python</application> procedural language allows
11 <productname>PostgreSQL</productname> functions to be written in the
12 <ulink url="http://www.python.org">Python language</ulink>.
16 To install PL/Python in a particular database, use
17 <literal>createlang plpythonu <replaceable>dbname</></literal> (but
18 see also <xref linkend="plpython-python23">).
23 If a language is installed into <literal>template1</>, all subsequently
24 created databases will have the language installed automatically.
29 As of <productname>PostgreSQL</productname> 7.4, PL/Python is only
30 available as an <quote>untrusted</> language (meaning it does not
31 offer any way of restricting what users can do in it). It has
32 therefore been renamed to <literal>plpythonu</>. The trusted
33 variant <literal>plpython</> might become available again in future,
34 if a new secure execution mechanism is developed in Python.
39 Users of source packages must specially enable the build of
40 PL/Python during the installation process. (Refer to the
41 installation instructions for more information.) Users of binary
42 packages might find PL/Python in a separate subpackage.
46 <sect1 id="plpython-python23">
47 <title>Python 2 vs. Python 3</title>
50 PL/Python supports both the Python 2 and Python 3 language
51 variants. (The PostgreSQL installation instructions might contain
52 more precise information about the exact supported minor versions
53 of Python.) Because the Python 2 and Python 3 language variants
54 are incompatible in some important aspects, the following naming
55 and transitioning scheme is used by PL/Python to avoid mixing them:
60 The PostgreSQL language named <literal>plpython2u</literal>
61 implements PL/Python based on the Python 2 language variant.
67 The PostgreSQL language named <literal>plpython3u</literal>
68 implements PL/Python based on the Python 3 language variant.
74 The language named <literal>plpythonu</literal> implements
75 PL/Python based on the default Python language variant, which is
76 currently Python 2. (This default is independent of what any
77 local Python installations might consider to be
78 their <quote>default</quote>, for example,
79 what <filename>/usr/bin/python</filename> might be.) The
80 default will probably be changed to Python 3 in a distant future
81 release of PostgreSQL, depending on the progress of the
82 migration to Python 3 in the Python community.
87 It depends on the build configuration or the installed packages
88 whether PL/Python for Python 2 or Python 3 or both are available.
92 This results in the following usage and migration strategy:
97 Existing users and users who are currently not interested in
98 Python 3 use the language name <literal>plpythonu</literal> and
99 don't have to change anything for the foreseeable future. It is
100 recommended to gradually <quote>future-proof</quote> the code
101 via migration to Python 2.6/2.7 to simplify the eventual
102 migration to Python 3.
106 In practice, many PL/Python functions will migrate to Python 3
107 with few or no changes.
113 Users who know that they have heavily Python 2 dependent code
114 and don't plan to ever change it can make use of
115 the <literal>plpython2u</literal> language name. This will
116 continue to work into the very distant future, until Python 2
117 support might be completely dropped by PostgreSQL.
123 Users who want to dive into Python 3 can use
124 the <literal>plpython3u</literal> language name, which will keep
125 working forever by today's standards. In the distant future,
126 when Python 3 might become the default, they might like to
127 remove the <quote>3</quote> for aesthetic reasons.
133 Daredevils, who want to build a Python-3-only operating system
134 environment, can change the build scripts to
135 make <literal>plpythonu</literal> be equivalent
136 to <literal>plpython3u</literal>, keeping in mind that this
137 would make their installation incompatible with most of the rest
146 document <ulink url="http://docs.python.org/dev/3.0/whatsnew/3.0.html">What's
147 New In Python 3.0</ulink> for more information about porting to
152 <sect1 id="plpython-funcs">
153 <title>PL/Python Functions</title>
156 Functions in PL/Python are declared via the
157 standard <xref linkend="sql-createfunction"> syntax:
160 CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-list</replaceable>)
161 RETURNS <replaceable>return-type</replaceable>
163 # PL/Python function body
164 $$ LANGUAGE plpythonu;
169 The body of a function is simply a Python script. When the function
170 is called, its arguments are passed as elements of the list
171 <varname>args</varname>; named arguments are also passed as
172 ordinary variables to the Python script. Use of named arguments is
173 usually more readable. The result is returned from the Python code
174 in the usual way, with <literal>return</literal> or
175 <literal>yield</literal> (in case of a result-set statement). If
176 you do not provide a return value, Python returns the default
177 <symbol>None</symbol>. <application>PL/Python</application> translates
178 Python's <symbol>None</symbol> into the SQL null value.
182 For example, a function to return the greater of two integers can be
186 CREATE FUNCTION pymax (a integer, b integer)
192 $$ LANGUAGE plpythonu;
195 The Python code that is given as the body of the function definition
196 is transformed into a Python function. For example, the above results in:
199 def __plpython_procedure_pymax_23456():
205 assuming that 23456 is the OID assigned to the function by
206 <productname>PostgreSQL</productname>.
210 The arguments are set as global variables. Because of the scoping
211 rules of Python, this has the subtle consequence that an argument
212 variable cannot be reassigned inside the function to the value of
213 an expression that involves the variable name itself, unless the
214 variable is redeclared as global in the block. For example, the
215 following won't work:
217 CREATE FUNCTION pystrip(x text)
220 x = x.strip() # error
222 $$ LANGUAGE plpythonu;
224 because assigning to <varname>x</varname>
225 makes <varname>x</varname> a local variable for the entire block,
226 and so the <varname>x</varname> on the right-hand side of the
227 assignment refers to a not-yet-assigned local
228 variable <varname>x</varname>, not the PL/Python function
229 parameter. Using the <literal>global</literal> statement, this can
232 CREATE FUNCTION pystrip(x text)
236 x = x.strip() # ok now
238 $$ LANGUAGE plpythonu;
240 But it is advisable not to rely on this implementation detail of
241 PL/Python. It is better to treat the function parameters as
247 <title>Data Values</title>
249 Generally speaking, the aim of PL/Python is to provide
250 a <quote>natural</quote> mapping between the PostgreSQL and the
251 Python worlds. This informs the data mapping rules described
256 <title>Data Type Mapping</title>
258 Function arguments are converted from their PostgreSQL type to a
259 corresponding Python type:
263 PostgreSQL <type>boolean</type> is converted to Python <type>bool</type>.
269 PostgreSQL <type>smallint</type> and <type>int</type> are
270 converted to Python <type>int</type>.
271 PostgreSQL <type>bigint</type> is converted
272 to <type>long</type> in Python 2 and to <type>int</type> in
279 PostgreSQL <type>real</type>, <type>double</type>,
280 and <type>numeric</type> are converted to
281 Python <type>float</type>. Note that for
282 the <type>numeric</type> this loses information and can lead to
283 incorrect results. This might be fixed in a future
290 PostgreSQL <type>bytea</type> is converted to
291 Python <type>str</type> in Python 2 and to <type>bytes</type>
292 in Python 3. In Python 2, the string should be treated as a
293 byte sequence without any character encoding.
299 All other data types, including the PostgreSQL character string
300 types, are converted to a Python <type>str</type>. In Python
301 2, this string will be in the PostgreSQL server encoding; in
302 Python 3, it will be a Unicode string like all strings.
308 For nonscalar data types, see below.
315 Function return values are converted to the declared PostgreSQL
316 return data type as follows:
320 When the PostgreSQL return type is <type>boolean</type>, the
321 return value will be evaluated for truth according to the
322 <emphasis>Python</emphasis> rules. That is, 0 and empty string
323 are false, but notably <literal>'f'</literal> is true.
329 When the PostgreSQL return type is <type>bytea</type>, the
330 return value will be converted to a string (Python 2) or bytes
331 (Python 3) using the respective Python builtins, with the
332 result being converted <type>bytea</type>.
338 For all other PostgreSQL return types, the returned Python
339 value is converted to a string using the Python
340 builtin <literal>str</literal>, and the result is passed to the
341 input function of the PostgreSQL data type.
345 Strings in Python 2 are required to be in the PostgreSQL server
346 encoding when they are passed to PostgreSQL. Strings that are
347 not valid in the current server encoding will raise an error,
348 but not all encoding mismatches can be detected, so garbage
349 data can still result when this is not done correctly. Unicode
350 strings are converted to the correct encoding automatically, so
351 it can be safer and more convenient to use those. In Python 3,
352 all strings are Unicode strings.
358 For nonscalar data types, see below.
363 Note that logical mismatches between the declared PostgreSQL
364 return type and the Python data type of the actual return object
365 are not flagged; the value will be converted in any case.
370 <title>Null, None</title>
372 If an SQL null value<indexterm><primary>null value</primary><secondary
373 sortas="PL/Python">PL/Python</secondary></indexterm> is passed to a
374 function, the argument value will appear as <symbol>None</symbol> in
375 Python. The above function definition will return the wrong answer for null
376 inputs. We could add <literal>STRICT</literal> to the function definition
377 to make <productname>PostgreSQL</productname> do something more reasonable:
378 if a null value is passed, the function will not be called at all,
379 but will just return a null result automatically. Alternatively,
380 we could check for null inputs in the function body:
383 CREATE FUNCTION pymax (a integer, b integer)
386 if (a is None) or (b is None):
391 $$ LANGUAGE plpythonu;
394 As shown above, to return an SQL null value from a PL/Python
395 function, return the value <symbol>None</symbol>. This can be done whether the
396 function is strict or not.
401 <title>Arrays, Lists</title>
403 SQL array values are passed into PL/Python as a Python list. To
404 return an SQL array value out of a PL/Python function, return a
405 Python sequence, for example a list or tuple:
408 CREATE FUNCTION return_arr()
411 return (1, 2, 3, 4, 5)
412 $$ LANGUAGE plpythonu;
421 Note that in Python, strings are sequences, which can have
422 undesirable effects that might be familiar to Python programmers:
425 CREATE FUNCTION return_str_arr()
429 $$ LANGUAGE plpythonu;
431 SELECT return_str_arr();
441 <title>Composite Types</title>
443 Composite-type arguments are passed to the function as Python mappings. The
444 element names of the mapping are the attribute names of the composite type.
445 If an attribute in the passed row has the null value, it has the value
446 <symbol>None</symbol> in the mapping. Here is an example:
449 CREATE TABLE employee (
455 CREATE FUNCTION overpaid (e employee)
458 if e["salary"] > 200000:
460 if (e["age"] < 30) and (e["salary"] > 100000):
463 $$ LANGUAGE plpythonu;
468 There are multiple ways to return row or composite types from a Python
469 function. The following examples assume we have:
472 CREATE TYPE named_value AS (
478 A composite result can be returned as a:
482 <term>Sequence type (a tuple or list, but not a set because
483 it is not indexable)</term>
486 Returned sequence objects must have the same number of items as the
487 composite result type has fields. The item with index 0 is assigned to
488 the first field of the composite type, 1 to the second and so on. For
492 CREATE FUNCTION make_pair (name text, value integer)
495 return [ name, value ]
496 # or alternatively, as tuple: return ( name, value )
497 $$ LANGUAGE plpythonu;
500 To return a SQL null for any column, insert <symbol>None</symbol> at
501 the corresponding position.
507 <term>Mapping (dictionary)</term>
510 The value for each result type column is retrieved from the mapping
511 with the column name as key. Example:
514 CREATE FUNCTION make_pair (name text, value integer)
517 return { "name": name, "value": value }
518 $$ LANGUAGE plpythonu;
521 Any extra dictionary key/value pairs are ignored. Missing keys are
523 To return a SQL null value for any column, insert
524 <symbol>None</symbol> with the corresponding column name as the key.
530 <term>Object (any object providing method <literal>__getattr__</literal>)</term>
533 This works the same as a mapping.
537 CREATE FUNCTION make_pair (name text, value integer)
541 def __init__ (self, n, v):
544 return named_value(name, value)
551 $$ LANGUAGE plpythonu;
561 <title>Set-Returning Functions</title>
563 A <application>PL/Python</application> function can also return sets of
564 scalar or composite types. There are several ways to achieve this because
565 the returned object is internally turned into an iterator. The following
566 examples assume we have composite type:
569 CREATE TYPE greeting AS (
575 A set result can be returned from a:
579 <term>Sequence type (tuple, list, set)</term>
583 CREATE FUNCTION greet (how text)
584 RETURNS SETOF greeting
586 # return tuple containing lists as composite types
587 # all other combinations work also
588 return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
589 $$ LANGUAGE plpythonu;
596 <term>Iterator (any object providing <symbol>__iter__</symbol> and
597 <symbol>next</symbol> methods)</term>
601 CREATE FUNCTION greet (how text)
602 RETURNS SETOF greeting
605 def __init__ (self, how, who):
615 if self.ndx == len(self.who):
617 return ( self.how, self.who[self.ndx] )
619 return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
620 $$ LANGUAGE plpythonu;
627 <term>Generator (<literal>yield</literal>)</term>
631 CREATE FUNCTION greet (how text)
632 RETURNS SETOF greeting
634 for who in [ "World", "PostgreSQL", "PL/Python" ]:
636 $$ LANGUAGE plpythonu;
642 <ulink url="http://bugs.python.org/issue1483133">bug #1483133</ulink>,
643 some debug versions of Python 2.4
644 (configured and compiled with option <literal>--with-pydebug</literal>)
645 are known to crash the <productname>PostgreSQL</productname> server
646 when using an iterator to return a set result.
647 Unpatched versions of Fedora 4 contain this bug.
648 It does not happen in production versions of Python or on patched
649 versions of Fedora 4.
660 <sect1 id="plpython-sharing">
661 <title>Sharing Data</title>
663 The global dictionary <varname>SD</varname> is available to store
664 data between function calls. This variable is private static data.
665 The global dictionary <varname>GD</varname> is public data,
666 available to all Python functions within a session. Use with
667 care.<indexterm><primary>global data</><secondary>in
668 PL/Python</></indexterm>
672 Each function gets its own execution environment in the
673 Python interpreter, so that global data and function arguments from
674 <function>myfunc</function> are not available to
675 <function>myfunc2</function>. The exception is the data in the
676 <varname>GD</varname> dictionary, as mentioned above.
680 <sect1 id="plpython-do">
681 <title>Anonymous Code Blocks</title>
684 PL/Python also supports anonymous code blocks called with the
685 <xref linkend="sql-do"> statement:
690 $$ LANGUAGE plpythonu;
693 An anonymous code block receives no arguments, and whatever value it
694 might return is discarded. Otherwise it behaves just like a function.
698 <sect1 id="plpython-trigger">
699 <title>Trigger Functions</title>
701 <indexterm zone="plpython-trigger">
702 <primary>trigger</primary>
703 <secondary>in PL/Python</secondary>
707 When a function is used as a trigger, the dictionary
708 <literal>TD</literal> contains trigger-related values:
711 <term><literal>TD["event"]</></term>
714 contains the event as a string:
715 <literal>INSERT</>, <literal>UPDATE</>,
716 <literal>DELETE</>, <literal>TRUNCATE</>,
717 or <literal>UNKNOWN</>.
723 <term><literal>TD["when"]</></term>
726 contains one of <literal>BEFORE</>, <literal>AFTER</>,
727 or <literal>UNKNOWN</>.
733 <term><literal>TD["level"]</></term>
736 contains one of <literal>ROW</>,
737 <literal>STATEMENT</>, or <literal>UNKNOWN</>.
743 <term><literal>TD["new"]</></term>
744 <term><literal>TD["old"]</></term>
747 For a row-level trigger, one or both of these fields contain
748 the respective trigger rows, depending on the trigger event.
754 <term><literal>TD["name"]</></term>
757 contains the trigger name.
763 <term><literal>TD["table_name"]</></term>
766 contains the name of the table on which the trigger occurred.
772 <term><literal>TD["table_schema"]</></term>
775 contains the schema of the table on which the trigger occurred.
781 <term><literal>TD["relid"]</></term>
784 contains the OID of the table on which the trigger occurred.
790 <term><literal>TD["args"]</></term>
793 If the <command>CREATE TRIGGER</> command
794 included arguments, they are available in <literal>TD["args"][0]</> to
795 <literal>TD["args"][<replaceable>n</>-1]</>.
803 If <literal>TD["when"]</literal> is <literal>BEFORE</> and
804 <literal>TD["level"]</literal> is <literal>ROW</>, you can
805 return <literal>None</literal> or <literal>"OK"</literal> from the
806 Python function to indicate the row is unmodified,
807 <literal>"SKIP"</> to abort the event, or <literal>"MODIFY"</> to
808 indicate you've modified the row.
809 Otherwise the return value is ignored.
813 <sect1 id="plpython-database">
814 <title>Database Access</title>
817 The PL/Python language module automatically imports a Python module
818 called <literal>plpy</literal>. The functions and constants in
819 this module are available to you in the Python code as
820 <literal>plpy.<replaceable>foo</replaceable></literal>.
824 The <literal>plpy</literal> module provides two
825 functions called <function>execute</function> and
826 <function>prepare</function>. Calling
827 <function>plpy.execute</function> with a query string and an
828 optional limit argument causes that query to be run and the result
829 to be returned in a result object. The result object emulates a
830 list or dictionary object. The result object can be accessed by
831 row number and column name. It has these additional methods:
832 <function>nrows</function> which returns the number of rows
833 returned by the query, and <function>status</function> which is the
834 <function>SPI_execute()</function> return value. The result object
841 rv = plpy.execute("SELECT * FROM my_table", 5)
843 returns up to 5 rows from <literal>my_table</literal>. If
844 <literal>my_table</literal> has a column
845 <literal>my_column</literal>, it would be accessed as:
847 foo = rv[i]["my_column"]
852 <indexterm><primary>preparing a query</><secondary>in PL/Python</></indexterm>
853 The second function, <function>plpy.prepare</function>, prepares
854 the execution plan for a query. It is called with a query string
855 and a list of parameter types, if you have parameter references in
856 the query. For example:
858 plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
860 <literal>text</literal> is the type of the variable you will be
861 passing for <literal>$1</literal>. After preparing a statement, you
862 use the function <function>plpy.execute</function> to run it:
864 rv = plpy.execute(plan, [ "name" ], 5)
866 The third argument is the limit and is optional.
870 When you prepare a plan using the PL/Python module it is
871 automatically saved. Read the SPI documentation (<xref
872 linkend="spi">) for a description of what this means.
873 In order to make effective use of this across function calls
874 one needs to use one of the persistent storage dictionaries
875 <literal>SD</literal> or <literal>GD</literal> (see
876 <xref linkend="plpython-sharing">). For example:
878 CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
879 if SD.has_key("plan"):
882 plan = plpy.prepare("SELECT 1")
885 $$ LANGUAGE plpythonu;
890 <sect1 id="plpython-util">
891 <title>Utility Functions</title>
893 The <literal>plpy</literal> module also provides the functions
894 <literal>plpy.debug(<replaceable>msg</>)</literal>,
895 <literal>plpy.log(<replaceable>msg</>)</literal>,
896 <literal>plpy.info(<replaceable>msg</>)</literal>,
897 <literal>plpy.notice(<replaceable>msg</>)</literal>,
898 <literal>plpy.warning(<replaceable>msg</>)</literal>,
899 <literal>plpy.error(<replaceable>msg</>)</literal>, and
900 <literal>plpy.fatal(<replaceable>msg</>)</literal>.<indexterm><primary>elog</><secondary>in PL/Python</></indexterm>
901 <function>plpy.error</function> and
902 <function>plpy.fatal</function> actually raise a Python exception
903 which, if uncaught, propagates out to the calling query, causing
904 the current transaction or subtransaction to be aborted.
905 <literal>raise plpy.ERROR(<replaceable>msg</>)</literal> and
906 <literal>raise plpy.FATAL(<replaceable>msg</>)</literal> are
907 equivalent to calling
908 <function>plpy.error</function> and
909 <function>plpy.fatal</function>, respectively.
910 The other functions only generate messages of different
912 Whether messages of a particular priority are reported to the client,
913 written to the server log, or both is controlled by the
914 <xref linkend="guc-log-min-messages"> and
915 <xref linkend="guc-client-min-messages"> configuration
916 variables. See <xref linkend="runtime-config"> for more information.