granicus.if.org Git - postgresql/blob - doc/src/sgml/ref/explain.sgml

   1 <!--
   2 doc/src/sgml/ref/explain.sgml
   3 PostgreSQL documentation
   4 -->
   5
   6 <refentry id="SQL-EXPLAIN">
   7  <refmeta>
   8   <refentrytitle>EXPLAIN</refentrytitle>
   9   <manvolnum>7</manvolnum>
  10   <refmiscinfo>SQL - Language Statements</refmiscinfo>
  11  </refmeta>
  12
  13  <refnamediv>
  14   <refname>EXPLAIN</refname>
  15   <refpurpose>show the execution plan of a statement</refpurpose>
  16  </refnamediv>
  17
  18  <indexterm zone="sql-explain">
  19   <primary>EXPLAIN</primary>
  20  </indexterm>
  21
  22  <indexterm zone="sql-explain">
  23   <primary>prepared statements</primary>
  24   <secondary>showing the query plan</secondary>
  25  </indexterm>
  26
  27  <indexterm zone="sql-explain">
  28   <primary>cursor</primary>
  29   <secondary>showing the query plan</secondary>
  30  </indexterm>
  31
  32  <refsynopsisdiv>
  33 <synopsis>
  34 EXPLAIN [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] <replaceable class="parameter">statement</replaceable>
  35 EXPLAIN [ ANALYZE ] [ VERBOSE ] <replaceable class="parameter">statement</replaceable>
  36
  37 <phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
  38
  39     ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
  40     VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
  41     COSTS [ <replaceable class="parameter">boolean</replaceable> ]
  42     BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
  43     TIMING [ <replaceable class="parameter">boolean</replaceable> ]
  44     FORMAT { TEXT | XML | JSON | YAML }
  45 </synopsis>
  46  </refsynopsisdiv>
  47
  48  <refsect1>
  49   <title>Description</title>
  50
  51   <para>
  52    This command displays the execution plan that the
  53    <productname>PostgreSQL</productname> planner generates for the
  54    supplied statement.  The execution plan shows how the table(s)
  55    referenced by the statement will be scanned &mdash; by plain sequential scan,
  56    index scan, etc. &mdash; and if multiple tables are referenced, what join
  57    algorithms will be used to bring together the required rows from
  58    each input table.
  59   </para>
  60
  61   <para>
  62    The most critical part of the display is the estimated statement execution
  63    cost, which is the planner's guess at how long it will take to run the
  64    statement (measured in cost units that are arbitrary, but conventionally
  65    mean disk page fetches).  Actually two numbers
  66    are shown: the start-up cost before the first row can be returned, and
  67    the total cost to return all the rows.  For most queries the total cost
  68    is what matters, but in contexts such as a subquery in <literal>EXISTS</literal>, the planner
  69    will choose the smallest start-up cost instead of the smallest total cost
  70    (since the executor will stop after getting one row, anyway).
  71    Also, if you limit the number of rows to return with a <literal>LIMIT</literal> clause,
  72    the planner makes an appropriate interpolation between the endpoint
  73    costs to estimate which plan is really the cheapest.
  74   </para>
  75
  76   <para>
  77    The <literal>ANALYZE</literal> option causes the statement to be actually
  78    executed, not only planned.  Then actual runtime statistics are added to
  79    the display, including the total elapsed time expended within each plan
  80    node (in milliseconds) and the total number of rows it actually returned.
  81    This is useful for seeing whether the planner's estimates
  82    are close to reality.
  83   </para>
  84
  85   <important>
  86    <para>
  87     Keep in mind that the statement is actually executed when
  88     the <literal>ANALYZE</literal> option is used.  Although
  89     <command>EXPLAIN</command> will discard any output that a
  90     <command>SELECT</command> would return, other side effects of the
  91     statement will happen as usual.  If you wish to use
  92     <command>EXPLAIN ANALYZE</command> on an
  93     <command>INSERT</command>, <command>UPDATE</command>,
  94     <command>DELETE</command>, <command>CREATE TABLE AS</command>,
  95     or <command>EXECUTE</command> statement
  96     without letting the command affect your data, use this approach:
  97 <programlisting>
  98 BEGIN;
  99 EXPLAIN ANALYZE ...;
 100 ROLLBACK;
 101 </programlisting>
 102    </para>
 103   </important>
 104
 105   <para>
 106    Only the <literal>ANALYZE</literal> and <literal>VERBOSE</literal> options
 107    can be specified, and only in that order, without surrounding the option
 108    list in parentheses.  Prior to <productname>PostgreSQL</productname> 9.0,
 109    the unparenthesized syntax was the only one supported.  It is expected that
 110    all new options will be supported only in the parenthesized syntax.
 111   </para>
 112  </refsect1>
 113
 114  <refsect1>
 115   <title>Parameters</title>
 116
 117   <variablelist>
 118    <varlistentry>
 119     <term><literal>ANALYZE</literal></term>
 120     <listitem>
 121      <para>
 122       Carry out the command and show actual run times and other statistics.
 123       This parameter defaults to <literal>FALSE</literal>.
 124      </para>
 125     </listitem>
 126    </varlistentry>
 127
 128    <varlistentry>
 129     <term><literal>VERBOSE</literal></term>
 130     <listitem>
 131      <para>
 132       Display additional information regarding the plan.  Specifically, include
 133       the output column list for each node in the plan tree, schema-qualify
 134       table and function names, always label variables in expressions with
 135       their range table alias, and always print the name of each trigger for
 136       which statistics are displayed.  This parameter defaults to
 137       <literal>FALSE</literal>.
 138      </para>
 139     </listitem>
 140    </varlistentry>
 141
 142    <varlistentry>
 143     <term><literal>COSTS</literal></term>
 144     <listitem>
 145      <para>
 146       Include information on the estimated startup and total cost of each
 147       plan node, as well as the estimated number of rows and the estimated
 148       width of each row.  This parameter defaults to <literal>TRUE</literal>.
 149      </para>
 150     </listitem>
 151    </varlistentry>
 152
 153    <varlistentry>
 154     <term><literal>BUFFERS</literal></term>
 155     <listitem>
 156      <para>
 157       Include information on buffer usage. Specifically, include the number of
 158       shared blocks hit, read, dirtied, and written, the number of local blocks
 159       hit, read, dirtied, and written, and the number of temp blocks read and
 160       written.
 161       A <emphasis>hit</> means that a read was avoided because the block was
 162       found already in cache when needed.
 163       Shared blocks contain data from regular tables and indexes;
 164       local blocks contain data from temporary tables and indexes;
 165       while temp blocks contain short-term working data used in sorts, hashes,
 166       Materialize plan nodes, and similar cases.
 167       The number of blocks <emphasis>dirtied</> indicates the number of
 168       previously unmodified blocks that were changed by this query; while the
 169       number of blocks <emphasis>written</> indicates the number of
 170       previously-dirtied blocks evicted from cache by this backend during
 171       query processing.
 172       The number of blocks shown for an
 173       upper-level node includes those used by all its child nodes.  In text
 174       format, only non-zero values are printed.  This parameter may only be
 175       used when <literal>ANALYZE</literal> is also enabled.  It defaults to
 176       <literal>FALSE</literal>.
 177      </para>
 178     </listitem>
 179    </varlistentry>
 180
 181    <varlistentry>
 182     <term><literal>TIMING</literal></term>
 183     <listitem>
 184      <para>
 185       Include the actual startup time and time spent in the node in the output.
 186       The overhead of repeatedly reading the system clock can slow down the
 187       query significantly on some systems, so it may be useful to set this
 188       parameter to <literal>FALSE</literal> when only actual row counts, and not
 189       exact times, are needed.
 190       This parameter may only be used when <literal>ANALYZE</literal> is also
 191       enabled.  It defaults to <literal>TRUE</literal>.
 192      </para>
 193     </listitem>
 194    </varlistentry>
 195
 196    <varlistentry>
 197     <term><literal>FORMAT</literal></term>
 198     <listitem>
 199      <para>
 200       Specify the output format, which can be TEXT, XML, JSON, or YAML.
 201       Non-text output contains the same information as the text output
 202       format, but is easier for programs to parse.  This parameter defaults to
 203       <literal>TEXT</literal>.
 204      </para>
 205     </listitem>
 206    </varlistentry>
 207
 208    <varlistentry>
 209     <term><replaceable class="parameter">boolean</replaceable></term>
 210     <listitem>
 211      <para>
 212       Specifies whether the selected option should be turned on or off.
 213       You can write <literal>TRUE</literal>, <literal>ON</>, or
 214       <literal>1</literal> to enable the option, and <literal>FALSE</literal>,
 215       <literal>OFF</>, or <literal>0</literal> to disable it.  The
 216       <replaceable class="parameter">boolean</replaceable> value can also
 217       be omitted, in which case <literal>TRUE</literal> is assumed.
 218      </para>
 219     </listitem>
 220    </varlistentry>
 221
 222    <varlistentry>
 223     <term><replaceable class="parameter">statement</replaceable></term>
 224     <listitem>
 225      <para>
 226       Any <command>SELECT</>, <command>INSERT</>, <command>UPDATE</>,
 227       <command>DELETE</>, <command>VALUES</>, <command>EXECUTE</>,
 228       <command>DECLARE</>, or <command>CREATE TABLE AS</command>
 229       statement, whose execution plan you wish to see.
 230      </para>
 231     </listitem>
 232    </varlistentry>
 233   </variablelist>
 234  </refsect1>
 235
 236  <refsect1>
 237   <title>Outputs</title>
 238
 239    <para>
 240     The command's result is a textual description of the plan selected
 241     for the <replaceable class="parameter">statement</replaceable>,
 242     optionally annotated with execution statistics.
 243     <xref linkend="using-explain"> describes the information provided.
 244    </para>
 245  </refsect1>
 246
 247  <refsect1>
 248   <title>Notes</title>
 249
 250   <para>
 251    In order to allow the <productname>PostgreSQL</productname> query
 252    planner to make reasonably informed decisions when optimizing
 253    queries, the <link
 254    linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
 255    data should be up-to-date for all tables used in the query.  Normally
 256    the <link linkend="autovacuum">autovacuum daemon</link> will take care
 257    of that automatically.  But if a table has recently had substantial
 258    changes in its contents, you might need to do a manual
 259    <xref linkend="sql-analyze"> rather than wait for autovacuum to catch up
 260    with the changes.
 261   </para>
 262
 263   <para>
 264    In order to measure the run-time cost of each node in the execution
 265    plan, the current implementation of <command>EXPLAIN
 266    ANALYZE</command> adds profiling overhead to query execution.
 267    As a result, running <command>EXPLAIN ANALYZE</command>
 268    on a query can sometimes take significantly longer than executing
 269    the query normally. The amount of overhead depends on the nature of
 270    the query, as well as the platform being used.  The worst case occurs
 271    for plan nodes that in themselves require very little time per
 272    execution, and on machines that have relatively slow operating
 273    system calls for obtaining the time of day.
 274   </para>
 275  </refsect1>
 276
 277  <refsect1>
 278   <title>Examples</title>
 279
 280   <para>
 281    To show the plan for a simple query on a table with a single
 282    <type>integer</type> column and 10000 rows:
 283
 284 <programlisting>
 285 EXPLAIN SELECT * FROM foo;
 286
 287                        QUERY PLAN
 288 ---------------------------------------------------------
 289  Seq Scan on foo  (cost=0.00..155.00 rows=10000 width=4)
 290 (1 row)
 291 </programlisting>
 292   </para>
 293
 294   <para>
 295   Here is the same query, with JSON output formatting:
 296 <programlisting>
 297 EXPLAIN (FORMAT JSON) SELECT * FROM foo;
 298            QUERY PLAN
 299 --------------------------------
 300  [                             +
 301    {                           +
 302      "Plan": {                 +
 303        "Node Type": "Seq Scan",+
 304        "Relation Name": "foo", +
 305        "Alias": "foo",         +
 306        "Startup Cost": 0.00,   +
 307        "Total Cost": 155.00,   +
 308        "Plan Rows": 10000,     +
 309        "Plan Width": 4         +
 310      }                         +
 311    }                           +
 312  ]
 313 (1 row)
 314 </programlisting>
 315   </para>
 316
 317   <para>
 318    If there is an index and we use a query with an indexable
 319    <literal>WHERE</literal> condition, <command>EXPLAIN</command>
 320    might show a different plan:
 321
 322 <programlisting>
 323 EXPLAIN SELECT * FROM foo WHERE i = 4;
 324
 325                          QUERY PLAN
 326 --------------------------------------------------------------
 327  Index Scan using fi on foo  (cost=0.00..5.98 rows=1 width=4)
 328    Index Cond: (i = 4)
 329 (2 rows)
 330 </programlisting>
 331   </para>
 332
 333   <para>
 334   Here is the same query, but in YAML format:
 335 <programlisting>
 336 EXPLAIN (FORMAT YAML) SELECT * FROM foo WHERE i='4';
 337           QUERY PLAN
 338 -------------------------------
 339  - Plan:                      +
 340      Node Type: "Index Scan"  +
 341      Scan Direction: "Forward"+
 342      Index Name: "fi"         +
 343      Relation Name: "foo"     +
 344      Alias: "foo"             +
 345      Startup Cost: 0.00       +
 346      Total Cost: 5.98         +
 347      Plan Rows: 1             +
 348      Plan Width: 4            +
 349      Index Cond: "(i = 4)"
 350 (1 row)
 351 </programlisting>
 352
 353     XML format is left as an exercise for the reader.
 354   </para>
 355   <para>
 356    Here is the same plan with cost estimates suppressed:
 357
 358 <programlisting>
 359 EXPLAIN (COSTS FALSE) SELECT * FROM foo WHERE i = 4;
 360
 361         QUERY PLAN
 362 ----------------------------
 363  Index Scan using fi on foo
 364    Index Cond: (i = 4)
 365 (2 rows)
 366 </programlisting>
 367   </para>
 368
 369   <para>
 370    Here is an example of a query plan for a query using an aggregate
 371    function:
 372
 373 <programlisting>
 374 EXPLAIN SELECT sum(i) FROM foo WHERE i &lt; 10;
 375
 376                              QUERY PLAN
 377 ---------------------------------------------------------------------
 378  Aggregate  (cost=23.93..23.93 rows=1 width=4)
 379    -&gt;  Index Scan using fi on foo  (cost=0.00..23.92 rows=6 width=4)
 380          Index Cond: (i &lt; 10)
 381 (3 rows)
 382 </programlisting>
 383   </para>
 384
 385   <para>
 386    Here is an example of using <command>EXPLAIN EXECUTE</command> to
 387    display the execution plan for a prepared query:
 388
 389 <programlisting>
 390 PREPARE query(int, int) AS SELECT sum(bar) FROM test
 391     WHERE id &gt; $1 AND id &lt; $2
 392     GROUP BY foo;
 393
 394 EXPLAIN ANALYZE EXECUTE query(100, 200);
 395
 396                                                        QUERY PLAN
 397 -------------------------------------------------------------------------------------------------------------------------
 398  HashAggregate  (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7 loops=1)
 399    -&gt;  Index Scan using test_pkey on test  (cost=0.00..32.97 rows=1311 width=8) (actual time=0.050..0.395 rows=99 loops=1)
 400          Index Cond: ((id &gt; $1) AND (id &lt; $2))
 401  Total runtime: 0.851 ms
 402 (4 rows)
 403 </programlisting>
 404   </para>
 405
 406   <para>
 407    Of course, the specific numbers shown here depend on the actual
 408    contents of the tables involved.  Also note that the numbers, and
 409    even the selected query strategy, might vary between
 410    <productname>PostgreSQL</productname> releases due to planner
 411    improvements. In addition, the <command>ANALYZE</command> command
 412    uses random sampling to estimate data statistics; therefore, it is
 413    possible for cost estimates to change after a fresh run of
 414    <command>ANALYZE</command>, even if the actual distribution of data
 415    in the table has not changed.
 416   </para>
 417  </refsect1>
 418
 419  <refsect1>
 420   <title>Compatibility</title>
 421
 422   <para>
 423    There is no <command>EXPLAIN</command> statement defined in the SQL standard.
 424   </para>
 425  </refsect1>
 426
 427  <refsect1>
 428   <title>See Also</title>
 429
 430   <simplelist type="inline">
 431    <member><xref linkend="sql-analyze"></member>
 432   </simplelist>
 433  </refsect1>
 434 </refentry>