2 doc/src/sgml/ref/create_index.sgml
3 PostgreSQL documentation
6 <refentry id="SQL-CREATEINDEX">
8 <refentrytitle>CREATE INDEX</refentrytitle>
9 <manvolnum>7</manvolnum>
10 <refmiscinfo>SQL - Language Statements</refmiscinfo>
14 <refname>CREATE INDEX</refname>
15 <refpurpose>define a new index</refpurpose>
18 <indexterm zone="sql-createindex">
19 <primary>CREATE INDEX</primary>
24 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ <replaceable class="parameter">name</replaceable> ] ON <replaceable class="parameter">table</replaceable> [ USING <replaceable class="parameter">method</replaceable> ]
25 ( { <replaceable class="parameter">column</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
26 [ WITH ( <replaceable class="PARAMETER">storage_parameter</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] ) ]
27 [ TABLESPACE <replaceable class="parameter">tablespace</replaceable> ]
28 [ WHERE <replaceable class="parameter">predicate</replaceable> ]
33 <title>Description</title>
36 <command>CREATE INDEX</command> constructs an index
37 on the specified column(s) of the specified table.
38 Indexes are primarily used to enhance database performance (though
39 inappropriate use can result in slower performance).
43 The key field(s) for the index are specified as column names,
44 or alternatively as expressions written in parentheses.
45 Multiple fields can be specified if the index method supports
50 An index field can be an expression computed from the values of
51 one or more columns of the table row. This feature can be used
52 to obtain fast access to data based on some transformation of
53 the basic data. For example, an index computed on
54 <literal>upper(col)</> would allow the clause
55 <literal>WHERE upper(col) = 'JIM'</> to use an index.
59 <productname>PostgreSQL</productname> provides the index methods
60 B-tree, hash, GiST, and GIN. Users can also define their own index
61 methods, but that is fairly complicated.
65 When the <literal>WHERE</literal> clause is present, a
66 <firstterm>partial index</firstterm> is created.
67 A partial index is an index that contains entries for only a portion of
68 a table, usually a portion that is more useful for indexing than the
69 rest of the table. For example, if you have a table that contains both
70 billed and unbilled orders where the unbilled orders take up a small
71 fraction of the total table and yet that is an often used section, you
72 can improve performance by creating an index on just that portion.
73 Another possible application is to use <literal>WHERE</literal> with
74 <literal>UNIQUE</literal> to enforce uniqueness over a subset of a
75 table. See <xref linkend="indexes-partial"> for more discussion.
79 The expression used in the <literal>WHERE</literal> clause can refer
80 only to columns of the underlying table, but it can use all columns,
81 not just the ones being indexed. Presently, subqueries and
82 aggregate expressions are also forbidden in <literal>WHERE</literal>.
83 The same restrictions apply to index fields that are expressions.
87 All functions and operators used in an index definition must be
88 <quote>immutable</>, that is, their results must depend only on
89 their arguments and never on any outside influence (such as
90 the contents of another table or the current time). This restriction
91 ensures that the behavior of the index is well-defined. To use a
92 user-defined function in an index expression or <literal>WHERE</literal>
93 clause, remember to mark the function immutable when you create it.
98 <title>Parameters</title>
102 <term><literal>UNIQUE</literal></term>
105 Causes the system to check for
106 duplicate values in the table when the index is created (if data
107 already exist) and each time data is added. Attempts to
108 insert or update data which would result in duplicate entries
109 will generate an error.
115 <term><literal>CONCURRENTLY</literal></term>
118 When this option is used, <productname>PostgreSQL</> will build the
119 index without taking any locks that prevent concurrent inserts,
120 updates, or deletes on the table; whereas a standard index build
121 locks out writes (but not reads) on the table until it's done.
122 There are several caveats to be aware of when using this option
123 — see <xref linkend="SQL-CREATEINDEX-CONCURRENTLY"
124 endterm="SQL-CREATEINDEX-CONCURRENTLY-title">.
130 <term><replaceable class="parameter">name</replaceable></term>
133 The name of the index to be created. No schema name can be included
134 here; the index is always created in the same schema as its parent
135 table. If the name is omitted, <productname>PostgreSQL</> chooses a
136 suitable name based on the parent table's name and the indexed column
143 <term><replaceable class="parameter">table</replaceable></term>
146 The name (possibly schema-qualified) of the table to be indexed.
152 <term><replaceable class="parameter">method</replaceable></term>
155 The name of the index method to be used. Choices are
156 <literal>btree</literal>, <literal>hash</literal>,
157 <literal>gist</literal>, and <literal>gin</>. The
158 default method is <literal>btree</literal>.
164 <term><replaceable class="parameter">column</replaceable></term>
167 The name of a column of the table.
173 <term><replaceable class="parameter">expression</replaceable></term>
176 An expression based on one or more columns of the table. The
177 expression usually must be written with surrounding parentheses,
178 as shown in the syntax. However, the parentheses can be omitted
179 if the expression has the form of a function call.
185 <term><replaceable class="parameter">collation</replaceable></term>
188 The name of the collation to use for the index. By default,
189 the index uses the collation declared for the column to be
190 indexed or the result collation of the expression to be
191 indexed. Indexes with non-default collations can be useful for
192 queries that involve expressions using non-default collations.
198 <term><replaceable class="parameter">opclass</replaceable></term>
201 The name of an operator class. See below for details.
207 <term><literal>ASC</></term>
210 Specifies ascending sort order (which is the default).
216 <term><literal>DESC</></term>
219 Specifies descending sort order.
225 <term><literal>NULLS FIRST</></term>
228 Specifies that nulls sort before non-nulls. This is the default
229 when <literal>DESC</> is specified.
235 <term><literal>NULLS LAST</></term>
238 Specifies that nulls sort after non-nulls. This is the default
239 when <literal>DESC</> is not specified.
245 <term><replaceable class="parameter">storage_parameter</replaceable></term>
248 The name of an index-method-specific storage parameter. See
249 <xref linkend="sql-createindex-storage-parameters" endterm="sql-createindex-storage-parameters-title">
256 <term><replaceable class="parameter">tablespace</replaceable></term>
259 The tablespace in which to create the index. If not specified,
260 <xref linkend="guc-default-tablespace"> is consulted, or
261 <xref linkend="guc-temp-tablespaces"> for indexes on temporary
268 <term><replaceable class="parameter">predicate</replaceable></term>
271 The constraint expression for a partial index.
278 <refsect2 id="SQL-CREATEINDEX-storage-parameters">
279 <title id="SQL-CREATEINDEX-storage-parameters-title">Index Storage Parameters</title>
282 The optional <literal>WITH</> clause specifies <firstterm>storage
283 parameters</> for the index. Each index method has its own set of allowed
284 storage parameters. The B-tree, hash and GiST index methods all accept a
291 <term><literal>FILLFACTOR</></term>
294 The fillfactor for an index is a percentage that determines how full
295 the index method will try to pack index pages. For B-trees, leaf pages
296 are filled to this percentage during initial index build, and also
297 when extending the index at the right (adding new largest key values).
299 subsequently become completely full, they will be split, leading to
300 gradual degradation in the index's efficiency. B-trees use a default
301 fillfactor of 90, but any integer value from 10 to 100 can be selected.
302 If the table is static then fillfactor 100 is best to minimize the
303 index's physical size, but for heavily updated tables a smaller
304 fillfactor is better to minimize the need for page splits. The
305 other index methods use fillfactor in different but roughly analogous
306 ways; the default fillfactor varies between methods.
314 GIN indexes accept a different parameter:
320 <term><literal>FASTUPDATE</></term>
323 This setting controls usage of the fast update technique described in
324 <xref linkend="gin-fast-update">. It is a Boolean parameter:
325 <literal>ON</> enables fast update, <literal>OFF</> disables it.
326 (Alternative spellings of <literal>ON</> and <literal>OFF</> are
327 allowed as described in <xref linkend="config-setting">.) The
328 default is <literal>ON</>.
333 Turning <literal>FASTUPDATE</> off via <command>ALTER INDEX</> prevents
334 future insertions from going into the list of pending index entries,
335 but does not in itself flush previous entries. You might want to
336 <command>VACUUM</> the table afterward to ensure the pending list is
345 GiST indexes additionally accept parameter:
351 <term><literal>BUFFERING</></term>
354 Determines whether the buffering build technique described in
355 <xref linkend="gist-buffering-build"> is used to build the index. With
356 <literal>OFF</> it is disabled, with <literal>ON</> it is enabled, and
357 with <literal>AUTO</> it is initially disabled, but turned on
358 on-the-fly once the index size reaches <xref linkend="guc-effective-cache-size">. The default is <literal>AUTO</>.
366 <refsect2 id="SQL-CREATEINDEX-CONCURRENTLY">
367 <title id="SQL-CREATEINDEX-CONCURRENTLY-title">Building Indexes Concurrently</title>
369 <indexterm zone="SQL-CREATEINDEX-CONCURRENTLY">
370 <primary>index</primary>
371 <secondary>building concurrently</secondary>
375 Creating an index can interfere with regular operation of a database.
376 Normally <productname>PostgreSQL</> locks the table to be indexed against
377 writes and performs the entire index build with a single scan of the
378 table. Other transactions can still read the table, but if they try to
379 insert, update, or delete rows in the table they will block until the
380 index build is finished. This could have a severe effect if the system is
381 a live production database. Very large tables can take many hours to be
382 indexed, and even for smaller tables, an index build can lock out writers
383 for periods that are unacceptably long for a production system.
387 <productname>PostgreSQL</> supports building indexes without locking
388 out writes. This method is invoked by specifying the
389 <literal>CONCURRENTLY</> option of <command>CREATE INDEX</>.
390 When this option is used,
391 <productname>PostgreSQL</> must perform two scans of the table, and in
392 addition it must wait for all existing transactions that could potentially
393 use the index to terminate. Thus
394 this method requires more total work than a standard index build and takes
395 significantly longer to complete. However, since it allows normal
396 operations to continue while the index is built, this method is useful for
397 adding new indexes in a production environment. Of course, the extra CPU
398 and I/O load imposed by the index creation might slow other operations.
402 In a concurrent index build, the index is actually entered into the
403 system catalogs in one transaction, then the two table scans occur in a
404 second and third transaction. All active transactions at the time the
405 second table scan starts, not just ones that already involve the table,
406 have the potential to block the concurrent index creation until they
407 finish. When checking for transactions that could still use the original
408 index, concurrent index creation advances through potentially interfering
409 older transactions one at a time, obtaining shared locks on their virtual
410 transaction identifiers to wait for them to complete.
414 If a problem arises while scanning the table, such as a
415 uniqueness violation in a unique index, the <command>CREATE INDEX</>
416 command will fail but leave behind an <quote>invalid</> index. This index
417 will be ignored for querying purposes because it might be incomplete;
418 however it will still consume update overhead. The <application>psql</>
419 <command>\d</> command will report such an index as <literal>INVALID</>:
424 Column | Type | Modifiers
425 --------+---------+-----------
428 "idx" btree (col) INVALID
431 The recommended recovery
432 method in such cases is to drop the index and try again to perform
433 <command>CREATE INDEX CONCURRENTLY</>. (Another possibility is to rebuild
434 the index with <command>REINDEX</>. However, since <command>REINDEX</>
435 does not support concurrent builds, this option is unlikely to seem
440 Another caveat when building a unique index concurrently is that the
441 uniqueness constraint is already being enforced against other transactions
442 when the second table scan begins. This means that constraint violations
443 could be reported in other queries prior to the index becoming available
444 for use, or even in cases where the index build eventually fails. Also,
445 if a failure does occur in the second scan, the <quote>invalid</> index
446 continues to enforce its uniqueness constraint afterwards.
450 Concurrent builds of expression indexes and partial indexes are supported.
451 Errors occurring in the evaluation of these expressions could cause
452 behavior similar to that described above for unique constraint violations.
456 Regular index builds permit other regular index builds on the
457 same table to occur in parallel, but only one concurrent index build
458 can occur on a table at a time. In both cases, no other types of schema
459 modification on the table are allowed meanwhile. Another difference
460 is that a regular <command>CREATE INDEX</> command can be performed within
461 a transaction block, but <command>CREATE INDEX CONCURRENTLY</> cannot.
470 See <xref linkend="indexes"> for information about when indexes can
471 be used, when they are not used, and in which particular situations
476 Currently, only the B-tree, GiST and GIN index methods support
477 multicolumn indexes. Up to 32 fields can be specified by default.
478 (This limit can be altered when building
479 <productname>PostgreSQL</productname>.) Only B-tree currently
480 supports unique indexes.
484 An <firstterm>operator class</firstterm> can be specified for each
485 column of an index. The operator class identifies the operators to be
486 used by the index for that column. For example, a B-tree index on
487 four-byte integers would use the <literal>int4_ops</literal> class;
488 this operator class includes comparison functions for four-byte
489 integers. In practice the default operator class for the column's data
490 type is usually sufficient. The main point of having operator classes
491 is that for some data types, there could be more than one meaningful
492 ordering. For example, we might want to sort a complex-number data
493 type either by absolute value or by real part. We could do this by
494 defining two operator classes for the data type and then selecting
495 the proper class when making an index. More information about
496 operator classes is in <xref linkend="indexes-opclass"> and in <xref
501 For index methods that support ordered scans (currently, only B-tree),
502 the optional clauses <literal>ASC</>, <literal>DESC</>, <literal>NULLS
503 FIRST</>, and/or <literal>NULLS LAST</> can be specified to modify
504 the sort ordering of the index. Since an ordered index can be
505 scanned either forward or backward, it is not normally useful to create a
506 single-column <literal>DESC</> index — that sort ordering is already
507 available with a regular index. The value of these options is that
508 multicolumn indexes can be created that match the sort ordering requested
509 by a mixed-ordering query, such as <literal>SELECT ... ORDER BY x ASC, y
510 DESC</>. The <literal>NULLS</> options are useful if you need to support
511 <quote>nulls sort low</> behavior, rather than the default <quote>nulls
512 sort high</>, in queries that depend on indexes to avoid sorting steps.
516 For most index methods, the speed of creating an index is
517 dependent on the setting of <xref linkend="guc-maintenance-work-mem">.
518 Larger values will reduce the time needed for index creation, so long
519 as you don't make it larger than the amount of memory really available,
520 which would drive the machine into swapping. For hash indexes, the
521 value of <xref linkend="guc-effective-cache-size"> is also relevant to
522 index creation time: <productname>PostgreSQL</productname> will use one
523 of two different hash index creation methods depending on whether the
524 estimated index size is more or less than <varname>effective_cache_size</>.
525 For best results, make sure that this parameter is also set to something
526 reflective of available memory, and be careful that the sum of
527 <varname>maintenance_work_mem</> and <varname>effective_cache_size</> is
528 less than the machine's RAM less whatever space is needed by other
533 Use <xref linkend="sql-dropindex">
538 Prior releases of <productname>PostgreSQL</productname> also had an
539 R-tree index method. This method has been removed because
540 it had no significant advantages over the GiST method.
541 If <literal>USING rtree</> is specified, <command>CREATE INDEX</>
542 will interpret it as <literal>USING gist</>, to simplify conversion
543 of old databases to GiST.
548 <title>Examples</title>
551 To create a B-tree index on the column <literal>title</literal> in
552 the table <literal>films</literal>:
554 CREATE UNIQUE INDEX title_idx ON films (title);
559 To create an index on the expression <literal>lower(title)</>,
560 allowing efficient case-insensitive searches:
562 CREATE INDEX ON films ((lower(title)));
564 (In this example we have chosen to omit the index name, so the system
565 will choose a name, typically <literal>films_lower_idx</>.)
569 To create an index with non-default collation:
571 CREATE INDEX title_idx_german ON films (title COLLATE "de_DE");
576 To create an index with non-default sort ordering of nulls:
578 CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST);
583 To create an index with non-default fill factor:
585 CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);
590 To create a <acronym>GIN</> index with fast updates disabled:
592 CREATE INDEX gin_idx ON documents_table USING gin (locations) WITH (fastupdate = off);
597 To create an index on the column <literal>code</> in the table
598 <literal>films</> and have the index reside in the tablespace
599 <literal>indexspace</>:
601 CREATE INDEX code_idx ON films (code) TABLESPACE indexspace;
606 To create a GiST index on a point attribute so that we
607 can efficiently use box operators on the result of the
610 CREATE INDEX pointloc
611 ON points USING gist (box(location,location));
613 WHERE box(location,location) && '(0,0),(1,1)'::box;
618 To create an index without locking out writes to the table:
620 CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
621 </programlisting></para>
626 <title>Compatibility</title>
629 <command>CREATE INDEX</command> is a
630 <productname>PostgreSQL</productname> language extension. There
631 are no provisions for indexes in the SQL standard.
636 <title>See Also</title>
638 <simplelist type="inline">
639 <member><xref linkend="sql-alterindex"></member>
640 <member><xref linkend="sql-dropindex"></member>