1 <!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.39 2006/10/24 02:24:27 tgl Exp $ -->
6 <indexterm zone="queries">
7 <primary>query</primary>
10 <indexterm zone="queries">
11 <primary>SELECT</primary>
15 The previous chapters explained how to create tables, how to fill
16 them with data, and how to manipulate that data. Now we finally
17 discuss how to retrieve the data out of the database.
21 <sect1 id="queries-overview">
22 <title>Overview</title>
25 The process of retrieving or the command to retrieve data from a
26 database is called a <firstterm>query</firstterm>. In SQL the
27 <xref linkend="sql-select" endterm="sql-select-title"> command is
28 used to specify queries. The general syntax of the
29 <command>SELECT</command> command is
31 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
33 The following sections describe the details of the select list, the
34 table expression, and the sort specification.
38 A simple kind of query has the form
42 Assuming that there is a table called <literal>table1</literal>,
43 this command would retrieve all rows and all columns from
44 <literal>table1</literal>. (The method of retrieval depends on the
45 client application. For example, the
46 <application>psql</application> program will display an ASCII-art
47 table on the screen, while client libraries will offer functions to
48 extract individual values from the query result.) The select list
49 specification <literal>*</literal> means all columns that the table
50 expression happens to provide. A select list can also select a
51 subset of the available columns or make calculations using the
52 columns. For example, if
53 <literal>table1</literal> has columns named <literal>a</>,
54 <literal>b</>, and <literal>c</> (and perhaps others) you can make
57 SELECT a, b + c FROM table1;
59 (assuming that <literal>b</> and <literal>c</> are of a numerical
61 See <xref linkend="queries-select-lists"> for more details.
65 <literal>FROM table1</literal> is a particularly simple kind of
66 table expression: it reads just one table. In general, table
67 expressions can be complex constructs of base tables, joins, and
68 subqueries. But you can also omit the table expression entirely and
69 use the <command>SELECT</command> command as a calculator:
73 This is more useful if the expressions in the select list return
74 varying results. For example, you could call a function this way:
82 <sect1 id="queries-table-expressions">
83 <title>Table Expressions</title>
85 <indexterm zone="queries-table-expressions">
86 <primary>table expression</primary>
90 A <firstterm>table expression</firstterm> computes a table. The
91 table expression contains a <literal>FROM</> clause that is
92 optionally followed by <literal>WHERE</>, <literal>GROUP BY</>, and
93 <literal>HAVING</> clauses. Trivial table expressions simply refer
94 to a table on disk, a so-called base table, but more complex
95 expressions can be used to modify or combine base tables in various
100 The optional <literal>WHERE</>, <literal>GROUP BY</>, and
101 <literal>HAVING</> clauses in the table expression specify a
102 pipeline of successive transformations performed on the table
103 derived in the <literal>FROM</> clause. All these transformations
104 produce a virtual table that provides the rows that are passed to
105 the select list to compute the output rows of the query.
108 <sect2 id="queries-from">
109 <title>The <literal>FROM</literal> Clause</title>
112 The <xref linkend="sql-from" endterm="sql-from-title"> derives a
113 table from one or more other tables given in a comma-separated
114 table reference list.
116 FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
119 A table reference may be a table name (possibly schema-qualified),
120 or a derived table such as a subquery, a table join, or complex
121 combinations of these. If more than one table reference is listed
122 in the <literal>FROM</> clause they are cross-joined (see below)
123 to form the intermediate virtual table that may then be subject to
124 transformations by the <literal>WHERE</>, <literal>GROUP BY</>,
125 and <literal>HAVING</> clauses and is finally the result of the
126 overall table expression.
130 <primary>ONLY</primary>
134 When a table reference names a table that is the parent of a
135 table inheritance hierarchy, the table reference produces rows of
136 not only that table but all of its descendant tables, unless the
137 key word <literal>ONLY</> precedes the table name. However, the
138 reference produces only the columns that appear in the named table
139 — any columns added in subtables are ignored.
142 <sect3 id="queries-join">
143 <title>Joined Tables</title>
145 <indexterm zone="queries-join">
146 <primary>join</primary>
150 A joined table is a table derived from two other (real or
151 derived) tables according to the rules of the particular join
152 type. Inner, outer, and cross-joins are available.
156 <title>Join Types</title>
159 <term>Cross join</term>
162 <primary>join</primary>
163 <secondary>cross</secondary>
167 <primary>cross join</primary>
172 <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
176 For each combination of rows from
177 <replaceable>T1</replaceable> and
178 <replaceable>T2</replaceable>, the derived table will contain a
179 row consisting of all columns in <replaceable>T1</replaceable>
180 followed by all columns in <replaceable>T2</replaceable>. If
181 the tables have N and M rows respectively, the joined
182 table will have N * M rows.
186 <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
187 <replaceable>T2</replaceable></literal> is equivalent to
188 <literal>FROM <replaceable>T1</replaceable>,
189 <replaceable>T2</replaceable></literal>. It is also equivalent to
190 <literal>FROM <replaceable>T1</replaceable> INNER JOIN
191 <replaceable>T2</replaceable> ON TRUE</literal> (see below).
197 <term>Qualified joins</term>
200 <primary>join</primary>
201 <secondary>outer</secondary>
205 <primary>outer join</primary>
210 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean_expression</replaceable>
211 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
212 <replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable>
216 The words <literal>INNER</literal> and
217 <literal>OUTER</literal> are optional in all forms.
218 <literal>INNER</literal> is the default;
219 <literal>LEFT</literal>, <literal>RIGHT</literal>, and
220 <literal>FULL</literal> imply an outer join.
224 The <firstterm>join condition</firstterm> is specified in the
225 <literal>ON</> or <literal>USING</> clause, or implicitly by
226 the word <literal>NATURAL</>. The join condition determines
227 which rows from the two source tables are considered to
228 <quote>match</quote>, as explained in detail below.
232 The <literal>ON</> clause is the most general kind of join
233 condition: it takes a Boolean value expression of the same
234 kind as is used in a <literal>WHERE</> clause. A pair of rows
235 from <replaceable>T1</> and <replaceable>T2</> match if the
236 <literal>ON</> expression evaluates to true for them.
240 <literal>USING</> is a shorthand notation: it takes a
241 comma-separated list of column names, which the joined tables
242 must have in common, and forms a join condition specifying
243 equality of each of these pairs of columns. Furthermore, the
244 output of a <literal>JOIN USING</> has one column for each of
245 the equated pairs of input columns, followed by all of the
246 other columns from each table. Thus, <literal>USING (a, b,
247 c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
248 t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
249 if <literal>ON</> is used there will be two columns
250 <literal>a</>, <literal>b</>, and <literal>c</> in the result,
251 whereas with <literal>USING</> there will be only one of each.
256 <primary>join</primary>
257 <secondary>natural</secondary>
260 <primary>natural join</primary>
262 Finally, <literal>NATURAL</> is a shorthand form of
263 <literal>USING</>: it forms a <literal>USING</> list
264 consisting of exactly those column names that appear in both
265 input tables. As with <literal>USING</>, these columns appear
266 only once in the output table.
270 The possible types of qualified join are:
274 <term><literal>INNER JOIN</></term>
278 For each row R1 of T1, the joined table has a row for each
279 row in T2 that satisfies the join condition with R1.
285 <term><literal>LEFT OUTER JOIN</></term>
288 <primary>join</primary>
289 <secondary>left</secondary>
293 <primary>left join</primary>
298 First, an inner join is performed. Then, for each row in
299 T1 that does not satisfy the join condition with any row in
300 T2, a joined row is added with null values in columns of
301 T2. Thus, the joined table unconditionally has at least
302 one row for each row in T1.
308 <term><literal>RIGHT OUTER JOIN</></term>
311 <primary>join</primary>
312 <secondary>right</secondary>
316 <primary>right join</primary>
321 First, an inner join is performed. Then, for each row in
322 T2 that does not satisfy the join condition with any row in
323 T1, a joined row is added with null values in columns of
324 T1. This is the converse of a left join: the result table
325 will unconditionally have a row for each row in T2.
331 <term><literal>FULL OUTER JOIN</></term>
335 First, an inner join is performed. Then, for each row in
336 T1 that does not satisfy the join condition with any row in
337 T2, a joined row is added with null values in columns of
338 T2. Also, for each row of T2 that does not satisfy the
339 join condition with any row in T1, a joined row with null
340 values in the columns of T1 is added.
351 Joins of all types can be chained together or nested: either or
352 both of <replaceable>T1</replaceable> and
353 <replaceable>T2</replaceable> may be joined tables. Parentheses
354 may be used around <literal>JOIN</> clauses to control the join
355 order. In the absence of parentheses, <literal>JOIN</> clauses
360 To put this together, assume we have tables <literal>t1</literal>
368 and <literal>t2</literal>
376 then we get the following results for the various joins:
378 <prompt>=></> <userinput>SELECT * FROM t1 CROSS JOIN t2;</>
379 num | name | num | value
380 -----+------+-----+-------
392 <prompt>=></> <userinput>SELECT * FROM t1 INNER JOIN t2 ON t1.num = t2.num;</>
393 num | name | num | value
394 -----+------+-----+-------
399 <prompt>=></> <userinput>SELECT * FROM t1 INNER JOIN t2 USING (num);</>
406 <prompt>=></> <userinput>SELECT * FROM t1 NATURAL INNER JOIN t2;</>
413 <prompt>=></> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num;</>
414 num | name | num | value
415 -----+------+-----+-------
421 <prompt>=></> <userinput>SELECT * FROM t1 LEFT JOIN t2 USING (num);</>
429 <prompt>=></> <userinput>SELECT * FROM t1 RIGHT JOIN t2 ON t1.num = t2.num;</>
430 num | name | num | value
431 -----+------+-----+-------
437 <prompt>=></> <userinput>SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num;</>
438 num | name | num | value
439 -----+------+-----+-------
449 The join condition specified with <literal>ON</> can also contain
450 conditions that do not relate directly to the join. This can
451 prove useful for some queries but needs to be thought out
452 carefully. For example:
454 <prompt>=></> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';</>
455 num | name | num | value
456 -----+------+-----+-------
465 <sect3 id="queries-table-aliases">
466 <title>Table and Column Aliases</title>
468 <indexterm zone="queries-table-aliases">
469 <primary>alias</primary>
470 <secondary>in the FROM clause</secondary>
474 <primary>label</primary>
479 A temporary name can be given to tables and complex table
480 references to be used for references to the derived table in
481 the rest of the query. This is called a <firstterm>table
486 To create a table alias, write
488 FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
492 FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
494 The <literal>AS</literal> key word is noise.
495 <replaceable>alias</replaceable> can be any identifier.
499 A typical application of table aliases is to assign short
500 identifiers to long table names to keep the join clauses
501 readable. For example:
503 SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.id = a.num;
508 The alias becomes the new name of the table reference for the
509 current query — it is no longer possible to refer to the table
510 by the original name. Thus
512 SELECT * FROM my_table AS m WHERE my_table.a > 5;
514 is not valid according to the SQL standard. In
515 <productname>PostgreSQL</productname> this will draw an error if the
516 <xref linkend="guc-add-missing-from"> configuration variable is
517 <literal>off</> (as it is by default). If it is <literal>on</>,
518 an implicit table reference will be added to the
519 <literal>FROM</literal> clause, so the query is processed as if
522 SELECT * FROM my_table AS m, my_table AS my_table WHERE my_table.a > 5;
524 That will result in a cross join, which is usually not what you want.
528 Table aliases are mainly for notational convenience, but it is
529 necessary to use them when joining a table to itself, e.g.,
531 SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
533 Additionally, an alias is required if the table reference is a
534 subquery (see <xref linkend="queries-subqueries">).
538 Parentheses are used to resolve ambiguities. In the following example,
539 the first statement assigns the alias <literal>b</literal> to the second
540 instance of <literal>my_table</>, but the second statement assigns the
541 alias to the result of the join:
543 SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
544 SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
549 Another form of table aliasing gives temporary names to the columns of
550 the table, as well as the table itself:
552 FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
554 If fewer column aliases are specified than the actual table has
555 columns, the remaining columns are not renamed. This syntax is
556 especially useful for self-joins or subqueries.
560 When an alias is applied to the output of a <literal>JOIN</>
561 clause, using any of these forms, the alias hides the original
562 names within the <literal>JOIN</>. For example,
564 SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
568 SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
570 is not valid: the table alias <literal>a</> is not visible
571 outside the alias <literal>c</>.
575 <sect3 id="queries-subqueries">
576 <title>Subqueries</title>
578 <indexterm zone="queries-subqueries">
579 <primary>subquery</primary>
583 Subqueries specifying a derived table must be enclosed in
584 parentheses and <emphasis>must</emphasis> be assigned a table
585 alias name. (See <xref linkend="queries-table-aliases">.) For
588 FROM (SELECT * FROM table1) AS alias_name
593 This example is equivalent to <literal>FROM table1 AS
594 alias_name</literal>. More interesting cases, which can't be
595 reduced to a plain join, arise when the subquery involves
596 grouping or aggregation.
600 A subquery can also be a <command>VALUES</> list:
602 FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow'))
603 AS names(first, last)
605 Again, a table alias is required. Assigning alias names to the columns
606 of the <command>VALUES</> list is optional, but is good practice.
607 For more information see <xref linkend="queries-values">.
611 <sect3 id="queries-tablefunctions">
612 <title>Table Functions</title>
614 <indexterm zone="queries-tablefunctions"><primary>table function</></>
616 <indexterm zone="queries-tablefunctions">
618 <secondary>in the FROM clause</>
622 Table functions are functions that produce a set of rows, made up
623 of either base data types (scalar types) or composite data types
624 (table rows). They are used like a table, view, or subquery in
625 the <literal>FROM</> clause of a query. Columns returned by table
626 functions may be included in <literal>SELECT</>,
627 <literal>JOIN</>, or <literal>WHERE</> clauses in the same manner
628 as a table, view, or subquery column.
632 If a table function returns a base data type, the single result
633 column is named like the function. If the function returns a
634 composite type, the result columns get the same names as the
635 individual attributes of the type.
639 A table function may be aliased in the <literal>FROM</> clause,
640 but it also may be left unaliased. If a function is used in the
641 <literal>FROM</> clause with no alias, the function name is used
642 as the resulting table name.
648 CREATE TABLE foo (fooid int, foosubid int, fooname text);
650 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
651 SELECT * FROM foo WHERE fooid = $1;
654 SELECT * FROM getfoo(1) AS t1;
657 WHERE foosubid IN (select foosubid from getfoo(foo.fooid) z
658 where z.fooid = foo.fooid);
660 CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
662 SELECT * FROM vw_getfoo;
667 In some cases it is useful to define table functions that can
668 return different column sets depending on how they are invoked.
669 To support this, the table function can be declared as returning
670 the pseudotype <type>record</>. When such a function is used in
671 a query, the expected row structure must be specified in the
672 query itself, so that the system can know how to parse and plan
673 the query. Consider this example:
676 FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
677 AS t1(proname name, prosrc text)
678 WHERE proname LIKE 'bytea%';
680 The <literal>dblink</> function executes a remote query (see
681 <filename>contrib/dblink</>). It is declared to return
682 <type>record</> since it might be used for any kind of query.
683 The actual column set must be specified in the calling query so
684 that the parser knows, for example, what <literal>*</> should
690 <sect2 id="queries-where">
691 <title>The <literal>WHERE</literal> Clause</title>
693 <indexterm zone="queries-where">
694 <primary>WHERE</primary>
698 The syntax of the <xref linkend="sql-where"
699 endterm="sql-where-title"> is
701 WHERE <replaceable>search_condition</replaceable>
703 where <replaceable>search_condition</replaceable> is any value
704 expression (see <xref linkend="sql-expressions">) that
705 returns a value of type <type>boolean</type>.
709 After the processing of the <literal>FROM</> clause is done, each
710 row of the derived virtual table is checked against the search
711 condition. If the result of the condition is true, the row is
712 kept in the output table, otherwise (that is, if the result is
713 false or null) it is discarded. The search condition typically
714 references at least some column of the table generated in the
715 <literal>FROM</> clause; this is not required, but otherwise the
716 <literal>WHERE</> clause will be fairly useless.
721 The join condition of an inner join can be written either in
722 the <literal>WHERE</> clause or in the <literal>JOIN</> clause.
723 For example, these table expressions are equivalent:
725 FROM a, b WHERE a.id = b.id AND b.val > 5
729 FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5
733 FROM a NATURAL JOIN b WHERE b.val > 5
735 Which one of these you use is mainly a matter of style. The
736 <literal>JOIN</> syntax in the <literal>FROM</> clause is
737 probably not as portable to other SQL database management systems. For
738 outer joins there is no choice in any case: they must be done in
739 the <literal>FROM</> clause. An <literal>ON</>/<literal>USING</>
740 clause of an outer join is <emphasis>not</> equivalent to a
741 <literal>WHERE</> condition, because it determines the addition
742 of rows (for unmatched input rows) as well as the removal of rows
743 from the final result.
748 Here are some examples of <literal>WHERE</literal> clauses:
750 SELECT ... FROM fdt WHERE c1 > 5
752 SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
754 SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2)
756 SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
758 SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100
760 SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)
762 <literal>fdt</literal> is the table derived in the
763 <literal>FROM</> clause. Rows that do not meet the search
764 condition of the <literal>WHERE</> clause are eliminated from
765 <literal>fdt</literal>. Notice the use of scalar subqueries as
766 value expressions. Just like any other query, the subqueries can
767 employ complex table expressions. Notice also how
768 <literal>fdt</literal> is referenced in the subqueries.
769 Qualifying <literal>c1</> as <literal>fdt.c1</> is only necessary
770 if <literal>c1</> is also the name of a column in the derived
771 input table of the subquery. But qualifying the column name adds
772 clarity even when it is not needed. This example shows how the column
773 naming scope of an outer query extends into its inner queries.
778 <sect2 id="queries-group">
779 <title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title>
781 <indexterm zone="queries-group">
782 <primary>GROUP BY</primary>
785 <indexterm zone="queries-group">
786 <primary>grouping</primary>
790 After passing the <literal>WHERE</> filter, the derived input
791 table may be subject to grouping, using the <literal>GROUP BY</>
792 clause, and elimination of group rows using the <literal>HAVING</>
797 SELECT <replaceable>select_list</replaceable>
799 <optional>WHERE ...</optional>
800 GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
804 The <xref linkend="sql-groupby" endterm="sql-groupby-title"> is
805 used to group together those rows in a table that share the same
806 values in all the columns listed. The order in which the columns
807 are listed does not matter. The effect is to combine each set
808 of rows sharing common values into one group row that is
809 representative of all rows in the group. This is done to
810 eliminate redundancy in the output and/or compute aggregates that
811 apply to these groups. For instance:
813 <prompt>=></> <userinput>SELECT * FROM test1;</>
822 <prompt>=></> <userinput>SELECT x FROM test1 GROUP BY x;</>
833 In the second query, we could not have written <literal>SELECT *
834 FROM test1 GROUP BY x</literal>, because there is no single value
835 for the column <literal>y</> that could be associated with each
836 group. The grouped-by columns can be referenced in the select list since
837 they have a single value in each group.
841 In general, if a table is grouped, columns that are not
842 used in the grouping cannot be referenced except in aggregate
843 expressions. An example with aggregate expressions is:
845 <prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</>
853 Here <literal>sum</literal> is an aggregate function that
854 computes a single value over the entire group. More information
855 about the available aggregate functions can be found in <xref
856 linkend="functions-aggregate">.
861 Grouping without aggregate expressions effectively calculates the
862 set of distinct values in a column. This can also be achieved
863 using the <literal>DISTINCT</> clause (see <xref
864 linkend="queries-distinct">).
869 Here is another example: it calculates the total sales for each
870 product (rather than the total sales on all products).
872 SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
873 FROM products p LEFT JOIN sales s USING (product_id)
874 GROUP BY product_id, p.name, p.price;
876 In this example, the columns <literal>product_id</literal>,
877 <literal>p.name</literal>, and <literal>p.price</literal> must be
878 in the <literal>GROUP BY</> clause since they are referenced in
879 the query select list. (Depending on how exactly the products
880 table is set up, name and price may be fully dependent on the
881 product ID, so the additional groupings could theoretically be
882 unnecessary, but this is not implemented yet.) The column
883 <literal>s.units</> does not have to be in the <literal>GROUP
884 BY</> list since it is only used in an aggregate expression
885 (<literal>sum(...)</literal>), which represents the sales
886 of a product. For each product, the query returns a summary row about
887 all sales of the product.
891 In strict SQL, <literal>GROUP BY</> can only group by columns of
892 the source table but <productname>PostgreSQL</productname> extends
893 this to also allow <literal>GROUP BY</> to group by columns in the
894 select list. Grouping by value expressions instead of simple
895 column names is also allowed.
899 <primary>HAVING</primary>
903 If a table has been grouped using a <literal>GROUP BY</literal>
904 clause, but then only certain groups are of interest, the
905 <literal>HAVING</literal> clause can be used, much like a
906 <literal>WHERE</> clause, to eliminate groups from a grouped
907 table. The syntax is:
909 SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
911 Expressions in the <literal>HAVING</> clause can refer both to
912 grouped expressions and to ungrouped expressions (which necessarily
913 involve an aggregate function).
919 <prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) > 3;</>
926 <prompt>=></> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x < 'c';</>
936 Again, a more realistic example:
938 SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
939 FROM products p LEFT JOIN sales s USING (product_id)
940 WHERE s.date > CURRENT_DATE - INTERVAL '4 weeks'
941 GROUP BY product_id, p.name, p.price, p.cost
942 HAVING sum(p.price * s.units) > 5000;
944 In the example above, the <literal>WHERE</> clause is selecting
945 rows by a column that is not grouped (the expression is only true for
946 sales during the last four weeks), while the <literal>HAVING</>
947 clause restricts the output to groups with total gross sales over
948 5000. Note that the aggregate expressions do not necessarily need
949 to be the same in all parts of the query.
955 <sect1 id="queries-select-lists">
956 <title>Select Lists</title>
959 <primary>SELECT</primary>
960 <secondary>select list</secondary>
964 As shown in the previous section,
965 the table expression in the <command>SELECT</command> command
966 constructs an intermediate virtual table by possibly combining
967 tables, views, eliminating rows, grouping, etc. This table is
968 finally passed on to processing by the <firstterm>select list</firstterm>. The select
969 list determines which <emphasis>columns</emphasis> of the
970 intermediate table are actually output.
973 <sect2 id="queries-select-list-items">
974 <title>Select-List Items</title>
981 The simplest kind of select list is <literal>*</literal> which
982 emits all columns that the table expression produces. Otherwise,
983 a select list is a comma-separated list of value expressions (as
984 defined in <xref linkend="sql-expressions">). For instance, it
985 could be a list of column names:
987 SELECT a, b, c FROM ...
989 The columns names <literal>a</>, <literal>b</>, and <literal>c</>
990 are either the actual names of the columns of tables referenced
991 in the <literal>FROM</> clause, or the aliases given to them as
992 explained in <xref linkend="queries-table-aliases">. The name
993 space available in the select list is the same as in the
994 <literal>WHERE</> clause, unless grouping is used, in which case
995 it is the same as in the <literal>HAVING</> clause.
999 If more than one table has a column of the same name, the table
1000 name must also be given, as in
1002 SELECT tbl1.a, tbl2.a, tbl1.b FROM ...
1004 When working with multiple tables, it can also be useful to ask for
1005 all the columns of a particular table:
1007 SELECT tbl1.*, tbl2.a FROM ...
1009 (See also <xref linkend="queries-where">.)
1013 If an arbitrary value expression is used in the select list, it
1014 conceptually adds a new virtual column to the returned table. The
1015 value expression is evaluated once for each result row, with
1016 the row's values substituted for any column references. But the
1017 expressions in the select list do not have to reference any
1018 columns in the table expression of the <literal>FROM</> clause;
1019 they could be constant arithmetic expressions as well, for
1024 <sect2 id="queries-column-labels">
1025 <title>Column Labels</title>
1027 <indexterm zone="queries-column-labels">
1028 <primary>alias</primary>
1029 <secondary>in the select list</secondary>
1033 The entries in the select list can be assigned names for further
1034 processing. The <quote>further processing</quote> in this case is
1035 an optional sort specification and the client application (e.g.,
1036 column headers for display). For example:
1038 SELECT a AS value, b + c AS sum FROM ...
1043 If no output column name is specified using <literal>AS</>, the system assigns a
1044 default name. For simple column references, this is the name of the
1045 referenced column. For function
1046 calls, this is the name of the function. For complex expressions,
1047 the system will generate a generic name.
1052 The naming of output columns here is different from that done in
1053 the <literal>FROM</> clause (see <xref
1054 linkend="queries-table-aliases">). This pipeline will in fact
1055 allow you to rename the same column twice, but the name chosen in
1056 the select list is the one that will be passed on.
1061 <sect2 id="queries-distinct">
1062 <title><literal>DISTINCT</literal></title>
1064 <indexterm zone="queries-distinct">
1065 <primary>DISTINCT</primary>
1068 <indexterm zone="queries-distinct">
1069 <primary>duplicates</primary>
1073 After the select list has been processed, the result table may
1074 optionally be subject to the elimination of duplicate rows. The
1075 <literal>DISTINCT</literal> key word is written directly after
1076 <literal>SELECT</literal> to specify this:
1078 SELECT DISTINCT <replaceable>select_list</replaceable> ...
1080 (Instead of <literal>DISTINCT</> the key word <literal>ALL</literal>
1081 can be used to specify the default behavior of retaining all rows.)
1085 <indexterm><primary>null value</><secondary sortas="DISTINCT">in
1086 DISTINCT</></indexterm>
1087 Obviously, two rows are considered distinct if they differ in at
1088 least one column value. Null values are considered equal in this
1093 Alternatively, an arbitrary expression can determine what rows are
1094 to be considered distinct:
1096 SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ...
1098 Here <replaceable>expression</replaceable> is an arbitrary value
1099 expression that is evaluated for all rows. A set of rows for
1100 which all the expressions are equal are considered duplicates, and
1101 only the first row of the set is kept in the output. Note that
1102 the <quote>first row</quote> of a set is unpredictable unless the
1103 query is sorted on enough columns to guarantee a unique ordering
1104 of the rows arriving at the <literal>DISTINCT</> filter.
1105 (<literal>DISTINCT ON</> processing occurs after <literal>ORDER
1110 The <literal>DISTINCT ON</> clause is not part of the SQL standard
1111 and is sometimes considered bad style because of the potentially
1112 indeterminate nature of its results. With judicious use of
1113 <literal>GROUP BY</> and subqueries in <literal>FROM</> the
1114 construct can be avoided, but it is often the most convenient
1121 <sect1 id="queries-union">
1122 <title>Combining Queries</title>
1124 <indexterm zone="queries-union">
1125 <primary>UNION</primary>
1127 <indexterm zone="queries-union">
1128 <primary>INTERSECT</primary>
1130 <indexterm zone="queries-union">
1131 <primary>EXCEPT</primary>
1133 <indexterm zone="queries-union">
1134 <primary>set union</primary>
1136 <indexterm zone="queries-union">
1137 <primary>set intersection</primary>
1139 <indexterm zone="queries-union">
1140 <primary>set difference</primary>
1142 <indexterm zone="queries-union">
1143 <primary>set operation</primary>
1147 The results of two queries can be combined using the set operations
1148 union, intersection, and difference. The syntax is
1150 <replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable>
1151 <replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable>
1152 <replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable>
1154 <replaceable>query1</replaceable> and
1155 <replaceable>query2</replaceable> are queries that can use any of
1156 the features discussed up to this point. Set operations can also
1157 be nested and chained, for example
1159 <replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
1163 (<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
1168 <literal>UNION</> effectively appends the result of
1169 <replaceable>query2</replaceable> to the result of
1170 <replaceable>query1</replaceable> (although there is no guarantee
1171 that this is the order in which the rows are actually returned).
1172 Furthermore, it eliminates duplicate rows from its result, in the same
1173 way as <literal>DISTINCT</>, unless <literal>UNION ALL</> is used.
1177 <literal>INTERSECT</> returns all rows that are both in the result
1178 of <replaceable>query1</replaceable> and in the result of
1179 <replaceable>query2</replaceable>. Duplicate rows are eliminated
1180 unless <literal>INTERSECT ALL</> is used.
1184 <literal>EXCEPT</> returns all rows that are in the result of
1185 <replaceable>query1</replaceable> but not in the result of
1186 <replaceable>query2</replaceable>. (This is sometimes called the
1187 <firstterm>difference</> between two queries.) Again, duplicates
1188 are eliminated unless <literal>EXCEPT ALL</> is used.
1192 In order to calculate the union, intersection, or difference of two
1193 queries, the two queries must be <quote>union compatible</quote>,
1194 which means that they return the same number of columns and
1195 the corresponding columns have compatible data types, as
1196 described in <xref linkend="typeconv-union-case">.
1201 <sect1 id="queries-order">
1202 <title>Sorting Rows</title>
1204 <indexterm zone="queries-order">
1205 <primary>sorting</primary>
1208 <indexterm zone="queries-order">
1209 <primary>ORDER BY</primary>
1213 After a query has produced an output table (after the select list
1214 has been processed) it can optionally be sorted. If sorting is not
1215 chosen, the rows will be returned in an unspecified order. The actual
1216 order in that case will depend on the scan and join plan types and
1217 the order on disk, but it must not be relied on. A particular
1218 output ordering can only be guaranteed if the sort step is explicitly
1223 The <literal>ORDER BY</> clause specifies the sort order:
1225 SELECT <replaceable>select_list</replaceable>
1226 FROM <replaceable>table_expression</replaceable>
1227 ORDER BY <replaceable>sort_expression1</replaceable> <optional>ASC | DESC</optional> <optional>, <replaceable>sort_expression2</replaceable> <optional>ASC | DESC</optional> ...</optional>
1229 The sort expression(s) can be any expression that would be valid in the
1230 query's select list. An example is
1232 SELECT a, b FROM table1 ORDER BY a + b, c;
1234 When more than one expression is specified,
1235 the later values are used to sort rows that are equal according to the
1236 earlier values. Each expression may be followed by an optional
1237 <literal>ASC</> or <literal>DESC</> keyword to set the sort direction to
1238 ascending or descending. <literal>ASC</> order is the default.
1239 Ascending order puts smaller values first, where
1240 <quote>smaller</quote> is defined in terms of the
1241 <literal><</literal> operator. Similarly, descending order is
1242 determined with the <literal>></literal> operator.
1245 Actually, <productname>PostgreSQL</> uses the <firstterm>default B-tree
1246 operator class</> for the expression's data type to determine the sort
1247 ordering for <literal>ASC</> and <literal>DESC</>. Conventionally,
1248 data types will be set up so that the <literal><</literal> and
1249 <literal>></literal> operators correspond to this sort ordering,
1250 but a user-defined data type's designer could choose to do something
1257 For backwards compatibility with the SQL92 version of the standard,
1258 a <replaceable>sort_expression</> can instead be the name or number
1259 of an output column, as in
1261 SELECT a + b AS sum, c FROM table1 ORDER BY sum;
1262 SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
1264 both of which sort by the first output column. Note that an output
1265 column name has to stand alone, it's not allowed as part of an expression
1266 — for example, this is <emphasis>not</> correct:
1268 SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
1270 This restriction is made to reduce ambiguity. There is still
1271 ambiguity if an <literal>ORDER BY</> item is a simple name that
1272 could match either an output column name or a column from the table
1273 expression. The output column is used in such cases. This would
1274 only cause confusion if you use <literal>AS</> to rename an output
1275 column to match some other table column's name.
1279 <literal>ORDER BY</> can be applied to the result of a
1280 <literal>UNION</>, <literal>INTERSECT</>, or <literal>EXCEPT</>
1281 combination, but in this case it is only permitted to sort by
1282 output column names or numbers, not by expressions.
1287 <sect1 id="queries-limit">
1288 <title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>
1290 <indexterm zone="queries-limit">
1291 <primary>LIMIT</primary>
1294 <indexterm zone="queries-limit">
1295 <primary>OFFSET</primary>
1299 <literal>LIMIT</> and <literal>OFFSET</> allow you to retrieve just
1300 a portion of the rows that are generated by the rest of the query:
1302 SELECT <replaceable>select_list</replaceable>
1303 FROM <replaceable>table_expression</replaceable>
1304 <optional> ORDER BY <replaceable>sort_expression1</replaceable> <optional>ASC | DESC</optional> <optional>, <replaceable>sort_expression2</replaceable> <optional>ASC | DESC</optional> ...</optional> </optional>
1305 <optional> LIMIT { <replaceable>number</replaceable> | ALL } </optional> <optional> OFFSET <replaceable>number</replaceable> </optional>
1310 If a limit count is given, no more than that many rows will be
1311 returned (but possibly less, if the query itself yields less rows).
1312 <literal>LIMIT ALL</> is the same as omitting the <literal>LIMIT</>
1317 <literal>OFFSET</> says to skip that many rows before beginning to
1318 return rows. <literal>OFFSET 0</> is the same as
1319 omitting the <literal>OFFSET</> clause. If both <literal>OFFSET</>
1320 and <literal>LIMIT</> appear, then <literal>OFFSET</> rows are
1321 skipped before starting to count the <literal>LIMIT</> rows that
1326 When using <literal>LIMIT</>, it is important to use an
1327 <literal>ORDER BY</> clause that constrains the result rows into a
1328 unique order. Otherwise you will get an unpredictable subset of
1329 the query's rows. You may be asking for the tenth through
1330 twentieth rows, but tenth through twentieth in what ordering? The
1331 ordering is unknown, unless you specified <literal>ORDER BY</>.
1335 The query optimizer takes <literal>LIMIT</> into account when
1336 generating a query plan, so you are very likely to get different
1337 plans (yielding different row orders) depending on what you give
1338 for <literal>LIMIT</> and <literal>OFFSET</>. Thus, using
1339 different <literal>LIMIT</>/<literal>OFFSET</> values to select
1340 different subsets of a query result <emphasis>will give
1341 inconsistent results</emphasis> unless you enforce a predictable
1342 result ordering with <literal>ORDER BY</>. This is not a bug; it
1343 is an inherent consequence of the fact that SQL does not promise to
1344 deliver the results of a query in any particular order unless
1345 <literal>ORDER BY</> is used to constrain the order.
1349 The rows skipped by an <literal>OFFSET</> clause still have to be
1350 computed inside the server; therefore a large <literal>OFFSET</>
1356 <sect1 id="queries-values">
1357 <title><literal>VALUES</literal> Lists</title>
1359 <indexterm zone="queries-values">
1360 <primary>VALUES</primary>
1364 <literal>VALUES</> provides a way to generate a <quote>constant table</>
1365 that can be used in a query without having to actually create and populate
1366 a table on-disk. The syntax is
1368 VALUES ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) [, ...]
1370 Each parenthesized list of expressions generates a row in the table.
1371 The lists must all have the same number of elements (i.e., the number
1372 of columns in the table), and corresponding entries in each list must
1373 have compatible data types. The actual data type assigned to each column
1374 of the result is determined using the same rules as for <literal>UNION</>
1375 (see <xref linkend="typeconv-union-case">).
1382 VALUES (1, 'one'), (2, 'two'), (3, 'three');
1385 will return a table of two columns and three rows. It's effectively
1389 SELECT 1 AS column1, 'one' AS column2
1396 By default, <productname>PostgreSQL</productname> assigns the names
1397 <literal>column1</>, <literal>column2</>, etc. to the columns of a
1398 <literal>VALUES</> table. The column names are not specified by the
1399 SQL standard and different database systems do it differently, so
1400 it's usually better to override the default names with a table alias
1405 Syntactically, <literal>VALUES</> followed by expression lists is
1406 treated as equivalent to
1408 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable>
1410 and can appear anywhere a <literal>SELECT</> can. For example, you can
1411 use it as an arm of a <literal>UNION</>, or attach a
1412 <replaceable>sort_specification</replaceable> (<literal>ORDER BY</>,
1413 <literal>LIMIT</>, and/or <literal>OFFSET</>) to it. <literal>VALUES</>
1414 is most commonly used as the data source in an <command>INSERT</> command,
1415 and next most commonly as a subquery.
1419 For more information see <xref linkend="sql-values"
1420 endterm="sql-values-title">.