From: Tom Lane Date: Fri, 15 Oct 2010 19:48:45 +0000 (-0400) Subject: Document the DISTINCT noise word in the UNION/INTERSECT/EXCEPT constructs. X-Git-Tag: REL9_1_ALPHA2~77 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=71d24466fb84bf51c479dcc85b52a0c2b71b9c50;p=postgresql Document the DISTINCT noise word in the UNION/INTERSECT/EXCEPT constructs. I also rearranged the order of the sections to match the logical order of processing steps: the distinct-elimination implied by SELECT DISTINCT happens before, not after, any UNION/INTERSECT/EXCEPT combination. Per a suggestion from Hitoshi Harada. --- diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index e840070873..24f8249713 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -40,7 +40,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [, ...] ] [ HAVING condition [, ...] ] [ WINDOW window_name AS ( window_definition ) [, ...] ] - [ { UNION | INTERSECT | EXCEPT } [ ALL ] select ] + [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ] [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ] [ LIMIT { count | ALL } ] [ OFFSET start [ ROW | ROWS ] ] @@ -106,7 +106,7 @@ TABLE { [ ONLY ] table_name [ * ] | If the GROUP BY clause is specified, the - output is divided into groups of rows that match on one or more + output is combined into groups of rows that match on one or more values. If the HAVING clause is present, it eliminates groups that do not satisfy the given condition. (See and @@ -118,12 +118,23 @@ TABLE { [ ONLY ] table_name [ * ] | The actual output rows are computed using the SELECT output expressions for each selected - row. (See + row or row group. (See below.) + + + SELECT DISTINCT eliminates duplicate rows from the + result. SELECT DISTINCT ON eliminates rows that + match on all the specified expressions. SELECT ALL + (the default) will return all candidate rows, including + duplicates. (See below.) + + + Using the operators UNION, @@ -136,7 +147,11 @@ TABLE { [ ONLY ] table_name [ * ] | strictly in both result sets. The EXCEPT operator returns the rows that are in the first result set but not in the second. In all three cases, duplicate rows are - eliminated unless ALL is specified. (See + eliminated unless ALL is specified. The noise + word DISTINCT can be added to explicitly specify + eliminating duplicate rows. Notice that DISTINCT is + the default behavior here, even though ALL is + the default for SELECT itself. (See , , and below.) @@ -153,17 +168,6 @@ TABLE { [ ONLY ] table_name [ * ] | - - - DISTINCT eliminates duplicate rows from the - result. DISTINCT ON eliminates rows that - match on all the specified expressions. ALL - (the default) will return all candidate rows, including - duplicates. (See below.) - - - If the LIMIT (or FETCH FIRST) or OFFSET @@ -219,7 +223,7 @@ TABLE { [ ONLY ] table_name [ * ] | subquery to reference itself by name. Such a subquery must have the form -non_recursive_term UNION [ ALL ] recursive_term +non_recursive_term UNION [ ALL | DISTINCT ] recursive_term where the recursive self-reference must appear on the right-hand side of the UNION. Only one recursive self-reference @@ -755,13 +759,52 @@ UNBOUNDED FOLLOWING + + <literal>DISTINCT</literal> Clause + + + If SELECT DISTINCT is specified, all duplicate rows are + removed from the result set (one row is kept from each group of + duplicates). SELECT ALL specifies the opposite: all rows are + kept; that is the default. + + + + SELECT DISTINCT ON ( expression [, ...] ) + keeps only the first row of each set of rows where the given + expressions evaluate to equal. The DISTINCT ON + expressions are interpreted using the same rules as for + ORDER BY (see above). Note that the first + row of each set is unpredictable unless ORDER + BY is used to ensure that the desired row appears first. For + example: + +SELECT DISTINCT ON (location) location, time, report + FROM weather_reports + ORDER BY location, time DESC; + + retrieves the most recent weather report for each location. But + if we had not used ORDER BY to force descending order + of time values for each location, we'd have gotten a report from + an unpredictable time for each location. + + + + The DISTINCT ON expression(s) must match the leftmost + ORDER BY expression(s). The ORDER BY clause + will normally contain additional expression(s) that determine the + desired precedence of rows within each DISTINCT ON group. + + + <literal>UNION</literal> Clause The UNION clause has this general form: -select_statement UNION [ ALL ] select_statement +select_statement UNION [ ALL | DISTINCT ] select_statement select_statement is any SELECT statement without an ORDER @@ -791,6 +834,8 @@ UNBOUNDED FOLLOWING ALL prevents elimination of duplicates. (Therefore, UNION ALL is usually significantly quicker than UNION; use ALL when you can.) + DISTINCT can be written to explicitly specify the + default behavior of eliminating duplicate rows. @@ -812,7 +857,7 @@ UNBOUNDED FOLLOWING The INTERSECT clause has this general form: -select_statement INTERSECT [ ALL ] select_statement +select_statement INTERSECT [ ALL | DISTINCT ] select_statement select_statement is any SELECT statement without an ORDER @@ -833,6 +878,8 @@ UNBOUNDED FOLLOWING With ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear min(m,n) times in the result set. + DISTINCT can be written to explicitly specify the + default behavior of eliminating duplicate rows. @@ -858,7 +905,7 @@ UNBOUNDED FOLLOWING The EXCEPT clause has this general form: -select_statement EXCEPT [ ALL ] select_statement +select_statement EXCEPT [ ALL | DISTINCT ] select_statement select_statement is any SELECT statement without an ORDER @@ -878,6 +925,8 @@ UNBOUNDED FOLLOWING With ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear max(m-n,0) times in the result set. + DISTINCT can be written to explicitly specify the + default behavior of eliminating duplicate rows. @@ -987,45 +1036,6 @@ SELECT name FROM distributors ORDER BY code; - - <literal>DISTINCT</literal> Clause - - - If DISTINCT is specified, all duplicate rows are - removed from the result set (one row is kept from each group of - duplicates). ALL specifies the opposite: all rows are - kept; that is the default. - - - - DISTINCT ON ( expression [, ...] ) - keeps only the first row of each set of rows where the given - expressions evaluate to equal. The DISTINCT ON - expressions are interpreted using the same rules as for - ORDER BY (see above). Note that the first - row of each set is unpredictable unless ORDER - BY is used to ensure that the desired row appears first. For - example: - -SELECT DISTINCT ON (location) location, time, report - FROM weather_reports - ORDER BY location, time DESC; - - retrieves the most recent weather report for each location. But - if we had not used ORDER BY to force descending order - of time values for each location, we'd have gotten a report from - an unpredictable time for each location. - - - - The DISTINCT ON expression(s) must match the leftmost - ORDER BY expression(s). The ORDER BY clause - will normally contain additional expression(s) that determine the - desired precedence of rows within each DISTINCT ON group. - - - <literal>LIMIT</literal> Clause diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml index 715d64097e..787c106a49 100644 --- a/doc/src/sgml/ref/select_into.sgml +++ b/doc/src/sgml/ref/select_into.sgml @@ -30,7 +30,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [, ...] ] [ HAVING condition [, ...] ] [ WINDOW window_name AS ( window_definition ) [, ...] ] - [ { UNION | INTERSECT | EXCEPT } [ ALL ] select ] + [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ] [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ] [ LIMIT { count | ALL } ] [ OFFSET start [ ROW | ROWS ] ]