</synopsis>
A table reference can be a table name (possibly schema-qualified),
- or a derived table such as a subquery, a table join, or complex
- combinations of these. If more than one table reference is listed
- in the <literal>FROM</> clause they are cross-joined (see below)
- to form the intermediate virtual table that can then be subject to
+ or a derived table such as a subquery, a <literal>JOIN</> construct, or
+ complex combinations of these. If more than one table reference is
+ listed in the <literal>FROM</> clause, the tables are cross-joined
+ (that is, the Cartesian product of their rows is formed; see below).
+ The result of the <literal>FROM</> list is an intermediate virtual
+ table that can then be subject to
transformations by the <literal>WHERE</>, <literal>GROUP BY</>,
and <literal>HAVING</> clauses and is finally the result of the
overall table expression.
A joined table is a table derived from two other (real or
derived) tables according to the rules of the particular join
type. Inner, outer, and cross-joins are available.
+ The general syntax of a joined table is
+<synopsis>
+<replaceable>T1</replaceable> <replaceable>join_type</replaceable> <replaceable>T2</replaceable> <optional> <replaceable>join_condition</replaceable> </optional>
+</synopsis>
+ Joins of all types can be chained together, or nested: either or
+ both <replaceable>T1</replaceable> and
+ <replaceable>T2</replaceable> can be joined tables. Parentheses
+ can be used around <literal>JOIN</> clauses to control the join
+ order. In the absence of parentheses, <literal>JOIN</> clauses
+ nest left-to-right.
</para>
<variablelist>
<para>
<literal>FROM <replaceable>T1</replaceable> CROSS JOIN
<replaceable>T2</replaceable></literal> is equivalent to
- <literal>FROM <replaceable>T1</replaceable>,
- <replaceable>T2</replaceable></literal>. It is also equivalent to
<literal>FROM <replaceable>T1</replaceable> INNER JOIN
<replaceable>T2</replaceable> ON TRUE</literal> (see below).
+ It is also equivalent to
+ <literal>FROM <replaceable>T1</replaceable>,
+ <replaceable>T2</replaceable></literal>.
+ <note>
+ <para>
+ This latter equivalence does not hold exactly when more than two
+ tables appear, because <literal>JOIN</> binds more tightly than
+ comma. For example
+ <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
+ <replaceable>T2</replaceable> INNER JOIN <replaceable>T3</replaceable>
+ ON <replaceable>condition</replaceable></literal>
+ is not the same as
+ <literal>FROM <replaceable>T1</replaceable>,
+ <replaceable>T2</replaceable> INNER JOIN <replaceable>T3</replaceable>
+ ON <replaceable>condition</replaceable></literal>
+ because the <replaceable>condition</replaceable> can
+ reference <replaceable>T1</replaceable> in the first case but not
+ the second.
+ </para>
+ </note>
</para>
</listitem>
</varlistentry>
<quote>match</quote>, as explained in detail below.
</para>
- <para>
- The <literal>ON</> clause is the most general kind of join
- condition: it takes a Boolean value expression of the same
- kind as is used in a <literal>WHERE</> clause. A pair of rows
- from <replaceable>T1</> and <replaceable>T2</> match if the
- <literal>ON</> expression evaluates to true.
- </para>
-
- <para>
- The <literal>USING</> clause allows you to take advantage of
- the specific situation where both sides of the join use the
- same name for the joining columns. It takes a
- comma-separated list of the shared column names
- and forms a join using the equals operator. Furthermore, the
- output of <literal>JOIN USING</> has one column for each of the
- listed columns, followed by the remaining columns from each table.
- </para>
-
- <para>The output column difference between <literal>ON</> and
- <literal>USING</> when invoking <literal>SELECT *</> is:</para>
- <itemizedlist>
- <listitem>
- <para>
- <literal>ON</> - all columns from <replaceable>T1</> followed
- by all columns from <replaceable>T2</>
- </para>
- </listitem>
- <listitem>
- <para>
- <literal>USING</> - all join columns, one copy each
- and in the listed order, followed by non-join columns
- in <replaceable>T1</> followed by non-join columns in
- <replaceable>T2</>
- </para>
- </listitem>
- <listitem>
- <para>
- Examples provided below
- </para>
- </listitem>
- </itemizedlist>
-
- <para>
- <indexterm>
- <primary>join</primary>
- <secondary>natural</secondary>
- </indexterm>
- <indexterm>
- <primary>natural join</primary>
- </indexterm>
- Finally, <literal>NATURAL</> is a shorthand form of
- <literal>USING</>: it forms a <literal>USING</> list
- consisting of all column names that appear in both
- input tables. As with <literal>USING</>, these columns appear
- only once in the output table. If there are no common
- columns, <literal>NATURAL</literal> behaves like
- <literal>CROSS JOIN</literal>.
- </para>
-
- <note>
- <para>
- <literal>USING</literal> is reasonably safe from column changes
- in the joined relations since only the specific columns mentioned
- are considered. <literal>NATURAL</> is considerably more problematic
- if you are referring to relations only by name (views and tables)
- since any schema changes to either relation that cause a new matching
- column name to be present will cause the join to consider that new column.
- </para>
- </note>
-
<para>
The possible types of qualified join are:
</varlistentry>
</variablelist>
</para>
+
+ <para>
+ The <literal>ON</> clause is the most general kind of join
+ condition: it takes a Boolean value expression of the same
+ kind as is used in a <literal>WHERE</> clause. A pair of rows
+ from <replaceable>T1</> and <replaceable>T2</> match if the
+ <literal>ON</> expression evaluates to true.
+ </para>
+
+ <para>
+ The <literal>USING</> clause is a shorthand that allows you to take
+ advantage of the specific situation where both sides of the join use
+ the same name for the joining column(s). It takes a
+ comma-separated list of the shared column names
+ and forms a join condition that includes an equality comparison
+ for each one. For example, joining <replaceable>T1</>
+ and <replaceable>T2</> with <literal>USING (a, b)</> produces
+ the join condition <literal>ON <replaceable>T1</>.a
+ = <replaceable>T2</>.a AND <replaceable>T1</>.b
+ = <replaceable>T2</>.b</literal>.
+ </para>
+
+ <para>
+ Furthermore, the output of <literal>JOIN USING</> suppresses
+ redundant columns: there is no need to print both of the matched
+ columns, since they must have equal values. While <literal>JOIN
+ ON</> produces all columns from <replaceable>T1</> followed by all
+ columns from <replaceable>T2</>, <literal>JOIN USING</> produces one
+ output column for each of the listed column pairs (in the listed
+ order), followed by any remaining columns from <replaceable>T1</>,
+ followed by any remaining columns from <replaceable>T2</>.
+ </para>
+
+ <para>
+ <indexterm>
+ <primary>join</primary>
+ <secondary>natural</secondary>
+ </indexterm>
+ <indexterm>
+ <primary>natural join</primary>
+ </indexterm>
+ Finally, <literal>NATURAL</> is a shorthand form of
+ <literal>USING</>: it forms a <literal>USING</> list
+ consisting of all column names that appear in both
+ input tables. As with <literal>USING</>, these columns appear
+ only once in the output table. If there are no common
+ column names, <literal>NATURAL</literal> behaves like
+ <literal>CROSS JOIN</literal>.
+ </para>
+
+ <note>
+ <para>
+ <literal>USING</literal> is reasonably safe from column changes
+ in the joined relations since only the listed columns
+ are combined. <literal>NATURAL</> is considerably more risky since
+ any schema changes to either relation that cause a new matching
+ column name to be present will cause the join to combine that new
+ column as well.
+ </para>
+ </note>
</listitem>
</varlistentry>
</variablelist>
- <para>
- Joins of all types can be chained together or nested: either or
- both <replaceable>T1</replaceable> and
- <replaceable>T2</replaceable> can be joined tables. Parentheses
- can be used around <literal>JOIN</> clauses to control the join
- order. In the absence of parentheses, <literal>JOIN</> clauses
- nest left-to-right.
- </para>
-
<para>
To put this together, assume we have tables <literal>t1</literal>:
<programlisting>
clause is processed <emphasis>before</> the join, while
a restriction placed in the <literal>WHERE</> clause is processed
<emphasis>after</> the join.
+ That does not matter with inner joins, but it matters a lot with outer
+ joins.
</para>
</sect3>