granicus.if.org Git - postgresql/blob - doc/src/sgml/typeconv.sgml

   1 <chapter Id="typeconv">
   2 <title>Type Conversion</title>
   3
   4  <sect1 id="typeconv-intro">
   5   <title>Introduction</title>
   6
   7 <para>
   8 <acronym>SQL</acronym> queries can, intentionally or not, require
   9 mixing of different data types in the same expression.
  10 <productname>PostgreSQL</productname> has extensive facilities for
  11 evaluating mixed-type expressions.
  12 </para>
  13
  14 <para>
  15 In many cases a user will not need
  16 to understand the details of the type conversion mechanism.
  17 However, the implicit conversions done by <productname>PostgreSQL</productname>
  18 can affect the results of a query.  When necessary, these results
  19 can be tailored by a user or programmer
  20 using <emphasis>explicit</emphasis> type coercion.
  21 </para>
  22
  23 <para>
  24 This chapter introduces the <productname>PostgreSQL</productname>
  25 type conversion mechanisms and conventions.
  26 Refer to the relevant sections in <xref linkend="datatype"> and <xref linkend="functions">
  27 for more information on specific data types and allowed functions and
  28 operators.
  29 </para>
  30
  31 <para>
  32 The <citetitle>Programmer's Guide</citetitle> has more details on the exact algorithms used for
  33 implicit type conversion and coercion.
  34 </para>
  35  </sect1>
  36
  37 <sect1 id="typeconv-overview">
  38 <title>Overview</title>
  39
  40 <para>
  41 <acronym>SQL</acronym> is a strongly typed language. That is, every data item
  42 has an associated data type which determines its behavior and allowed usage.
  43 <productname>PostgreSQL</productname> has an extensible type system that is
  44 much more general and flexible than other <acronym>RDBMS</acronym> implementations.
  45 Hence, most type conversion behavior in <productname>PostgreSQL</productname>
  46 should be governed by general rules rather than by ad-hoc heuristics, to allow
  47 mixed-type expressions to be meaningful even with user-defined types.
  48 </para>
  49
  50 <para>
  51 The <productname>PostgreSQL</productname> scanner/parser decodes lexical
  52 elements into only five fundamental categories: integers, floating-point numbers, strings,
  53 names, and key words.  Most extended types are first tokenized into
  54 strings. The <acronym>SQL</acronym> language definition allows specifying type
  55 names with strings, and this mechanism can be used in
  56 <productname>PostgreSQL</productname> to start the parser down the correct
  57 path. For example, the query
  58
  59 <screen>
  60 tgl=> SELECT text 'Origin' AS "Label", point '(0,0)' AS "Value";
  61  Label  | Value
  62 --------+-------
  63  Origin | (0,0)
  64 (1 row)
  65 </screen>
  66
  67 has two literal constants, of type <type>text</type> and <type>point</type>.
  68 If a type is not specified for a string literal, then the placeholder type
  69 <firstterm>unknown</firstterm> is assigned initially, to be resolved in later
  70 stages as described below.
  71 </para>
  72
  73 <para>
  74 There are four fundamental <acronym>SQL</acronym> constructs requiring
  75 distinct type conversion rules in the <productname>PostgreSQL</productname>
  76 parser:
  77 </para>
  78
  79 <variablelist>
  80 <varlistentry>
  81 <term>
  82 Operators
  83 </term>
  84 <listitem>
  85 <para>
  86 <productname>PostgreSQL</productname> allows expressions with
  87 prefix and postfix unary (one-argument) operators,
  88 as well as binary (two-argument) operators.
  89 </para>
  90 </listitem>
  91 </varlistentry>
  92 <varlistentry>
  93 <term>
  94 Function calls
  95 </term>
  96 <listitem>
  97 <para>
  98 Much of the <productname>PostgreSQL</productname> type system is built around a
  99 rich set of functions. Function calls have one or more arguments which, for
 100 any specific query, must be matched to the functions available in the system
 101 catalog.  Since <productname>PostgreSQL</productname> permits function
 102 overloading, the function name alone does not uniquely identify the function
 103 to be called; the parser must select the right function based on the data
 104 types of the supplied arguments.
 105 </para>
 106 </listitem>
 107 </varlistentry>
 108 <varlistentry>
 109 <term>
 110 Query targets
 111 </term>
 112 <listitem>
 113 <para>
 114 <acronym>SQL</acronym> <command>INSERT</command> and <command>UPDATE</command> statements place the results of
 115 expressions into a table. The expressions in the query must be matched up
 116 with, and perhaps converted to, the types of the target columns.
 117 </para>
 118 </listitem>
 119 </varlistentry>
 120 <varlistentry>
 121 <term>
 122 <literal>UNION</literal> and <literal>CASE</literal> constructs
 123 </term>
 124 <listitem>
 125 <para>
 126 Since all select results from a unionized <literal>SELECT</literal> statement must appear in a single
 127 set of columns, the types of the results
 128 of each <literal>SELECT</> clause must be matched up and converted to a uniform set.
 129 Similarly, the result expressions of a <literal>CASE</> construct must be coerced to
 130 a common type so that the <literal>CASE</> expression as a whole has a known output type.
 131 </para>
 132 </listitem>
 133 </varlistentry>
 134 </variablelist>
 135
 136 <para>
 137 Many of the general type conversion rules use simple conventions built on
 138 the <productname>PostgreSQL</productname> function and operator system tables.
 139 There are some heuristics included in the conversion rules to better support
 140 conventions for the <acronym>SQL</acronym> standard native types such as
 141 <type>smallint</type>, <type>integer</type>, and <type>real</type>.
 142 </para>
 143
 144 <para>
 145 The <productname>PostgreSQL</productname> parser uses the convention that all
 146 type conversion functions take a single argument of the source type and are
 147 named with the same name as the target type. Any function meeting these
 148 criteria is considered to be a valid conversion function, and may be used
 149 by the parser as such. This simple assumption gives the parser the power
 150 to explore type conversion possibilities without hardcoding, allowing
 151 extended user-defined types to use these same features transparently.
 152 </para>
 153
 154 <para>
 155 An additional heuristic is provided in the parser to allow better guesses
 156 at proper behavior for <acronym>SQL</acronym> standard types. There are
 157 several basic <firstterm>type categories</firstterm> defined: <type>boolean</type>,
 158 <type>numeric</type>, <type>string</type>, <type>bitstring</type>, <type>datetime</type>, <type>timespan</type>, <type>geometric</type>, <type>network</type>,
 159 and user-defined. Each category, with the exception of user-defined, has
 160 a <firstterm>preferred type</firstterm> which is preferentially selected
 161 when there is ambiguity.
 162 In the user-defined category, each type is its own preferred type.
 163 Ambiguous expressions (those with multiple candidate parsing solutions)
 164 can often be resolved when there are multiple possible built-in types, but
 165 they will raise an error when there are multiple choices for user-defined
 166 types.
 167 </para>
 168
 169 <para>
 170 All type conversion rules are designed with several principles in mind:
 171
 172 <itemizedlist mark="bullet" spacing="compact">
 173 <listitem>
 174 <para>
 175 Implicit conversions should never have surprising or unpredictable outcomes.
 176 </para>
 177 </listitem>
 178
 179 <listitem>
 180 <para>
 181 User-defined types, of which the parser has no a-priori knowledge, should be
 182 <quote>higher</quote> in the type hierarchy. In mixed-type expressions, native types shall always
 183 be converted to a user-defined type (of course, only if conversion is necessary).
 184 </para>
 185 </listitem>
 186
 187 <listitem>
 188 <para>
 189 User-defined types are not related. Currently, <productname>PostgreSQL</productname>
 190 does not have information available to it on relationships between types, other than
 191 hardcoded heuristics for built-in types and implicit relationships based on available functions
 192 in the catalog.
 193 </para>
 194 </listitem>
 195
 196 <listitem>
 197 <para>
 198 There should be no extra overhead from the parser or executor
 199 if a query does not need implicit type conversion.
 200 That is, if a query is well formulated and the types already match up, then the query should proceed
 201 without spending extra time in the parser and without introducing unnecessary implicit conversion
 202 functions into the query.
 203 </para>
 204
 205 <para>
 206 Additionally, if a query usually requires an implicit conversion for a function, and
 207 if then the user defines an explicit function with the correct argument types, the parser
 208 should use this new function and will no longer do the implicit conversion using the old function.
 209 </para>
 210 </listitem>
 211 </itemizedlist>
 212 </para>
 213
 214 </sect1>
 215
 216 <sect1 id="typeconv-oper">
 217 <title>Operators</title>
 218
 219   <para>
 220    The operand types of an operator invocation are resolved following
 221    the procedure below.  Note that this procedure is indirectly affected
 222    by the precedence of the involved operators.  See <xref
 223    linkend="sql-precedence"> for more information.
 224   </para>
 225
 226 <procedure>
 227 <title>Operand Type Resolution</title>
 228
 229 <step performance="required">
 230 <para>
 231 Check for an exact match in the <classname>pg_operator</classname> system catalog.
 232 </para>
 233
 234 <substeps>
 235 <step performance="optional">
 236 <para>
 237 If one argument of a binary operator is <type>unknown</type> type,
 238 then assume it is the same type as the other argument for this check.
 239 Other cases involving <type>unknown</type> will never find a match at
 240 this step.
 241 </para>
 242 </step>
 243 </substeps>
 244 </step>
 245
 246 <step performance="required">
 247 <para>
 248 Look for the best match.
 249 </para>
 250 <substeps>
 251 <step performance="required">
 252 <para>
 253 Make a list of all operators of the same name for which the input types
 254 match or can be coerced to match.  (<type>unknown</type> literals are
 255 assumed to be coercible to anything for this purpose.)  If there is only
 256 one, use it; else continue to the next step.
 257 </para>
 258 </step>
 259 <step performance="required">
 260 <para>
 261 Run through all candidates and keep those with the most exact matches
 262 on input types.  Keep all candidates if none have any exact matches.
 263 If only one candidate remains, use it; else continue to the next step.
 264 </para>
 265 </step>
 266 <step performance="required">
 267 <para>
 268 Run through all candidates and keep those with the most exact or
 269 binary-compatible matches on input types.  Keep all candidates if none have
 270 any exact or binary-compatible matches.
 271 If only one candidate remains, use it; else continue to the next step.
 272 </para>
 273 </step>
 274 <step performance="required">
 275 <para>
 276 Run through all candidates and keep those that accept preferred types at
 277 the most positions where type coercion will be required.
 278 Keep all candidates if none accept preferred types.
 279 If only one candidate remains, use it; else continue to the next step.
 280 </para>
 281 </step>
 282 <step performance="required">
 283 <para>
 284 If any input arguments are <quote>unknown</quote>, check the type
 285 categories accepted at those argument positions by the remaining
 286 candidates.  At each position, select the "string" category if any
 287 candidate accepts that category (this bias towards string is appropriate
 288 since an unknown-type literal does look like a string). Otherwise, if
 289 all the remaining candidates accept the same type category, select that
 290 category; otherwise fail because the correct choice cannot be deduced
 291 without more clues.  Also note whether any of the candidates accept a
 292 preferred data type within the selected category. Now discard operator
 293 candidates that do not accept the selected type category; furthermore,
 294 if any candidate accepts a preferred type at a given argument position,
 295 discard candidates that accept non-preferred types for that argument.
 296 </para>
 297 </step>
 298 <step performance="required">
 299 <para>
 300 If only one candidate remains, use it.  If no candidate or more than one
 301 candidate remains,
 302 then fail.
 303 </para>
 304 </step>
 305 </substeps>
 306 </step>
 307 </procedure>
 308
 309 <bridgehead renderas="sect2">Examples</bridgehead>
 310
 311 <example>
 312 <title>Exponentiation Operator Type Resolution</title>
 313
 314 <para>
 315 There is only one exponentiation
 316 operator defined in the catalog, and it takes arguments of type
 317 <type>double precision</type>.
 318 The scanner assigns an initial type of <type>integer</type> to both arguments
 319 of this query expression:
 320 <screen>
 321 tgl=> SELECT 2 ^ 3 AS "Exp";
 322  Exp
 323 -----
 324    8
 325 (1 row)
 326 </screen>
 327
 328 So the parser does a type conversion on both operands and the query
 329 is equivalent to
 330
 331 <screen>
 332 tgl=> SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "Exp";
 333  Exp
 334 -----
 335    8
 336 (1 row)
 337 </screen>
 338
 339 or
 340
 341 <screen>
 342 tgl=> SELECT 2.0 ^ 3.0 AS "Exp";
 343  Exp
 344 -----
 345    8
 346 (1 row)
 347 </screen>
 348
 349 <note>
 350 <para>
 351 This last form has the least overhead, since no functions are called to do
 352 implicit type conversion. This is not an issue for small queries, but may
 353 have an impact on the performance of queries involving large tables.
 354 </para>
 355 </note>
 356 </para>
 357 </example>
 358
 359 <example>
 360 <title>String Concatenation Operator Type Resolution</title>
 361
 362 <para>
 363 A string-like syntax is used for working with string types as well as for
 364 working with complex extended types.
 365 Strings with unspecified type are matched with likely operator candidates.
 366 </para>
 367
 368 <para>
 369 An example with one unspecified argument:
 370 <screen>
 371 tgl=> SELECT text 'abc' || 'def' AS "Text and Unknown";
 372  Text and Unknown
 373 ------------------
 374  abcdef
 375 (1 row)
 376 </screen>
 377 </para>
 378
 379 <para>
 380 In this case the parser looks to see if there is an operator taking <type>text</type>
 381 for both arguments. Since there is, it assumes that the second argument should
 382 be interpreted as of type <type>text</type>.
 383 </para>
 384
 385 <para>
 386 Concatenation on unspecified types:
 387 <screen>
 388 tgl=> SELECT 'abc' || 'def' AS "Unspecified";
 389  Unspecified
 390 -------------
 391  abcdef
 392 (1 row)
 393 </screen>
 394 </para>
 395
 396 <para>
 397 In this case there is no initial hint for which type to use, since no types
 398 are specified in the query. So, the parser looks for all candidate operators
 399 and finds that there are candidates accepting both string-category and
 400 bit-string-category inputs.  Since string category is preferred when available,
 401 that category is selected, and then the
 402 <quote>preferred type</quote> for strings, <type>text</type>, is used as the specific
 403 type to resolve the unknown literals to.
 404 </para>
 405 </example>
 406
 407 <example>
 408 <title>Absolute-Value and Factorial Operator Type Resolution</title>
 409
 410 <para>
 411 The <productname>PostgreSQL</productname> operator catalog has several
 412 entries for the prefix operator <literal>@</>, all of which implement
 413 absolute-value operations for various numeric datatypes.  One of these
 414 entries is for type <type>float8</type>, which is the preferred type in
 415 the numeric category.  Therefore, <productname>PostgreSQL</productname>
 416 will use that entry when faced with a non-numeric input:
 417 <screen>
 418 tgl=> select @ text '-4.5' as "abs";
 419  abs
 420 -----
 421  4.5
 422 (1 row)
 423 </screen>
 424 Here the system has performed an implicit text-to-float8 conversion
 425 before applying the chosen operator.  We can verify that float8 and
 426 not some other type was used:
 427 <screen>
 428 tgl=> select @ text '-4.5e500' as "abs";
 429 ERROR:  Input '-4.5e500' is out of range for float8
 430 </screen>
 431 </para>
 432
 433 <para>
 434 On the other hand, the postfix operator <literal>!</> (factorial)
 435 is defined only for integer datatypes, not for float8.  So, if we
 436 try a similar case with <literal>!</>, we get:
 437 <screen>
 438 tgl=> select text '44' ! as "factorial";
 439 ERROR:  Unable to identify a postfix operator '!' for type 'text'
 440         You may need to add parentheses or an explicit cast
 441 </screen>
 442 This happens because the system can't decide which of the several
 443 possible <literal>!</> operators should be preferred.  We can help
 444 it out with an explicit cast:
 445 <screen>
 446 tgl=> select cast(text '44' as int8) ! as "factorial";
 447       factorial
 448 ---------------------
 449  2673996885588443136
 450 (1 row)
 451 </screen>
 452 </para>
 453 </example>
 454
 455 </sect1>
 456
 457 <sect1 id="typeconv-func">
 458 <title>Functions</title>
 459
 460   <para>
 461    The argument types of function calls are resolved according to the
 462    following steps.
 463   </para>
 464
 465 <procedure>
 466 <title>Function Argument Type Resolution</title>
 467
 468 <step performance="required">
 469 <para>
 470 Check for an exact match in the <classname>pg_proc</classname> system catalog.
 471 (Cases involving <type>unknown</type> will never find a match at
 472 this step.)
 473 </para></step>
 474 <step performance="required">
 475 <para>
 476 If no exact match appears in the catalog, see whether the function call appears
 477 to be a trivial type coercion request.  This happens if the function call
 478 has just one argument and the function name is the same as the (internal)
 479 name of some data type.  Furthermore, the function argument must be either
 480 an unknown-type literal or a type that is binary-compatible with the named
 481 data type.  When these conditions are met, the function argument is coerced
 482 to the named data type without any explicit function call.
 483 </para>
 484 </step>
 485 <step performance="required">
 486 <para>
 487 Look for the best match.
 488 </para>
 489 <substeps>
 490 <step performance="required">
 491 <para>
 492 Make a list of all functions of the same name with the same number of
 493 arguments for which the input types
 494 match or can be coerced to match.  (<type>unknown</type> literals are
 495 assumed to be coercible to anything for this purpose.)  If there is only
 496 one, use it; else continue to the next step.
 497 </para>
 498 </step>
 499 <step performance="required">
 500 <para>
 501 Run through all candidates and keep those with the most exact matches
 502 on input types.  Keep all candidates if none have any exact matches.
 503 If only one candidate remains, use it; else continue to the next step.
 504 </para>
 505 </step>
 506 <step performance="required">
 507 <para>
 508 Run through all candidates and keep those with the most exact or
 509 binary-compatible matches on input types.  Keep all candidates if none have
 510 any exact or binary-compatible matches.
 511 If only one candidate remains, use it; else continue to the next step.
 512 </para>
 513 </step>
 514 <step performance="required">
 515 <para>
 516 Run through all candidates and keep those that accept preferred types at
 517 the most positions where type coercion will be required.
 518 Keep all candidates if none accept preferred types.
 519 If only one candidate remains, use it; else continue to the next step.
 520 </para>
 521 </step>
 522 <step performance="required">
 523 <para>
 524 If any input arguments are <type>unknown</type>, check the type categories accepted
 525 at those argument positions by the remaining candidates.  At each position,
 526 select the <type>string</type> category if any candidate accepts that category
 527 (this bias towards string
 528 is appropriate since an unknown-type literal does look like a string).
 529 Otherwise, if all the remaining candidates accept the same type category,
 530 select that category; otherwise fail because
 531 the correct choice cannot be deduced without more clues.  Also note whether
 532 any of the candidates accept a preferred data type within the selected category.
 533 Now discard candidates that do not accept the selected type category;
 534 furthermore, if any candidate accepts a preferred type at a given argument
 535 position, discard candidates that accept non-preferred types for that
 536 argument.
 537 </para>
 538 </step>
 539 <step performance="required">
 540 <para>
 541 If only one candidate remains, use it.  If no candidate or more than one
 542 candidate remains,
 543 then fail.
 544 </para>
 545 </step>
 546 </substeps>
 547 </step>
 548 </procedure>
 549
 550 <bridgehead renderas="sect2">Examples</bridgehead>
 551
 552 <example>
 553 <title>Factorial Function Argument Type Resolution</title>
 554
 555 <para>
 556 There is only one <function>int4fac</function> function defined in the
 557 <classname>pg_proc</classname> catalog.
 558 So the following query automatically converts the <type>int2</type> argument
 559 to <type>int4</type>:
 560
 561 <screen>
 562 tgl=> SELECT int4fac(int2 '4');
 563  int4fac
 564 ---------
 565       24
 566 (1 row)
 567 </screen>
 568
 569 and is actually transformed by the parser to
 570 <screen>
 571 tgl=> SELECT int4fac(int4(int2 '4'));
 572  int4fac
 573 ---------
 574       24
 575 (1 row)
 576 </screen>
 577 </para>
 578 </example>
 579
 580 <example>
 581 <title>Substring Function Type Resolution</title>
 582
 583 <para>
 584 There are two <function>substr</function> functions declared in <classname>pg_proc</classname>. However,
 585 only one takes two arguments, of types <type>text</type> and <type>int4</type>.
 586 </para>
 587
 588 <para>
 589 If called with a string constant of unspecified type, the type is matched up
 590 directly with the only candidate function type:
 591 <screen>
 592 tgl=> SELECT substr('1234', 3);
 593  substr
 594 --------
 595      34
 596 (1 row)
 597 </screen>
 598 </para>
 599
 600 <para>
 601 If the string is declared to be of type <type>varchar</type>, as might be the case
 602 if it comes from a table, then the parser will try to coerce it to become <type>text</type>:
 603 <screen>
 604 tgl=> SELECT substr(varchar '1234', 3);
 605  substr
 606 --------
 607      34
 608 (1 row)
 609 </screen>
 610 which is transformed by the parser to become
 611 <screen>
 612 tgl=> SELECT substr(text(varchar '1234'), 3);
 613  substr
 614 --------
 615      34
 616 (1 row)
 617 </screen>
 618 </para>
 619 <para>
 620 <note>
 621 <para>
 622 Actually, the parser is aware that <type>text</type> and <type>varchar</type>
 623 are <firstterm>binary-compatible</>, meaning that one can be passed to a function that
 624 accepts the other without doing any physical conversion.  Therefore, no
 625 explicit type conversion call is really inserted in this case.
 626 </para>
 627 </note>
 628 </para>
 629
 630 <para>
 631 And, if the function is called with an <type>int4</type>, the parser will
 632 try to convert that to <type>text</type>:
 633 <screen>
 634 tgl=> SELECT substr(1234, 3);
 635  substr
 636 --------
 637      34
 638 (1 row)
 639 </screen>
 640 which actually executes as
 641 <screen>
 642 tgl=> SELECT substr(text(1234), 3);
 643  substr
 644 --------
 645      34
 646 (1 row)
 647 </screen>
 648 This succeeds because there is a conversion function text(int4) in the
 649 system catalog.
 650 </para>
 651 </example>
 652
 653 </sect1>
 654
 655 <sect1 id="typeconv-query">
 656 <title>Query Targets</title>
 657
 658   <para>
 659    Values to be inserted into a table are coerced to the destination
 660    column's datatype according to the
 661    following steps.
 662   </para>
 663
 664 <procedure>
 665 <title>Query Target Type Resolution</title>
 666
 667 <step performance="required">
 668 <para>
 669 Check for an exact match with the target.
 670 </para></step>
 671 <step performance="required">
 672 <para>
 673 Otherwise, try to coerce the expression to the target type.  This will succeed
 674 if the two types are known binary-compatible, or if there is a conversion
 675 function.  If the expression is an unknown-type literal, the contents of
 676 the literal string will be fed to the input conversion routine for the target
 677 type.
 678 </para></step>
 679
 680 <step performance="required">
 681 <para>
 682 If the target is a fixed-length type (e.g. <type>char</type> or <type>varchar</type>
 683 declared with a length) then try to find a sizing function for the target
 684 type.  A sizing function is a function of the same name as the type,
 685 taking two arguments of which the first is that type and the second is an
 686 integer, and returning the same type.  If one is found, it is applied,
 687 passing the column's declared length as the second parameter.
 688 </para></step>
 689
 690 </procedure>
 691
 692 <example>
 693 <title><type>character</type> Storage Type Conversion</title>
 694
 695 <para>
 696 For a target column declared as <type>character(20)</type> the following query
 697 ensures that the target is sized correctly:
 698
 699 <screen>
 700 tgl=> CREATE TABLE vv (v character(20));
 701 CREATE
 702 tgl=> INSERT INTO vv SELECT 'abc' || 'def';
 703 INSERT 392905 1
 704 tgl=> SELECT v, length(v) FROM vv;
 705           v           | length
 706 ----------------------+--------
 707  abcdef               |     20
 708 (1 row)
 709 </screen>
 710
 711 What has really happened here is that the two unknown literals are resolved
 712 to <type>text</type> by default, allowing the <literal>||</literal> operator
 713 to be resolved as <type>text</type> concatenation.  Then the <type>text</type>
 714 result of the operator is coerced to <type>bpchar</type> (<quote>blank-padded
 715 char</>, the internal name of the character datatype) to match the target
 716 column type.  (Since the parser knows that <type>text</type> and
 717 <type>bpchar</type> are binary-compatible, this coercion is implicit and does
 718 not insert any real function call.)  Finally, the sizing function
 719 <literal>bpchar(bpchar, integer)</literal> is found in the system catalogs
 720 and applied to the operator's result and the stored column length.  This
 721 type-specific function performs the required length check and addition of
 722 padding spaces.
 723 </para>
 724 </example>
 725 </sect1>
 726
 727 <sect1 id="typeconv-union-case">
 728 <title><literal>UNION</> and <literal>CASE</> Constructs</title>
 729
 730 <para>
 731 SQL <literal>UNION</> constructs must match up possibly dissimilar types to
 732 become a single result set.  The resolution algorithm is applied separately
 733 to each output column of a union query.  The <literal>INTERSECT</> and
 734 <literal>EXCEPT</> constructs resolve dissimilar types in the same way as
 735 <literal>UNION</>.
 736 A <literal>CASE</> construct also uses the identical algorithm to match up its
 737 component expressions and select a result datatype.
 738 </para>
 739 <procedure>
 740 <title><literal>UNION</> and <literal>CASE</> Type Resolution</title>
 741
 742 <step performance="required">
 743 <para>
 744 If all inputs are of type <type>unknown</type>, resolve as type
 745 <type>text</type> (the preferred type for string category).
 746 Otherwise, ignore the <type>unknown</type> inputs while choosing the type.
 747 </para></step>
 748
 749 <step performance="required">
 750 <para>
 751 If the non-unknown inputs are not all of the same type category, fail.
 752 </para></step>
 753
 754 <step performance="required">
 755 <para>
 756 If one or more non-unknown inputs are of a preferred type in that category,
 757 resolve as that type.
 758 </para></step>
 759
 760 <step performance="required">
 761 <para>
 762 Otherwise, resolve as the type of the first non-unknown input.
 763 </para></step>
 764
 765 <step performance="required">
 766 <para>
 767 Coerce all inputs to the selected type.
 768 </para></step>
 769 </procedure>
 770
 771 <bridgehead renderas="sect2">Examples</bridgehead>
 772
 773 <example>
 774 <title>Underspecified Types in a Union</title>
 775
 776 <para>
 777 <screen>
 778 tgl=> SELECT text 'a' AS "Text" UNION SELECT 'b';
 779  Text
 780 ------
 781  a
 782  b
 783 (2 rows)
 784 </screen>
 785 Here, the unknown-type literal <literal>'b'</literal> will be resolved as type text.
 786 </para>
 787 </example>
 788
 789 <example>
 790 <title>Type Conversion in a Simple Union</title>
 791
 792 <para>
 793 <screen>
 794 tgl=> SELECT 1.2 AS "Double" UNION SELECT 1;
 795  Double
 796 --------
 797       1
 798     1.2
 799 (2 rows)
 800 </screen>
 801 The literal <literal>1.2</> is of type <type>double precision</>,
 802 the preferred type in the numeric category, so that type is used.
 803 </para>
 804 </example>
 805
 806 <example>
 807 <title>Type Conversion in a Transposed Union</title>
 808
 809 <para>
 810 Here the output type of the union is forced to match the type of
 811 the first clause in the union:
 812
 813 <screen>
 814 tgl=> SELECT 1 AS "All integers"
 815 tgl-> UNION SELECT CAST('2.2' AS REAL);
 816  All integers
 817 --------------
 818             1
 819             2
 820 (2 rows)
 821 </screen>
 822 </para>
 823 <para>
 824 Since <type>REAL</type> is not a preferred type, the parser sees no reason
 825 to select it over <type>INTEGER</type> (which is what the 1 is), and instead
 826 falls back on the use-the-first-alternative rule.
 827 This example demonstrates that the preferred-type mechanism doesn't encode
 828 as much information as we'd like.  Future versions of
 829 <productname>PostgreSQL</productname> may support a more general notion of
 830 type preferences.
 831 </para>
 832 </example>
 833
 834 </sect1>
 835 </chapter>
 836
 837 <!-- Keep this comment at the end of the file
 838 Local variables:
 839 mode:sgml
 840 sgml-omittag:t
 841 sgml-shorttag:t
 842 sgml-minimize-attributes:nil
 843 sgml-always-quote-attributes:t
 844 sgml-indent-step:1
 845 sgml-indent-data:t
 846 sgml-parent-document:nil
 847 sgml-default-dtd-file:"./reference.ced"
 848 sgml-exposed-tags:nil
 849 sgml-local-catalogs:("/usr/lib/sgml/catalog")
 850 sgml-local-ecat-files:nil
 851 End:
 852 -->