1 <!-- $PostgreSQL: pgsql/doc/src/sgml/array.sgml,v 1.36 2004/08/05 03:29:11 joe Exp $ -->
7 <primary>array</primary>
11 <productname>PostgreSQL</productname> allows columns of a table to be
12 defined as variable-length multidimensional arrays. Arrays of any
13 built-in or user-defined base type can be created. (Arrays of
14 composite types or domains are not yet supported, however.)
18 <title>Declaration of Array Types</title>
21 To illustrate the use of array types, we create this table:
23 CREATE TABLE sal_emp (
25 pay_by_quarter integer[],
29 As shown, an array data type is named by appending square brackets
30 (<literal>[]</>) to the data type name of the array elements. The
31 above command will create a table named
32 <structname>sal_emp</structname> with a column of type
33 <type>text</type> (<structfield>name</structfield>), a
34 one-dimensional array of type <type>integer</type>
35 (<structfield>pay_by_quarter</structfield>), which represents the
36 employee's salary by quarter, and a two-dimensional array of
37 <type>text</type> (<structfield>schedule</structfield>), which
38 represents the employee's weekly schedule.
42 The syntax for <command>CREATE TABLE</command> allows the exact size of
43 arrays to be specified, for example:
46 CREATE TABLE tictactoe (
51 However, the current implementation does not enforce the array size
52 limits --- the behavior is the same as for arrays of unspecified
57 Actually, the current implementation does not enforce the declared
58 number of dimensions either. Arrays of a particular element type are
59 all considered to be of the same type, regardless of size or number
60 of dimensions. So, declaring number of dimensions or sizes in
61 <command>CREATE TABLE</command> is simply documentation, it does not
62 affect runtime behavior.
66 An alternative, SQL99-standard syntax may be used for one-dimensional arrays.
67 <structfield>pay_by_quarter</structfield> could have been defined as:
69 pay_by_quarter integer ARRAY[4],
71 This syntax requires an integer constant to denote the array size.
72 As before, however, <productname>PostgreSQL</> does not enforce the
78 <title>Array Value Input</title>
81 <primary>array</primary>
82 <secondary>constant</secondary>
86 To write an array value as a literal constant, enclose the element
87 values within curly braces and separate them by commas. (If you
88 know C, this is not unlike the C syntax for initializing
89 structures.) You may put double quotes around any element value,
90 and must do so if it contains commas or curly braces. (More
91 details appear below.) Thus, the general format of an array
92 constant is the following:
94 '{ <replaceable>val1</replaceable> <replaceable>delim</replaceable> <replaceable>val2</replaceable> <replaceable>delim</replaceable> ... }'
96 where <replaceable>delim</replaceable> is the delimiter character
97 for the type, as recorded in its <literal>pg_type</literal> entry.
98 (For all built-in types, this is the comma character
99 <quote><literal>,</literal></>.) Each
100 <replaceable>val</replaceable> is either a constant of the array
101 element type, or a subarray. An example of an array constant is
103 '{{1,2,3},{4,5,6},{7,8,9}}'
105 This constant is a two-dimensional, 3-by-3 array consisting of
106 three subarrays of integers.
110 (These kinds of array constants are actually only a special case of
111 the generic type constants discussed in <xref
112 linkend="sql-syntax-constants-generic">. The constant is initially
113 treated as a string and passed to the array input conversion
114 routine. An explicit type specification might be necessary.)
118 Now we can show some <command>INSERT</command> statements.
123 '{10000, 10000, 10000, 10000}',
124 '{{"meeting", "lunch"}, {"meeting"}}');
125 ERROR: multidimensional arrays must have array expressions with matching dimensions
128 Note that multidimensional arrays must have matching extents for each
129 dimension. A mismatch causes an error report.
134 '{10000, 10000, 10000, 10000}',
135 '{{"meeting", "lunch"}, {"training", "presentation"}}');
139 '{20000, 25000, 25000, 25000}',
140 '{{"breakfast", "consulting"}, {"meeting", "lunch"}}');
145 A limitation of the present array implementation is that individual
146 elements of an array cannot be SQL null values. The entire array
147 can be set to null, but you can't have an array with some elements
152 The result of the previous two inserts looks like this:
154 SELECT * FROM sal_emp;
155 name | pay_by_quarter | schedule
156 -------+---------------------------+-------------------------------------------
157 Bill | {10000,10000,10000,10000} | {{meeting,lunch},{training,presentation}}
158 Carol | {20000,25000,25000,25000} | {{breakfast,consulting},{meeting,lunch}}
164 The <literal>ARRAY</literal> expression syntax may also be used:
168 ARRAY[10000, 10000, 10000, 10000],
169 ARRAY[['meeting', 'lunch'], ['training', 'presentation']]);
173 ARRAY[20000, 25000, 25000, 25000],
174 ARRAY[['breakfast', 'consulting'], ['meeting', 'lunch']]);
176 Notice that the array elements are ordinary SQL constants or
177 expressions; for instance, string literals are single quoted, instead of
178 double quoted as they would be in an array literal. The <literal>ARRAY</>
179 expression syntax is discussed in more detail in <xref
180 linkend="sql-syntax-array-constructors">.
185 <title>Accessing Arrays</title>
188 Now, we can run some queries on the table.
189 First, we show how to access a single element of an array at a time.
190 This query retrieves the names of the employees whose pay changed in
194 SELECT name FROM sal_emp WHERE pay_by_quarter[1] <> pay_by_quarter[2];
202 The array subscript numbers are written within square brackets.
203 By default <productname>PostgreSQL</productname> uses the
204 one-based numbering convention for arrays, that is,
205 an array of <replaceable>n</> elements starts with <literal>array[1]</literal> and
206 ends with <literal>array[<replaceable>n</>]</literal>.
210 This query retrieves the third quarter pay of all employees:
213 SELECT pay_by_quarter[3] FROM sal_emp;
224 We can also access arbitrary rectangular slices of an array, or
225 subarrays. An array slice is denoted by writing
226 <literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>
227 for one or more array dimensions. For example, this query retrieves the first
228 item on Bill's schedule for the first two days of the week:
231 SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';
234 ------------------------
235 {{meeting},{training}}
239 We could also have written
242 SELECT schedule[1:2][1] FROM sal_emp WHERE name = 'Bill';
245 with the same result. An array subscripting operation is always taken to
246 represent an array slice if any of the subscripts are written in the form
247 <literal><replaceable>lower</replaceable>:<replaceable>upper</replaceable></literal>.
248 A lower bound of 1 is assumed for any subscript where only one value
249 is specified, as in this example:
251 SELECT schedule[1:2][2] FROM sal_emp WHERE name = 'Bill';
254 -------------------------------------------
255 {{meeting,lunch},{training,presentation}}
261 The current dimensions of any array value can be retrieved with the
262 <function>array_dims</function> function:
265 SELECT array_dims(schedule) FROM sal_emp WHERE name = 'Carol';
273 <function>array_dims</function> produces a <type>text</type> result,
274 which is convenient for people to read but perhaps not so convenient
275 for programs. Dimensions can also be retrieved with
276 <function>array_upper</function> and <function>array_lower</function>,
277 which return the upper and lower bound of a
278 specified array dimension, respectively.
281 SELECT array_upper(schedule, 1) FROM sal_emp WHERE name = 'Carol';
292 <title>Modifying Arrays</title>
295 An array value can be replaced completely:
298 UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}'
299 WHERE name = 'Carol';
302 or using the <literal>ARRAY</literal> expression syntax:
305 UPDATE sal_emp SET pay_by_quarter = ARRAY[25000,25000,27000,27000]
306 WHERE name = 'Carol';
309 An array may also be updated at a single element:
312 UPDATE sal_emp SET pay_by_quarter[4] = 15000
316 or updated in a slice:
319 UPDATE sal_emp SET pay_by_quarter[1:2] = '{27000,27000}'
320 WHERE name = 'Carol';
326 A stored array value can be enlarged by assigning to an element adjacent to
327 those already present, or by assigning to a slice that is adjacent
328 to or overlaps the data already present. For example, if array
329 <literal>myarray</> currently has 4 elements, it will have five
330 elements after an update that assigns to <literal>myarray[5]</>.
331 Currently, enlargement in this fashion is only allowed for one-dimensional
332 arrays, not multidimensional arrays.
336 Array slice assignment allows creation of arrays that do not use one-based
337 subscripts. For example one might assign to <literal>myarray[-2:7]</> to
338 create an array with subscript values running from -2 to 7.
342 New array values can also be constructed by using the concatenation operator,
343 <literal>||</literal>.
345 SELECT ARRAY[1,2] || ARRAY[3,4];
351 SELECT ARRAY[5,6] || ARRAY[[1,2],[3,4]];
353 ---------------------
360 The concatenation operator allows a single element to be pushed on to the
361 beginning or end of a one-dimensional array. It also accepts two
362 <replaceable>N</>-dimensional arrays, or an <replaceable>N</>-dimensional
363 and an <replaceable>N+1</>-dimensional array.
367 When a single element is pushed on to the beginning of a one-dimensional
368 array, the result is an array with a lower bound subscript equal to
369 the right-hand operand's lower bound subscript, minus one. When a single
370 element is pushed on to the end of a one-dimensional array, the result is
371 an array retaining the lower bound of the left-hand operand. For example:
373 SELECT array_dims(1 || ARRAY[2,3]);
379 SELECT array_dims(ARRAY[1,2] || 3);
388 When two arrays with an equal number of dimensions are concatenated, the
389 result retains the lower bound subscript of the left-hand operand's outer
390 dimension. The result is an array comprising every element of the left-hand
391 operand followed by every element of the right-hand operand. For example:
393 SELECT array_dims(ARRAY[1,2] || ARRAY[3,4,5]);
399 SELECT array_dims(ARRAY[[1,2],[3,4]] || ARRAY[[5,6],[7,8],[9,0]]);
408 When an <replaceable>N</>-dimensional array is pushed on to the beginning
409 or end of an <replaceable>N+1</>-dimensional array, the result is
410 analogous to the element-array case above. Each <replaceable>N</>-dimensional
411 sub-array is essentially an element of the <replaceable>N+1</>-dimensional
412 array's outer dimension. For example:
414 SELECT array_dims(ARRAY[1,2] || ARRAY[[3,4],[5,6]]);
423 An array can also be constructed by using the functions
424 <function>array_prepend</function>, <function>array_append</function>,
425 or <function>array_cat</function>. The first two only support one-dimensional
426 arrays, but <function>array_cat</function> supports multidimensional arrays.
428 Note that the concatenation operator discussed above is preferred over
429 direct use of these functions. In fact, the functions are primarily for use
430 in implementing the concatenation operator. However, they may be directly
431 useful in the creation of user-defined aggregates. Some examples:
434 SELECT array_prepend(1, ARRAY[2,3]);
440 SELECT array_append(ARRAY[1,2], 3);
446 SELECT array_cat(ARRAY[1,2], ARRAY[3,4]);
452 SELECT array_cat(ARRAY[[1,2],[3,4]], ARRAY[5,6]);
454 ---------------------
458 SELECT array_cat(ARRAY[5,6], ARRAY[[1,2],[3,4]]);
460 ---------------------
467 <title>Searching in Arrays</title>
470 To search for a value in an array, you must check each value of the
471 array. This can be done by hand, if you know the size of the array.
475 SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
476 pay_by_quarter[2] = 10000 OR
477 pay_by_quarter[3] = 10000 OR
478 pay_by_quarter[4] = 10000;
481 However, this quickly becomes tedious for large arrays, and is not
482 helpful if the size of the array is uncertain. An alternative method is
483 described in <xref linkend="functions-comparisons">. The above
484 query could be replaced by:
487 SELECT * FROM sal_emp WHERE 10000 = ANY (pay_by_quarter);
490 In addition, you could find rows where the array had all values
494 SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);
501 Arrays are not sets; searching for specific array elements
502 may be a sign of database misdesign. Consider
503 using a separate table with a row for each item that would be an
504 array element. This will be easier to search, and is likely to
505 scale up better to large numbers of elements.
511 <title>Array Input and Output Syntax</title>
514 The external text representation of an array value consists of items that
515 are interpreted according to the I/O conversion rules for the array's
516 element type, plus decoration that indicates the array structure.
517 The decoration consists of curly braces (<literal>{</> and <literal>}</>)
518 around the array value plus delimiter characters between adjacent items.
519 The delimiter character is usually a comma (<literal>,</>) but can be
520 something else: it is determined by the <literal>typdelim</> setting
521 for the array's element type. (Among the standard data types provided
522 in the <productname>PostgreSQL</productname> distribution, type
523 <literal>box</> uses a semicolon (<literal>;</>) but all the others
524 use comma.) In a multidimensional array, each dimension (row, plane,
525 cube, etc.) gets its own level of curly braces, and delimiters
526 must be written between adjacent curly-braced entities of the same level.
527 You may write whitespace before a left brace, after a right
528 brace, or before any individual item string. Whitespace after an item
529 is not ignored, however: after skipping leading whitespace, everything
530 up to the next right brace or delimiter is taken as the item value.
534 By default, the lower bound index value of an array's dimensions is
535 set to one. If any of an array's dimensions has a lower bound index not
536 equal to one, an additional decoration that indicates the actual
537 array dimensions will precede the array structure decoration.
538 The decoration consists of square braces (<literal>[</> and <literal>]</>)
539 around each array dimension's lower and upper bound indicies, plus
540 a colon (<literal>:</>) delimiter character inbetween. Delimiting the
541 array dimension decoration from the array structure decoration is a
542 single assignment operator (<literal>=</>). For example:
544 SELECT 1 || ARRAY[2,3] AS array;
551 SELECT ARRAY[1,2] || ARRAY[[3,4]] AS array;
554 --------------------------
555 [0:1][1:2]={{1,2},{3,4}}
561 In a similar fashion, an array with non-default indicies may be specified
562 using the same literal syntax. For example:
564 SELECT f1[1][-2][3] AS e1, f1[1][-1][5] AS e2
565 FROM (SELECT '[1:1][-2:-1][3:5]={{{1,2,3},{4,5,6}}}'::int[] AS f1) AS ss;
575 As shown previously, when writing an array value you may write double
576 quotes around any individual array
577 element. You <emphasis>must</> do so if the element value would otherwise
578 confuse the array-value parser. For example, elements containing curly
579 braces, commas (or whatever the delimiter character is), double quotes,
580 backslashes, or leading white space must be double-quoted. To put a double
581 quote or backslash in a quoted array element value, precede it with a
583 Alternatively, you can use backslash-escaping to protect all data characters
584 that would otherwise be taken as array syntax or ignorable white space.
588 The array output routine will put double quotes around element values
589 if they are empty strings or contain curly braces, delimiter characters,
590 double quotes, backslashes, or white space. Double quotes and backslashes
591 embedded in element values will be backslash-escaped. For numeric
592 data types it is safe to assume that double quotes will never appear, but
593 for textual data types one should be prepared to cope with either presence
594 or absence of quotes. (This is a change in behavior from pre-7.2
595 <productname>PostgreSQL</productname> releases.)
600 Remember that what you write in an SQL command will first be interpreted
601 as a string literal, and then as an array. This doubles the number of
602 backslashes you need. For example, to insert a <type>text</> array
603 value containing a backslash and a double quote, you'd need to write
605 INSERT ... VALUES ('{"\\\\","\\""}');
607 The string-literal processor removes one level of backslashes, so that
608 what arrives at the array-value parser looks like <literal>{"\\","\""}</>.
609 In turn, the strings fed to the <type>text</> data type's input routine
610 become <literal>\</> and <literal>"</> respectively. (If we were working
611 with a data type whose input routine also treated backslashes specially,
612 <type>bytea</> for example, we might need as many as eight backslashes
613 in the command to get one backslash into the stored array element.)
619 The <literal>ARRAY</> constructor syntax is often easier to work with
620 than the array-literal syntax when writing array values in SQL commands.
621 In <literal>ARRAY</>, individual element values are written the same way
622 they would be written when not members of an array.