1 <!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.27 2003/06/25 21:30:25 momjian Exp $ -->
7 <primary>arrays</primary>
11 <productname>PostgreSQL</productname> allows columns of a table to be
12 defined as variable-length multidimensional arrays. Arrays of any
13 built-in type or user-defined type can be created.
17 <title>Declaration of Array Types</title>
20 To illustrate the use of array types, we create this table:
22 CREATE TABLE sal_emp (
24 pay_by_quarter integer[],
28 As shown, an array data type is named by appending square brackets
29 (<literal>[]</>) to the data type name of the array elements. The
30 above command will create a table named
31 <structname>sal_emp</structname> with a column of type
32 <type>text</type> (<structfield>name</structfield>), a
33 one-dimensional array of type <type>integer</type>
34 (<structfield>pay_by_quarter</structfield>), which represents the
35 employee's salary by quarter, and a two-dimensional array of
36 <type>text</type> (<structfield>schedule</structfield>), which
37 represents the employee's weekly schedule.
42 <title>Array Value Input</title>
45 Now we can show some <command>INSERT</command> statements. To write an array
46 value, we enclose the element values within curly braces and separate them
47 by commas. If you know C, this is not unlike the syntax for
48 initializing structures. (More details appear below.)
53 '{10000, 10000, 10000, 10000}',
54 '{{"meeting", "lunch"}, {}}');
58 '{20000, 25000, 25000, 25000}',
59 '{{"talk", "consult"}, {"meeting"}}');
65 A limitation of the present array implementation is that individual
66 elements of an array cannot be SQL null values. The entire array can be set
67 to null, but you can't have an array with some elements null and some
68 not. Fixing this is on the to-do list.
74 <title>Array Value References</title>
77 Now, we can run some queries on the table.
78 First, we show how to access a single element of an array at a time.
79 This query retrieves the names of the employees whose pay changed in
83 SELECT name FROM sal_emp WHERE pay_by_quarter[1] <> pay_by_quarter[2];
91 The array subscript numbers are written within square brackets.
92 By default <productname>PostgreSQL</productname> uses the
93 one-based numbering convention for arrays, that is,
94 an array of <replaceable>n</> elements starts with <literal>array[1]</literal> and
95 ends with <literal>array[<replaceable>n</>]</literal>.
99 This query retrieves the third quarter pay of all employees:
102 SELECT pay_by_quarter[3] FROM sal_emp;
113 We can also access arbitrary rectangular slices of an array, or
114 subarrays. An array slice is denoted by writing
115 <literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>
116 for one or more array dimensions. For example, this query retrieves the first
117 item on Bill's schedule for the first two days of the week:
120 SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';
128 We could also have written
131 SELECT schedule[1:2][1] FROM sal_emp WHERE name = 'Bill';
134 with the same result. An array subscripting operation is always taken to
135 represent an array slice if any of the subscripts are written in the
137 <literal><replaceable>lower</replaceable>:<replaceable>upper</replaceable></literal>.
138 A lower bound of 1 is assumed for any subscript where only one value
143 An array value can be replaced completely:
146 UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}'
147 WHERE name = 'Carol';
150 or updated at a single element:
153 UPDATE sal_emp SET pay_by_quarter[4] = 15000
157 or updated in a slice:
160 UPDATE sal_emp SET pay_by_quarter[1:2] = '{27000,27000}'
161 WHERE name = 'Carol';
166 An array can be enlarged by assigning to an element adjacent to
167 those already present, or by assigning to a slice that is adjacent
168 to or overlaps the data already present. For example, if an array
169 value currently has 4 elements, it will have five elements after an
170 update that assigns to <literal>array[5]</>. Currently, enlargement in
171 this fashion is only allowed for one-dimensional arrays, not
172 multidimensional arrays.
176 Array slice assignment allows creation of arrays that do not use one-based
177 subscripts. For example one might assign to <literal>array[-2:7]</> to
178 create an array with subscript values running from -2 to 7.
182 The syntax for <command>CREATE TABLE</command> allows fixed-length
183 arrays to be defined:
186 CREATE TABLE tictactoe (
187 squares integer[3][3]
191 However, the current implementation does not enforce the array size
192 limits --- the behavior is the same as for arrays of unspecified
197 Actually, the current implementation does not enforce the declared
198 number of dimensions either. Arrays of a particular element type are
199 all considered to be of the same type, regardless of size or number
200 of dimensions. So, declaring number of dimensions or sizes in
201 <command>CREATE TABLE</command> is simply documentation, it does not
202 affect runtime behavior.
206 The current dimensions of any array value can be retrieved with the
207 <function>array_dims</function> function:
210 SELECT array_dims(schedule) FROM sal_emp WHERE name = 'Carol';
218 <function>array_dims</function> produces a <type>text</type> result,
219 which is convenient for people to read but perhaps not so convenient
220 for programs. <function>array_upper</function> and <function>
221 array_lower</function> return the upper/lower bound of the
222 given array dimension, respectively.
227 <title>Searching in Arrays</title>
230 To search for a value in an array, you must check each value of the
231 array. This can be done by hand (if you know the size of the array).
235 SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
236 pay_by_quarter[2] = 10000 OR
237 pay_by_quarter[3] = 10000 OR
238 pay_by_quarter[4] = 10000;
241 However, this quickly becomes tedious for large arrays, and is not
242 helpful if the size of the array is unknown. Although it is not built
243 into <productname>PostgreSQL</productname>,
244 there is an extension available that defines new functions and
245 operators for iterating over array values. Using this, the above
249 SELECT * FROM sal_emp WHERE pay_by_quarter[1:4] *= 10000;
252 To search the entire array (not just specified slices), you could
256 SELECT * FROM sal_emp WHERE pay_by_quarter *= 10000;
259 In addition, you could find rows where the array had all values
260 equal to 10 000 with:
263 SELECT * FROM sal_emp WHERE pay_by_quarter **= 10000;
266 To install this optional module, look in the
267 <filename>contrib/array</filename> directory of the
268 <productname>PostgreSQL</productname> source distribution.
273 Arrays are not sets; using arrays in the manner described in the
274 previous paragraph is often a sign of database misdesign. The
275 array field should generally be split off into a separate table.
276 Tables can obviously be searched easily.
282 <title>Array Input and Output Syntax</title>
285 The external representation of an array value consists of items that
286 are interpreted according to the I/O conversion rules for the array's
287 element type, plus decoration that indicates the array structure.
288 The decoration consists of curly braces (<literal>{</> and <literal>}</>)
289 around the array value plus delimiter characters between adjacent items.
290 The delimiter character is usually a comma (<literal>,</>) but can be
291 something else: it is determined by the <literal>typdelim</> setting
292 for the array's element type. (Among the standard data types provided
293 in the <productname>PostgreSQL</productname> distribution, type
294 <literal>box</> uses a semicolon (<literal>;</>) but all the others
295 use comma.) In a multidimensional array, each dimension (row, plane,
296 cube, etc.) gets its own level of curly braces, and delimiters
297 must be written between adjacent curly-braced entities of the same level.
298 You may write whitespace before a left brace, after a right
299 brace, or before any individual item string. Whitespace after an item
300 is not ignored, however: after skipping leading whitespace, everything
301 up to the next right brace or delimiter is taken as the item value.
306 <title>Quoting Array Elements</title>
309 As shown above, when writing an array value you may write double
310 quotes around any individual array
311 element. You <emphasis>must</> do so if the element value would otherwise
312 confuse the array-value parser. For example, elements containing curly
313 braces, commas (or whatever the delimiter character is), double quotes,
314 backslashes, or leading white space must be double-quoted. To put a double
315 quote or backslash in an array element value, precede it with a backslash.
316 Alternatively, you can use backslash-escaping to protect all data characters
317 that would otherwise be taken as array syntax or ignorable white space.
321 The array output routine will put double quotes around element values
322 if they are empty strings or contain curly braces, delimiter characters,
323 double quotes, backslashes, or white space. Double quotes and backslashes
324 embedded in element values will be backslash-escaped. For numeric
325 data types it is safe to assume that double quotes will never appear, but
326 for textual data types one should be prepared to cope with either presence
327 or absence of quotes. (This is a change in behavior from pre-7.2
328 <productname>PostgreSQL</productname> releases.)
333 Remember that what you write in an SQL command will first be interpreted
334 as a string literal, and then as an array. This doubles the number of
335 backslashes you need. For example, to insert a <type>text</> array
336 value containing a backslash and a double quote, you'd need to write
338 INSERT ... VALUES ('{"\\\\","\\""}');
340 The string-literal processor removes one level of backslashes, so that
341 what arrives at the array-value parser looks like <literal>{"\\","\""}</>.
342 In turn, the strings fed to the <type>text</> data type's input routine
343 become <literal>\</> and <literal>"</> respectively. (If we were working
344 with a data type whose input routine also treated backslashes specially,
345 <type>bytea</> for example, we might need as many as eight backslashes
346 in the command to get one backslash into the stored array element.)