1 <Chapter Id="largeObjects">
2 <Title>Large Objects</Title>
5 In <ProductName>Postgres</ProductName>, data values are stored in tuples and
6 individual tuples cannot span data pages. Since the size of
7 a data page is 8192 bytes, the upper limit on the size
8 of a data value is relatively low. To support the storage
9 of larger atomic values, <ProductName>Postgres</ProductName> provides a large
10 object interface. This interface provides file
11 oriented access to user data that has been declared to
13 This section describes the implementation and the
14 programmatic and query language interfaces to <ProductName>Postgres</ProductName>
19 <Title>Historical Note</Title>
22 Originally, <ProductName>Postgres 4.2</ProductName> supported three standard
23 implementations of large objects: as files external
24 to <ProductName>Postgres</ProductName>, as <Acronym>UNIX</Acronym> files managed by <ProductName>Postgres</ProductName>, and as data
25 stored within the <ProductName>Postgres</ProductName> database. It causes
26 considerable confusion among users. As a result, we only
27 support large objects as data stored within the <ProductName>Postgres</ProductName>
28 database in <ProductName>PostgreSQL</ProductName>. Even though it is slower to
29 access, it provides stricter data integrity.
30 For historical reasons, this storage scheme is referred to as
31 Inversion large objects. (We will use Inversion and large
32 objects interchangeably to mean the same thing in this
38 <Title>Inversion Large Objects</Title>
41 The Inversion large object implementation breaks large
42 objects up into "chunks" and stores the chunks in
43 tuples in the database. A B-tree index guarantees fast
44 searches for the correct chunk number when doing random
45 access reads and writes.
50 <Title>Large Object Interfaces</Title>
53 The facilities <ProductName>Postgres</ProductName> provides to access large
54 objects, both in the backend as part of user-defined
55 functions or the front end as part of an application
56 using the interface, are described below. (For users
57 familiar with <ProductName>Postgres 4.2</ProductName>, <ProductName>PostgreSQL</ProductName> has a new set of
58 functions providing a more coherent interface. The
59 interface is the same for dynamically-loaded C
60 functions as well as for XXX LOST TEXT? WHAT SHOULD GO HERE??.
62 The <ProductName>Postgres</ProductName> large object interface is modeled after
63 the <Acronym>UNIX</Acronym> file system interface, with analogues of
64 <Function>open(2)</Function>, <Function>read(2)</Function>, <Function>write(2)</Function>,
65 <Function>lseek(2)</Function>, etc. User
66 functions call these routines to retrieve only the data of
67 interest from a large object. For example, if a large
68 object type called mugshot existed that stored
69 photographs of faces, then a function called beard could
70 be declared on mugshot data. Beard could look at the
71 lower third of a photograph, and determine the color of
72 the beard that appeared there, if any. The entire
73 large object value need not be buffered, or even
74 examined, by the beard function.
75 Large objects may be accessed from dynamically-loaded <Acronym>C</Acronym>
76 functions or database client programs that link the
77 library. <ProductName>Postgres</ProductName> provides a set of routines that
78 support opening, reading, writing, closing, and seeking on
83 <Title>Creating a Large Object</Title>
88 Oid lo_creat(PGconn *conn, int mode)
90 creates a new large object. The mode is a bitmask
91 describing several different attributes of the new
92 object. The symbolic constants listed here are defined
95 PGROOT/src/backend/libpq/libpq-fs.h
97 The access type (read, write, or both) is controlled by
98 OR ing together the bits <Acronym>INV_READ</Acronym> and <Acronym>INV_WRITE</Acronym>. If
99 the large object should be archived -- that is, if
100 historical versions of it should be moved periodically to
101 a special archive relation -- then the <Acronym>INV_ARCHIVE</Acronym> bit
102 should be set. The low-order sixteen bits of mask are
103 the storage manager number on which the large object
104 should reside. For sites other than Berkeley, these
105 bits should always be zero.
106 The commands below create an (Inversion) large object:
108 inv_oid = lo_creat(INV_READ|INV_WRITE|INV_ARCHIVE);
114 <Title>Importing a Large Object</Title>
117 To import a <Acronym>UNIX</Acronym> file as
120 Oid lo_import(PGconn *conn, text *filename)
122 The filename argument specifies the <Acronym>UNIX</Acronym> pathname of
123 the file to be imported as a large object.
128 <Title>Exporting a Large Object</Title>
131 To export a large object
132 into <Acronym>UNIX</Acronym> file, call
134 int lo_export(PGconn *conn, Oid lobjId, text *filename)
136 The lobjId argument specifies the Oid of the large
137 object to export and the filename argument specifies
138 the <Acronym>UNIX</Acronym> pathname of the file.
143 <Title>Opening an Existing Large Object</Title>
146 To open an existing large object, call
148 int lo_open(PGconn *conn, Oid lobjId, int mode, ...)
150 The lobjId argument specifies the Oid of the large
151 object to open. The mode bits control whether the
152 object is opened for reading INV_READ), writing or
154 A large object cannot be opened before it is created.
155 lo_open returns a large object descriptor for later use
156 in lo_read, lo_write, lo_lseek, lo_tell, and lo_close.
161 <Title>Writing Data to a Large Object</Title>
166 int lo_write(PGconn *conn, int fd, char *buf, int len)
168 writes len bytes from buf to large object fd. The fd
169 argument must have been returned by a previous lo_open.
170 The number of bytes actually written is returned. In
171 the event of an error, the return value is negative.
176 <Title>Seeking on a Large Object</Title>
179 To change the current read or write location on a large
182 int lo_lseek(PGconn *conn, int fd, int offset, int whence)
184 This routine moves the current location pointer for the
185 large object described by fd to the new location specified
186 by offset. The valid values for .i whence are
187 SEEK_SET SEEK_CUR and SEEK_END.
192 <Title>Closing a Large Object Descriptor</Title>
195 A large object may be closed by calling
197 int lo_close(PGconn *conn, int fd)
199 where fd is a large object descriptor returned by
200 lo_open. On success, <Acronym>lo_close</Acronym> returns zero. On error,
201 the return value is negative.
207 <Title>Built in registered functions</Title>
210 There are two built-in registered functions, <Acronym>lo_import</Acronym>
211 and <Acronym>lo_export</Acronym> which are convenient for use in <Acronym>SQL</Acronym>
213 Here is an example of their use
220 INSERT INTO image (name, raster)
221 VALUES ('beautiful image', lo_import('/etc/motd'));
223 SELECT lo_export(image.raster, "/tmp/motd") from image
224 WHERE name = 'beautiful image';
230 <Title>Accessing Large Objects from LIBPQ</Title>
233 Below is a sample program which shows how the large object
235 in LIBPQ can be used. Parts of the program are
236 commented out but are left in the source for the readers
237 benefit. This program can be found in
241 Frontend applications which use the large object interface
242 in LIBPQ should include the header file
243 libpq/libpq-fs.h and link with the libpq library.
248 <Title>Sample Program</Title>
252 /*--------------------------------------------------------------
255 * test using large objects with libpq
257 * Copyright (c) 1994, Regents of the University of California
261 * /usr/local/devel/pglite/cvs/src/doc/manual.me,v 1.16 1995/09/01 23:55:00 jolly Exp
263 *--------------------------------------------------------------
265 #include <stdio.h>
266 #include "libpq-fe.h"
267 #include "libpq/libpq-fs.h"
272 * importFile * import file "in_filename" into database as large object "lobjOid"
275 Oid importFile(PGconn *conn, char *filename)
284 * open the file to be read in
286 fd = open(filename, O_RDONLY, 0666);
287 if (fd < 0) { /* error */
288 fprintf(stderr, "can't open unix file %s\n", filename);
292 * create the large object
294 lobjId = lo_creat(conn, INV_READ|INV_WRITE);
296 fprintf(stderr, "can't create large object\n");
299 lobj_fd = lo_open(conn, lobjId, INV_WRITE);
301 * read in from the Unix file and write to the inversion file
303 while ((nbytes = read(fd, buf, BUFSIZE)) > 0) {
304 tmp = lo_write(conn, lobj_fd, buf, nbytes);
305 if (tmp < nbytes) {
306 fprintf(stderr, "error while reading large object\n");
311 (void) lo_close(conn, lobj_fd);
316 void pickout(PGconn *conn, Oid lobjId, int start, int len)
323 lobj_fd = lo_open(conn, lobjId, INV_READ);
324 if (lobj_fd < 0) {
325 fprintf(stderr,"can't open large object %d\n",
329 lo_lseek(conn, lobj_fd, start, SEEK_SET);
333 while (len - nread > 0) {
334 nbytes = lo_read(conn, lobj_fd, buf, len - nread);
336 fprintf(stderr,">>> %s", buf);
339 fprintf(stderr,"\n");
340 lo_close(conn, lobj_fd);
343 void overwrite(PGconn *conn, Oid lobjId, int start, int len)
351 lobj_fd = lo_open(conn, lobjId, INV_READ);
352 if (lobj_fd < 0) {
353 fprintf(stderr,"can't open large object %d\n",
357 lo_lseek(conn, lobj_fd, start, SEEK_SET);
360 for (i=0;i<len;i++)
365 while (len - nwritten > 0) {
366 nbytes = lo_write(conn, lobj_fd, buf + nwritten, len - nwritten);
369 fprintf(stderr,"\n");
370 lo_close(conn, lobj_fd);
374 * exportFile * export large object "lobjOid" to file "out_filename"
377 void exportFile(PGconn *conn, Oid lobjId, char *filename)
385 * create an inversion "object"
387 lobj_fd = lo_open(conn, lobjId, INV_READ);
388 if (lobj_fd < 0) {
389 fprintf(stderr,"can't open large object %d\n",
394 * open the file to be written to
396 fd = open(filename, O_CREAT|O_WRONLY, 0666);
397 if (fd < 0) { /* error */
398 fprintf(stderr, "can't open unix file %s\n",
403 * read in from the Unix file and write to the inversion file
405 while ((nbytes = lo_read(conn, lobj_fd, buf, BUFSIZE)) > 0) {
406 tmp = write(fd, buf, nbytes);
407 if (tmp < nbytes) {
408 fprintf(stderr,"error while writing %s\n",
413 (void) lo_close(conn, lobj_fd);
420 exit_nicely(PGconn* conn)
427 main(int argc, char **argv)
429 char *in_filename, *out_filename;
436 fprintf(stderr, "Usage: %s database_name in_filename out_filename\n",
442 in_filename = argv[2];
443 out_filename = argv[3];
446 * set up the connection
448 conn = PQsetdb(NULL, NULL, NULL, NULL, database);
450 /* check to see that the backend connection was successfully made */
451 if (PQstatus(conn) == CONNECTION_BAD) {
452 fprintf(stderr,"Connection to database '%s' failed.\n", database);
453 fprintf(stderr,"%s",PQerrorMessage(conn));
457 res = PQexec(conn, "begin");
460 printf("importing file %s\n", in_filename);
461 /* lobjOid = importFile(conn, in_filename); */
462 lobjOid = lo_import(conn, in_filename);
464 printf("as large object %d.\n", lobjOid);
466 printf("picking out bytes 1000-2000 of the large object\n");
467 pickout(conn, lobjOid, 1000, 1000);
469 printf("overwriting bytes 1000-2000 of the large object with X's\n");
470 overwrite(conn, lobjOid, 1000, 1000);
473 printf("exporting large object to file %s\n", out_filename);
474 /* exportFile(conn, lobjOid, out_filename); */
475 lo_export(conn, lobjOid,out_filename);
477 res = PQexec(conn, "end");