1 <!-- $Header: /cvsroot/pgsql/doc/src/sgml/protocol.sgml,v 1.18 2001/06/22 23:27:48 petere Exp $ -->
3 <chapter id="protocol">
4 <title>Frontend/Backend Protocol</title>
8 Written by Phil Thompson (<email>phil@river-bank.demon.co.uk</email>).
9 Updates for protocol 2.0 by Tom Lane (<email>tgl@sss.pgh.pa.us</email>).
14 <application>PostgreSQL</application> uses a message-based protocol
15 for communication between frontends and backends. The protocol is
16 implemented over <acronym>TCP/IP</acronym> and also on Unix domain
17 sockets. <productname>PostgreSQL</productname> 6.3 introduced
18 version numbers into the protocol. This was done in such a way as
19 to still allow connections from earlier versions of frontends, but
20 this document does not cover the protocol used by those earlier
25 This document describes version 2.0 of the protocol, implemented in
26 <application>PostgreSQL</application> 6.4 and later.
30 Higher level features built on this protocol (for example, how
31 <application>libpq</application> passes certain environment
32 variables after the connection is established) are covered
36 <sect1 id="protocol-overview">
37 <title>Overview</title>
40 A frontend opens a connection to the server and sends a start-up
41 packet. This includes the names of the user and the database the
42 user wants to connect to. The server then uses this, and the
43 information in the <filename>pg_hba.conf</filename> file to
44 determine what further authentication information it requires the
45 frontend to send (if any) and responds to the frontend accordingly.
49 The frontend then sends any required authentication information.
50 Once the server validates this it responds to the frontend that it
51 is authenticated and sends a message indicating successful start-up
52 (normal case) or failure (for example, an invalid database name).
56 In order to serve multiple clients efficiently, the server would
57 normally create a new child process to handle each incoming
58 connection. However, this is not required. In the current
59 implementation, a new child process is created immediately after an
60 incoming connection is detected. In earlier versions of PostgreSQL
61 (7.1 and earlier), the child process was created after sending the
62 authentication confirmation message.
66 When the frontend wishes to disconnect it sends an appropriate packet and
67 closes the connection without waiting for a response for the backend.
71 Packets are sent as a data stream. The first byte determines what
72 should be expected in the rest of the packet. The exceptions are
73 packets sent as part of the startup and authentication exchange,
74 which comprise a packet length followed by the packet itself. The
75 difference is historical.
79 <sect1 id="protocol-protocol">
80 <title>Protocol</title>
83 This section describes the message flow. There are four different
84 types of flows depending on the state of the connection: start-up,
85 query, function call, and termination. There are also special
86 provisions for notification responses and command cancellation,
87 which can occur at any time after the start-up phase.
91 <title>Start-up</Title>
94 Initially, the frontend sends a StartupPacket. The server uses
95 this info and the contents of the <filename>pg_hba.conf</filename>
96 file to determine what authentication method the frontend must
97 use. The server then responds with one of the following messages:
101 <term>ErrorResponse</term>
104 The server then immediately closes the connection.
110 <term>AuthenticationOk</term>
113 The authentication exchange is completed.
119 <term>AuthenticationKerberosV4</Term>
122 The frontend must then take part in a Kerberos V4
123 authentication dialog (not described here, part of the
124 Kerberos specification) with the server. If this is
125 successful, the server responds with an AuthenticationOk,
126 otherwise it responds with an ErrorResponse.
132 <Term>AuthenticationKerberosV5</Term>
135 The frontend must then take part in a Kerberos V5
136 authentication dialog (not described here, part of the
137 Kerberos specification) with the server. If this is
138 successful, the server responds with an AuthenticationOk,
139 otherwise it responds with an ErrorResponse.
145 <Term>AuthenticationUnencryptedPassword</Term>
148 The frontend must then send an UnencryptedPasswordPacket. If
149 this is the correct password, the server responds with an
150 AuthenticationOk, otherwise it responds with an ErrorResponse.
156 <Term>AuthenticationEncryptedPassword</Term>
159 The frontend must then send an EncryptedPasswordPacket. If
160 this is the correct password, the server responds with an
161 AuthenticationOk, otherwise it responds with an ErrorResponse.
170 If the frontend does not support the authentication method
171 requested by the server, then it should immediately close the
176 After having received AuthenticationOk, the frontend should wait
177 for further messages from the server. The possible messages from
178 the backend in this phase are:
182 <Term>BackendKeyData</Term>
185 This message provides secret-key data that the frontend must
186 save if it wants to be able to issue cancel requests later.
187 The frontend should not respond to this message, but should
188 continue listening for a ReadyForQuery message.
194 <Term>ReadyForQuery</Term>
197 Start-up is completed. The frontend may now issue query or
198 function call messages.
204 <Term>ErrorResponse</Term>
207 Start-up failed. The connection is closed after sending this
214 <Term>NoticeResponse</Term>
217 A warning message has been issued. The frontend should
218 display the message but continue listening for ReadyForQuery
227 The ReadyForQuery message is the same one that the backend will
228 issue after each query cycle. Depending on the coding needs of
229 the frontend, it is reasonable to consider ReadyForQuery as
230 starting a query cycle (and then BackendKeyData indicates
231 successful conclusion of the start-up phase), or to consider
232 ReadyForQuery as ending the start-up phase and each subsequent
241 A Query cycle is initiated by the frontend sending a Query message
242 to the backend. The backend then sends one or more response
243 messages depending on the contents of the query command string,
244 and finally a ReadyForQuery response message. ReadyForQuery
245 informs the frontend that it may safely send a new query or
250 The possible response messages from the backend are:
254 <Term>CompletedResponse</Term>
257 An SQL command completed normally.
263 <Term>CopyInResponse</Term>
266 The backend is ready to copy data from the frontend to a
267 table. The frontend should then send a CopyDataRows message.
268 The backend will then respond with a CompletedResponse message
269 with a tag of <literal>COPY</literal>.
275 <Term>CopyOutResponse</Term>
278 The backend is ready to copy data from a table to the
279 frontend. It then sends a CopyDataRows message, and then a
280 CompletedResponse message with a tag of <literal>COPY</literal>.
286 <Term>CursorResponse</Term>
289 The query was either an <literal>INSERT</literal>,
290 <literal>UPDATE</literal>, <literal>DELETE</literal>,
291 <literal>FETCH</literal>, or a <literal>SELECT</literal>
292 command. If the transaction has been aborted then the backend
293 sends a CompletedResponse message with a tag of <literal>*ABORT
294 STATE*</literal>. Otherwise the following responses are sent.
298 For an <literal>INSERT</literal> command, the backend then
299 sends a CompletedResponse message with a tag of
300 <literal>INSERT <replaceable>oid</replaceable>
301 <replaceable>rows</replaceable></literal>, where
302 <replaceable>rows</replaceable> is the number of rows
303 inserted, and <replaceable>oid</replaceable> is the object ID
304 of the inserted row if <Replaceable>rows</Replaceable> is 1,
305 otherwise <Replaceable>oid</Replaceable> is 0.
309 For a <literal>DELETE</literal> command, the backend then
310 sends a CompletedResponse message with a tag of <literal>DELETE
311 <Replaceable>rows</Replaceable></literal> where
312 <Replaceable>rows</Replaceable> is the number of rows deleted.
316 For an <literal>UPDATE</literal> command, the backend then
317 sends a CompletedResponse message with a tag of <literal>UPDATE
318 <Replaceable>rows</Replaceable></literal> where
319 <Replaceable>rows</Replaceable> is the number of rows affected
324 For a <literal>FETCH</literal> or <literal>SELECT</literal>
325 command, the backend sends a RowDescription message. This is
326 then followed by an AsciiRow or BinaryRow message (depending
327 on whether a binary cursor was specified) for each row being
328 returned to the frontend. Finally, the backend sends a
329 CompletedResponse message with a tag of <literal>SELECT</literal>.
335 <Term>EmptyQueryResponse</Term>
338 An empty query string was recognized. (The need to specially
339 distinguish this case is historical.)
345 <Term>ErrorResponse</Term>
348 An error has occurred.
354 <Term>ReadyForQuery</Term>
357 Processing of the query string is complete. A separate
358 message is sent to indicate this because the query string may
359 contain multiple SQL commands. (CompletedResponse marks the
360 end of processing one SQL command, not the whole string.)
361 ReadyForQuery will always be sent, whether processing
362 terminates successfully or with an error.
368 <Term>NoticeResponse</Term>
371 A warning message has been issued in relation to the query.
372 Notices are in addition to other responses, i.e., the backend
373 will continue processing the command.
382 A frontend must be prepared to accept ErrorResponse and
383 NoticeResponse messages whenever it is expecting any other type of
388 Actually, it is possible for NoticeResponse to arrive even when
389 the frontend is not expecting any kind of message, that is, the
390 backend is nominally idle. (In particular, the backend can be
391 commanded to terminate by its parent process. In that case it will
392 send a NoticeResponse before closing the connection.) It is
393 recommended that the frontend check for such asynchronous notices
394 just before issuing any new command.
398 Also, if the frontend issues any <literal>LISTEN</literal>
399 commands then it must be prepared to accept NotificationResponse
400 messages at any time; see below.
405 <Title>Function Call</Title>
408 A Function Call cycle is initiated by the frontend sending a
409 FunctionCall message to the backend. The backend then sends one
410 or more response messages depending on the results of the function
411 call, and finally a ReadyForQuery response message. ReadyForQuery
412 informs the frontend that it may safely send a new query or
417 The possible response messages from the backend are:
421 <Term>ErrorResponse</Term>
424 An error has occurred.
430 <Term>FunctionResultResponse</Term>
433 The function call was executed and returned a result.
439 <Term>FunctionVoidResponse</Term>
442 The function call was executed and returned no result.
448 <Term>ReadyForQuery</Term>
451 Processing of the function call is complete. ReadyForQuery
452 will always be sent, whether processing terminates
453 successfully or with an error.
459 <Term>NoticeResponse</Term>
462 A warning message has been issued in relation to the function
463 call. Notices are in addition to other responses, i.e., the
464 backend will continue processing the command.
472 A frontend must be prepared to accept ErrorResponse and
473 NoticeResponse messages whenever it is expecting any other type of
474 message. Also, if it issues any <literal>LISTEN</literal>
475 commands then it must be prepared to accept NotificationResponse
476 messages at any time; see below.
481 <title>Notification Responses</title>
484 If a frontend issues a <literal>LISTEN</literal> command, then the
485 backend will send a NotificationResponse message (not to be
486 confused with NoticeResponse!) whenever a
487 <literal>NOTIFY</literal> command is executed for the same
492 Notification responses are permitted at any point in the protocol
493 (after start-up), except within another backend message. Thus,
494 the frontend must be prepared to recognize a NotificationResponse
495 message whenever it is expecting any message. Indeed, it should
496 be able to handle NotificationResponse messages even when it is
497 not engaged in a query.
501 <Term>NotificationResponse</Term>
504 A <literal>NOTIFY</literal> command has been executed for a
505 name for which a previous <literal>LISTEN</literal> command
506 was executed. Notifications may be sent at any time.
514 It may be worth pointing out that the names used in listen and
515 notify commands need not have anything to do with names of
516 relations (tables) in the SQL database. Notification names are
517 simply arbitrarily chosen condition names.
522 <Title>Cancelling Requests in Progress</Title>
525 During the processing of a query, the frontend may request
526 cancellation of the query. The cancel request is not sent
527 directly on the open connection to the backend for reasons of
528 implementation efficiency: we don't want to have the backend
529 constantly checking for new input from the frontend during query
530 processing. Cancel requests should be relatively infrequent, so
531 we make them slightly cumbersome in order to avoid a penalty in
536 To issue a cancel request, the frontend opens a new connection to
537 the server and sends a CancelRequest message, rather than the
538 StartupPacket message that would ordinarily be sent across a new
539 connection. The server will process this request and then close
540 the connection. For security reasons, no direct reply is made to
541 the cancel request message.
545 A CancelRequest message will be ignored unless it contains the
546 same key data (PID and secret key) passed to the frontend during
547 connection start-up. If the request matches the PID and secret
548 key for a currently executing backend, the processing of the
549 current query is aborted. (In the existing implemenation, this is
550 done by sending a special signal to the backend process that is
551 processing the query.)
555 The cancellation signal may or may not have any effect --- for
556 example, if it arrives after the backend has finished processing
557 the query, then it will have no effect. If the cancellation is
558 effective, it results in the current command being terminated
559 early with an error message.
563 The upshot of all this is that for reasons of both security and
564 efficiency, the frontend has no direct way to tell whether a
565 cancel request has succeeded. It must continue to wait for the
566 backend to respond to the query. Issuing a cancel simply improves
567 the odds that the current query will finish soon, and improves the
568 odds that it will fail with an error message instead of
573 Since the cancel request is sent across a new connection to the
574 server and not across the regular frontend/backend communication
575 link, it is possible for the cancel request to be issued by any
576 process, not just the frontend whose query is to be canceled.
577 This may have some benefits of flexibility in building
578 multiple-process applications. It also introduces a security
579 risk, in that unauthorized persons might try to cancel queries.
580 The security risk is addressed by requiring a dynamically
581 generated secret key to be supplied in cancel requests.
586 <Title>Termination</Title>
589 The normal, graceful termination procedure is that the frontend
590 sends a Terminate message and immediately closes the connection.
591 On receipt of the message, the backend immediately closes the
592 connection and terminates.
596 An ungraceful termination may occur due to software failure (i.e.,
597 core dump) at either end. If either frontend or backend sees an
598 unexpected closure of the connection, it should clean up and
599 terminate. The frontend has the option of launching a new backend
600 by recontacting the server if it doesn't want to terminate
606 <Sect1 id="protocol-message-types">
607 <Title>Message Data Types</Title>
610 This section describes the base data types used in messages.
615 Int<Replaceable>n</Replaceable>(<Replaceable>i</Replaceable>)
619 An <Replaceable>n</Replaceable> bit integer in network byte order.
620 If <Replaceable>i</Replaceable> is specified it
621 is the literal value. Eg. Int16, Int32(42).
627 LimString<Replaceable>n</Replaceable>(<Replaceable>s</Replaceable>)
631 A character array of exactly <Replaceable>n</Replaceable> bytes interpreted as a '\0'
632 terminated string. The '\0' is omitted if there is
633 insufficient room. If <Replaceable>s</Replaceable> is specified it is the literal value.
634 Eg. LimString32, LimString64("user").
640 String(<Replaceable>s</Replaceable>)
644 A conventional C '\0' terminated string with no length
646 If <Replaceable>s</Replaceable> is specified it is the literal value.
647 Eg. String, String("user").
652 <Emphasis>There is no predefined limit</Emphasis> on the length of a string
653 that can be returned by the backend. Good coding strategy for a frontend
654 is to use an expandable buffer so that anything that fits in memory can be
655 accepted. If that's not feasible, read the full string and discard trailing
656 characters that don't fit into your fixed-size buffer.
663 Byte<Replaceable>n</Replaceable>(<Replaceable>c</Replaceable>)
667 Exactly <Replaceable>n</Replaceable> bytes. If <Replaceable>c</Replaceable> is specified it is the literal
668 value. Eg. Byte, Byte1('\n').
676 <Sect1 id="protocol-message-formats">
677 <Title>Message Formats</Title>
680 This section describes the detailed format of each message. Each can be sent
681 by either a frontend (F), a backend (B), or both (F & B).
698 Identifies the message as an <Acronym>ASCII</Acronym> data row.
699 (A prior RowDescription message defines the number of
700 fields in the row and their data types.)
706 Byte<Replaceable>n</Replaceable>
710 A bit map with one bit for each field in the row. The 1st
711 field corresponds to bit 7 (MSB) of the 1st byte, the 2nd
712 field corresponds to bit 6 of the 1st byte, the 8th field
713 corresponds to bit 0 (LSB) of the 1st byte, the 9th field
714 corresponds to bit 7 of the 2nd byte, and so on. Each bit
715 is set if the value of the corresponding field is not NULL.
716 If the number of fields is not a multiple of 8, the remainder
717 of the last byte in the bit map is wasted.
720 Then, for each field with a non-NULL value, there is the following:
728 Specifies the size of the value of the field, including
735 Byte<Replaceable>n</Replaceable>
739 Specifies the value of the field itself in <Acronym>ASCII</Acronym>
740 characters. <Replaceable>n</Replaceable> is the above
742 There is no trailing '\0' in the field data; the front
743 end must add one if it wants one.
771 Identifies the message as an authentication request.
781 Specifies that the authentication was successful.
792 AuthenticationKerberosV4 (B)
804 Identifies the message as an authentication request.
814 Specifies that Kerberos V4 authentication is required.
826 AuthenticationKerberosV5 (B)
838 Identifies the message as an authentication request.
848 Specifies that Kerberos V5 authentication is required.
860 AuthenticationUnencryptedPassword (B)
872 Identifies the message as an authentication request.
882 Specifies that an unencrypted password is required.
894 AuthenticationEncryptedPassword (B)
906 Identifies the message as an authentication request.
916 Specifies that an encrypted password is required.
926 The salt to use when encrypting the password.
949 Identifies the message as cancellation key data.
950 The frontend must save these values if it wishes to be
951 able to issue CancelRequest messages later.
961 The process ID of this backend.
971 The secret key of this backend.
995 Identifies the message as a binary data row.
996 (A prior RowDescription message defines the number of
997 fields in the row and their data types.)
1003 Byte<Replaceable>n</Replaceable>
1007 A bit map with one bit for each field in the row. The 1st
1008 field corresponds to bit 7 (MSB) of the 1st byte, the 2nd
1009 field corresponds to bit 6 of the 1st byte, the 8th field
1010 corresponds to bit 0 (LSB) of the 1st byte, the 9th field
1011 corresponds to bit 7 of the 2nd byte, and so on. Each bit
1012 is set if the value of the corresponding field is not NULL.
1013 If the number of fields is not a multiple of 8, the remainder
1014 of the last byte in the bit map is wasted.
1017 Then, for each field with a non-NULL value, there is the following:
1025 Specifies the size of the value of the field, excluding
1032 Byte<Replaceable>n</Replaceable>
1036 Specifies the value of the field itself in binary
1037 format. <Replaceable>n</Replaceable> is the above size.
1065 The size of the packet in bytes.
1075 The cancel request code. The value is chosen to contain
1076 "1234" in the most significant 16 bits, and "5678" in the
1077 least 16 significant bits. (To avoid confusion, this code
1078 must not be the same as any protocol version number.)
1088 The process ID of the target backend.
1098 The secret key for the target backend.
1110 CompletedResponse (B)
1122 Identifies the message as a completed response.
1132 The command tag. This is usually (but not always) a single
1133 word that identifies which SQL command was completed.
1145 CopyDataRows (B & F)
1149 This is a stream of rows where each row is terminated by a Byte1('\n').
1150 This is then followed by the sequence Byte1('\\'), Byte1('.'),
1169 Identifies the message as a Start Copy In response.
1170 The frontend must now send a CopyDataRows message.
1194 Identifies the message as a Start Copy Out response.
1195 This message will be followed by a CopyDataRows message.
1219 Identifies the message as a cursor response.
1229 The name of the cursor. This will be "blank" if the cursor is
1242 EmptyQueryResponse (B)
1254 Identifies the message as a response to an empty query string.
1276 EncryptedPasswordPacket (F)
1288 The size of the packet in bytes.
1298 The encrypted (using crypt()) password.
1322 Identifies the message as an error.
1332 The error message itself.
1356 Identifies the message as a function call.
1376 Specifies the object ID of the function to call.
1386 Specifies the number of arguments being supplied to the
1390 Then, for each argument, there is the following:
1398 Specifies the size of the value of the argument,
1399 excluding this size.
1405 Byte<Replaceable>n</Replaceable>
1409 Specifies the value of the field itself in binary
1410 format. <Replaceable>n</Replaceable> is the above size.
1427 FunctionResultResponse (B)
1439 Identifies the message as a function call result.
1449 Specifies that a nonempty result was returned.
1459 Specifies the size of the value of the result, excluding this
1466 Byte<Replaceable>n</Replaceable>
1470 Specifies the value of the result itself in binary format.
1471 <Replaceable>n</Replaceable> is the above size.
1481 Unused. (Strictly speaking, FunctionResultResponse and
1482 FunctionVoidResponse are the same thing but with some optional
1483 parts to the message.)
1495 FunctionVoidResponse (B)
1507 Identifies the message as a function call result.
1517 Specifies that an empty result was returned.
1541 Identifies the message as a notice.
1551 The notice message itself.
1563 NotificationResponse (B)
1575 Identifies the message as a notification response.
1585 The process ID of the notifying backend process.
1595 The name of the condition that the notify has been raised on.
1619 Identifies the message as a query.
1629 The query string itself.
1653 Identifies the message type. ReadyForQuery is sent
1654 whenever the backend is ready for a new query cycle.
1678 Identifies the message as a row description.
1688 Specifies the number of fields in a row (may be zero).
1691 Then, for each field, there is the following:
1699 Specifies the field name.
1709 Specifies the object ID of the field type.
1719 Specifies the type size.
1729 Specifies the type modifier.
1758 The size of the packet in bytes.
1768 The protocol version number. The most significant 16 bits are
1769 the major version number. The least 16 significant bits are
1770 the minor version number.
1780 The database name, defaults to the user name if empty.
1800 Any additional command line arguments to be passed to the
1801 backend child process by the server.
1821 The optional tty the backend should use for debugging messages.
1845 Identifies the message as a termination.
1857 UnencryptedPasswordPacket (F)
1869 The size of the packet in bytes.
1879 The unencrypted password.