From 515112f9d4874aaedd0c093f41c0ba3e0bf7f660 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 21 May 2006 20:19:23 +0000 Subject: [PATCH] Modify libpq's string-escaping routines to be aware of encoding considerations and standard_conforming_strings. The encoding changes are needed for proper escaping in multibyte encodings, as per the SQL-injection vulnerabilities noted in CVE-2006-2313 and CVE-2006-2314. Concurrent fixes are being applied to the server to ensure that it rejects queries that may have been corrupted by attempted SQL injection, but this merely guarantees that unpatched clients will fail rather than allow injection. An actual fix requires changing the client-side code. While at it we have also fixed these routines to understand about standard_conforming_strings, so that the upcoming changeover to SQL-spec string syntax can be somewhat transparent to client code. Since the existing API of PQescapeString and PQescapeBytea provides no way to inform them which settings are in use, these functions are now deprecated in favor of new functions PQescapeStringConn and PQescapeByteaConn. The new functions take the PGconn to which the string will be sent as an additional parameter, and look inside the connection structure to determine what to do. So as to provide some functionality for clients using the old functions, libpq stores the latest encoding and standard_conforming_strings values received from the backend in static variables, and the old functions consult these variables. This will work reliably in clients using only one Postgres connection at a time, or even multiple connections if they all use the same encoding and string syntax settings; which should cover many practical scenarios. Clients that use homebrew escaping methods, such as PHP's addslashes() function or even hardwired regexp substitution, will require extra effort to fix :-(. It is strongly recommended that such code be replaced by use of PQescapeStringConn/PQescapeByteaConn if at all feasible. --- doc/src/sgml/libpq.sgml | 157 +++++++++++++++++------ src/interfaces/libpq/exports.txt | 6 +- src/interfaces/libpq/fe-connect.c | 13 +- src/interfaces/libpq/fe-exec.c | 198 ++++++++++++++++++++++++------ src/interfaces/libpq/libpq-fe.h | 15 ++- src/interfaces/libpq/libpq-int.h | 3 +- 6 files changed, 309 insertions(+), 83 deletions(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index aef1129844..0c24f2b773 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -1,4 +1,4 @@ - + <application>libpq</application> - C Library @@ -2187,15 +2187,16 @@ It is not thread-safe. Escaping Strings for Inclusion in SQL Commands + PQescapeStringConn PQescapeString escaping strings -PQescapeString escapes a string for use within an SQL +PQescapeStringConn escapes a string for use within an SQL command. This is useful when inserting data values as literal constants in SQL commands. Certain characters (such as quotes and backslashes) must be escaped to prevent them from being interpreted specially by the SQL parser. -PQescapeString performs this operation. +PQescapeStringConn performs this operation. @@ -2213,36 +2214,68 @@ value is passed as a separate parameter in PQexecParams or its sibling routines. -size_t PQescapeString (char *to, const char *from, size_t length); +size_t PQescapeStringConn (PGconn *conn, + char *to, const char *from, size_t length, + int *error); +PQescapeStringConn writes an escaped +version of the from string to the to +buffer, escaping special characters so that they cannot cause any +harm, and adding a terminating zero byte. The single quotes that +must surround PostgreSQL string literals are not +included in the result string; they should be provided in the SQL +command that the result is inserted into. The parameter from points to the first character of the string that is to be escaped, and the length parameter gives the -number of characters in this string. A terminating zero byte is not +number of bytes in this string. A terminating zero byte is not required, and should not be counted in length. (If a terminating zero byte is found before length bytes are -processed, PQescapeString stops at the zero; the behavior +processed, PQescapeStringConn stops at the zero; the behavior is thus rather like strncpy.) to shall point to a -buffer that is able to hold at least one more character than twice +buffer that is able to hold at least one more byte than twice the value of length, otherwise the behavior is -undefined. A call to PQescapeString writes an escaped -version of the from string to the to -buffer, replacing special characters so that they cannot cause any -harm, and adding a terminating zero byte. The single quotes that -must surround PostgreSQL string literals are not -included in the result string; they should be provided in the SQL -command that the result is inserted into. +undefined. +Behavior is likewise undefined if the to and from +strings overlap. + + +If the error parameter is not NULL, then *error +is set to zero on success, nonzero on error. Presently the only possible +error conditions involve invalid multibyte encoding in the source string. +The output string is still generated on error, but it can be expected that +the server will reject it as malformed. On error, a suitable message is +stored in the conn object, whether or not error +is NULL. -PQescapeString returns the number of characters written +PQescapeStringConn returns the number of bytes written to to, not including the terminating zero byte. + -Behavior is undefined if the to and from -strings overlap. + +size_t PQescapeString (char *to, const char *from, size_t length); + + + + +PQescapeString is an older, deprecated version of +PQescapeStringConn; the difference is that it does not +take conn or error parameters. Because of this, +it cannot adjust its behavior depending on the connection properties (such as +character encoding) and therefore it may give the wrong results. +Also, it has no way to report error conditions. + + +PQescapeString can be used safely in single-threaded client +programs that work with only one PostgreSQL connection at +a time (in this case it can find out what it needs to know behind the +scenes). In other contexts it is a security hazard and should be avoided +in favor of PQescapeStringConn. @@ -2257,16 +2290,17 @@ strings overlap. - PQescapeByteaPQescapeBytea + PQescapeByteaConnPQescapeByteaConn Escapes binary data for use within an SQL command with the type - bytea. As with PQescapeString, + bytea. As with PQescapeStringConn, this is only used when inserting data directly into an SQL command string. -unsigned char *PQescapeBytea(const unsigned char *from, - size_t from_length, - size_t *to_length); +unsigned char *PQescapeByteaConn(PGconn *conn, + const unsigned char *from, + size_t from_length, + size_t *to_length); @@ -2276,10 +2310,10 @@ unsigned char *PQescapeBytea(const unsigned char *from, of a bytea literal in an SQL statement. In general, to escape a byte, it is converted into the three digit octal number equal to the octet value, and preceded by - two backslashes. The single quote (') and backslash + one or two backslashes. The single quote (') and backslash (\) characters have special alternative escape sequences. See for more - information. PQescapeBytea performs this + information. PQescapeByteaConn performs this operation, escaping only the minimally required bytes. @@ -2290,16 +2324,15 @@ unsigned char *PQescapeBytea(const unsigned char *from, bytes in this binary string. (A terminating zero byte is neither necessary nor counted.) The to_length parameter points to a variable that will hold the resultant - escaped string length. The result string length includes the terminating + escaped string length. This result string length includes the terminating zero byte of the result. - PQescapeBytea returns an escaped version of the + PQescapeByteaConn returns an escaped version of the from parameter binary string in memory - allocated with malloc() (a null pointer is returned if - memory could not be allocated). This memory must be freed using - PQfreemem when the result is no longer needed. The + allocated with malloc(). This memory must be freed using + PQfreemem() when the result is no longer needed. The return string has all special characters replaced so that they can be properly processed by the PostgreSQL string literal parser, and the bytea input function. A @@ -2307,6 +2340,45 @@ unsigned char *PQescapeBytea(const unsigned char *from, surround PostgreSQL string literals are not part of the result string. + + + On error, a NULL pointer is returned, and a suitable error message + is stored in the conn object. Currently, the only + possible error is insufficient memory for the result string. + + + + + + PQescapeByteaPQescapeBytea + + + PQescapeBytea is an older, deprecated version of + PQescapeByteaConn. + +unsigned char *PQescapeBytea(const unsigned char *from, + size_t from_length, + size_t *to_length); + + + + + The only difference from PQescapeByteaConn is that + PQescapeBytea does not + take a PGconn parameter. Because of this, it cannot adjust + its behavior depending on the connection properties (in particular, + whether standard-conforming strings are enabled) + and therefore it may give the wrong results. Also, it + has no way to return an error message on failure. + + + + PQescapeBytea can be used safely in single-threaded client + programs that work with only one PostgreSQL connection at + a time (in this case it can find out what it needs to know behind the + scenes). In other contexts it is a security hazard and should be + avoided in favor of PQescapeByteaConn. + @@ -2314,7 +2386,7 @@ unsigned char *PQescapeBytea(const unsigned char *from, PQunescapeByteaPQunescapeBytea - Converts an escaped string representation of binary data into binary + Converts a string representation of binary data into binary data — the reverse of PQescapeBytea. This is needed when retrieving bytea data in text format, but not when retrieving it in binary format. @@ -2324,16 +2396,24 @@ unsigned char *PQunescapeBytea(const unsigned char *from, size_t *to_length); - - The from parameter points to an escaped string - such as might be returned by PQgetvalue when applied to a - bytea column. PQunescapeBytea converts - this string representation into its binary representation. + + The from parameter points to a string + such as might be returned by PQgetvalue when applied + to a bytea column. PQunescapeBytea + converts this string representation into its binary representation. It returns a pointer to a buffer allocated with malloc(), or null on error, and puts the size of the buffer in to_length. The result must be freed using PQfreemem when it is no longer needed. + + + This conversion is not exactly the inverse of + PQescapeBytea, because the string is not expected + to be escaped when received from PQgetvalue. + In particular this means there is no need for string quoting considerations, + and so no need for a PGconn parameter. + @@ -2349,6 +2429,7 @@ void PQfreemem(void *ptr); Frees memory allocated by libpq, particularly + PQescapeByteaConn, PQescapeBytea, PQunescapeBytea, and PQnotifies. @@ -4000,9 +4081,9 @@ current connection parameters will be used. (Therefore, put more-specific entries first when you are using wildcards.) If an entry needs to contain : or \, escape this character with \. -A hostname of localhost matches both TCP host (hostname localhost) -and Unix domain socket local (pghost empty or the default socket directory) -connections coming from the local machine. +A hostname of localhost matches both TCP (hostname +localhost) and Unix domain socket (pghost empty or the +default socket directory) connections coming from the local machine. diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 9cf43cfc24..7fcd43b01a 100644 --- a/src/interfaces/libpq/exports.txt +++ b/src/interfaces/libpq/exports.txt @@ -1,4 +1,4 @@ -# $PostgreSQL: pgsql/src/interfaces/libpq/exports.txt,v 1.7 2005/12/26 14:58:05 petere Exp $ +# $PostgreSQL: pgsql/src/interfaces/libpq/exports.txt,v 1.8 2006/05/21 20:19:23 tgl Exp $ # Functions to be exported by libpq DLLs PQconnectdb 1 PQsetdbLogin 2 @@ -125,4 +125,6 @@ PQcancel 122 lo_create 123 PQinitSSL 124 PQregisterThreadLock 125 -PQencryptPassword 126 +PQescapeStringConn 126 +PQescapeByteaConn 127 +PQencryptPassword 128 diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 47884bdea5..6634758948 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/interfaces/libpq/fe-connect.c,v 1.331 2006/05/19 14:26:58 alvherre Exp $ + * $PostgreSQL: pgsql/src/interfaces/libpq/fe-connect.c,v 1.332 2006/05/21 20:19:23 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1828,6 +1828,7 @@ makeEmptyPGconn(void) conn->nonblocking = false; conn->setenv_state = SETENV_STATE_IDLE; conn->client_encoding = PG_SQL_ASCII; + conn->std_strings = false; /* unless server says differently */ conn->verbosity = PQERRORS_DEFAULT; conn->sock = -1; #ifdef USE_SSL @@ -2944,8 +2945,14 @@ PQsetClientEncoding(PGconn *conn, const char *encoding) status = -1; else { - /* change libpq internal encoding */ - conn->client_encoding = pg_char_to_encoding(encoding); + /* + * In protocol 2 we have to assume the setting will stick, and + * adjust our state immediately. In protocol 3 and up we can + * rely on the backend to report the parameter value, and we'll + * change state at that time. + */ + if (PG_PROTOCOL_MAJOR(conn->pversion) < 3) + pqSaveParameterStatus(conn, "client_encoding", encoding); status = 0; /* everything is ok */ } PQclear(res); diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c index 7f09ff6dd2..3139398946 100644 --- a/src/interfaces/libpq/fe-exec.c +++ b/src/interfaces/libpq/fe-exec.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/interfaces/libpq/fe-exec.c,v 1.182 2006/03/14 22:48:23 tgl Exp $ + * $PostgreSQL: pgsql/src/interfaces/libpq/fe-exec.c,v 1.183 2006/05/21 20:19:23 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -40,6 +40,12 @@ char *const pgresStatus[] = { "PGRES_FATAL_ERROR" }; +/* + * static state needed by PQescapeString and PQescapeBytea; initialize to + * values that result in backward-compatible behavior + */ +static int static_client_encoding = PG_SQL_ASCII; +static bool static_std_strings = false; static bool PQsendQueryStart(PGconn *conn); @@ -609,11 +615,22 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value) } /* - * Special hacks: remember client_encoding as a numeric value, and convert - * server version to a numeric form as well. + * Special hacks: remember client_encoding and standard_conforming_strings, + * and convert server version to a numeric form. We keep the first two of + * these in static variables as well, so that PQescapeString and + * PQescapeBytea can behave somewhat sanely (at least in single- + * connection-using programs). */ if (strcmp(name, "client_encoding") == 0) + { conn->client_encoding = pg_char_to_encoding(value); + static_client_encoding = conn->client_encoding; + } + else if (strcmp(name, "standard_conforming_strings") == 0) + { + conn->std_strings = (strcmp(value, "on") == 0); + static_std_strings = conn->std_strings; + } else if (strcmp(name, "server_version") == 0) { int cnt; @@ -2367,7 +2384,7 @@ PQfreeNotify(PGnotify *notify) /* * Escaping arbitrary strings to get valid SQL literal strings. * - * Replaces "\\" with "\\\\" and "'" with "''". + * Replaces "'" with "''", and if not std_strings, replaces "\" with "\\". * * length is the length of the source string. (Note: if a terminating NUL * is encountered sooner, PQescapeString stops short of "length"; the behavior @@ -2379,19 +2396,74 @@ PQfreeNotify(PGnotify *notify) * * Returns the actual length of the output (not counting the terminating NUL). */ -size_t -PQescapeString(char *to, const char *from, size_t length) +static size_t +PQescapeStringInternal(PGconn *conn, + char *to, const char *from, size_t length, + int *error, + int encoding, bool std_strings) { const char *source = from; char *target = to; size_t remaining = length; + if (error) + *error = 0; + while (remaining > 0 && *source != '\0') { - if (SQL_STR_DOUBLE(*source)) - *target++ = *source; - *target++ = *source++; - remaining--; + char c = *source; + int len; + int i; + + /* Fast path for plain ASCII */ + if (!IS_HIGHBIT_SET(c)) + { + /* Apply quoting if needed */ + if (c == '\'' || + (c == '\\' && !std_strings)) + *target++ = c; + /* Copy the character */ + *target++ = c; + source++; + remaining--; + continue; + } + + /* Slow path for possible multibyte characters */ + len = pg_encoding_mblen(encoding, source); + + /* Copy the character */ + for (i = 0; i < len; i++) + { + if (remaining == 0 || *source == '\0') + break; + *target++ = *source++; + remaining--; + } + + /* + * If we hit premature end of string (ie, incomplete multibyte + * character), try to pad out to the correct length with spaces. + * We may not be able to pad completely, but we will always be able + * to insert at least one pad space (since we'd not have quoted a + * multibyte character). This should be enough to make a string that + * the server will error out on. + */ + if (i < len) + { + if (error) + *error = 1; + if (conn) + printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("incomplete multibyte character\n")); + for (; i < len; i++) + { + if (((size_t) (target - to)) / 2 >= length) + break; + *target++ = ' '; + } + break; + } } /* Write the terminating NUL character. */ @@ -2400,72 +2472,109 @@ PQescapeString(char *to, const char *from, size_t length) return target - to; } +size_t +PQescapeStringConn(PGconn *conn, + char *to, const char *from, size_t length, + int *error) +{ + if (!conn) + { + /* force empty-string result */ + *to = '\0'; + if (error) + *error = 1; + return 0; + } + return PQescapeStringInternal(conn, to, from, length, error, + conn->client_encoding, + conn->std_strings); +} + +size_t +PQescapeString(char *to, const char *from, size_t length) +{ + return PQescapeStringInternal(NULL, to, from, length, NULL, + static_client_encoding, + static_std_strings); +} + /* * PQescapeBytea - converts from binary string to the * minimal encoding necessary to include the string in an SQL * INSERT statement with a bytea type column as the target. * * The following transformations are applied - * '\0' == ASCII 0 == \\000 - * '\'' == ASCII 39 == \' - * '\\' == ASCII 92 == \\\\ - * anything < 0x20, or > 0x7e ---> \\ooo + * '\0' == ASCII 0 == \000 + * '\'' == ASCII 39 == '' + * '\\' == ASCII 92 == \\ + * anything < 0x20, or > 0x7e ---> \ooo * (where ooo is an octal expression) + * If not std_strings, all backslashes sent to the output are doubled. */ -unsigned char * -PQescapeBytea(const unsigned char *bintext, size_t binlen, size_t *bytealen) +static unsigned char * +PQescapeByteaInternal(PGconn *conn, + const unsigned char *from, size_t from_length, + size_t *to_length, bool std_strings) { const unsigned char *vp; unsigned char *rp; unsigned char *result; size_t i; size_t len; + size_t bslash_len = (std_strings ? 1 : 2); /* * empty string has 1 char ('\0') */ len = 1; - vp = bintext; - for (i = binlen; i > 0; i--, vp++) + vp = from; + for (i = from_length; i > 0; i--, vp++) { if (*vp < 0x20 || *vp > 0x7e) - len += 5; /* '5' is for '\\ooo' */ + len += bslash_len + 3; else if (*vp == '\'') len += 2; else if (*vp == '\\') - len += 4; + len += bslash_len + bslash_len; else len++; } + *to_length = len; rp = result = (unsigned char *) malloc(len); if (rp == NULL) + { + if (conn) + printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("out of memory\n")); return NULL; + } - vp = bintext; - *bytealen = len; - - for (i = binlen; i > 0; i--, vp++) + vp = from; + for (i = from_length; i > 0; i--, vp++) { if (*vp < 0x20 || *vp > 0x7e) { - (void) sprintf((char *) rp, "\\\\%03o", *vp); - rp += 5; + if (!std_strings) + *rp++ = '\\'; + (void) sprintf((char *) rp, "\\%03o", *vp); + rp += 4; } else if (*vp == '\'') { - rp[0] = '\''; - rp[1] = '\''; - rp += 2; + *rp++ = '\''; + *rp++ = '\''; } else if (*vp == '\\') { - rp[0] = '\\'; - rp[1] = '\\'; - rp[2] = '\\'; - rp[3] = '\\'; - rp += 4; + if (!std_strings) + { + *rp++ = '\\'; + *rp++ = '\\'; + } + *rp++ = '\\'; + *rp++ = '\\'; } else *rp++ = *vp; @@ -2475,6 +2584,25 @@ PQescapeBytea(const unsigned char *bintext, size_t binlen, size_t *bytealen) return result; } +unsigned char * +PQescapeByteaConn(PGconn *conn, + const unsigned char *from, size_t from_length, + size_t *to_length) +{ + if (!conn) + return NULL; + return PQescapeByteaInternal(conn, from, from_length, to_length, + conn->std_strings); +} + +unsigned char * +PQescapeBytea(const unsigned char *from, size_t from_length, size_t *to_length) +{ + return PQescapeByteaInternal(NULL, from, from_length, to_length, + static_std_strings); +} + + #define ISFIRSTOCTDIGIT(CH) ((CH) >= '0' && (CH) <= '3') #define ISOCTDIGIT(CH) ((CH) >= '0' && (CH) <= '7') #define OCTVAL(CH) ((CH) - '0') @@ -2484,7 +2612,7 @@ PQescapeBytea(const unsigned char *bintext, size_t binlen, size_t *bytealen) * of a bytea, strtext, into binary, filling a buffer. It returns a * pointer to the buffer (or NULL on error), and the size of the * buffer in retbuflen. The pointer may subsequently be used as an - * argument to the function free(3). It is the reverse of PQescapeBytea. + * argument to the function PQfreemem. * * The following transformations are made: * \\ == ASCII 92 == \ diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 0a4263f996..c309448bac 100644 --- a/src/interfaces/libpq/libpq-fe.h +++ b/src/interfaces/libpq/libpq-fe.h @@ -7,7 +7,7 @@ * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/interfaces/libpq/libpq-fe.h,v 1.127 2006/04/27 00:53:58 momjian Exp $ + * $PostgreSQL: pgsql/src/interfaces/libpq/libpq-fe.h,v 1.128 2006/05/21 20:19:23 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -427,11 +427,18 @@ extern PGresult *PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status); /* Quoting strings before inclusion in queries. */ -extern size_t PQescapeString(char *to, const char *from, size_t length); -extern unsigned char *PQescapeBytea(const unsigned char *bintext, size_t binlen, - size_t *bytealen); +extern size_t PQescapeStringConn(PGconn *conn, + char *to, const char *from, size_t length, + int *error); +extern unsigned char *PQescapeByteaConn(PGconn *conn, + const unsigned char *from, size_t from_length, + size_t *to_length); extern unsigned char *PQunescapeBytea(const unsigned char *strtext, size_t *retbuflen); +/* These forms are deprecated! */ +extern size_t PQescapeString(char *to, const char *from, size_t length); +extern unsigned char *PQescapeBytea(const unsigned char *from, size_t from_length, + size_t *to_length); diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h index 39533452ec..e72d4d911c 100644 --- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -12,7 +12,7 @@ * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * - * $PostgreSQL: pgsql/src/interfaces/libpq/libpq-int.h,v 1.112 2006/03/14 22:48:23 tgl Exp $ + * $PostgreSQL: pgsql/src/interfaces/libpq/libpq-int.h,v 1.113 2006/05/21 20:19:23 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -309,6 +309,7 @@ struct pg_conn char cryptSalt[2]; /* password salt received from backend */ pgParameterStatus *pstatus; /* ParameterStatus data */ int client_encoding; /* encoding id */ + bool std_strings; /* standard_conforming_strings */ PGVerbosity verbosity; /* error/notice message verbosity */ PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */ -- 2.40.0