Tom Lane [Mon, 31 Oct 2011 20:40:27 +0000 (16:40 -0400)]
Stop btree indexscans upon reaching nulls in either direction.
The existing scan-direction-sensitive tests were overly complex, and
failed to stop the scan in cases where it's perfectly legitimate to do so.
Per bug #6278 from Maksym Boguk.
Back-patch to 8.3, which is as far back as the patch applies easily.
Doesn't seem worth sweating over a relatively minor performance issue in
8.2 at this late date. (But note that this was a performance regression
from 8.1 and before, so 8.2 is being left as an outlier.)
Tom Lane [Sat, 29 Oct 2011 18:31:12 +0000 (14:31 -0400)]
Fix assorted bogosities in cash_in() and cash_out().
cash_out failed to handle multiple-byte thousands separators, as per bug
#6277 from Alexander Law. In addition, cash_in didn't handle that either,
nor could it handle multiple-byte positive_sign. Both routines failed to
support multiple-byte mon_decimal_point, which I did not think was worth
changing, but at least now they check for the possibility and fall back to
using '.' rather than emitting invalid output. Also, make cash_in handle
trailing negative signs, which formerly it would reject. Since cash_out
generates trailing negative signs whenever the locale tells it to, this
last omission represents a fail-to-reload-dumped-data bug. IMO that
justifies patching this all the way back.
Tom Lane [Wed, 26 Oct 2011 17:02:53 +0000 (13:02 -0400)]
Change FK trigger creation order to better support self-referential FKs.
When a foreign-key constraint references another column of the same table,
row updates will queue both the PK's ON UPDATE action and the FK's CHECK
action in the same event. The ON UPDATE action must execute first, else
the CHECK will check a non-final state of the row and possibly throw an
inappropriate error, as seen in bug #6268 from Roman Lytovchenko.
Now, the firing order of multiple triggers for the same event is determined
by the sort order of their pg_trigger.tgnames, and the auto-generated names
we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the
trigger OID. So most of the time the firing order is the same as creation
order, and so rearranging the creation order fixes it.
This patch will fail to fix the problem if the OID counter wraps around or
adds a decimal digit (eg, from 99999 to 100000) while we are creating the
triggers for an FK constraint. Given the small odds of that, and the low
usage of self-referential FKs, we'll live with that solution in the back
branches. A better fix is to change the auto-generated names for FK
triggers, but it seems unwise to do that in stable branches because there
may be client code that depends on the naming convention. We'll fix it
that way in HEAD in a separate patch.
Back-patch to all supported branches, since this bug has existed for a long
time.
Tom Lane [Tue, 18 Oct 2011 21:11:18 +0000 (17:11 -0400)]
Fix pg_dump to dump casts between auto-generated types.
The heuristic for when to dump a cast failed for a cast between table
rowtypes, as reported by Frédéric Rejol. Fix it by setting
the "dump" flag for such a type the same way as the flag is set for the
underlying table or base type. This won't result in the auto-generated
type appearing in the output, since setting its objType to DO_DUMMY_TYPE
unconditionally suppresses that. But it will result in dumpCast doing what
was intended.
Back-patch to 8.3. The 8.2 code is rather different in this area, and it
doesn't seem worth any risk to fix a corner case that nobody has stumbled
on before.
Tom Lane [Sat, 15 Oct 2011 00:24:50 +0000 (20:24 -0400)]
Fix bugs in information_schema.referential_constraints view.
This view was being insufficiently careful about matching the FK constraint
to the depended-on primary or unique key constraint. That could result in
failure to show an FK constraint at all, or showing it multiple times, or
claiming that it depended on a different constraint than the one it really
does. Fix by joining via pg_depend to ensure that we find only the correct
dependency.
Back-patch, but don't bump catversion because we can't force initdb in back
branches. The next minor-version release notes should explain that if you
need to fix this in an existing installation, you can drop the
information_schema schema then re-create it by sourcing
$SHAREDIR/information_schema.sql in each database (as a superuser of
course).
Tom Lane [Wed, 12 Oct 2011 17:59:30 +0000 (13:59 -0400)]
Improve documentation of psql's \q command.
The documentation neglected to explain its behavior in a script file
(it only ends execution of the script, not psql as a whole), and failed
to mention the long form \quit either.
Don't let transform_null_equals=on affect CASE foo WHEN NULL ... constructs.
transform_null_equals is only supposed to affect "foo = NULL" expressions
given directly by the user, not the internal "foo = NULL" expression
generated from CASE-WHEN.
This fixes bug #6242, reported by Sergey. Backpatch to all supported
branches.
Robert Haas [Thu, 6 Oct 2011 16:08:59 +0000 (12:08 -0400)]
Make pgstatindex respond to cancel interrupts.
A similar problem for pgstattuple() was fixed in April of 2010 by commit 33065ef8bc52253ae855bc959576e52d8a28ba06, but pgstatindex() seems to have
been overlooked.
Back-patch all the way, as with that commit, though not to 7.4 through
8.1, since those are now EOL.
Tom Lane [Sat, 24 Sep 2011 02:12:36 +0000 (22:12 -0400)]
Fix our mapping of Windows timezones for Central America.
We were mapping "Central America Standard Time" to "CST6CDT", which seems
entirely wrong, because according to the Olson timezone database noplace
in Central America observes daylight savings time on any regular basis ---
and certainly not according to the USA DST rules that are implied by
"CST6CDT". (Mexico is an exception, but they can be disregarded since
they have a separate timezone name in Windows.) So, map this zone name to
plain "CST6", which will provide a fixed UTC offset.
As written, this patch will also result in mapping "Central America
Daylight Time" to CST6. I considered hacking things so that would still
map to CST6CDT, but it seems it would confuse win32tzlist.pl to put those
two names in separate entries. Since there's little evidence that any
such zone name is used in the wild, much less that CST6CDT would be a good
match for it, I'm not too worried about what we do with it.
Tom Lane [Fri, 16 Sep 2011 08:28:11 +0000 (04:28 -0400)]
gistendscan() forgot to free so->giststate.
This oversight led to a massive memory leak --- upwards of 10KB per tuple
--- during creation-time verification of an exclusion constraint based on a
GIST index. In most other scenarios it'd just be a leak of 10KB that would
be recovered at end of query, so not too significant; though perhaps the
leak would be noticeable in a situation where a GIST index was being used
in a nestloop inner indexscan. In any case, it's a real leak of long
standing, so patch all supported branches. Per report from Harald Fuchs.
Tom Lane [Wed, 7 Sep 2011 21:06:39 +0000 (17:06 -0400)]
Fix corner case bug in numeric to_char().
Trailing-zero stripping applied by the FM specifier could strip zeroes
to the left of the decimal point, for a format with no digit positions
after the decimal point (such as "FM999.").
Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
Tom Lane [Tue, 6 Sep 2011 18:50:28 +0000 (14:50 -0400)]
Avoid possibly accessing off the end of memory in SJIS2004 conversion.
The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when
only one remained in the string. Since conversion functions aren't
supposed to assume null-terminated input, this poses a small risk of
fetching past the end of memory and incurring SIGSEGV. No such crash has
been identified in the field, but we've certainly seen the equivalent
happen in other code paths, so patch this one all the way back.
Tom Lane [Tue, 6 Sep 2011 18:35:55 +0000 (14:35 -0400)]
Avoid possibly accessing off the end of memory in examine_attribute().
Since the last couple of columns of pg_type are often NULL,
sizeof(FormData_pg_type) can be an overestimate of the actual size of the
tuple data part. Therefore memcpy'ing that much out of the catalog cache,
as analyze.c was doing, poses a small risk of copying past the end of
memory and incurring SIGSEGV. No such crash has been identified in the
field, but we've certainly seen the equivalent happen in other code paths,
so patch this one all the way back.
Per valgrind testing by Noah Misch, though this is not his proposed patch.
I chose to use SearchSysCacheCopy1 rather than inventing special-purpose
infrastructure for copying only the minimal part of a pg_type tuple.
Tom Lane [Tue, 6 Sep 2011 16:14:51 +0000 (12:14 -0400)]
Update type-conversion documentation for long-ago changes.
This example wasn't updated when we changed the behavior of bpcharlen()
in 8.0, nor when we changed the number of parameters taken by the bpchar()
cast function in 7.3. Per report from lsliang.
Tom Lane [Sat, 3 Sep 2011 20:17:57 +0000 (16:17 -0400)]
Fix typo in pg_srand48 (srand48 in older branches).
">" should be ">>". This typo results in failure to use all of the bits
of the provided seed.
This might rise to the level of a security bug if we were relying on
srand48 for any security-critical purposes, but we are not --- in fact,
it's not used at all unless the platform lacks srandom(), which is
improbable. Even on such a platform the exposure seems minimal.
Move the line to undefine setlocale() macro on Win32 outside USE_REPL_SNPRINTF
ifdef block. It has nothing to do with whether the replacement snprintf
function is used. It caused no live bug, because the replacement snprintf
function is always used on Win32, but it was nevertheless misplaced.
The version of this macro used in autoconf 2.59 is capable of incorrectly
succeeding (ie, reporting that a library function is available when it
isn't), if the compiler performs link-time optimization and decides that
it can optimize the function reference away entirely. Replace it with the
coding used in autoconf 2.61 and later, which forces the program result to
depend on the function's result so that it cannot be optimized away. This
should fix build failures currently being seen on buildfarm member anchovy.
This patch affects the 8.2 and 8.3 branches only, since later branches are
using autoconf versions that don't have this problem.
Tom Lane [Sat, 27 Aug 2011 20:37:17 +0000 (16:37 -0400)]
Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server.
These days, such a response is far more likely to signify a server-side
problem, such as fork failure. Reporting "server does not support SSL"
(in sslmode=require) could be quite misleading. But the results could
be even worse in sslmode=prefer: if the problem was transient and the
next connection attempt succeeds, we'll have silently fallen back to
protocol version 2.0, possibly disabling features the user needs.
Hence, it seems best to just eliminate the assumption that backing off
to non-SSL/2.0 protocol is the way to recover from an "E" response, and
instead treat the server error the same as we would in non-SSL cases.
I tested this change against a pre-7.0 server, and found that there
was a second logic bug in the "prefer" path: the test to decide whether
to make a fallback connection attempt assumed that we must have opened
conn->ssl, which in fact does not happen given an "E" response. After
fixing that, the code does indeed connect successfully to pre-7.0,
as long as you didn't set sslmode=require. (If you did, you get
"Unsupported frontend protocol", which isn't completely off base
given the server certainly doesn't support SSL.)
Since there seems no reason to believe that pre-7.0 servers exist anymore
in the wild, back-patch to all supported branches.
Tom Lane [Sat, 27 Aug 2011 18:16:35 +0000 (14:16 -0400)]
Ensure we discard unread/unsent data when abandoning a connection attempt.
There are assorted situations wherein PQconnectPoll() will abandon a
connection attempt and try again with different parameters (eg, SSL versus
not SSL). However, the code forgot to discard any pending data in libpq's
I/O buffers when doing this. In at least one case (server returns E
message during SSL negotiation), there is unread input data which bollixes
the next connection attempt. I have not checked to see whether this is
possible in the other cases where we close the socket and retry, but it
seems like a matter of good defensive programming to add explicit
buffer-flushing code to all of them.
This is one of several issues exposed by Daniel Farina's report of
misbehavior after a server-side fork failure.
This has been wrong since forever, so back-patch to all supported branches.
Tom Lane [Fri, 26 Aug 2011 20:51:57 +0000 (16:51 -0400)]
Fix potential memory clobber in tsvector_concat().
tsvector_concat() allocated its result workspace using the "conservative"
estimate of the sum of the two input tsvectors' sizes. Unfortunately that
wasn't so conservative as all that, because it supposed that the number of
pad bytes required could not grow. Which it can, as per test case from
Jesper Krogh, if there's a mix of lexemes with positions and lexemes
without them in the input data. The fix is to assume that we might add
a not-previously-present pad byte for each and every lexeme in the two
inputs; which really is conservative, but it doesn't seem worthwhile to
try to be more precise.
This is an aboriginal bug in tsvector_concat, so back-patch to all
versions containing it.
Tom Lane [Thu, 25 Aug 2011 03:50:31 +0000 (23:50 -0400)]
Fix pgstatindex() to give consistent results for empty indexes.
For an empty index, the pgstatindex() function would compute 0.0/0.0 for
its avg_leaf_density and leaf_fragmentation outputs. On machines that
follow the IEEE float arithmetic standard with any care, that results in
a NaN. However, per report from Rushabh Lathia, Microsoft couldn't
manage to get this right, so you'd get a bizarre error on Windows.
Fix by forcing the results to be NaN explicitly, rather than relying on
the division operator to give that or the snprintf function to print it
correctly. I have some doubts that this is really the most useful
definition, but it seems better to remain backward-compatible with
those platforms for which the behavior wasn't completely broken.
Back-patch to 8.2, since the code is like that in all current releases.
Tom Lane [Sat, 20 Aug 2011 18:51:02 +0000 (14:51 -0400)]
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can
make building a large tidbitmap into an O(N^2) operation. To fix, make
sure we remove more than the minimum amount of information per call, and
add a fallback path to behave sanely if we're unable to fit the bitmap
within the requested amount of memory.
This has been wrong since the tidbitmap code was written, so back-patch
to all supported branches.
Tom Lane [Tue, 16 Aug 2011 17:12:23 +0000 (13:12 -0400)]
Fix race condition in relcache init file invalidation.
The previous code tried to synchronize by unlinking the init file twice,
but that doesn't actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that would
tell it the data is stale. The result would be bizarre failures in catalog
accesses, typically "could not read block 0 in file ..." later during
startup.
Instead, hold RelCacheInitLock across both the unlink and the sending of
the SI messages. This is more straightforward, and might even be a bit
faster since only one unlink call is needed.
This has been wrong since it was put in (in 2002!), so back-patch to all
supported releases.
Tom Lane [Thu, 28 Jul 2011 18:07:23 +0000 (14:07 -0400)]
Fix pg_restore's direct-to-database mode for standard_conforming_strings.
pg_backup_db.c contained a mini SQL lexer with which it tried to identify
boundaries between SQL commands, but that code was not designed to cope
with standard_conforming_strings, and would get the wrong answer if a
backslash immediately precedes a closing single quote in such a string,
as per report from Julian Mehnle. The bug only affects direct-to-database
restores from archive files made with standard_conforming_strings = on.
Rather than complicating the code some more to try to fix that, let's just
rip it all out. The only reason it was needed was to cope with COPY data
embedded into ordinary archive entries, which was a layout that was used
only for about the first three weeks of the archive format's existence,
and never in any production release of pg_dump. Instead, just rely on the
archive file layout to tell us whether we're printing COPY data or not.
This bug represents a data corruption hazard in all releases in which
standard_conforming_strings can be turned on, ie 8.2 and later, so
back-patch to all supported branches.
Tom Lane [Mon, 25 Jul 2011 03:29:27 +0000 (23:29 -0400)]
Fix previous patch so it also works if not USE_SSL (mea culpa).
On balance, the need to cover this case changes my mind in favor of pushing
all error-message generation duties into the two fe-secure.c routines.
So do it that way.
Tom Lane [Sun, 24 Jul 2011 20:29:30 +0000 (16:29 -0400)]
Improve libpq's error reporting for SSL failures.
In many cases, pqsecure_read/pqsecure_write set up useful error messages,
which were then overwritten with useless ones by their callers. Fix this
by defining the responsibility to set an error message to be entirely that
of the lower-level function when using SSL.
Back-patch to 8.3; the code is too different in 8.2 to be worth the
trouble.
Tom Lane [Sun, 24 Jul 2011 19:18:12 +0000 (15:18 -0400)]
Use OpenSSL's SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER flag.
This disables an entirely unnecessary "sanity check" that causes failures
in nonblocking mode, because OpenSSL complains if we move or compact the
write buffer. The only actual requirement is that we not modify pending
data once we've attempted to send it, which we don't. Per testing and
research by Martin Pihlak, though this fix is a lot simpler than his patch.
I put the same change into the backend, although it's less clear whether
it's necessary there. We do use nonblock mode in some situations in
streaming replication, so seems best to keep the same behavior in the
backend as in libpq.
Magnus Hagander [Sat, 16 Jul 2011 17:58:53 +0000 (19:58 +0200)]
Fix SSPI login when multiple roundtrips are required
This fixes SSPI login failures showing "The function
requested is not supported", often showing up when connecting
to localhost. The reason was not properly updating the SSPI
handle when multiple roundtrips were required to complete the
authentication sequence.
Report and analysis by Ahmed Shinwari, patch by Magnus Hagander
Fix two ancient bugs in GiST code to re-find a parent after page split:
First, when following a right-link, we incorrectly marked the current page
as the parent of the right sibling. In reality, the parent of the right page
is the same as the parent of the current page (or some page to the right of
it, gistFindCorrectParent() will sort that out).
Secondly, when we follow a right-link, we must prepend, not append, the right
page to our list of pages to visit. That's because we assume that once we
hit a leaf page in the list, all the rest are leaf pages too, and give up.
To hit these bugs, you need concurrent actions and several unlucky accidents.
Another backend must split the root page, while you're in process of
splitting a lower-level page. Furthermore, while you scan the internal nodes
to re-find the parent, another backend needs to again split some more internal
pages. Even then, the bugs don't necessarily manifest as user-visible errors
or index corruption.
While we're at it, make the error reporting a bit better if gistFindPath()
fails to re-find the parent. It used to be an assertion, but an elog() seems
more appropriate.
Tom Lane [Tue, 5 Jul 2011 16:04:40 +0000 (12:04 -0400)]
Fix psql's counting of script file line numbers during COPY.
handleCopyIn incremented pset.lineno for each line of COPY data read from
a file. This is correct when reading from the current script file (i.e.,
we are doing COPY FROM STDIN followed by in-line data), but it's wrong if
the data is coming from some other file. Per bug #6083 from Steve Haslam.
Back-patch to all supported versions.
Tom Lane [Sun, 3 Jul 2011 20:40:34 +0000 (16:40 -0400)]
Back-patch creation of tar.bz2 tarball during "make dist".
Since commit a4d03bbcdaf7739d7e9073ee76bb186f68ddc163, "make dist" has
built both gzip- and bzip2-compressed tarballs. However, this was
pretty useless, because our tarball build script didn't know about it
and proceeded to overwrite the bz2 file with new data. Back-patch the
change to all active branches, so that creation of the tar.bz2 file
can be removed from the build script.
Tom Lane [Tue, 21 Jun 2011 18:41:05 +0000 (14:41 -0400)]
Apply upstream fix for blowfish signed-character bug (CVE-2011-2483).
A password containing a character with the high bit set was misprocessed
on machines where char is signed (which is most). This could cause the
preceding one to three characters to fail to affect the hashed result,
thus weakening the password. The result was also unportable, and failed
to match some other blowfish implementations such as OpenBSD's.
Since the fix changes the output for such passwords, upstream chose
to provide a compatibility hack: password salts beginning with $2x$
(instead of the usual $2a$ for blowfish) are intentionally processed
"wrong" to give the same hash as before. Stored password hashes can
thus be modified if necessary to still match, though it'd be better
to change any affected passwords.
In passing, sync a couple other upstream changes that marginally improve
performance and/or tighten error checking.
Back-patch to all supported branches. Since this issue is already
public, no reason not to commit the fix ASAP.
Tom Lane [Fri, 17 Jun 2011 23:13:21 +0000 (19:13 -0400)]
Don't use "cp -i" in the example WAL archive_command.
This is a dangerous example to provide because on machines with GNU cp,
it will silently do the wrong thing and risk archive corruption. Worse,
during the 9.0 cycle somebody "improved" the discussion by removing the
warning that used to be there about that, and instead leaving the
impression that the command would work as desired on most Unixen.
It doesn't. Try to rectify the damage by providing an example that is safe
most everywhere, and then noting that you can try cp -i if you want but
you'd better test that.
In back-patching this to all supported branches, I also added an example
command for Windows, which wasn't provided before 9.0.
Tom Lane [Fri, 17 Jun 2011 22:19:26 +0000 (18:19 -0400)]
Obtain table locks as soon as practical during pg_dump.
For some reason, when we (I) added table lock acquisition to pg_dump,
we didn't think about making it happen as soon as possible after the
start of the transaction. What with subsequent additions, there was
actually quite a lot going on before we got around to that; which sort
of defeats the purpose. Rearrange the order of calls in dumpSchema()
to close the risk window as much as we easily can. Back-patch to all
supported branches.
Robert Haas [Fri, 17 Jun 2011 18:28:45 +0000 (14:28 -0400)]
Add overflow checks to int4 and int8 versions of generate_series().
The previous code went into an infinite loop after overflow. In fact,
an overflow is not really an error; it just means that the current
value is the last one we need to return. So, just arrange to stop
immediately when overflow is detected.
Tom Lane [Tue, 14 Jun 2011 21:14:06 +0000 (17:14 -0400)]
Suppress -arch switches in the output of ExtUtils::Embed.
We previously found out that OS X's standard perl installation tries to put
-arch switches into Perl link commands, evidently in hopes of building
universal binaries. But it doesn't work to add such switches in plperl's
link step if they weren't being used earlier, so this is basically
unworkable. When using gcc the result is only some warnings; but LLVM
fails entirely, so this issue isn't as cosmetic as we originally thought.
Hence, back-patch commit d69a419e682c2d39c2355105a7e5e2b90357c8f0 into
pre-9.0 branches.
Tom Lane [Tue, 14 Jun 2011 20:24:45 +0000 (16:24 -0400)]
Fix assorted issues with build and install paths containing spaces.
Apparently there is no buildfarm critter exercising this case after all,
because it fails in several places. With this patch, build, install,
check-world, and installcheck-world pass for me on OS X.
Tom Lane [Fri, 10 Jun 2011 21:03:21 +0000 (17:03 -0400)]
Work around gcc 4.6.0 bug that breaks WAL replay.
ReadRecord's habit of using both direct references to tmpRecPtr and
references to *RecPtr (which is pointing at tmpRecPtr) triggers an
optimization bug in gcc 4.6.0, which apparently has forgotten about
aliasing rules. Avoid the compiler bug, and make the code more readable
to boot, by getting rid of the direct references. Improve the comments
while at it.
Back-patch to all supported versions, in case they get built with 4.6.0.
Tom Lane, with some cosmetic suggestions from Alex Hunsaker
Tom Lane [Sat, 4 Jun 2011 19:48:36 +0000 (15:48 -0400)]
Expose the "*VALUES*" alias that we generate for a stand-alone VALUES list.
We were trying to make that strictly an internal implementation detail,
but it turns out that it's exposed anyway when dumping a view defined
like
CREATE VIEW test_view AS VALUES (1), (2), (3) ORDER BY 1;
This comes out as
CREATE VIEW ... ORDER BY "*VALUES*".column1;
which fails to parse when reloading the dump.
Hacking ruleutils.c to suppress the column qualification looks like it'd
be a risky business, so instead promote the RTE alias to full-fledged
usability.
Per bug #6049 from Dylan Adams. Back-patch to all supported branches.
Tom Lane [Thu, 2 Jun 2011 19:31:22 +0000 (15:31 -0400)]
Clean up after erroneous SELECT FOR UPDATE/SHARE on a sequence.
My previous commit disallowed this operation, but did nothing about
cleaning up the damage if one had already been done. With the operation
disallowed, it's okay to just forcibly clear xmax in a sequence's tuple,
since any value seen there could not represent a live transaction's lock.
So, any sequence-specific operation will repair the problem automatically,
whether or not the user has already seen "could not access status of
transaction" failures.
Tom Lane [Thu, 2 Jun 2011 18:46:31 +0000 (14:46 -0400)]
Disallow SELECT FOR UPDATE/SHARE on sequences.
We can't allow this because such an operation stores its transaction XID
into the sequence tuple's xmax. Because VACUUM doesn't process sequences
(and we don't want it to start doing so), such an xmax value won't get
frozen, meaning it will eventually refer to nonexistent pg_clog storage,
and even wrap around completely. Since the row lock is ignored by nextval
and setval, the usefulness of the operation is highly debatable anyway.
Per reports of trouble with pgpool 3.0, which had ill-advisedly started
using such commands as a form of locking.
In HEAD, also disallow SELECT FOR UPDATE/SHARE on toast tables. Although
this does work safely given the current implementation, there seems no
good reason to allow it. I refrained from changing that behavior in
back branches, however.
Tom Lane [Tue, 31 May 2011 21:54:06 +0000 (17:54 -0400)]
Protect GIST logic that assumes penalty values can't be negative.
Apparently sane-looking penalty code might return small negative values,
for example because of roundoff error. This will confuse places like
gistchoose(). Prevent problems by clamping negative penalty values to
zero. (Just to be really sure, I also made it force NaNs to zero.)
Back-patch to all supported branches.
Tom Lane [Mon, 30 May 2011 23:16:22 +0000 (19:16 -0400)]
Fix portability bugs in use of credentials control messages for peer auth.
Even though our existing code for handling credentials control messages has
been basically unchanged since 2001, it was fundamentally wrong: it did not
ensure proper alignment of the supplied buffer, and it was calculating
buffer sizes and message sizes incorrectly. This led to failures on
platforms where alignment padding is relevant, for instance FreeBSD on
64-bit platforms, as seen in a recent Debian bug report passed on by
Martin Pitt (http://bugs.debian.org//cgi-bin/bugreport.cgi?bug=612888).
Rewrite to do the message-whacking using the macros specified in RFC 2292,
following a suggestion from Theo de Raadt in that thread. Tested by me
on Debian/kFreeBSD-amd64; since OpenBSD and NetBSD document the identical
CMSG API, it should work there too.
Tom Lane [Sat, 28 May 2011 16:36:04 +0000 (12:36 -0400)]
Fix null-dereference crash in parse_xml_decl().
parse_xml_decl's header comment says you can pass NULL for any unwanted
output parameter, but it failed to honor this contract for the "standalone"
flag. The only currently-affected caller is xml_recv, so the net effect is
that sending a binary XML value containing a standalone parameter in its
xml declaration would crash the backend. Per bug #6044 from Christopher
Dillard.
In passing, remove useless initializations of parse_xml_decl's output
parameters in xml_parse.
Back-patch to 8.3, where this code was introduced.
Tom Lane [Thu, 26 May 2011 23:25:19 +0000 (19:25 -0400)]
Make decompilation of optimized CASE constructs more robust.
We had some hacks in ruleutils.c to cope with various odd transformations
that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause.
However, the fundamental impossibility of covering all cases was exposed
by Heikki, who pointed out that the "=" operator could get replaced by an
inlined SQL function, which could contain nearly anything at all. So give
up on the hacks and just print the expression as-is if we fail to recognize
it as "CaseTestExpr = RHS". (We must cover that case so that decompiled
rules print correctly; but we are not under any obligation to make EXPLAIN
output be 100% valid SQL in all cases, and already could not do so in some
other cases.) This approach requires that we have some printable
representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR".
Back-patch to all supported branches, since the problem case fails in all.
Tom Lane [Mon, 23 May 2011 16:53:00 +0000 (12:53 -0400)]
Install defenses against overflow in BuildTupleHashTable().
The planner can sometimes compute very large values for numGroups, and in
cases where we have no alternative to building a hashtable, such a value
will get fed directly to BuildTupleHashTable as its nbuckets parameter.
There were two ways in which that could go bad. First, BuildTupleHashTable
declared the parameter as "int" but most callers were passing "long"s,
so on 64-bit machines undetected overflow could occur leading to a bogus
negative value. The obvious fix for that is to change the parameter to
"long", which is what I've done in HEAD. In the back branches that seems a
bit risky, though, since third-party code might be calling this function.
So for them, just put in a kluge to treat negative inputs as INT_MAX.
Second, hash_create can go nuts with extremely large requested table sizes
(notably, my_log2 becomes an infinite loop for inputs larger than
LONG_MAX/2). What seems most appropriate to avoid that is to bound the
initial table size request to work_mem.
This fixes bug #6035 reported by Daniel Schreiber. Although the reported
case only occurs back to 8.4 since it involves WITH RECURSIVE, I think
it's a good idea to install the defenses in all supported branches.
Tom Lane [Thu, 12 May 2011 15:56:38 +0000 (11:56 -0400)]
Fix write-past-buffer-end in ldapServiceLookup().
The code to assemble ldap_get_values_len's output into a single string
wrote the terminating null one byte past where it should. Fix that,
and make some other cosmetic adjustments to make the code a trifle more
readable and more in line with usual Postgres coding style.
Also, free the "result" string when done with it, to avoid a permanent
memory leak.
Bug report and patch by Albe Laurenz, cosmetic adjustments by me.
Tom Lane [Sun, 1 May 2011 21:57:55 +0000 (17:57 -0400)]
Make CLUSTER lock the old table's toast table before copying data.
We must lock out autovacuuming of the old toast table before computing the
OldestXmin horizon we will use. Otherwise, autovacuum could start on the
toast table later, compute a later OldestXmin horizon, and remove as DEAD
toast tuples that we still need (because we think their parent tuples are
only RECENTLY_DEAD). Per further thought about bug #5998.
Tom Lane [Fri, 29 Apr 2011 20:30:02 +0000 (16:30 -0400)]
Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().
VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it. The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots. However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain). This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible. But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table. It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.
Easiest fix is to get rid of the special case for xmin == xmax. This may
delay reclaiming dead space for a little bit in some cases, but it's by far
the most reliable way to fix the issue.
Per bug #5998 from Mark Reid. Back-patch to 8.3, which is the oldest
version with MVCC-safe CLUSTER.
Tom Lane [Fri, 29 Apr 2011 05:45:21 +0000 (01:45 -0400)]
Rewrite pg_size_pretty() to avoid compiler bug.
Convert it to use successive shifts right instead of increasing a divisor.
This is probably a tad more efficient than the original coding, and it's
nicer-looking than the previous patch because we don't need a special case
to avoid overflow in the last branch. But the real reason to do it is to
avoid a Solaris compiler bug, as per results from buildfarm member moa.
Tom Lane [Wed, 27 Apr 2011 17:58:54 +0000 (13:58 -0400)]
Fix array- and path-creating functions to ensure padding bytes are zeroes.
Per recent discussion, it's important for all computed datums (not only the
results of input functions) to not contain any ill-defined (uninitialized)
bits. Failing to ensure that can result in equal() reporting that
semantically indistinguishable Consts are not equal, which in turn leads to
bizarre and undesirable planner behavior, such as in a recent example from
David Johnston. We might eventually try to fix this in a general manner by
allowing datatypes to define identity-testing functions, but for now the
path of least resistance is to expect datatypes to force all unused bits
into consistent states.
Per some testing by Noah Misch, array and path functions seem to be the
only ones presenting risks at the moment, so I looked through all the
functions in adt/array*.c and geo_ops.c and fixed them as necessary. In
the array functions, the easiest/safest fix is to allocate result arrays
with palloc0 instead of palloc. Possibly in future someone will want to
look into whether we can just zero the padding bytes, but that looks too
complex for a back-patchable fix. In the path functions, we already had a
precedent in path_in for just zeroing the one known pad field, so duplicate
that code as needed.
Tom Lane [Mon, 25 Apr 2011 20:22:24 +0000 (16:22 -0400)]
Fix pg_size_pretty() to avoid overflow for inputs close to INT64_MAX.
The expression that tried to round the value to the nearest TB could
overflow, leading to bogus output as reported in bug #5993 from Nicola
Cossu. This isn't likely to ever happen in the intended usage of the
function (if it could, we'd be needing to use a wider datatype instead);
but it's not hard to give the expected output, so let's do so.
Tom Lane [Thu, 21 Apr 2011 00:34:27 +0000 (20:34 -0400)]
Fix bugs in indexing of in-doubt HOT-updated tuples.
If we find a DELETE_IN_PROGRESS HOT-updated tuple, it is impossible to know
whether to index it or not except by waiting to see if the deleting
transaction commits. If it doesn't, the tuple might again be LIVE, meaning
we have to index it. So wait and recheck in that case.
Also, we must not rely on ii_BrokenHotChain to decide that it's possible to
omit tuples from the index. That could result in omitting tuples that we
need, particularly in view of yesterday's fixes to not necessarily set
indcheckxmin (but it's broken even without that, as per my analysis today).
Since this is just an extremely marginal performance optimization, dropping
the test shouldn't hurt.
These cases are only expected to happen in system catalogs (they're
possible there due to early release of RowExclusiveLock in most
catalog-update code paths). Since reindexing of a system catalog isn't a
particularly performance-critical operation anyway, there's no real need to
be concerned about possible performance degradation from these changes.
The worst aspects of this bug were introduced in 9.0 --- 8.x will always
wait out a DELETE_IN_PROGRESS tuple. But I think dropping index entries
on the strength of ii_BrokenHotChain is dangerous even without that, so
back-patch removal of that optimization to 8.3 and 8.4.
Tom Lane [Tue, 19 Apr 2011 22:51:12 +0000 (18:51 -0400)]
Avoid changing an index's indcheckxmin horizon during REINDEX.
There can never be a need to push the indcheckxmin horizon forward, since
any HOT chains that are actually broken with respect to the index must
pre-date its original creation. So we can just avoid changing pg_index
altogether during a REINDEX operation.
This offers a cleaner solution than my previous patch for the problem
found a few days ago that we mustn't try to update pg_index while we are
reindexing it. System catalog indexes will always be created with
indcheckxmin = false during initdb, and with this modified code we should
never try to change their pg_index entries. This avoids special-casing
system catalogs as the former patch did, and should provide a performance
benefit for many cases where REINDEX formerly caused an index to be
considered unusable for a short time.
Back-patch to 8.3 to cover all versions containing HOT. Note that this
patch changes the API for index_build(), but I believe it is unlikely that
any add-on code is calling that directly.
Tom Lane [Sat, 16 Apr 2011 00:19:16 +0000 (20:19 -0400)]
Prevent incorrect updates of pg_index while reindexing pg_index itself.
The places that attempt to change pg_index.indcheckxmin during a reindexing
operation cannot be executed safely if pg_index itself is the subject of
the operation. This is the explanation for a couple of recent reports of
VACUUM FULL failing with
ERROR: duplicate key value violates unique constraint "pg_index_indexrelid_index"
DETAIL: Key (indexrelid)=(2678) already exists.
However, there isn't any real need to update indcheckxmin in such a
situation, if we assume that pg_index can never contain a truly broken HOT
chain. This assumption holds if new indexes are never created on it during
concurrent operations, which is something we don't consider safe for any
system catalog, not just pg_index. Accordingly, modify the code to not
manipulate indcheckxmin when reindexing any system catalog.
Back-patch to 8.3, where HOT was introduced. The known failure scenarios
involve 9.0-style VACUUM FULL, so there might not be any real risk before
9.0, but let's not assume that.
On IA64 architecture, we check the depth of the register stack in addition
to the regular stack. The code to do that is platform and compiler specific,
add support for the HP-UX native compiler.
Tom Lane [Thu, 7 Apr 2011 19:14:56 +0000 (15:14 -0400)]
Modernize dlopen interface code for FreeBSD and OpenBSD.
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD. This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester. We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.
Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
Tom Lane [Thu, 7 Apr 2011 15:40:39 +0000 (11:40 -0400)]
Fix SortTocFromFile() to cope with lines that are too long for its buffer.
The original coding supposed that a dump TOC file could never contain lines
longer than 1K. The folly of that was exposed by a recent report from
Per-Olov Esgard. We only really need to see the first dozen or two bytes
of each line, since we're just trying to read off the numeric ID at the
start of the line; so there's no need for a particularly huge buffer.
What there is a need for is logic to not process continuation bufferloads.
Back-patch to all supported branches, since it's always been like this.
Tom Lane [Mon, 28 Mar 2011 19:45:14 +0000 (15:45 -0400)]
Prevent a rowtype from being included in itself.
Eventually we might be able to allow that, but it's not clear how many
places need to be fixed to prevent infinite recursion when there's a direct
or indirect inclusion of a rowtype in itself. One such place is
CheckAttributeType(), which will recurse to stack overflow in cases such as
those exhibited in bug #5950 from Alex Perepelica. If we were sure it was
the only such place, we could easily modify the code added by this patch to
stop the recursion without a complaint ... but it probably isn't the only
such place. Hence, throw error until such time as someone is excited
enough about this type of usage to put work into making it safe.
Back-patch as far as 8.3. 8.2 doesn't have the recursive call in
CheckAttributeType in the first place, so I see no need to add code there
in the absence of clear evidence of a problem elsewhere.
Tom Lane [Wed, 23 Mar 2011 20:57:37 +0000 (16:57 -0400)]
Improve user-defined-aggregates documentation.
On closer inspection, that two-element initcond value seems to have been
a little white lie to avoid explaining the full behavior of float8_accum.
But if people are going to expect the examples to be exactly correct,
I suppose we'd better explain. Per comment from Thom Brown.
Tom Lane [Tue, 22 Mar 2011 17:01:17 +0000 (13:01 -0400)]
Avoid potential deadlock in InitCatCachePhase2().
Opening a catcache's index could require reading from that cache's own
catalog, which of course would acquire AccessShareLock on the catalog.
So the original coding here risks locking index before heap, which could
deadlock against another backend trying to get exclusive locks in the
normal order. Because InitCatCachePhase2 is only called when a backend
has to start up without a relcache init file, the deadlock was seldom seen
in the field. (And by the same token, there's no need to worry about any
performance disadvantage; so not much point in trying to distinguish
exactly which catalogs have the risk.)
Bug report, diagnosis, and patch by Nikhil Sontakke. Additional commentary
by me. Back-patch to all supported branches.
Andrew Dunstan [Thu, 17 Mar 2011 04:22:03 +0000 (00:22 -0400)]
Use correct PATH separator for Cygwin in pg_regress.c.
This has been broken for years, and I'm not sure why it has not been
noticed before, but now a very modern Cygwin breaks on it, and the fix
is clearly correct. Backpatching to all live branches.
Tom Lane [Fri, 11 Mar 2011 23:19:07 +0000 (18:19 -0500)]
Put in some more safeguards against executing a division-by-zero.
Add dummy returns before every potential division-by-zero in int8.c,
because apparently further "improvements" in gcc's optimizer have
enabled it to break functions that weren't broken before.