Robert Haas [Tue, 30 Aug 2011 15:34:29 +0000 (11:34 -0400)]
Fix parsing of time string followed by yesterday/today/tomorrow.
Previously, 'yesterday 04:00:00'::timestamp didn't do the same thing as
'04:00:00 yesterday'::timestamp, and the return value from the latter
was midnight rather than the specified time.
Tom Lane [Mon, 29 Aug 2011 17:18:44 +0000 (13:18 -0400)]
Use a non-locking test in TAS_SPIN() on all IA64 platforms.
Per my testing, this works just as well with gcc as it does with HP's
compiler; and there is no reason to think that the effect doesn't occur
with icc, either.
Also, rewrite the header comment about enforcing sequencing around spinlock
operations, per Robert's gripe that it was misleading.
Robert Haas [Mon, 29 Aug 2011 14:05:48 +0000 (10:05 -0400)]
Improve spinlock performance for HP-UX, ia64, non-gcc.
At least on this architecture, it's very important to spin on a
non-atomic instruction and only retry the atomic once it appears
that it will succeed. To fix this, split TAS() into two macros:
TAS(), for trying to grab the lock the first time, and TAS_SPIN(),
for spinning until we get it. TAS_SPIN() defaults to same as TAS(),
but we can override it when we know there's a better way.
It's likely that some of the other cases in s_lock.h require
similar treatment, but this is the only one we've got conclusive
evidence for at present.
Tom Lane [Mon, 29 Aug 2011 02:27:48 +0000 (22:27 -0400)]
Actually, all of parallel restore's limitations should be tested earlier.
On closer inspection, whining in restore_toc_entries_parallel is really
much too late for any user-facing error case. The right place to do it
is at the start of RestoreArchive(), before we've done anything interesting
(suh as trying to DROP all the targets ...)
Back-patch to 8.4, where parallel restore was introduced.
Tom Lane [Mon, 29 Aug 2011 01:48:58 +0000 (21:48 -0400)]
Be more user-friendly about unsupported cases for parallel pg_restore.
If we are unable to do a parallel restore because the input file is stdin
or is otherwise unseekable, we should complain and fail immediately, not
after having done some of the restore. Complaining once per thread isn't
so cool either, and the messages should be worded to make it clear this is
an unsupported case not some weird race-condition bug. Per complaint from
Lonni Friedman.
Back-patch to 8.4, where parallel restore was introduced.
Tom Lane [Sat, 27 Aug 2011 20:36:57 +0000 (16:36 -0400)]
Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server.
These days, such a response is far more likely to signify a server-side
problem, such as fork failure. Reporting "server does not support SSL"
(in sslmode=require) could be quite misleading. But the results could
be even worse in sslmode=prefer: if the problem was transient and the
next connection attempt succeeds, we'll have silently fallen back to
protocol version 2.0, possibly disabling features the user needs.
Hence, it seems best to just eliminate the assumption that backing off
to non-SSL/2.0 protocol is the way to recover from an "E" response, and
instead treat the server error the same as we would in non-SSL cases.
I tested this change against a pre-7.0 server, and found that there
was a second logic bug in the "prefer" path: the test to decide whether
to make a fallback connection attempt assumed that we must have opened
conn->ssl, which in fact does not happen given an "E" response. After
fixing that, the code does indeed connect successfully to pre-7.0,
as long as you didn't set sslmode=require. (If you did, you get
"Unsupported frontend protocol", which isn't completely off base
given the server certainly doesn't support SSL.)
Since there seems no reason to believe that pre-7.0 servers exist anymore
in the wild, back-patch to all supported branches.
Tom Lane [Sat, 27 Aug 2011 18:16:14 +0000 (14:16 -0400)]
Ensure we discard unread/unsent data when abandoning a connection attempt.
There are assorted situations wherein PQconnectPoll() will abandon a
connection attempt and try again with different parameters (eg, SSL versus
not SSL). However, the code forgot to discard any pending data in libpq's
I/O buffers when doing this. In at least one case (server returns E
message during SSL negotiation), there is unread input data which bollixes
the next connection attempt. I have not checked to see whether this is
possible in the other cases where we close the socket and retry, but it
seems like a matter of good defensive programming to add explicit
buffer-flushing code to all of them.
This is one of several issues exposed by Daniel Farina's report of
misbehavior after a server-side fork failure.
This has been wrong since forever, so back-patch to all supported branches.
Tom Lane [Fri, 26 Aug 2011 20:51:34 +0000 (16:51 -0400)]
Fix potential memory clobber in tsvector_concat().
tsvector_concat() allocated its result workspace using the "conservative"
estimate of the sum of the two input tsvectors' sizes. Unfortunately that
wasn't so conservative as all that, because it supposed that the number of
pad bytes required could not grow. Which it can, as per test case from
Jesper Krogh, if there's a mix of lexemes with positions and lexemes
without them in the input data. The fix is to assume that we might add
a not-previously-present pad byte for each and every lexeme in the two
inputs; which really is conservative, but it doesn't seem worthwhile to
try to be more precise.
This is an aboriginal bug in tsvector_concat, so back-patch to all
versions containing it.
Tom Lane [Fri, 26 Aug 2011 17:52:23 +0000 (13:52 -0400)]
Clean up weird corner cases in lexing of psql meta-command arguments.
These changes allow backtick command evaluation and psql variable
interpolation to happen on substrings of a single meta-command argument.
Formerly, no such evaluations happened at all if the backtick or colon
wasn't the first character of the argument, and we considered an argument
completed as soon as we'd processed one backtick, variable reference, or
quoted substring. A string like 'FOO'BAR was thus taken as two arguments
not one, not exactly what one would expect. In the new coding, an argument
is considered terminated only by unquoted whitespace or backslash.
Also, clean up a bunch of omissions, infelicities and outright errors in
the psql documentation of variables and metacommand argument syntax.
Tom Lane [Thu, 25 Aug 2011 18:33:08 +0000 (14:33 -0400)]
Fix psql lexer to avoid use of backtracking.
Per previous experimentation, backtracking slows down lexing performance
significantly (by about a third). It's usually pretty easy to avoid, just
need to have rules that accept an incomplete construct and do whatever the
lexer would have done otherwise.
The backtracking was introduced by the patch that added quoted variable
substitution. Back-patch to 9.0 where that was added.
Robert Haas [Thu, 25 Aug 2011 16:47:30 +0000 (12:47 -0400)]
Change format of SQL/MED generic options in psql backslash commands.
Rather than dumping out the raw array as PostgreSQL represents it
internally, we now print it out in a format similar to the one in
which the user input it, which seems a lot more user friendly.
Tom Lane [Thu, 25 Aug 2011 03:50:10 +0000 (23:50 -0400)]
Fix pgstatindex() to give consistent results for empty indexes.
For an empty index, the pgstatindex() function would compute 0.0/0.0 for
its avg_leaf_density and leaf_fragmentation outputs. On machines that
follow the IEEE float arithmetic standard with any care, that results in
a NaN. However, per report from Rushabh Lathia, Microsoft couldn't
manage to get this right, so you'd get a bizarre error on Windows.
Fix by forcing the results to be NaN explicitly, rather than relying on
the division operator to give that or the snprintf function to print it
correctly. I have some doubts that this is really the most useful
definition, but it seems better to remain backward-compatible with
those platforms for which the behavior wasn't completely broken.
Back-patch to 8.2, since the code is like that in all current releases.
Tom Lane [Wed, 24 Aug 2011 19:16:17 +0000 (15:16 -0400)]
Fix pgxs.mk to always add --dbname=$(CONTRIB_TESTDB) to REGRESS_OPTS.
The previous coding resulted in contrib modules unintentionally overriding
the use of CONTRIB_TESTDB. There seems no particularly good reason to
allow that (after all, the makefile can set CONTRIB_TESTDB if that's really
what it intends).
In passing, document REGRESS_OPTS where the other pgxs.mk options are
documented.
Back-patch to 9.1 --- in prior versions, there were no cases of contrib
modules setting REGRESS_OPTS without including the --dbname switch, so
while the coding was fragile there was no actual bug.
Tom Lane [Wed, 24 Aug 2011 17:47:01 +0000 (13:47 -0400)]
Avoid locale dependency in expected output.
We'll have to settle for just listing the extensions' data types,
since function arguments seem to sort differently in different locales.
Per buildfarm results.
Tom Lane [Wed, 24 Aug 2011 17:09:06 +0000 (13:09 -0400)]
Fix multiple bugs in extension dropping.
When we implemented extensions, we made findDependentObjects() treat
EXTENSION dependency links similarly to INTERNAL links. However, that
logic contained an implicit assumption that an object could have at most
one INTERNAL dependency, so it did not work correctly for objects having
both INTERNAL and DEPENDENCY links. This led to failure to drop some
extension member objects when dropping the extension. Furthermore, we'd
never actually exercised the case of recursing to an internally-referenced
(owning) object from anything other than a NORMAL dependency, and it turns
out that passing the incoming dependency's flags to the owning object is
the Wrong Thing. This led to sometimes dropping a whole extension silently
when we should have rejected the drop command for lack of CASCADE.
Since we obviously were under-testing extension drop scenarios, add some
regression test cases. Unfortunately, such test cases require some
extensions (duh), so we can't test for problems in the core regression
tests. I chose to add them to the earthdistance contrib module, which is
a good test case because it has a dependency on the cube contrib module.
Back-patch to 9.1. Arguably these are pre-existing bugs in INTERNAL
dependency handling, but since it appears that the cases can never arise
pre-9.1, I'll refrain from back-patching the logic changes further than
that.
Tom Lane [Wed, 24 Aug 2011 01:49:07 +0000 (21:49 -0400)]
Make CREATE EXTENSION check schema creation permissions.
When creating a new schema for a non-relocatable extension, we neglected
to check whether the calling user has permission to create schemas.
That didn't matter in the original coding, since we had already checked
superuserness, but in the new dispensation where users need not be
superusers, we should check it. Use CreateSchemaCommand() rather than
calling NamespaceCreate() directly, so that we also enforce the rules
about reserved schema names.
Per complaint from KaiGai Kohei, though this isn't the same as his patch.
Tom Lane [Tue, 23 Aug 2011 21:11:41 +0000 (17:11 -0400)]
Fix overoptimistic assumptions in column width estimation for subqueries.
set_append_rel_pathlist supposed that, while computing per-column width
estimates for the appendrel, it could ignore child rels for which the
translated reltargetlist entry wasn't a Var. This gave rise to completely
silly estimates in some common cases, such as constant outputs from some or
all of the arms of a UNION ALL. Instead, fall back on get_typavgwidth to
estimate from the value's datatype; which might be a poor estimate but at
least it's not completely wacko.
That problem was exposed by an Assert in set_subquery_size_estimates, which
unfortunately was still overoptimistic even with that fix, since we don't
compute attr_widths estimates for appendrels that are entirely excluded by
constraints. So remove the Assert; we'll just fall back on get_typavgwidth
in such cases.
Also, since set_subquery_size_estimates calls set_baserel_size_estimates
which calls set_rel_width, there's no need for set_subquery_size_estimates
to call get_typavgwidth; set_rel_width will handle it for us if we just
leave the estimate set to zero. Remove the unnecessary code.
Per report from Erik Rijkers and subsequent investigation.
Tom Lane [Mon, 22 Aug 2011 14:55:47 +0000 (10:55 -0400)]
Fix handling of extension membership when filling in a shell operator.
The previous coding would result in deleting and not re-creating the
extension membership pg_depend rows, since there was no
CommandCounterIncrement that would allow recordDependencyOnCurrentExtension
to see that the deletion had happened. Make it work like the shell type
case, ie, keep the existing entries (and then throw an error if they're for
the wrong extension).
Per bug #6172 from Hitoshi Harada. Investigation and fix by Dimitri
Fontaine.
Tom Lane [Sun, 21 Aug 2011 22:15:55 +0000 (18:15 -0400)]
Fix trigger WHEN conditions when both BEFORE and AFTER triggers exist.
Due to tuple-slot mismanagement, evaluation of WHEN conditions for AFTER
ROW UPDATE triggers could crash if there had been a BEFORE ROW trigger
fired for the same update. Fix by not trying to overload the use of
estate->es_trig_tuple_slot. Per report from Yoran Heling.
Back-patch to 9.0, when trigger WHEN conditions were introduced.
Tom Lane [Sat, 20 Aug 2011 18:51:02 +0000 (14:51 -0400)]
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can
make building a large tidbitmap into an O(N^2) operation. To fix, make
sure we remove more than the minimum amount of information per call, and
add a fallback path to behave sanely if we're unable to fit the bitmap
within the requested amount of memory.
This has been wrong since the tidbitmap code was written, so back-patch
to all supported branches.
Robert Haas [Fri, 19 Aug 2011 17:09:40 +0000 (13:09 -0400)]
Clean up 'chkselinuxenv' script.
Eliminate dependencies on "which", as we don't really need that to be
installed for proper testing. Don't number the tests, as that increases
the footprint of every patch that wants to add or remove tests. Make
the test output more informative, so that it's a bit easier to see what
went right (or wrong). Spelling and grammar improvements.
Robert Haas [Fri, 19 Aug 2011 15:57:38 +0000 (11:57 -0400)]
Fix contrib/sepgsql and contrib/xml2 to always link required libraries.
contrib/xml2 can get by without libxslt; the relevant features just
won't work. But if doesn't have libxml2, or if sepgsql doesn't have
libselinux, the link succeeds but the module then fails to work at load
time. To avoid that, link the require libraries unconditionally, so
that it will be clear at link-time that there is a problem.
Tom Lane [Thu, 18 Aug 2011 15:45:25 +0000 (11:45 -0400)]
Explain max_prepared_transactions requirement in isolation tests' README.
Now that we have a test that requires nondefault settings to pass, it seems
like we'd better mention that detail in the directions about how to run the
tests.
Add an SSI regression test that tests all interesting permutations in the
order of begin, prepare, and commit of three concurrent transactions that
have conflicts between them.
The test runs for a quite long time, and the expected output file is huge,
but this test caught some serious bugs during development, so seems
worthwhile to keep. The test uses prepared transactions, so it fails if the
server has max_prepared_transactions=0. Because of that, it's marked as
"ignore" in the schedule file.
Strip whitespace from SQL blocks in the isolation test suite. This is purely
cosmetic, it removes a lot of IMHO ugly whitespace from the expected output.
Robert Haas [Thu, 18 Aug 2011 13:49:41 +0000 (09:49 -0400)]
Remove obsolete README file.
Perhaps we ought to add some other kind of documentation here instead,
but for now let's get rid of this woefully obsolete description of the
sinval machinery.
Peter Eisentraut [Thu, 18 Aug 2011 11:43:16 +0000 (14:43 +0300)]
Improve detection of Python 3.2 installations
Because of ABI tagging, the library version number might no longer be
exactly the Python version number, so do extra lookups. This affects
installations without a shared library, such as ActiveState's
installer.
Also update the way to detect the location of the 'config' directory,
which can also be versioned.
Peter Eisentraut [Thu, 18 Aug 2011 09:53:32 +0000 (12:53 +0300)]
Change PyInit_plpy to external linkage
Module initialization functions in Python 3 must have external
linkage, because PyMODINIT_FUNC does dllexport on Windows-like
platforms. Without this change, the build with Python 3 fails on
Windows.
Tom Lane [Wed, 17 Aug 2011 21:07:16 +0000 (17:07 -0400)]
Fix two issues in plpython's handling of composite results.
Dropped columns within a composite type were not handled correctly.
Also, we did not check for whether a composite result type had changed
since we cached the information about it.
Jan UrbaĆski, per a bug report from Jean-Baptiste Quenot
Tom Lane [Tue, 16 Aug 2011 23:27:46 +0000 (19:27 -0400)]
Revise sinval code to remove no-longer-used tuple TID from inval messages.
This requires adjusting the API for syscache callback functions: they now
get a hash value, not a TID, to identify the target tuple. Most of them
weren't paying any attention to that argument anyway, but plancache did
require a small amount of fixing.
Also, improve performance a trifle by avoiding sending duplicate inval
messages when a heap_update isn't changing the catcache lookup columns.
Tom Lane [Tue, 16 Aug 2011 19:26:22 +0000 (15:26 -0400)]
Forget about targeting catalog cache invalidations by tuple TID.
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
Tom Lane [Tue, 16 Aug 2011 18:38:20 +0000 (14:38 -0400)]
Fix incorrect order of operations during sinval reset processing.
We have to be sure that we have revalidated each nailed-in-cache relcache
entry before we try to use it to load data for some other relcache entry.
The introduction of "mapped relations" in 9.0 broke this, because although
we updated the state kept in relmapper.c early enough, we failed to
propagate that information into relcache entries soon enough; in
particular, we could try to fetch pg_class rows out of pg_class before
we'd updated its relcache entry's rd_node.relNode value from the map.
This bug accounts for Dave Gould's report of failures after "vacuum full
pg_class", and I believe that there is risk for other system catalogs
as well.
The core part of the fix is to copy relmapper data into the relcache
entries during "phase 1" in RelationCacheInvalidate(), before they'll be
used in "phase 2". To try to future-proof the code against other similar
bugs, I also rearranged the order in which nailed relations are visited
during phase 2: now it's pg_class first, then pg_class_oid_index, then
other nailed relations. This should ensure that RelationClearRelation can
apply RelationReloadIndexInfo to all nailed indexes without risking use
of not-yet-revalidated relcache entries.
Back-patch to 9.0 where the relation mapper was introduced.
Tom Lane [Tue, 16 Aug 2011 17:48:04 +0000 (13:48 -0400)]
Preserve toast value OIDs in toast-swap-by-content for CLUSTER/VACUUM FULL.
This works around the problem that a catalog cache entry might contain a
toast pointer that we try to dereference just as a VACUUM FULL completes
on that catalog. We will see the sinval message on the cache entry when
we acquire lock on the toast table, but by that point we've already told
tuptoaster.c "here's the pointer to fetch", so it's difficult from a code
structural standpoint to update the pointer before we use it. Much less
painful to ensure that toast pointers are not invalidated in the first
place. We have to add a bit of code to deal with the case that a value
that previously wasn't toasted becomes so; but that should be a
seldom-exercised corner case, so the inefficiency shouldn't be significant.
Back-patch to 9.0. In prior versions, we didn't allow CLUSTER on system
catalogs, and VACUUM FULL didn't result in reassignment of toast OIDs, so
there was no problem.
Tom Lane [Tue, 16 Aug 2011 17:11:54 +0000 (13:11 -0400)]
Fix race condition in relcache init file invalidation.
The previous code tried to synchronize by unlinking the init file twice,
but that doesn't actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that would
tell it the data is stale. The result would be bizarre failures in catalog
accesses, typically "could not read block 0 in file ..." later during
startup.
Instead, hold RelCacheInitLock across both the unlink and the sending of
the SI messages. This is more straightforward, and might even be a bit
faster since only one unlink call is needed.
This has been wrong since it was put in (in 2002!), so back-patch to all
supported releases.
Magnus Hagander [Tue, 16 Aug 2011 14:56:47 +0000 (16:56 +0200)]
Adjust total size in pg_basebackup progress report when reality changes
When streaming including WAL, the size estimate will always be incorrect,
since we don't know how much WAL is included. To make sure the output doesn't
look completely unreasonable, this patch increases the total size whenever we
go past the estimate, to make sure we never go above 100%.