Tom Lane [Fri, 7 Sep 2012 20:02:23 +0000 (16:02 -0400)]
Centralize libpq's low-level code for dropping a connection.
Create an internal function pqDropConnection that does the physical socket
close and cleans up closely-associated state. This removes a bunch of ad
hoc, not always consistent closure code. The ulterior motive is to have a
single place to wait for a spawned child backend to exit, but this seems
like good cleanup even if that never happens.
I went back and forth on whether to include "conn->status = CONNECTION_BAD"
in pqDropConnection's actions, but for the moment decided not to. Only a
minority of the call sites actually want that, and in any case it's
arguable that conn->status is slightly higher-level state, and thus not
part of this function's purview.
Tom Lane [Thu, 6 Sep 2012 15:43:51 +0000 (11:43 -0400)]
Allow embedded spaces without quoting in unix_socket_directories entries.
This fix removes an unnecessary incompatibility with the old behavior of
the unix_socket_directory parameter. Since pathnames with embedded spaces
are fairly popular on some platforms, the incompatibility could be
significant in practice. We'll still strip unquoted leading/trailing
spaces, however.
No docs update since the documentation already implied that it worked
like this.
Andrew Dunstan [Thu, 6 Sep 2012 03:14:49 +0000 (23:14 -0400)]
In pg_upgrade, try a few times to open a log file.
If we call pg_ctl stop, the server might continue and thus
hold a log file for a short time after it has deleted its pid file,
(which is when pg_ctl will exit), and so a subsequent attempt to
open the log file might fail.
We therefore try to open it a few times, sleeping one second between
tries, to give the server time to exit.
This corrects an error that was observed on the buildfarm.
Fix WAL file replacement during cascading replication on Windows.
When the startup process restores a WAL file from the archive, it deletes
any old file with the same name and renames the new file in its place. On
Windows, however, when a file is deleted, it still lingers as long as a
process holds a file handle open on it. With cascading replication, a
walsender process can hold the old file open, so the rename() in the startup
process would fail. To fix that, rename the old file to a temporary name, to
make the original file name available for reuse, before deleting the old
file.
Tom Lane [Thu, 6 Sep 2012 01:49:08 +0000 (21:49 -0400)]
Fix inappropriate error messages for Hot Standby misconfiguration errors.
Give the correct name of the GUC parameter being complained of.
Also, emit a more suitable SQLSTATE (INVALID_PARAMETER_VALUE,
not the default INTERNAL_ERROR).
Andrew Dunstan [Wed, 5 Sep 2012 22:00:31 +0000 (18:00 -0400)]
Fix pg_upgrade test script's line end handling on Windows.
Call pg_dumpall using -f switch instead of redirection, to avoid
writing the output in text mode and generating spurious carriage
returns. Remove to carriage return ignoring hack introduced by
commit e442b0f0c6fd26738bafdeb5222511b586dfe4b9.
Andrew Dunstan [Wed, 5 Sep 2012 21:41:43 +0000 (17:41 -0400)]
Fix line end mishandling in pg_upgrade on Windows.
pg_upgrade opened the output from pg_dumpall in text mode and
wrote the split files in text mode. This caused unwanted eating
of intended carriage returns on input and production of spurious
carriage returns on output. To avoid this, open all these files
in binary mode. On non-Windows platforms, this change has no
effect.
Backpatch to 9.0. On 9.0 and 9.1, we also switch from redirecting
pg_dumpall's output to using pg_dumpall's -f switch, for the same
reason.
Tom Lane [Wed, 5 Sep 2012 20:43:37 +0000 (16:43 -0400)]
Restore SIGFPE handler after initializing PL/Perl.
Perl, for some unaccountable reason, believes it's a good idea to reset
SIGFPE handling to SIG_IGN. Which wouldn't be a good idea even if it
worked; but on some platforms (Linux at least) it doesn't work at all,
instead resulting in forced process termination if the signal occurs.
Given the lack of other complaints, it seems safe to assume that Perl
never actually provokes SIGFPE and so there is no value in the setting
anyway. Hence, reset it to our normal handler after initializing Perl.
Tom Lane [Wed, 5 Sep 2012 16:54:03 +0000 (12:54 -0400)]
Fix PARAM_EXEC assignment mechanism to be safe in the presence of WITH.
The planner previously assumed that parameter Vars having the same absolute
query level, varno, and varattno could safely be assigned the same runtime
PARAM_EXEC slot, even though they might be different Vars appearing in
different subqueries. This was (probably) safe before the introduction of
CTEs, but the lazy-evalution mechanism used for CTEs means that a CTE can
be executed during execution of some other subquery, causing the lifespan
of Params at the same syntactic nesting level as the CTE to overlap with
use of the same slots inside the CTE. In 9.1 we created additional hazards
by using the same parameter-assignment technology for nestloop inner scan
parameters, but it was broken before that, as illustrated by the added
regression test.
To fix, restructure the planner's management of PlannerParamItems so that
items having different semantic lifespans are kept rigorously separated.
This will probably result in complex queries using more runtime PARAM_EXEC
slots than before, but the slots are cheap enough that this hardly matters.
Also, stop generating PlannerParamItems containing Params for subquery
outputs: all we really need to do is reserve the PARAM_EXEC slot number,
and that now only takes incrementing a counter. The planning code is
simpler and probably faster than before, as well as being more correct.
Per report from Vik Reykja.
These changes will mostly also need to be made in the back branches, but
I'm going to hold off on that until after 9.2.0 wraps.
Fix bugs in cascading replication with recovery_target_timeline='latest'
The cascading replication code assumed that the current RecoveryTargetTLI
never changes, but that's not true with recovery_target_timeline='latest'.
The obvious upshot of that is that RecoveryTargetTLI in shared memory needs
to be protected by a lock. A less obvious consequence is that when a
cascading standby is connected, and the standby switches to a new target
timeline after scanning the archive, it will continue to stream WAL to the
cascading standby, but from a wrong file, ie. the file of the previous
timeline. For example, if the standby is currently streaming from the middle
of file 000000010000000000000005, and the timeline changes, the standby
will continue to stream from that file. However, the WAL on the new
timeline is in file 000000020000000000000005, so the standby sends garbage
from 000000010000000000000005 to the cascading standby, instead of the
correct WAL from file 000000020000000000000005.
This also fixes a related bug where a partial WAL segment is restored from
the archive and streamed to a cascading standby. The code assumed that when
a WAL segment is copied from the archive, it can immediately be fully
streamed to a cascading standby. However, if the segment is only partially
filled, ie. has the right size, but only N first bytes contain valid WAL,
that's not safe. That can happen if a partial WAL segment is manually copied
to the archive, or if a partial WAL segment is archived because a server is
started up on a new timeline within that segment. The cascading standby will
get confused if the WAL it received is not valid, and will get stuck until
it's restarted. This patch fixes that problem by not allowing WAL restored
from the archive to be streamed to a cascading standby until it's been
replayed, and thus validated.
Kevin Grittner [Wed, 5 Sep 2012 02:13:11 +0000 (21:13 -0500)]
Fix serializable mode with index-only scans.
Serializable Snapshot Isolation used for serializable transactions
depends on acquiring SIRead locks on all heap relation tuples which
are used to generate the query result, so that a later delete or
update of any of the tuples can flag a read-write conflict between
transactions. This is normally handled in heapam.c, with tuple level
locking. Since an index-only scan avoids heap access in many cases,
building the result from the index tuple, the necessary predicate
locks were not being acquired for all tuples in an index-only scan.
To prevent problems with tuple IDs which are vacuumed and re-used
while the transaction still matters, the xmin of the tuple is part of
the tag for the tuple lock. Since xmin is not available to the
index-only scan for result rows generated from the index tuples, it
is not possible to acquire a tuple-level predicate lock in such
cases, in spite of having the tid. If we went to the heap to get the
xmin value, it would no longer be an index-only scan. Rather than
prohibit index-only scans under serializable transaction isolation,
we acquire an SIRead lock on the page containing the tuple, when it
was not necessary to visit the heap for other reasons.
Magnus Hagander [Tue, 4 Sep 2012 13:00:04 +0000 (15:00 +0200)]
Change "restoring" to "processing" in message from pg_dump
The same message is used in both pg_restore and pg_dump, and it's
confusing to output "restoring data for table xyz" when the user
is just doing a pg_dump.
Bruce Momjian [Tue, 4 Sep 2012 02:52:34 +0000 (22:52 -0400)]
Fix to_date() and to_timestamp() to allow specification of the day of
the week via ISO or Gregorian designations. The fix is to store the
day-of-week consistently as 1-7, Sunday = 1.
Bruce Momjian [Tue, 4 Sep 2012 02:15:09 +0000 (22:15 -0400)]
In pg_upgrade, pull the port number from postmaster.pid, like we do for
socket location. Also, prevent putting the socket in the current
directory for pre-9.1 servers in live check and non-live check mode,
because pre-9.1 pg_ctl -w can't handle it.
Andrew Dunstan [Mon, 3 Sep 2012 22:06:47 +0000 (18:06 -0400)]
Use correct path separator for Windows builtin commands.
pg_upgrade produces a platform-specific script to remove the old
directory, but on Windows it has not been making sure that the
paths it writes as arguments for rmdir and del use the backslash
path separator, which will cause these scripts to fail.
Tom Lane [Mon, 3 Sep 2012 19:38:42 +0000 (15:38 -0400)]
Replace memcpy() calls in xlog.c critical sections with struct assignments.
This gets rid of a dangerous-looking use of the not-volatile XLogCtl
pointer in a couple of spinlock-protected sections, where the normal
coding rule is that you should only access shared memory through a
pointer-to-volatile. I think the risk is only hypothetical not actual,
since for there to be a bug the compiler would have to move the spinlock
acquire or release across the memcpy() call, which one sincerely hopes
it will not. Still, it looks cleaner this way.
Per comment from Daniel Farina and subsequent discussion.
Tom Lane [Mon, 3 Sep 2012 17:52:34 +0000 (13:52 -0400)]
Fix pg_upgrade to cope with non-default unix_socket_directory scenarios.
When starting either an old or new postmaster, force it to place its Unix
socket in the current directory. This makes it even harder for accidental
connections to occur during pg_upgrade, and also works around some
scenarios where the default socket location isn't usable. (For example,
if the default location is something other than "/tmp", it might not exist
during "make check".)
When checking an already-running old postmaster, find out its actual socket
directory location from postmaster.pid, if possible. This dodges problems
with an old postmaster having a configured location different from the
default built into pg_upgrade's libpq. We can't find that out if the old
postmaster is pre-9.1, so also document how to cope with such scenarios
manually.
In support of this, centralize handling of the connection-related command
line options passed to pg_upgrade's subsidiary programs, such as pg_dump.
This should make future changes easier.
Tom Lane [Mon, 3 Sep 2012 15:24:31 +0000 (11:24 -0400)]
Make psql's \d+ show reloptions for all relkinds.
Formerly it would only show them for relkinds 'r' and 'f' (plain tables
and foreign tables). However, as of 9.2, views can also have reloptions,
namely security_barrier. The relkind restriction seems pointless and
not at all future-proof, so just print reloptions whenever there are any.
In passing, make some cosmetic improvements to the code that pulls the
"tableinfo" fields out of the PGresult.
Noted and patched by Dean Rasheed, with adjustment for all relkinds by me.
Tom Lane [Sat, 1 Sep 2012 22:16:24 +0000 (18:16 -0400)]
Drop cheap-startup-cost paths during add_path() if we don't need them.
We can detect whether the planner top level is going to care at all about
cheap startup cost (it will only do so if query_planner's tuple_fraction
argument is greater than zero). If it isn't, we might as well discard
paths immediately whose only advantage over others is cheap startup cost.
This turns out to get rid of quite a lot of paths in complex queries ---
I saw planner runtime reduction of more than a third on one large query.
Since add_path isn't currently passed the PlannerInfo "root", the easiest
way to tell it whether to do this was to add a bool flag to RelOptInfo.
That's a bit redundant, since all relations in a given query level will
have the same setting. But in the future it's possible that we'd refine
the control decision to work on a per-relation basis, so this seems like
a good arrangement anyway.
Tom Lane [Sat, 1 Sep 2012 17:56:14 +0000 (13:56 -0400)]
Fix mark_placeholder_maybe_needed to handle LATERAL references.
If a PlaceHolderVar contains a pulled-up LATERAL reference, its minimum
possible evaluation level might be higher in the join tree than its
original syntactic location. That in turn affects the ph_needed level for
any contained PlaceHolderVars (that is, those PHVs had better propagate up
the join tree at least to the evaluation level of the outer PHV). We got
this mostly right, but mark_placeholder_maybe_needed() failed to account
for the effect, and in consequence could leave the inner PHVs with
ph_may_need less than what their ultimate ph_needed value will be. That's
bad because it could lead to failure to select a join order that will allow
evaluation of the inner PHV at a valid location. Fix that, and add an
Assert that checks that we don't ever set ph_needed to more than
ph_may_need.
Tom Lane [Sat, 1 Sep 2012 04:40:15 +0000 (00:40 -0400)]
More documentation updates for LATERAL.
Extend xfunc.sgml's discussion of set-returning functions to show an
example of using LATERAL, and recommend that over putting SRFs in the
targetlist.
In passing, reword func.sgml's section on set-returning functions so
that it doesn't claim that the functions listed therein are all the
built-in set-returning functions. That hasn't been true for a long
time, and trying to make it so doesn't seem like it would be an
improvement. (Perhaps we should rename that section?)
Only warn when connecting to a newer server, since connecting to older
servers works pretty well nowadays. Also update the documentation a
little about current psql/server compatibility expectations.
Andrew Dunstan [Sat, 1 Sep 2012 00:38:37 +0000 (20:38 -0400)]
Restore setting of _USE_32BIT_TIME_T to 32 bit MSVC builds.
This was removed in commit cd004067742ee16ee63e55abfb4acbd5f09fbaab,
we're not quite sure why, but there have been reports of crashes due
to AS Perl being built with it when we are not, and it certainly
seems like the right thing to do. There is still some uncertainty
as to why it sometimes fails and sometimes doesn't.
Original patch from Owais Khani, substantially reworked and
extended by Andrew Dunstan.
Tom Lane [Fri, 31 Aug 2012 22:57:12 +0000 (18:57 -0400)]
Partially restore qual scope checks in distribute_qual_to_rels().
The LATERAL implementation is now basically complete, and I still don't
see a cost-effective way to make an exact qual scope cross-check in the
presence of LATERAL. However, I did add a PlannerInfo.hasLateralRTEs flag
along the way, so it's easy to make the check only when not hasLateralRTEs.
That seems to still be useful, and it beats having no check at all.
Tom Lane [Fri, 31 Aug 2012 18:17:56 +0000 (14:17 -0400)]
Make configure probe for mbstowcs_l as well as wcstombs_l.
We previously supposed that any given platform would supply both or neither
of these functions, so that one configure test would be sufficient. It now
appears that at least on AIX this is not the case ... which is likely an
AIX bug, but nonetheless we need to cope with it. So use separate tests.
Per bug #6758; thanks to Andrew Hastie for doing the followup testing
needed to confirm what was happening.
Backpatch to 9.1, where we began using these functions.
Tom Lane [Fri, 31 Aug 2012 02:52:43 +0000 (22:52 -0400)]
Improve coding of gistchoose and gistRelocateBuildBuffersOnSplit.
This is mostly cosmetic, but it does eliminate a speculative portability
issue. The previous coding ignored the fact that sum_grow could easily
overflow (in fact, it could be summing multiple IEEE float infinities).
On a platform where that didn't guarantee to produce a positive result,
the code would misbehave. In any case, it was less than readable.
Alvaro Herrera [Thu, 30 Aug 2012 20:15:44 +0000 (16:15 -0400)]
Split tuple struct defs from htup.h to htup_details.h
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
Tom Lane [Thu, 30 Aug 2012 18:32:22 +0000 (14:32 -0400)]
Suppress creation of backwardly-indexed paths for LATERAL join clauses.
Given a query such as
SELECT * FROM foo JOIN LATERAL (SELECT foo.var1) ss(x) ON ss.x = foo.var2
the existence of the join clause "ss.x = foo.var2" encourages indxpath.c to
build a parameterized path for foo using any index available for foo.var2.
This is completely useless activity, though, since foo has got to be on the
outside not the inside of any nestloop join with ss. It's reasonably
inexpensive to add tests that prevent creation of such paths, so let's do
that.
Robert Haas [Thu, 30 Aug 2012 18:14:22 +0000 (14:14 -0400)]
Document how to prevent PostgreSQL itself from exhausting memory.
The existing documentation in Linux Memory Overcommit seemed to
assume that PostgreSQL itself could never be the problem, or at
least it didn't tell you what to do about it.
Per discussion with Craig Ringer and Kevin Grittner.
Robert Haas [Thu, 30 Aug 2012 17:05:45 +0000 (13:05 -0400)]
Fix logic bug in gistchoose and gistRelocateBuildBuffersOnSplit.
Every time the best-tuple-found-so-far changes, we need to reset all
the penalty values in which_grow[] to the penalties for the new best
tuple. The old code failed to do this, resulting in inferior index
quality.
The original patch from Alexander Korotkov was just two lines; I took
the liberty of fleshing that out by adding a bunch of comments that I
hope will make this logic easier for others to understand than it was
for me.
Tom Lane [Thu, 30 Aug 2012 16:56:50 +0000 (12:56 -0400)]
Improve EXPLAIN's ability to cope with LATERAL references in plans.
push_child_plan/pop_child_plan didn't bother to adjust the "ancestors"
list of parent plan nodes when descending to a child plan node. I think
this was okay when it was written, but it's not okay in the presence of
LATERAL references, since a subplan node could easily be returning a
LATERAL value back up to the same nestloop node that provides the value.
Per changed regression test results, the omission led to failure to
interpret Param nodes that have perfectly good interpretations.
Peter Eisentraut [Thu, 30 Aug 2012 02:44:59 +0000 (22:44 -0400)]
Also check for Python platform-specific include directory
Python can be built to have two separate include directories: one for
platform-independent files and one for platform-specific files. So
far, this has apparently never mattered for a PL/Python build. But
with the new multi-arch Python packages in Debian and Ubuntu, this is
becoming the standard configuration on these platforms, so we must
check these directories separately to be able to build there.
Also add a bit of reporting in configure to be able to see better what
is going on with this.
Tom Lane [Thu, 30 Aug 2012 02:05:27 +0000 (22:05 -0400)]
Adjust definition of cheapest_total_path to work better with LATERAL.
In the initial cut at LATERAL, I kept the rule that cheapest_total_path
was always unparameterized, which meant it had to be NULL if the relation
has no unparameterized paths. It turns out to work much more nicely if
we always have *some* path nominated as cheapest-total for each relation.
In particular, let's still say it's the cheapest unparameterized path if
there is one; if not, take the cheapest-total-cost path among those of
the minimum available parameterization. (The first rule is actually
a special case of the second.)
This allows reversion of some temporary lobotomizations I'd put in place.
In particular, the planner can now consider hash and merge joins for
joins below a parameter-supplying nestloop, even if there aren't any
unparameterized paths available. This should bring planning of
LATERAL-containing queries to the same level as queries not using that
feature.
Along the way, simplify management of parameterized paths in add_path()
and friends. In the original coding for parameterized paths in 9.2,
I tried to minimize the logic changes in add_path(), so it just treated
parameterization as yet another dimension of comparison for paths.
We later made it ignore pathkeys (sort ordering) of parameterized paths,
on the grounds that ordering isn't a useful property for the path on the
inside of a nestloop, so we might as well get rid of useless parameterized
paths as quickly as possible. But we didn't take that reasoning as far as
we should have. Startup cost isn't a useful property inside a nestloop
either, so add_path() ought to discount startup cost of parameterized paths
as well. Having done that, the secondary sorting I'd implemented (in
add_parameterized_path) is no longer needed --- any parameterized path that
survives add_path() at all is worth considering at higher levels. So this
should be a bit faster as well as simpler.
This includes two micro-optimizations to the tight inner loop in descending
the SP-GiST tree: 1. avoid an extra function call to index_getprocinfo when
calling user-defined choose function, and 2. avoid a useless palloc+pfree
when node labels are not used.
Alvaro Herrera [Tue, 28 Aug 2012 23:02:00 +0000 (19:02 -0400)]
Split heapam_xlog.h from heapam.h
The heapam XLog functions are used by other modules, not all of which
are interested in the rest of the heapam API. With this, we let them
get just the XLog stuff in which they are interested and not pollute
them with unrelated includes.
Also, since heapam.h no longer requires xlog.h, many files that do
include heapam.h no longer get xlog.h automatically, including a few
headers. This is useful because heapam.h is getting pulled in by
execnodes.h, which is in turn included by a lot of files.
Tom Lane [Tue, 28 Aug 2012 00:17:12 +0000 (20:17 -0400)]
Add section IDs to subsections of syntax.sgml that lacked them.
This is so that these sections will have stable HTML tags that one can
link to, rather than things like "AEN1902". Perhaps we should mount a
campaign to do this everywhere, but I've found myself pointing at
syntax.sgml subsections often enough to be sure it's useful here.
Alvaro Herrera [Mon, 27 Aug 2012 18:21:09 +0000 (14:21 -0400)]
pg_upgrade: Fix exec_prog API to be less flaky
The previous signature made it very easy to pass something other than
the printf-format specifier in the corresponding position, without any
warning from the compiler.
While at it, move some of the escaping, redirecting and quoting
responsibilities from the callers into exec_prog() itself. This makes
the callsites cleaner.
Tom Lane [Mon, 27 Aug 2012 16:45:43 +0000 (12:45 -0400)]
Fix DROP INDEX CONCURRENTLY IF EXISTS.
This threw ERROR, not the expected NOTICE, if the index didn't exist.
The bug was actually visible in not-as-expected regression test output,
so somebody wasn't paying too close attention in commit 8cb53654dbdb4c386369eb988062d0bbb6de725e.
Per report from Brendan Byrd.