Peter Eisentraut [Tue, 26 Jan 2010 06:58:39 +0000 (06:58 +0000)]
Reformat the comments in pg_hba.conf and pg_ident.conf
These files have apparently been edited over the years by a dozen people
with as many different editor settings, which made the alignment of the
paragraphs quite inconsistent and ugly. I made a pass of M-q with Emacs
to straighten it out.
Add note that PREPARE TRANSACTION is for transaction managers, not
regular applications. Also add a comment pointing out that tab-complition
for PREPARE TRANSACTION is missing on purpose.
Tom Lane [Mon, 25 Jan 2010 01:58:14 +0000 (01:58 +0000)]
Apply Tcl_Init() to the "hold" interpreter created by pltcl.
You might think this is unnecessary since that interpreter is never used
to run code --- but it turns out that's wrong. As of Tcl 8.5, the "clock"
command (alone among builtin Tcl commands) is partially implemented by
loaded-on-demand Tcl code, which means that it fails if there's not
unknown-command support, and also that it's impossible to run it directly
in a safe interpreter. The way they get around the latter is that
Tcl_CreateSlave() automatically sets up an alias command that forwards any
execution of "clock" in a safe slave interpreter to its parent interpreter.
Thus, when attempting to execute "clock" in trusted pltcl, the command
actually executes in the "hold" interpreter, where it will fail if
unknown-command support hasn't been introduced by sourcing the standard
init.tcl script, which is done by Tcl_Init(). (This is a pretty dubious
design decision on the Tcl boys' part, if you ask me ... but they didn't.)
Back-patch all the way. It's not clear that anyone would try to use ancient
versions of pltcl with a recent Tcl, but it's not clear they wouldn't, either.
Also add a regression test using "clock", in branches that have regression
test support for pltcl.
Joe Conway [Sun, 24 Jan 2010 22:19:38 +0000 (22:19 +0000)]
Rewrite dblink_record_internal() and dblink_fetch() to use a tuplestore
(SFRM_Materialize mode) to return tuples. Since we don't return from the
dblink function in tuplestore mode, release the PGresult with a PG_CATCH
block on error. Also rearrange to share the same code to materialize the
tuplestore. Patch by Takahiro Itagaki.
Tom Lane [Sun, 24 Jan 2010 21:49:17 +0000 (21:49 +0000)]
Fix assorted core dumps and Assert failures that could occur during
AbortTransaction or AbortSubTransaction, when trying to clean up after an
error that prevented (sub)transaction start from completing:
* access to TopTransactionResourceOwner that might not exist
* assert failure in AtEOXact_GUC, if AtStart_GUC not called yet
* assert failure or core dump in AfterTriggerEndSubXact, if
AfterTriggerBeginSubXact not called yet
Per testing by injecting elog(ERROR) at successive steps in StartTransaction
and StartSubTransaction. It's not clear whether all of these cases could
really occur in the field, but at least one of them is easily exposed by
simple stress testing, as per my accidental discovery yesterday.
Tom Lane [Sat, 23 Jan 2010 21:29:00 +0000 (21:29 +0000)]
Insert CHECK_FOR_INTERRUPTS calls into loops in dbsize.c, to ensure that
the various disk-size-reporting functions will respond to query cancel
reasonably promptly even in very large databases. Per report from
Kevin Grittner.
Simon Riggs [Sat, 23 Jan 2010 16:37:12 +0000 (16:37 +0000)]
In HS, Startup process sets SIGALRM when waiting for buffer pin. If
woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.
Robert Haas [Fri, 22 Jan 2010 16:40:19 +0000 (16:40 +0000)]
Replace ALTER TABLE ... SET STATISTICS DISTINCT with a more general mechanism.
Attributes can now have options, just as relations and tablespaces do, and
the reloptions code is used to parse, validate, and store them. For
simplicity and because these options are not performance critical, we store
them in a separate cache rather than the main relcache.
Michael Meskes [Fri, 22 Jan 2010 14:13:03 +0000 (14:13 +0000)]
Applied patch by Boszormenyi Zoltan <zb@cybertec.at> to fix problem in auto-prepare mode if the connection is closed and re-opened and the previously prepared query is issued again.
Robert Haas [Thu, 21 Jan 2010 14:58:53 +0000 (14:58 +0000)]
Add new escaping functions PQescapeLiteral and PQescapeIdentifier.
PQescapeLiteral is similar to PQescapeStringConn, but it relieves the
caller of the need to know how large the output buffer should be, and
it provides the appropriate quoting (in addition to escaping special
characers within the string). PQescapeIdentifier provides similar
functionality for escaping identifiers.
Simon Riggs [Thu, 21 Jan 2010 00:53:58 +0000 (00:53 +0000)]
Better internal documentation of locking for Hot Standby conflict resolution.
Discuss the reasons for the lock type we hold on ProcArrayLock while deriving
the conflict list. Cover the idea of false positive conflicts and seemingly
strange effects on snapshot derivation.
Tom Lane [Wed, 20 Jan 2010 23:12:03 +0000 (23:12 +0000)]
Well, the systemtap guys moved the goalposts again: with the latest version,
we *must* generate probes.o or the dtrace probes don't work. Revert our
workaround for their previous bug. Details at
https://bugzilla.redhat.com/show_bug.cgi?id=557266
Write a WAL record whenever we perform an operation without WAL-logging
that would've been WAL-logged if archiving was enabled. If we encounter
such records in archive recovery anyway, we know that some data is
missing from the log. A WARNING is emitted in that case.
Original patch by Fujii Masao, with changes by me.
Now that much of walreceiver has been pulled back into the postgres
binary, revert PGDLLIMPORT decoration of global variables. I'm not sure
if there's any real harm from unnecessary PGDLLIMPORTs, but these are all
internal variables that external modules really shouldn't be messing
with. ThisTimeLineID still needs PGDLLIMPORT.
Rethink the way walreceiver is linked into the backend. Instead than shoving
walreceiver as whole into a dynamically loaded module, split the
libpq-specific parts of it into dynamically loaded module and keep the rest
in the main backend binary.
Although Tom fixed the Windows compilation problems with the old walreceiver
module already, this is a cleaner division of labour and makes the code
more readable. There's also the prospect of adding new transport methods
as pluggable modules in the future, which this patch makes easier, though for
now the API between libpqwalreceiver and walreceiver process should be
considered private.
The libpq-specific module is now in src/backend/replication/libpqwalreceiver,
and the part linked with postgres binary is in
src/backend/replication/walreceiver.c.
Peter Eisentraut [Wed, 20 Jan 2010 05:47:09 +0000 (05:47 +0000)]
Before attempting to create a composite type, check whether a type of that
name already exists, so we'd get an error message about a "type" instead
of about a "relation", because the composite type code shares code with
relation creation.
Andrew Dunstan [Wed, 20 Jan 2010 01:08:21 +0000 (01:08 +0000)]
Add utility functions to PLPerl:
quote_literal, quote_nullable, quote_ident,
encode_bytea, decode_bytea, looks_like_number,
encode_array_literal, encode_array_constructor.
Split SPI.xs into two - SPI.xs now contains only SPI functions. Remainder
are in new Util.xs.
Some more code and documentation cleanup along the way, as well as
adding some CVS markers to files missing them.
Original patch from Tim Bunce, with a little editing from me.
Robert Haas [Wed, 20 Jan 2010 00:42:28 +0000 (00:42 +0000)]
Reformat documentation of libpq escaping functions.
Modify the "Escaping Strings for Inclusion in SQL Commands" section
to use a <variablelist> as the preceding and following sections do,
and merge the "Escaping Binary Strings for Inclusion in SQL Commands"
section into it.
This changes only the formatting of these sections, not the content.
It is intended to lay the groundwork for a follow-on patch to add
some new escaping functions, but it makes sense to commit this first,
for clarity.
Tom Lane [Tue, 19 Jan 2010 18:39:19 +0000 (18:39 +0000)]
When doing a parallel restore, we must guard against out-of-range dependency
dump IDs, because the array we're using is sized according to the highest
dump ID actually defined in the archive file. In a partial dump there could
be references to higher dump IDs that weren't dumped. Treat these the same
as references to in-range IDs that weren't dumped. (The whole thing is a
bit scary because the missing objects might have been part of dependency
chains, which we won't know about. Not much we can do though --- throwing
an error is probably overreaction.)
Also, reject parallel restore with pre-1.8 archive version (made by pre-8.0
pg_dump). In these old versions the dependency entries are OIDs, not dump
IDs, and we don't have enough information to interpret them.
Tom Lane [Tue, 19 Jan 2010 16:33:33 +0000 (16:33 +0000)]
Fix thinko in my recent change to put an explicit argisrow field in NullTest:
when the planner splits apart a ROW(...) IS NULL test, the argisrow values
of the component tests have to be determined from the component field types,
not copied from the original NullTest (in which argisrow is surely true).
Tom Lane [Mon, 18 Jan 2010 18:17:45 +0000 (18:17 +0000)]
Fix an oversight in convert_EXISTS_sublink_to_join: we can't convert an
EXISTS that contains a WITH clause. This would usually lead to a
"could not find CTE" error later in planning, because the WITH wouldn't
get processed at all. Noted while playing with an example from Ken Marshall.
Tom Lane [Mon, 18 Jan 2010 02:30:25 +0000 (02:30 +0000)]
Fix portalmem.c to avoid keeping a dangling pointer to a cached plan list
after it's released its reference count for the cached plan. There are
code paths that might try to examine the plan list before noticing that
the portal is already in aborted state. Report and diagnosis by Tatsuo
Ishii, though this isn't exactly his proposed patch.
Tom Lane [Sun, 17 Jan 2010 22:56:23 +0000 (22:56 +0000)]
Improve the handling of SET CONSTRAINTS commands by having them search
pg_constraint before searching pg_trigger. This allows saner handling of
corner cases; in particular we now say "constraint is not deferrable"
rather than "constraint does not exist" when the command is applied to
a constraint that's inherently non-deferrable. Per a gripe several months
ago from hubert depesz lubaczewski.
To make this work without breaking user-defined constraint triggers,
we have to add entries for them to pg_constraint. However, in return
we can remove the pgconstrname column from pg_constraint, which represents
a fairly sizable space savings. I also replaced the tgisconstraint column
with tgisinternal; the old meaning of tgisconstraint can now be had by
testing for nonzero tgconstraint, while there is no other way to get
the old meaning of nonzero tgconstraint, namely that the trigger was
internally generated rather than being user-created.
In passing, fix an old misstatement in the docs and comments, namely that
pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable.
Actually, we mark RI action triggers as nondeferrable even when they belong to
a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on
that instead of hard-coding a list of exception OIDs.
Tom Lane [Sat, 16 Jan 2010 19:50:26 +0000 (19:50 +0000)]
Re-order configure tests to reflect the fact that the code generated for
posix_fadvise and other file-related functions can depend on _LARGEFILE_SOURCE
and/or _FILE_OFFSET_BITS. Per report from Robert Treat.
Back-patch to 8.4. This has been wrong all along, but we weren't really using
posix_fadvise in anger before, and AC_FUNC_FSEEKO seems to mask the issue well
enough for that function.
Tom Lane [Sat, 16 Jan 2010 17:39:55 +0000 (17:39 +0000)]
Fix unportable use of isxdigit() with char (rather than unsigned char)
argument, per warnings from buildfarm member pika. Also clean up code
formatting a trifle.
Simon Riggs [Sat, 16 Jan 2010 14:16:31 +0000 (14:16 +0000)]
Lock database while running drop database in Hot Standby to protect
against concurrent reconnection. Failure during testing showed issue
was possible, even though earlier analysis seemed to indicate it
would not be required. Use LockSharedObjectForSession() before
ResolveRecoveryConflictWithDatabase() and hold lock until end of
processing for that WAL record. Simple approach to avoid introducing
further bugs at this stage of development on an improbable issue.
Simon Riggs [Sat, 16 Jan 2010 10:05:59 +0000 (10:05 +0000)]
Teach standby conflict resolution to use SIGUSR1
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
Tom Lane [Sat, 16 Jan 2010 05:52:29 +0000 (05:52 +0000)]
Huh, apparently on cygwin we HAVE_SIGPROCMASK, so both variants of
the BlockSig/UnBlockSig declaration have to be PGDLLIMPORT'ified.
Per buildfarm results.
Tom Lane [Fri, 15 Jan 2010 22:36:35 +0000 (22:36 +0000)]
Do parse analysis of an EXPLAIN's contained statement during the normal
parse analysis phase, rather than at execution time. This makes parameter
handling work the same as it does in ordinary plannable queries, and in
particular fixes the incompatibility that Pavel pointed out with plpgsql's
new handling of variable references. plancache.c gets a little bit
grottier, but the alternatives seem worse.
Move build of src/backend/replication/walreceiver/ later in the build
process, after src/interfaces, because it depends on libpq. Also add
missing lines for clean etc. targets
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.
Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.
Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.
Simon Riggs [Thu, 14 Jan 2010 11:08:02 +0000 (11:08 +0000)]
First part of refactoring of code for ResolveRecoveryConflict. Purposes
of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.
Tom Lane [Thu, 14 Jan 2010 00:14:06 +0000 (00:14 +0000)]
Simplify validate_exec() by using access(2) to check file permissions,
rather than trying to implement the equivalent logic by hand. The motivation
for the original coding appears to have been to check with the effective uid's
permissions not the real uid's; but there is no longer any difference, because
we don't run the postmaster setuid (indeed, main.c enforces that they're the
same). Using access() means we will get it right in situations the original
coding failed to handle, such as ACL-based permissions. Besides it's a lot
shorter, cleaner, and more thread-safe. Per bug #5275 from James Bellinger.
Tom Lane [Wed, 13 Jan 2010 23:07:08 +0000 (23:07 +0000)]
When loading critical system indexes into the relcache, ensure we lock the
underlying catalog not only the index itself. Otherwise, if the cache
load process touches the catalog (which will happen for many though not
all of these indexes), we are locking index before parent table, which can
result in a deadlock against processes that are trying to lock them in the
normal order. Per today's failure on buildfarm member gothic_moth; it's
surprising the problem hadn't been identified before.
Back-patch to 8.2. Earlier releases didn't have the issue because they
didn't try to lock these indexes during load (instead assuming that they
couldn't change schema at all during multiuser operation).
Tom Lane [Wed, 13 Jan 2010 16:56:56 +0000 (16:56 +0000)]
Fix bug #5269: ResetPlanCache mustn't invalidate cached utility statements,
especially not ROLLBACK. ROLLBACK might need to be executed in an already
aborted transaction, when there is no safe way to revalidate the plan. But
in general there's no point in marking utility statements invalid, since
they have no plans in the normal sense of the word; so we might as well
work a bit harder here to avoid future revalidation cycles.
Michael Meskes [Wed, 13 Jan 2010 08:41:50 +0000 (08:41 +0000)]
Fix SQL3 type return value.
For non-SQL3 types ecpg used to return -Oid. This will break if there are
enough Oids to fill the namespace. Therefore we play it safe and return 0 if
there is no Oid->SQL3 tyoe mapping available.
Tom Lane [Wed, 13 Jan 2010 01:17:07 +0000 (01:17 +0000)]
Make fixed_paramref_hook behave properly when there are 'unused' slots
in the parameter array. Noted while experimenting with an example
from Pavel. This wouldn't come up in normal use, but it ought to honor
the specification that a parameter array can have unused slots.
Tom Lane [Tue, 12 Jan 2010 18:12:18 +0000 (18:12 +0000)]
Fix relcache reload mechanism to be more robust in the face of errors
occurring during a reload, such as query-cancel. Instead of zeroing out
an existing relcache entry and rebuilding it in place, build a new relcache
entry, then swap its contents with the old one, then free the new entry.
This avoids problems with code believing that a previously obtained pointer
to a cache entry must still reference a valid entry, as seen in recent
failures on buildfarm member jaguar. (jaguar is using CLOBBER_CACHE_ALWAYS
which raises the probability of failure substantially, but the problem
could occur in the field without that.) The previous design was okay
when it was made, but subtransactions and the ResourceOwner mechanism
make it unsafe now.
Also, make more use of the already existing rd_isvalid flag, so that we
remember that the entry requires rebuilding even if the first attempt fails.
Back-patch as far as 8.2. Prior versions have enough issues around relcache
reload anyway (due to inadequate locking) that fixing this one doesn't seem
worthwhile.
Bruce Momjian [Tue, 12 Jan 2010 02:42:52 +0000 (02:42 +0000)]
Please tablespace directories in their own subdirectory so pg_migrator
can upgrade clusters without renaming the tablespace directories. New
directory structure format is, e.g.:
Tom Lane [Mon, 11 Jan 2010 18:39:32 +0000 (18:39 +0000)]
Add some simple support and documentation for using process-specific oom_adj
settings to prevent the postmaster from being OOM-killed on Linux systems.
Tom Lane [Mon, 11 Jan 2010 15:31:04 +0000 (15:31 +0000)]
Improve ExecEvalVar's handling of whole-row variables in cases where the
rowtype contains dropped columns. Sometimes the input tuple will be formed
from a select targetlist in which dropped columns are filled with a NULL
of an arbitrary type (the planner typically uses INT4, since it can't tell
what type the dropped column really was). So we need to relax the rowtype
compatibility check to not insist on physical compatibility if the actual
column value is NULL.
In principle we might need to do this for functions returning composite
types, too (see tupledesc_match()). In practice there doesn't seem to be
a bug there, probably because the function will be using the same cached
rowtype descriptor as the caller. Fixing that code path would require
significant rearrangement, so I left it alone for now.