Simon Riggs [Thu, 28 Jan 2010 10:05:37 +0000 (10:05 +0000)]
Use malloc() in GetLockConflicts() when called InHotStandby to avoid repeated
palloc calls. Current code assumed this was already true, so this is a bug fix.
Change a few remaining calls of XLogArchivingActive() to use
XLogIsNeeded() instead, to determine if an otherwise non-logged operation
needs to be logged in WAL for standby servers.
Joe Conway [Thu, 28 Jan 2010 06:28:26 +0000 (06:28 +0000)]
Introduce two new libpq connection functions, PQconnectdbParams and
PQconnectStartParams. These are analogous to PQconnectdb and PQconnectStart
respectively. They differ from the legacy functions in that they accept
two NULL-terminated arrays, keywords and values, rather than conninfo
strings. This avoids the need to build the conninfo string in cases
where it might be inconvenient to do so. Includes documentation.
Also modify psql to utilize PQconnectdbParams rather than PQsetdbLogin.
This allows the new config parameter application_name to be set, which
in turn is displayed in the pg_stat_activity view and included in CSV
log entries. This will also ensure both new functions get regularly
exercised.
Patch by Guillaume Lelarge with review and minor adjustments by
Joe Conway.
Fix bug in wasender's xlogid boundary handling, reported by Erik Rijkers.
LogwrtRqst.Write can be set to non-existent FF log segment, we mustn't
try to send that in XLogSend().
Also fix similar bug in ReadRecord(), which I just introduced in the
ReadRecord() refactoring patch.
Make standby server continuously retry restoring the next WAL segment with
restore_command, if the connection to the primary server is lost. This
ensures that the standby can recover automatically, if the connection is
lost for a long time and standby falls behind so much that the required
WAL segments have been archived and deleted in the master.
This also makes standby_mode useful without streaming replication; the
server will keep retrying restore_command every few seconds until the
trigger file is found. That's the same basic functionality pg_standby
offers, but without the bells and whistles.
To implement that, refactor the ReadRecord/FetchRecord functions. The
FetchRecord() function introduced in the original streaming replication
patch is removed, and all the retry logic is now in a new function called
XLogReadPage(). XLogReadPage() is now responsible for executing
restore_command, launching walreceiver, and waiting for new WAL to arrive
from primary, as required.
This also changes the life cycle of walreceiver. When launched, it now only
tries to connect to the master once, and exits if the connection fails, or
is lost during streaming for any reason. The startup process detects the
death, and re-launches walreceiver if necessary.
Andrew Dunstan [Tue, 26 Jan 2010 23:11:56 +0000 (23:11 +0000)]
Various small improvements and cleanups for PL/Perl.
- Allow (ineffective) use of 'require' in plperl
If the required module is not already loaded then it dies.
So "use strict;" now works in plperl.
- Pre-load the feature module if perl >= 5.10.
So "use feature :5.10;" now works in plperl.
- Stored procedure subs are now given names.
The names are not visible in ordinary use, but they make
tools like Devel::NYTProf and Devel::Cover much more useful.
- Simplified and generalized the subroutine creation code.
Now one code path for generating sub source code, not four.
Can generate multiple 'use' statements with specific imports
(which handles plperl.use_strict currently and can easily
be extended to handle a plperl.use_feature=':5.12' in future).
- Disallows use of Safe version 2.20 which is broken for PL/Perl.
http://rt.perl.org/rt3/Ticket/Display.html?id=72068
- Assorted minor optimizations by pre-growing data structures.
Tom Lane [Tue, 26 Jan 2010 16:33:40 +0000 (16:33 +0000)]
Remove the default_do_language parameter, instead making DO use a hardwired
default of "plpgsql". This is more reasonable than it was when the DO patch
was written, because we have since decided that plpgsql should be installed
by default. Per discussion, having a parameter for this doesn't seem useful
enough to justify the risk of application breakage if the value is changed
unexpectedly.
Peter Eisentraut [Tue, 26 Jan 2010 06:58:39 +0000 (06:58 +0000)]
Reformat the comments in pg_hba.conf and pg_ident.conf
These files have apparently been edited over the years by a dozen people
with as many different editor settings, which made the alignment of the
paragraphs quite inconsistent and ugly. I made a pass of M-q with Emacs
to straighten it out.
Add note that PREPARE TRANSACTION is for transaction managers, not
regular applications. Also add a comment pointing out that tab-complition
for PREPARE TRANSACTION is missing on purpose.
Tom Lane [Mon, 25 Jan 2010 01:58:14 +0000 (01:58 +0000)]
Apply Tcl_Init() to the "hold" interpreter created by pltcl.
You might think this is unnecessary since that interpreter is never used
to run code --- but it turns out that's wrong. As of Tcl 8.5, the "clock"
command (alone among builtin Tcl commands) is partially implemented by
loaded-on-demand Tcl code, which means that it fails if there's not
unknown-command support, and also that it's impossible to run it directly
in a safe interpreter. The way they get around the latter is that
Tcl_CreateSlave() automatically sets up an alias command that forwards any
execution of "clock" in a safe slave interpreter to its parent interpreter.
Thus, when attempting to execute "clock" in trusted pltcl, the command
actually executes in the "hold" interpreter, where it will fail if
unknown-command support hasn't been introduced by sourcing the standard
init.tcl script, which is done by Tcl_Init(). (This is a pretty dubious
design decision on the Tcl boys' part, if you ask me ... but they didn't.)
Back-patch all the way. It's not clear that anyone would try to use ancient
versions of pltcl with a recent Tcl, but it's not clear they wouldn't, either.
Also add a regression test using "clock", in branches that have regression
test support for pltcl.
Joe Conway [Sun, 24 Jan 2010 22:19:38 +0000 (22:19 +0000)]
Rewrite dblink_record_internal() and dblink_fetch() to use a tuplestore
(SFRM_Materialize mode) to return tuples. Since we don't return from the
dblink function in tuplestore mode, release the PGresult with a PG_CATCH
block on error. Also rearrange to share the same code to materialize the
tuplestore. Patch by Takahiro Itagaki.
Tom Lane [Sun, 24 Jan 2010 21:49:17 +0000 (21:49 +0000)]
Fix assorted core dumps and Assert failures that could occur during
AbortTransaction or AbortSubTransaction, when trying to clean up after an
error that prevented (sub)transaction start from completing:
* access to TopTransactionResourceOwner that might not exist
* assert failure in AtEOXact_GUC, if AtStart_GUC not called yet
* assert failure or core dump in AfterTriggerEndSubXact, if
AfterTriggerBeginSubXact not called yet
Per testing by injecting elog(ERROR) at successive steps in StartTransaction
and StartSubTransaction. It's not clear whether all of these cases could
really occur in the field, but at least one of them is easily exposed by
simple stress testing, as per my accidental discovery yesterday.
Tom Lane [Sat, 23 Jan 2010 21:29:00 +0000 (21:29 +0000)]
Insert CHECK_FOR_INTERRUPTS calls into loops in dbsize.c, to ensure that
the various disk-size-reporting functions will respond to query cancel
reasonably promptly even in very large databases. Per report from
Kevin Grittner.
Simon Riggs [Sat, 23 Jan 2010 16:37:12 +0000 (16:37 +0000)]
In HS, Startup process sets SIGALRM when waiting for buffer pin. If
woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.
Robert Haas [Fri, 22 Jan 2010 16:40:19 +0000 (16:40 +0000)]
Replace ALTER TABLE ... SET STATISTICS DISTINCT with a more general mechanism.
Attributes can now have options, just as relations and tablespaces do, and
the reloptions code is used to parse, validate, and store them. For
simplicity and because these options are not performance critical, we store
them in a separate cache rather than the main relcache.
Michael Meskes [Fri, 22 Jan 2010 14:13:03 +0000 (14:13 +0000)]
Applied patch by Boszormenyi Zoltan <zb@cybertec.at> to fix problem in auto-prepare mode if the connection is closed and re-opened and the previously prepared query is issued again.
Robert Haas [Thu, 21 Jan 2010 14:58:53 +0000 (14:58 +0000)]
Add new escaping functions PQescapeLiteral and PQescapeIdentifier.
PQescapeLiteral is similar to PQescapeStringConn, but it relieves the
caller of the need to know how large the output buffer should be, and
it provides the appropriate quoting (in addition to escaping special
characers within the string). PQescapeIdentifier provides similar
functionality for escaping identifiers.
Simon Riggs [Thu, 21 Jan 2010 00:53:58 +0000 (00:53 +0000)]
Better internal documentation of locking for Hot Standby conflict resolution.
Discuss the reasons for the lock type we hold on ProcArrayLock while deriving
the conflict list. Cover the idea of false positive conflicts and seemingly
strange effects on snapshot derivation.
Tom Lane [Wed, 20 Jan 2010 23:12:03 +0000 (23:12 +0000)]
Well, the systemtap guys moved the goalposts again: with the latest version,
we *must* generate probes.o or the dtrace probes don't work. Revert our
workaround for their previous bug. Details at
https://bugzilla.redhat.com/show_bug.cgi?id=557266
Write a WAL record whenever we perform an operation without WAL-logging
that would've been WAL-logged if archiving was enabled. If we encounter
such records in archive recovery anyway, we know that some data is
missing from the log. A WARNING is emitted in that case.
Original patch by Fujii Masao, with changes by me.
Now that much of walreceiver has been pulled back into the postgres
binary, revert PGDLLIMPORT decoration of global variables. I'm not sure
if there's any real harm from unnecessary PGDLLIMPORTs, but these are all
internal variables that external modules really shouldn't be messing
with. ThisTimeLineID still needs PGDLLIMPORT.
Rethink the way walreceiver is linked into the backend. Instead than shoving
walreceiver as whole into a dynamically loaded module, split the
libpq-specific parts of it into dynamically loaded module and keep the rest
in the main backend binary.
Although Tom fixed the Windows compilation problems with the old walreceiver
module already, this is a cleaner division of labour and makes the code
more readable. There's also the prospect of adding new transport methods
as pluggable modules in the future, which this patch makes easier, though for
now the API between libpqwalreceiver and walreceiver process should be
considered private.
The libpq-specific module is now in src/backend/replication/libpqwalreceiver,
and the part linked with postgres binary is in
src/backend/replication/walreceiver.c.
Peter Eisentraut [Wed, 20 Jan 2010 05:47:09 +0000 (05:47 +0000)]
Before attempting to create a composite type, check whether a type of that
name already exists, so we'd get an error message about a "type" instead
of about a "relation", because the composite type code shares code with
relation creation.
Andrew Dunstan [Wed, 20 Jan 2010 01:08:21 +0000 (01:08 +0000)]
Add utility functions to PLPerl:
quote_literal, quote_nullable, quote_ident,
encode_bytea, decode_bytea, looks_like_number,
encode_array_literal, encode_array_constructor.
Split SPI.xs into two - SPI.xs now contains only SPI functions. Remainder
are in new Util.xs.
Some more code and documentation cleanup along the way, as well as
adding some CVS markers to files missing them.
Original patch from Tim Bunce, with a little editing from me.
Robert Haas [Wed, 20 Jan 2010 00:42:28 +0000 (00:42 +0000)]
Reformat documentation of libpq escaping functions.
Modify the "Escaping Strings for Inclusion in SQL Commands" section
to use a <variablelist> as the preceding and following sections do,
and merge the "Escaping Binary Strings for Inclusion in SQL Commands"
section into it.
This changes only the formatting of these sections, not the content.
It is intended to lay the groundwork for a follow-on patch to add
some new escaping functions, but it makes sense to commit this first,
for clarity.
Tom Lane [Tue, 19 Jan 2010 18:39:19 +0000 (18:39 +0000)]
When doing a parallel restore, we must guard against out-of-range dependency
dump IDs, because the array we're using is sized according to the highest
dump ID actually defined in the archive file. In a partial dump there could
be references to higher dump IDs that weren't dumped. Treat these the same
as references to in-range IDs that weren't dumped. (The whole thing is a
bit scary because the missing objects might have been part of dependency
chains, which we won't know about. Not much we can do though --- throwing
an error is probably overreaction.)
Also, reject parallel restore with pre-1.8 archive version (made by pre-8.0
pg_dump). In these old versions the dependency entries are OIDs, not dump
IDs, and we don't have enough information to interpret them.
Tom Lane [Tue, 19 Jan 2010 16:33:33 +0000 (16:33 +0000)]
Fix thinko in my recent change to put an explicit argisrow field in NullTest:
when the planner splits apart a ROW(...) IS NULL test, the argisrow values
of the component tests have to be determined from the component field types,
not copied from the original NullTest (in which argisrow is surely true).
Tom Lane [Mon, 18 Jan 2010 18:17:45 +0000 (18:17 +0000)]
Fix an oversight in convert_EXISTS_sublink_to_join: we can't convert an
EXISTS that contains a WITH clause. This would usually lead to a
"could not find CTE" error later in planning, because the WITH wouldn't
get processed at all. Noted while playing with an example from Ken Marshall.
Tom Lane [Mon, 18 Jan 2010 02:30:25 +0000 (02:30 +0000)]
Fix portalmem.c to avoid keeping a dangling pointer to a cached plan list
after it's released its reference count for the cached plan. There are
code paths that might try to examine the plan list before noticing that
the portal is already in aborted state. Report and diagnosis by Tatsuo
Ishii, though this isn't exactly his proposed patch.
Tom Lane [Sun, 17 Jan 2010 22:56:23 +0000 (22:56 +0000)]
Improve the handling of SET CONSTRAINTS commands by having them search
pg_constraint before searching pg_trigger. This allows saner handling of
corner cases; in particular we now say "constraint is not deferrable"
rather than "constraint does not exist" when the command is applied to
a constraint that's inherently non-deferrable. Per a gripe several months
ago from hubert depesz lubaczewski.
To make this work without breaking user-defined constraint triggers,
we have to add entries for them to pg_constraint. However, in return
we can remove the pgconstrname column from pg_constraint, which represents
a fairly sizable space savings. I also replaced the tgisconstraint column
with tgisinternal; the old meaning of tgisconstraint can now be had by
testing for nonzero tgconstraint, while there is no other way to get
the old meaning of nonzero tgconstraint, namely that the trigger was
internally generated rather than being user-created.
In passing, fix an old misstatement in the docs and comments, namely that
pg_trigger.tgdeferrable is exactly redundant with pg_constraint.condeferrable.
Actually, we mark RI action triggers as nondeferrable even when they belong to
a nominally deferrable FK constraint. The SET CONSTRAINTS code now relies on
that instead of hard-coding a list of exception OIDs.
Tom Lane [Sat, 16 Jan 2010 19:50:26 +0000 (19:50 +0000)]
Re-order configure tests to reflect the fact that the code generated for
posix_fadvise and other file-related functions can depend on _LARGEFILE_SOURCE
and/or _FILE_OFFSET_BITS. Per report from Robert Treat.
Back-patch to 8.4. This has been wrong all along, but we weren't really using
posix_fadvise in anger before, and AC_FUNC_FSEEKO seems to mask the issue well
enough for that function.
Tom Lane [Sat, 16 Jan 2010 17:39:55 +0000 (17:39 +0000)]
Fix unportable use of isxdigit() with char (rather than unsigned char)
argument, per warnings from buildfarm member pika. Also clean up code
formatting a trifle.
Simon Riggs [Sat, 16 Jan 2010 14:16:31 +0000 (14:16 +0000)]
Lock database while running drop database in Hot Standby to protect
against concurrent reconnection. Failure during testing showed issue
was possible, even though earlier analysis seemed to indicate it
would not be required. Use LockSharedObjectForSession() before
ResolveRecoveryConflictWithDatabase() and hold lock until end of
processing for that WAL record. Simple approach to avoid introducing
further bugs at this stage of development on an improbable issue.
Simon Riggs [Sat, 16 Jan 2010 10:05:59 +0000 (10:05 +0000)]
Teach standby conflict resolution to use SIGUSR1
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
Tom Lane [Sat, 16 Jan 2010 05:52:29 +0000 (05:52 +0000)]
Huh, apparently on cygwin we HAVE_SIGPROCMASK, so both variants of
the BlockSig/UnBlockSig declaration have to be PGDLLIMPORT'ified.
Per buildfarm results.
Tom Lane [Fri, 15 Jan 2010 22:36:35 +0000 (22:36 +0000)]
Do parse analysis of an EXPLAIN's contained statement during the normal
parse analysis phase, rather than at execution time. This makes parameter
handling work the same as it does in ordinary plannable queries, and in
particular fixes the incompatibility that Pavel pointed out with plpgsql's
new handling of variable references. plancache.c gets a little bit
grottier, but the alternatives seem worse.
Move build of src/backend/replication/walreceiver/ later in the build
process, after src/interfaces, because it depends on libpq. Also add
missing lines for clean etc. targets
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.
Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.
Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.
Simon Riggs [Thu, 14 Jan 2010 11:08:02 +0000 (11:08 +0000)]
First part of refactoring of code for ResolveRecoveryConflict. Purposes
of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.
Tom Lane [Thu, 14 Jan 2010 00:14:06 +0000 (00:14 +0000)]
Simplify validate_exec() by using access(2) to check file permissions,
rather than trying to implement the equivalent logic by hand. The motivation
for the original coding appears to have been to check with the effective uid's
permissions not the real uid's; but there is no longer any difference, because
we don't run the postmaster setuid (indeed, main.c enforces that they're the
same). Using access() means we will get it right in situations the original
coding failed to handle, such as ACL-based permissions. Besides it's a lot
shorter, cleaner, and more thread-safe. Per bug #5275 from James Bellinger.
Tom Lane [Wed, 13 Jan 2010 23:07:08 +0000 (23:07 +0000)]
When loading critical system indexes into the relcache, ensure we lock the
underlying catalog not only the index itself. Otherwise, if the cache
load process touches the catalog (which will happen for many though not
all of these indexes), we are locking index before parent table, which can
result in a deadlock against processes that are trying to lock them in the
normal order. Per today's failure on buildfarm member gothic_moth; it's
surprising the problem hadn't been identified before.
Back-patch to 8.2. Earlier releases didn't have the issue because they
didn't try to lock these indexes during load (instead assuming that they
couldn't change schema at all during multiuser operation).
Tom Lane [Wed, 13 Jan 2010 16:56:56 +0000 (16:56 +0000)]
Fix bug #5269: ResetPlanCache mustn't invalidate cached utility statements,
especially not ROLLBACK. ROLLBACK might need to be executed in an already
aborted transaction, when there is no safe way to revalidate the plan. But
in general there's no point in marking utility statements invalid, since
they have no plans in the normal sense of the word; so we might as well
work a bit harder here to avoid future revalidation cycles.