Bruce Momjian [Fri, 5 Feb 2010 23:37:43 +0000 (23:37 +0000)]
Document that archive_timeout will force new WAL files even if a single
checkpoint has happened, and recommend adjusting checkpoint_timeout to
reduce the impact of this.
Joe Conway [Fri, 5 Feb 2010 03:09:05 +0000 (03:09 +0000)]
Modify recently added PQconnectdbParams() with new argument, expand_dbname.
If expand_dbname is non-zero and dbname contains an = sign, it is taken as
a conninfo string in exactly the same way as if it had been passed to
PQconnectdb. This is equivalent to the way PQsetdbLogin() works, allowing
PQconnectdbParams() to be a complete alternative.
Also improve the way the new function is called from psql and replace a
previously missed call to PQsetdbLogin() in psql. Additionally use
PQconnectdbParams() for pg_dump and friends, and the bin/scripts
command line utilities such as vacuumdb, createdb, etc.
Finally, update the documentation for the new parameter, as well as the
nuances of precedence in cases where key words are repeated or duplicated
in the conninfo string.
Tom Lane [Thu, 4 Feb 2010 00:09:14 +0000 (00:09 +0000)]
Restructure CLUSTER/newstyle VACUUM FULL/ALTER TABLE support so that swapping
of old and new toast tables can be done either at the logical level (by
swapping the heaps' reltoastrelid links) or at the physical level (by swapping
the relfilenodes of the toast tables and their indexes). This is necessary
infrastructure for upcoming changes to support CLUSTER/VAC FULL on shared
system catalogs, where we cannot change reltoastrelid. The physical swap
saves a few catalog updates too.
We unfortunately have to keep the logical-level swap logic because in some
cases we will be adding or deleting a toast table, so there's no possibility
of a physical swap. However, that only happens as a consequence of schema
changes in the table, which we do not need to support for system catalogs,
so such cases aren't an obstacle for that.
In passing, refactor the cluster support functions a little bit to eliminate
unnecessarily-duplicated code; and fix the problem that while CLUSTER had
been taught to rename the final toast table at need, ALTER TABLE had not.
Joe Conway [Wed, 3 Feb 2010 23:01:11 +0000 (23:01 +0000)]
Check to ensure the number of primary key fields supplied does not
exceed the total number of non-dropped source table fields for
dblink_build_sql_*(). Addresses bug report from Rushabh Lathia.
Move the responsibility of writing a "unlogged WAL operation" record from
heap_sync() to the callers, because heap_sync() is sometimes called even
if the operation itself is WAL-logged. This eliminates the bogus unlogged
records from CLUSTER that Simon Riggs reported, patch by Fujii Masao.
Add a message type header to the CopyData messages sent from primary
to standby in streaming replication. While we only have one message type
at the moment, adding a message type header makes this easier to extend.
Tom Lane [Wed, 3 Feb 2010 03:21:25 +0000 (03:21 +0000)]
Fix timing-sensitive regression test result I just created :-( --- the
DROP USER at the end of the cluster.sql test could fail, if the temp
table created in the previous session hadn't finished getting dropped.
Unluckily, I didn't see this in several repetitions of the parallel
regression tests, but it's popping up on quite a few buildfarm machines.
Tom Lane [Wed, 3 Feb 2010 01:14:17 +0000 (01:14 +0000)]
Assorted cleanups in preparation for using a map file to support altering
the relfilenode of currently-not-relocatable system catalogs.
1. Get rid of inval.c's dependency on relfilenode, by not having it emit
smgr invalidations as a result of relcache flushes. Instead, smgr sinval
messages are sent directly from smgr.c when an actual relation delete or
truncate is done. This makes considerably more structural sense and allows
elimination of a large number of useless smgr inval messages that were
formerly sent even in cases where nothing was changing at the
physical-relation level. Note that this reintroduces the concept of
nontransactional inval messages, but that's okay --- because the messages
are sent by smgr.c, they will be sent in Hot Standby slaves, just from a
lower logical level than before.
2. Move setNewRelfilenode out of catalog/index.c, where it never logically
belonged, into relcache.c; which is a somewhat debatable choice as well but
better than before. (I considered catalog/storage.c, but that seemed too
low level.) Rename to RelationSetNewRelfilenode.
3. Cosmetic cleanups of some other relfilenode manipulations.
Tom Lane [Tue, 2 Feb 2010 19:12:29 +0000 (19:12 +0000)]
CLUSTER specified the wrong namespace when renaming toast tables of temporary
relations (they don't live in pg_toast). This caused an Assert failure in
assert-enabled builds. So far as I can see, in a non-assert build it would
only have messed up the checks for conflicting names, so a failure would be
quite improbable but perhaps not impossible.
Robert Haas [Tue, 2 Feb 2010 18:52:33 +0000 (18:52 +0000)]
Fold FindConversion() into FindConversionByName() and remove ACL check.
All callers of FindConversionByName() already do suitable permissions
checking already apart from this function, but this is not just dead
code removal: the unnecessary permissions check can actually lead to
spurious failures - there's no reason why inability to execute the
underlying function should prohibit renaming the conversion, for example.
(The error messages in these cases were also rather poor:
FindConversion would return InvalidOid, eventually leading to a complaint
that the conversion "did not exist", which was not correct.)
Tom Lane [Tue, 2 Feb 2010 18:16:10 +0000 (18:16 +0000)]
The particular table names used in the new inheritance regression test are
prone to sort differently in different locales, as seen in buildfarm results.
Let's cast to name not text to avoid that.
Robert Haas [Mon, 1 Feb 2010 19:28:56 +0000 (19:28 +0000)]
Tighten integrity checks on ALTER TABLE ... ALTER COLUMN ... RENAME.
When a column is renamed, we recursively rename the same column in
all descendent tables. But if one of those tables also inherits that
column from a table outside the inheritance hierarchy rooted at the
named table, we must throw an error. The previous coding correctly
prohibited the rename when the parent had inherited the column from
elsewhere, but overlooked the case where the parent was OK but a child
table also inherited the same column from a second, unrelated parent.
For now, not backpatched due to lack of complaints from the field.
KaiGai Kohei, with further changes by me.
Reviewed by Bernd Helme and Tom Lane.
Robert Haas [Mon, 1 Feb 2010 15:43:36 +0000 (15:43 +0000)]
Augment EXPLAIN output with more details on Hash nodes.
We show the number of buckets, the number of batches (and also the original
number if it has changed), and the peak space used by the hash table. Minor
executor changes to track peak space used.
Add string_agg aggregate functions. The one argument version concatenates
the input values into a string. The two argument version also does the same
thing, but inserts delimiters between elements.
Original patch by Pavel Stehule, reviewed by David E. Wheeler and me.
Tom Lane [Mon, 1 Feb 2010 02:45:29 +0000 (02:45 +0000)]
Change regexp engine's ccondissect/crevdissect routines to perform DFA
matching before recursing instead of after. The DFA match eliminates
unworkable midpoint choices a lot faster than the recursive check, in most
cases, so doing it first can speed things up; particularly in pathological
cases such as recently exhibited by Michael Glaesemann.
In addition, apply some cosmetic changes that were applied upstream (in the
Tcl project) at the same time, in order to sync with upstream version 1.15
of regexec.c.
Upstream apparently intends to backpatch this, so I will too. The
pathological behavior could be unpleasant if encountered in the field,
which seems to justify any risk of introducing new bugs.
Tom Lane, reviewed by Donal K. Fellows of Tcl project
Simon Riggs [Sun, 31 Jan 2010 19:01:11 +0000 (19:01 +0000)]
Detect early deadlock in Hot Standby when Startup is already waiting. First
stage of required deadlock detection to allow re-enabling max_standby_delay
setting of -1, which is now essential in the absence of improved relation-
specific conflict resoluton. Requested by Greg Stark et al.
Tom Lane [Sun, 31 Jan 2010 18:15:39 +0000 (18:15 +0000)]
Fix memory leak created by deferrable-index-constraints patches.
We need to free the OID list returned by ExecInsertIndexTuples to avoid
a query-lifespan memory leak. When many rows require rechecking, this
can be a significant leak --- it's even more than the space used for the
queued trigger events.
Magnus Hagander [Sun, 31 Jan 2010 17:16:23 +0000 (17:16 +0000)]
Fix race condition in win32 signal handling.
There was a race condition where the receiving pipe could be closed by the
child thread if the main thread was pre-empted before it got a chance to
create a new one, and the dispatch thread ran to completion during that time.
One symptom of this is that rows in pg_listener could be dropped under
heavy load.
Analysis and original patch by Radu Ilie, with some small
modifications by Magnus Hagander.
Tom Lane [Sat, 30 Jan 2010 20:09:53 +0000 (20:09 +0000)]
Avoid performing encoding conversion on command tag strings during EndCommand.
Since all current and foreseeable future command tags will be pure ASCII,
there is no need to do conversion on them. This saves a few cycles and also
avoids polluting otherwise-pristine subtransaction memory contexts, which
is the cause of the backend memory leak exhibited in bug #5302. (Someday
we'll probably want to have a better method of determining whether
subtransaction contexts need to be kept around, but today is not that day.)
Backpatch to 8.0. The cycle-shaving aspect of this would work in 7.4
too, but without subtransactions the memory-leak aspect doesn't apply,
so it doesn't seem worth touching 7.4.
Tom Lane [Sat, 30 Jan 2010 18:59:51 +0000 (18:59 +0000)]
Fix memory leakage introduced into print_aligned_text by 8.4 changes
(failure to free col_lineptrs[] array elements) and exacerbated in the
current devel cycle (failure to free "wrap"). This resulted in moderate
bloat of psql over long script runs. Noted while testing bug #5302,
although what the reporter was complaining of was backend-side leakage.
Andrew Dunstan [Sat, 30 Jan 2010 01:46:57 +0000 (01:46 +0000)]
Add plperl.on_perl_init setting to provide for initializing the perl library on load. Also, handle END blocks in plperl.
Database access is disallowed during both these operations, although it might be allowed in END blocks in future.
Simon Riggs [Fri, 29 Jan 2010 19:45:12 +0000 (19:45 +0000)]
Adjust GetLockConflicts() so that it uses TopMemoryContext when
executed InHotStandby. Cleaner solution than using malloc or palloc
depending upon situation, as proposed by Tom.
Simon Riggs [Fri, 29 Jan 2010 18:39:05 +0000 (18:39 +0000)]
Augment WAL records for btree delete with GetOldestXmin() to reduce
false positives during Hot Standby conflict processing. Simple
patch to enhance conflict processing, following previous discussions.
Controlled by parameter minimize_standby_conflicts = on | off, with
default off allows measurement of performance impact to see whether
it should be set on all the time.
Simon Riggs [Fri, 29 Jan 2010 17:10:05 +0000 (17:10 +0000)]
Filter recovery conflicts based upon dboid from relfilenode of WAL
records for heap and btree. Minor change, mostly API changes to
pass through the required values. This is a simple change though
also provides the refactoring required for further enhancements
to conflict processing using the relOid. Changes only have effect
during Hot Standby.
Andrew Dunstan [Thu, 28 Jan 2010 23:59:52 +0000 (23:59 +0000)]
Add new make targets "world", "install-world" and "installcheck-world" to build, install and check just about everything.
In addition to everything built installed and tested by all, install and installcheck targets, these build HTML Docs,
build and test contrib, and test PLs and ECPG.
Simon Riggs [Thu, 28 Jan 2010 10:05:37 +0000 (10:05 +0000)]
Use malloc() in GetLockConflicts() when called InHotStandby to avoid repeated
palloc calls. Current code assumed this was already true, so this is a bug fix.
Change a few remaining calls of XLogArchivingActive() to use
XLogIsNeeded() instead, to determine if an otherwise non-logged operation
needs to be logged in WAL for standby servers.
Joe Conway [Thu, 28 Jan 2010 06:28:26 +0000 (06:28 +0000)]
Introduce two new libpq connection functions, PQconnectdbParams and
PQconnectStartParams. These are analogous to PQconnectdb and PQconnectStart
respectively. They differ from the legacy functions in that they accept
two NULL-terminated arrays, keywords and values, rather than conninfo
strings. This avoids the need to build the conninfo string in cases
where it might be inconvenient to do so. Includes documentation.
Also modify psql to utilize PQconnectdbParams rather than PQsetdbLogin.
This allows the new config parameter application_name to be set, which
in turn is displayed in the pg_stat_activity view and included in CSV
log entries. This will also ensure both new functions get regularly
exercised.
Patch by Guillaume Lelarge with review and minor adjustments by
Joe Conway.
Fix bug in wasender's xlogid boundary handling, reported by Erik Rijkers.
LogwrtRqst.Write can be set to non-existent FF log segment, we mustn't
try to send that in XLogSend().
Also fix similar bug in ReadRecord(), which I just introduced in the
ReadRecord() refactoring patch.
Make standby server continuously retry restoring the next WAL segment with
restore_command, if the connection to the primary server is lost. This
ensures that the standby can recover automatically, if the connection is
lost for a long time and standby falls behind so much that the required
WAL segments have been archived and deleted in the master.
This also makes standby_mode useful without streaming replication; the
server will keep retrying restore_command every few seconds until the
trigger file is found. That's the same basic functionality pg_standby
offers, but without the bells and whistles.
To implement that, refactor the ReadRecord/FetchRecord functions. The
FetchRecord() function introduced in the original streaming replication
patch is removed, and all the retry logic is now in a new function called
XLogReadPage(). XLogReadPage() is now responsible for executing
restore_command, launching walreceiver, and waiting for new WAL to arrive
from primary, as required.
This also changes the life cycle of walreceiver. When launched, it now only
tries to connect to the master once, and exits if the connection fails, or
is lost during streaming for any reason. The startup process detects the
death, and re-launches walreceiver if necessary.
Andrew Dunstan [Tue, 26 Jan 2010 23:11:56 +0000 (23:11 +0000)]
Various small improvements and cleanups for PL/Perl.
- Allow (ineffective) use of 'require' in plperl
If the required module is not already loaded then it dies.
So "use strict;" now works in plperl.
- Pre-load the feature module if perl >= 5.10.
So "use feature :5.10;" now works in plperl.
- Stored procedure subs are now given names.
The names are not visible in ordinary use, but they make
tools like Devel::NYTProf and Devel::Cover much more useful.
- Simplified and generalized the subroutine creation code.
Now one code path for generating sub source code, not four.
Can generate multiple 'use' statements with specific imports
(which handles plperl.use_strict currently and can easily
be extended to handle a plperl.use_feature=':5.12' in future).
- Disallows use of Safe version 2.20 which is broken for PL/Perl.
http://rt.perl.org/rt3/Ticket/Display.html?id=72068
- Assorted minor optimizations by pre-growing data structures.
Tom Lane [Tue, 26 Jan 2010 16:33:40 +0000 (16:33 +0000)]
Remove the default_do_language parameter, instead making DO use a hardwired
default of "plpgsql". This is more reasonable than it was when the DO patch
was written, because we have since decided that plpgsql should be installed
by default. Per discussion, having a parameter for this doesn't seem useful
enough to justify the risk of application breakage if the value is changed
unexpectedly.
Peter Eisentraut [Tue, 26 Jan 2010 06:58:39 +0000 (06:58 +0000)]
Reformat the comments in pg_hba.conf and pg_ident.conf
These files have apparently been edited over the years by a dozen people
with as many different editor settings, which made the alignment of the
paragraphs quite inconsistent and ugly. I made a pass of M-q with Emacs
to straighten it out.
Add note that PREPARE TRANSACTION is for transaction managers, not
regular applications. Also add a comment pointing out that tab-complition
for PREPARE TRANSACTION is missing on purpose.
Tom Lane [Mon, 25 Jan 2010 01:58:14 +0000 (01:58 +0000)]
Apply Tcl_Init() to the "hold" interpreter created by pltcl.
You might think this is unnecessary since that interpreter is never used
to run code --- but it turns out that's wrong. As of Tcl 8.5, the "clock"
command (alone among builtin Tcl commands) is partially implemented by
loaded-on-demand Tcl code, which means that it fails if there's not
unknown-command support, and also that it's impossible to run it directly
in a safe interpreter. The way they get around the latter is that
Tcl_CreateSlave() automatically sets up an alias command that forwards any
execution of "clock" in a safe slave interpreter to its parent interpreter.
Thus, when attempting to execute "clock" in trusted pltcl, the command
actually executes in the "hold" interpreter, where it will fail if
unknown-command support hasn't been introduced by sourcing the standard
init.tcl script, which is done by Tcl_Init(). (This is a pretty dubious
design decision on the Tcl boys' part, if you ask me ... but they didn't.)
Back-patch all the way. It's not clear that anyone would try to use ancient
versions of pltcl with a recent Tcl, but it's not clear they wouldn't, either.
Also add a regression test using "clock", in branches that have regression
test support for pltcl.
Joe Conway [Sun, 24 Jan 2010 22:19:38 +0000 (22:19 +0000)]
Rewrite dblink_record_internal() and dblink_fetch() to use a tuplestore
(SFRM_Materialize mode) to return tuples. Since we don't return from the
dblink function in tuplestore mode, release the PGresult with a PG_CATCH
block on error. Also rearrange to share the same code to materialize the
tuplestore. Patch by Takahiro Itagaki.
Tom Lane [Sun, 24 Jan 2010 21:49:17 +0000 (21:49 +0000)]
Fix assorted core dumps and Assert failures that could occur during
AbortTransaction or AbortSubTransaction, when trying to clean up after an
error that prevented (sub)transaction start from completing:
* access to TopTransactionResourceOwner that might not exist
* assert failure in AtEOXact_GUC, if AtStart_GUC not called yet
* assert failure or core dump in AfterTriggerEndSubXact, if
AfterTriggerBeginSubXact not called yet
Per testing by injecting elog(ERROR) at successive steps in StartTransaction
and StartSubTransaction. It's not clear whether all of these cases could
really occur in the field, but at least one of them is easily exposed by
simple stress testing, as per my accidental discovery yesterday.
Tom Lane [Sat, 23 Jan 2010 21:29:00 +0000 (21:29 +0000)]
Insert CHECK_FOR_INTERRUPTS calls into loops in dbsize.c, to ensure that
the various disk-size-reporting functions will respond to query cancel
reasonably promptly even in very large databases. Per report from
Kevin Grittner.
Simon Riggs [Sat, 23 Jan 2010 16:37:12 +0000 (16:37 +0000)]
In HS, Startup process sets SIGALRM when waiting for buffer pin. If
woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.
Robert Haas [Fri, 22 Jan 2010 16:40:19 +0000 (16:40 +0000)]
Replace ALTER TABLE ... SET STATISTICS DISTINCT with a more general mechanism.
Attributes can now have options, just as relations and tablespaces do, and
the reloptions code is used to parse, validate, and store them. For
simplicity and because these options are not performance critical, we store
them in a separate cache rather than the main relcache.