Simon Riggs [Sat, 16 Jan 2010 14:16:31 +0000 (14:16 +0000)]
Lock database while running drop database in Hot Standby to protect
against concurrent reconnection. Failure during testing showed issue
was possible, even though earlier analysis seemed to indicate it
would not be required. Use LockSharedObjectForSession() before
ResolveRecoveryConflictWithDatabase() and hold lock until end of
processing for that WAL record. Simple approach to avoid introducing
further bugs at this stage of development on an improbable issue.
Simon Riggs [Sat, 16 Jan 2010 10:05:59 +0000 (10:05 +0000)]
Teach standby conflict resolution to use SIGUSR1
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
Tom Lane [Sat, 16 Jan 2010 05:52:29 +0000 (05:52 +0000)]
Huh, apparently on cygwin we HAVE_SIGPROCMASK, so both variants of
the BlockSig/UnBlockSig declaration have to be PGDLLIMPORT'ified.
Per buildfarm results.
Tom Lane [Fri, 15 Jan 2010 22:36:35 +0000 (22:36 +0000)]
Do parse analysis of an EXPLAIN's contained statement during the normal
parse analysis phase, rather than at execution time. This makes parameter
handling work the same as it does in ordinary plannable queries, and in
particular fixes the incompatibility that Pavel pointed out with plpgsql's
new handling of variable references. plancache.c gets a little bit
grottier, but the alternatives seem worse.
Move build of src/backend/replication/walreceiver/ later in the build
process, after src/interfaces, because it depends on libpq. Also add
missing lines for clean etc. targets
This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.
Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.
Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.
Simon Riggs [Thu, 14 Jan 2010 11:08:02 +0000 (11:08 +0000)]
First part of refactoring of code for ResolveRecoveryConflict. Purposes
of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.
Tom Lane [Thu, 14 Jan 2010 00:14:06 +0000 (00:14 +0000)]
Simplify validate_exec() by using access(2) to check file permissions,
rather than trying to implement the equivalent logic by hand. The motivation
for the original coding appears to have been to check with the effective uid's
permissions not the real uid's; but there is no longer any difference, because
we don't run the postmaster setuid (indeed, main.c enforces that they're the
same). Using access() means we will get it right in situations the original
coding failed to handle, such as ACL-based permissions. Besides it's a lot
shorter, cleaner, and more thread-safe. Per bug #5275 from James Bellinger.
Tom Lane [Wed, 13 Jan 2010 23:07:08 +0000 (23:07 +0000)]
When loading critical system indexes into the relcache, ensure we lock the
underlying catalog not only the index itself. Otherwise, if the cache
load process touches the catalog (which will happen for many though not
all of these indexes), we are locking index before parent table, which can
result in a deadlock against processes that are trying to lock them in the
normal order. Per today's failure on buildfarm member gothic_moth; it's
surprising the problem hadn't been identified before.
Back-patch to 8.2. Earlier releases didn't have the issue because they
didn't try to lock these indexes during load (instead assuming that they
couldn't change schema at all during multiuser operation).
Tom Lane [Wed, 13 Jan 2010 16:56:56 +0000 (16:56 +0000)]
Fix bug #5269: ResetPlanCache mustn't invalidate cached utility statements,
especially not ROLLBACK. ROLLBACK might need to be executed in an already
aborted transaction, when there is no safe way to revalidate the plan. But
in general there's no point in marking utility statements invalid, since
they have no plans in the normal sense of the word; so we might as well
work a bit harder here to avoid future revalidation cycles.
Michael Meskes [Wed, 13 Jan 2010 08:41:50 +0000 (08:41 +0000)]
Fix SQL3 type return value.
For non-SQL3 types ecpg used to return -Oid. This will break if there are
enough Oids to fill the namespace. Therefore we play it safe and return 0 if
there is no Oid->SQL3 tyoe mapping available.
Tom Lane [Wed, 13 Jan 2010 01:17:07 +0000 (01:17 +0000)]
Make fixed_paramref_hook behave properly when there are 'unused' slots
in the parameter array. Noted while experimenting with an example
from Pavel. This wouldn't come up in normal use, but it ought to honor
the specification that a parameter array can have unused slots.
Tom Lane [Tue, 12 Jan 2010 18:12:18 +0000 (18:12 +0000)]
Fix relcache reload mechanism to be more robust in the face of errors
occurring during a reload, such as query-cancel. Instead of zeroing out
an existing relcache entry and rebuilding it in place, build a new relcache
entry, then swap its contents with the old one, then free the new entry.
This avoids problems with code believing that a previously obtained pointer
to a cache entry must still reference a valid entry, as seen in recent
failures on buildfarm member jaguar. (jaguar is using CLOBBER_CACHE_ALWAYS
which raises the probability of failure substantially, but the problem
could occur in the field without that.) The previous design was okay
when it was made, but subtransactions and the ResourceOwner mechanism
make it unsafe now.
Also, make more use of the already existing rd_isvalid flag, so that we
remember that the entry requires rebuilding even if the first attempt fails.
Back-patch as far as 8.2. Prior versions have enough issues around relcache
reload anyway (due to inadequate locking) that fixing this one doesn't seem
worthwhile.
Bruce Momjian [Tue, 12 Jan 2010 02:42:52 +0000 (02:42 +0000)]
Please tablespace directories in their own subdirectory so pg_migrator
can upgrade clusters without renaming the tablespace directories. New
directory structure format is, e.g.:
Tom Lane [Mon, 11 Jan 2010 18:39:32 +0000 (18:39 +0000)]
Add some simple support and documentation for using process-specific oom_adj
settings to prevent the postmaster from being OOM-killed on Linux systems.
Tom Lane [Mon, 11 Jan 2010 15:31:04 +0000 (15:31 +0000)]
Improve ExecEvalVar's handling of whole-row variables in cases where the
rowtype contains dropped columns. Sometimes the input tuple will be formed
from a select targetlist in which dropped columns are filled with a NULL
of an arbitrary type (the planner typically uses INT4, since it can't tell
what type the dropped column really was). So we need to relax the rowtype
compatibility check to not insist on physical compatibility if the actual
column value is NULL.
In principle we might need to do this for functions returning composite
types, too (see tupledesc_match()). In practice there doesn't seem to be
a bug there, probably because the function will be using the same cached
rowtype descriptor as the caller. Fixing that code path would require
significant rearrangement, so I left it alone for now.
Tom Lane [Sun, 10 Jan 2010 17:56:50 +0000 (17:56 +0000)]
Improve plpgsql parsing to report "foo is not a known variable", rather than a
generic syntax error, when seeing "foo := something" and foo isn't recognized.
This buys back most of the helpfulness discarded in my previous patch by not
throwing errors when a qualified name appears to match a row variable but the
last component doesn't match any field of the row. It covers other cases
where our error messages left something to be desired, too.
Tom Lane [Sun, 10 Jan 2010 17:15:18 +0000 (17:15 +0000)]
Improve plpgsql's handling of record field references by forcing all potential
field references in SQL expressions to have RECFIELD datum-array entries at
parse time. If it turns out that the reference is actually to a SQL column,
the RECFIELD entry is useless, but it costs little. This allows us to get rid
of the previous use of FieldSelect applied to a whole-row Param for the record
variable; which was not only slower than a direct RECFIELD reference, but
failed for references to system columns of a trigger's NEW or OLD record.
Per report and fix suggestion from Dean Rasheed.
Magnus Hagander [Sun, 10 Jan 2010 15:54:11 +0000 (15:54 +0000)]
Update Windows installation notes.
pginstaller isn't used anymore, in favor of the one-click installers.
Make it clear that we support Windows 2000 and newer with the native
port, instead of first saying we support NT4 and then saying we don't.
Simon Riggs [Sun, 10 Jan 2010 15:44:28 +0000 (15:44 +0000)]
During Hot Standby, fix drop database when sessions idle.
Previously we only cancelled sessions that were in-transaction.
Simple fix is to just cancel all sessions without waiting. Doing
it this way avoids complicating common code paths, which would
not be worth the trouble to cover this rare case.
Problem report and fix by Andres Freund, edited somewhat by me
Robert Haas [Sun, 10 Jan 2010 04:26:36 +0000 (04:26 +0000)]
Remove partial, broken support for NULL pointers when fetching attributes.
Previously, fastgetattr() and heap_getattr() tested their fourth argument
against a null pointer, but any attempt to use them with a literal-NULL
fourth argument evaluated to *(void *)0, resulting in a compiler error.
Remove these NULL tests to avoid leading future readers of this code to
believe that this has a chance of working. Also clean up related legacy
code in nocachegetattr(), heap_getsysattr(), and nocache_index_getattr().
The new coding standard is that any code which calls a getattr-type
function or macro which takes an isnull argument MUST pass a valid
boolean pointer. Per discussion with Bruce Momjian, Tom Lane, Alvaro
Herrera.
Tom Lane [Sat, 9 Jan 2010 20:46:19 +0000 (20:46 +0000)]
Make ExecEvalFieldSelect throw a more intelligible error if it's asked to
extract a system column, and remove a couple of lines that are useless
in light of the fact that we aren't ever going to support this case. There
isn't much point in trying to make this work because a tuple Datum does
not carry many of the system columns. Per experimentation with a case
reported by Dean Rasheed; we'll have to fix his problem somewhere else.
Simon Riggs [Sat, 9 Jan 2010 16:49:27 +0000 (16:49 +0000)]
During Hot Standby, set DatabasePath correctly during relcache init file
deletion, so that we attempt to unlink the correct filepath. unlink()
errors are ignorable there, so lack of a DatabasePath initialization step
did not cause visible problems until a related bug showed up on Solaris.
Code refactored from xact_redo_commit() to
ProcessCommittedInvalidationMessages() in inval.c. Recovery may replay
shared invalidation messages for many databases, so we cannot
SetDatabasePath() once as we do in normal backends. Read the databaseid
from the shared invalidation messages, then set DatabasePath
temporarily before calling RelationCacheInitFileInvalidate().
Problem report by Robert Treat, analysis and fix by me.
Andrew Dunstan [Sat, 9 Jan 2010 15:25:41 +0000 (15:25 +0000)]
Provide regression testing for plperlu, and for plperl+plperlu interaction.
The latter are only run if the platform can run both interpreters in the
same backend.
Andrew Dunstan [Sat, 9 Jan 2010 02:40:50 +0000 (02:40 +0000)]
Tidy up and refactor plperl.c.
- Changed MULTIPLICITY check from runtime to compiletime.
No loads the large Config module.
- Changed plperl_init_interp() to return new interp
and not alter the global interp_state
- Moved plperl_safe_init() call into check_interp().
- Removed plperl_safe_init_done state variable
as interp_state now covers that role.
- Changed plperl_create_sub() to take a plperl_proc_desc argument.
- Simplified return value handling in plperl_create_sub.
- Changed perl.com link in the docs to perl.org and tweaked
wording to clarify that require, not use, is what's blocked.
- Moved perl code in large multi-line C string literal macros
out to plc_*.pl files.
- Added a test2macro.pl utility to convert the plc_*.pl files to
macros in a perlchunks.h file which is #included
- Simplifed plperl_safe_init() slightly
- Optimized pg_verifymbstr calls to avoid unneeded strlen()s.
Tom Lane [Fri, 8 Jan 2010 02:44:00 +0000 (02:44 +0000)]
Fix oversight in EvalPlanQualFetch: after failing to lock a tuple because
someone else has just updated it, we have to set priorXmax to that tuple's
xmax (ie, the XID of the other xact that updated it) before looping back to
examine the next tuple. Obviously, the next tuple in the update chain should
have that XID as its xmin, not the same xmin as the preceding tuple that we
had been trying to lock. The mismatch would cause the EvalPlanQual logic to
decide that the tuple chain ended in a deletion, when actually there was a
live tuple that should have been found.
I inserted this error when recently adding logic to EvalPlanQual to make it
lock tuples before returning them (as opposed to the old method in which the
lock would occur much later, causing a great deal of work to be wasted if we
only then discover someone else updated it). Sigh. Per today's report from
Takahiro Itagaki of inconsistent results during pgbench runs.
This uses the same infrastructure with EXPLAIN BUFFERS to support
{shared|local}_blks_{hit|read|written} andtemp_blks_{read|written}
columns in the pg_stat_statements view. The dumped file format
also updated.
Tom Lane [Thu, 7 Jan 2010 19:53:11 +0000 (19:53 +0000)]
Make bit/varbit substring() treat any negative length as meaning "all the rest
of the string". The previous coding treated only -1 that way, and would
produce an invalid result value for other negative values.
We ought to fix it so that 2-parameter bit substring() is a different C
function and the 3-parameter form throws error for negative length, but
that takes a pg_proc change which is impractical in the back branches;
and in any case somebody might be relying on -1 working this way.
So just do this as a back-patchable fix.
Tom Lane [Thu, 7 Jan 2010 16:29:58 +0000 (16:29 +0000)]
Fix (some of the) breakage introduced into query-cancel processing by HS.
It is absolutely not okay to throw an ereport(ERROR) in any random place in
the code just because DoingCommandRead is set; interrupting, say, OpenSSL
in the midst of its activities is guaranteed to result in heartache.
Instead of that, undo the original optimizations that threw away
QueryCancelPending anytime we were starting or finishing a command read, and
instead discard the cancel request within ProcessInterrupts if we find that
there is no HS reason for forcing a cancel and we are DoingCommandRead.
In passing, may I once again condemn the practice of changing the code
and not fixing the adjacent comment that you just turned into a lie?
Tom Lane [Thu, 7 Jan 2010 04:53:35 +0000 (04:53 +0000)]
Remove all the special-case code for INT64_IS_BUSTED, per decision that
we're not going to support that anymore.
I did keep the 64-bit-CRC-with-32-bit-arithmetic code, since it has a
performance excuse to live. It's a bit moot since that's all ifdef'd
out, of course.
Robert Haas [Thu, 7 Jan 2010 03:53:08 +0000 (03:53 +0000)]
Further fixes for per-tablespace options patch.
Add missing varlena header to TableSpaceOpts structure. And, per
Tom Lane, instead of calling tablespace_reloptions in CacheMemoryContext,
call it in the caller's memory context and copy the value over
afterwards, to reduce the chances of a session-lifetime memory leak.
Tom Lane [Thu, 7 Jan 2010 01:41:11 +0000 (01:41 +0000)]
Make configure check the version of Perl we're building with, and reject
versions < 5.8. Also, if there's no Perl, emit a warning informing the
user that he won't be able to build from a CVS pull. This is exactly the
same treatment we give Bison and Perl, and for the same reasons.
Tom Lane [Thu, 7 Jan 2010 00:25:05 +0000 (00:25 +0000)]
Alter the configure script to fail immediately if the C compiler does not
provide a working 64-bit integer datatype. As recently noted, we've been
broken on such platforms since early in the 8.4 development cycle. Since
it took nearly two years for anyone to even notice, it seems that the
rationale for continuing to support such platforms has reached the point
of non-existence. Rather than thrashing around to try to make it work
again, we'll just admit up front that this no longer works.
Back-patch to 8.4 since that branch is also broken.
We should go around to remove INT64_IS_BUSTED support, but just in HEAD,
so that seems like material for a separate commit.
Tom Lane [Wed, 6 Jan 2010 23:00:02 +0000 (23:00 +0000)]
Fix spccache.c to not suppose that a cache entry will live across database
access, per testing with CLOBBER_CACHE_ALWAYS. Minor other editorialization.
Tom Lane [Wed, 6 Jan 2010 19:56:29 +0000 (19:56 +0000)]
Make the makefile pass $MAJORVERSION to genbki.pl, not $VERSION which is
overridden in the snapshot build script. $MAJORVERSION is what it really
wanted anyway, so we can tighten up the parsing of --set-version's argument.
Michael Meskes [Wed, 6 Jan 2010 11:59:52 +0000 (11:59 +0000)]
Removed test case using nan as float value because printf's output for nan is
OS specific with some distinguishing between signaling and quiet nans. It's not
really importnat for us here anyway.
Support rewritten-based full vacuum as VACUUM FULL. Traditional
VACUUM FULL was renamed to VACUUM FULL INPLACE. Also added a new
option -i, --inplace for vacuumdb to perform FULL INPLACE vacuuming.
Since the new VACUUM FULL uses CLUSTER infrastructure, we cannot
use it for system tables. VACUUM FULL for system tables always
fall back into VACUUM FULL INPLACE silently.
Itagaki Takahiro, reviewed by Jeff Davis and Simon Riggs.
Variables must consist of only alphabets, numerals and underscores.
We had allowed to set variables with invalid names, but could not
refer them in queries.
Tom Lane [Tue, 5 Jan 2010 23:25:36 +0000 (23:25 +0000)]
Add support for doing FULL JOIN ON FALSE. While this is really a rather
peculiar variant of UNION ALL, and so wouldn't likely get written directly
as-is, it's possible for it to arise as a result of simplification of
less-obviously-silly queries. In particular, now that we can do flattening
of subqueries that have constant outputs and are underneath an outer join,
it's possible for the case to result from simplification of queries of the
type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality
regression for this type of query.
Robert Haas [Tue, 5 Jan 2010 21:54:00 +0000 (21:54 +0000)]
Support ALTER TABLESPACE name SET/RESET ( tablespace_options ).
This patch only supports seq_page_cost and random_page_cost as parameters,
but it provides the infrastructure to scalably support many more.
In particular, we may want to add support for effective_io_concurrency,
but I'm leaving that as future work for now.
Thanks to Tom Lane for design help and Alvaro Herrera for the review.
Tom Lane [Tue, 5 Jan 2010 20:23:32 +0000 (20:23 +0000)]
Fix genbki.pl and Gen_fmgrtab.pl to use PID-specific temp file names,
so that it's safe if a parallel make chooses to run two concurrent copies.
Also, work around a memory leak in some versions of Perl.