Automatically terminate replication connections that are idle for more
than replication_timeout (a new GUC) milliseconds. The TCP timeout is often
too long, you want the master to notice a dead connection much sooner.
People complained about that in 9.0 too, but with synchronous replication
it's even more important to notice dead connections promptly.
Peter Eisentraut [Tue, 29 Mar 2011 20:23:50 +0000 (23:23 +0300)]
Update SQL features list
Feature F692 "Extended collation support" is now also supported. This
refers to allowing the COLLATE clause anywhere in a column or domain
definition instead of just directly after the type.
Also correct the name of the feature in accordance with the latest SQL
standard.
Peter Eisentraut [Mon, 28 Mar 2011 19:53:55 +0000 (22:53 +0300)]
Add maintainer-check target
This can do various source code checks that are not appropriate for
either the build or the regression tests. Currently: duplicate_oids,
SGML syntax and tabs check, NLS syntax check.
Tom Lane [Mon, 28 Mar 2011 19:44:54 +0000 (15:44 -0400)]
Prevent a rowtype from being included in itself.
Eventually we might be able to allow that, but it's not clear how many
places need to be fixed to prevent infinite recursion when there's a direct
or indirect inclusion of a rowtype in itself. One such place is
CheckAttributeType(), which will recurse to stack overflow in cases such as
those exhibited in bug #5950 from Alex Perepelica. If we were sure it was
the only such place, we could easily modify the code added by this patch to
stop the recursion without a complaint ... but it probably isn't the only
such place. Hence, throw error until such time as someone is excited
enough about this type of usage to put work into making it safe.
Back-patch as far as 8.3. 8.2 doesn't have the recursive call in
CheckAttributeType in the first place, so I see no need to add code there
in the absence of clear evidence of a problem elsewhere.
Tom Lane [Sun, 27 Mar 2011 16:51:04 +0000 (12:51 -0400)]
Fix plpgsql to release SPI plans when a function or DO block is freed.
This fixes the gripe I made a few months ago about DO blocks getting
slower with repeated use. At least, it fixes it for the case where
the DO block isn't aborted by an error. We could try running
plpgsql_free_function_memory() even during error exit, but that seems
a bit scary since it makes a lot of presumptions about the data
structures being in good shape. It's probably reasonable to assume
that repeated failures of DO blocks isn't a performance-critical case.
Tom Lane [Sat, 26 Mar 2011 18:25:48 +0000 (14:25 -0400)]
Clean up a few failures to set collation fields in expression nodes.
I'm not sure these have any non-cosmetic implications, but I'm not sure
they don't, either. In particular, ensure the CaseTestExpr generated
by transformAssignmentIndirection to represent the base target column
carries the correct collation, because parse_collate.c won't fix that.
Tweak lsyscache.c API so that we can get the appropriate collation
without an extra syscache lookup.
Simon Riggs [Sat, 26 Mar 2011 10:09:37 +0000 (10:09 +0000)]
Additional test for each commit in sync rep path to plug minute
possibility of race condition that would effect performance only.
Requested by Robert Haas. Re-arrange related comments.
Tom Lane [Sat, 26 Mar 2011 00:10:42 +0000 (20:10 -0400)]
Pass collation to makeConst() instead of looking it up internally.
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be. (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.) So an internal lookup
is both expensive and wrong.
Tom Lane [Fri, 25 Mar 2011 19:06:36 +0000 (15:06 -0400)]
Fix collation handling in plpgsql functions.
Make plpgsql treat the input collation as a polymorphism variable, so
that we cache separate plans for each input collation that's used in a
particular session, as per recent discussion. Propagate the input
collation to all collatable input parameters.
I chose to also propagate the input collation to all declared variables of
collatable types, which is a bit more debatable but seems to be necessary
for non-astonishing behavior. (Copying a parameter into a separate local
variable shouldn't result in a change of behavior, for example.) There is
enough infrastructure here to support declaring a collation for each local
variable to override that default, but I thought we should wait to see what
the field demand is before adding such a feature.
In passing, remove exec_get_rec_fieldtype(), which wasn't used anywhere.
Robert Haas [Fri, 25 Mar 2011 15:23:39 +0000 (11:23 -0400)]
Make walreceiver send a reply after receiving data but before flushing it.
It originally worked this way, but was changed by commit a8a8a3e0965201df88bdfdff08f50e5c06c552b7, since which time it's been impossible
for walreceiver to ever send a reply with write_location and flush_location
set to different values.
Tom Lane [Fri, 25 Mar 2011 00:30:14 +0000 (20:30 -0400)]
Fix handling of collation in SQL-language functions.
Ensure that parameter symbols receive collation from the function's
resolved input collation, and fix inlining to behave properly.
BTW, this commit lays about 90% of the infrastructure needed to support
use of argument names in SQL functions. Parsing of parameters is now
done via the parser-hook infrastructure ... we'd just need to supply
a column-ref hook ...
Robert Haas [Thu, 24 Mar 2011 20:57:12 +0000 (16:57 -0400)]
Edits to 9.1 release notes.
Add some new items and some additional details to existing items, mostly
by cribbing from the 9.1alpha notes. Some additional clarifications and
corrections elsewhere, and a few typo fixes.
Tom Lane [Thu, 24 Mar 2011 19:29:52 +0000 (15:29 -0400)]
Clean up handling of COLLATE clauses in index column definitions.
Ensure that COLLATE at the top level of an index expression is treated the
same as a grammatically separate COLLATE. Fix bogus reverse-parsing logic
in pg_get_indexdef.
Tom Lane [Wed, 23 Mar 2011 20:56:55 +0000 (16:56 -0400)]
Improve user-defined-aggregates documentation.
On closer inspection, that two-element initcond value seems to have been
a little white lie to avoid explaining the full behavior of float8_accum.
But if people are going to expect the examples to be exactly correct,
I suppose we'd better explain. Per comment from Thom Brown.
Simon Riggs [Wed, 23 Mar 2011 19:35:53 +0000 (19:35 +0000)]
Minor changes to recovery pause behaviour.
Change location LOG message so it works each time we pause, not
just for final pause.
Ensure that we pause only if we are in Hot Standby and can connect
to allow us to run resume function. This change supercedes the
code to override parameter recoveryPauseAtTarget to false if not
attempting to enter Hot Standby, which is now removed.
This is no longer necessary, and might result in a situation where the
configuration file is reloaded (and everything seems OK) but a subsequent
restart of the database fails.
Simon Riggs [Wed, 23 Mar 2011 13:30:05 +0000 (13:30 +0000)]
Prevent intermittent hang in recovery from bgwriter interaction.
Startup process waited for cleanup lock but when hot_standby = off
the pid was not registered, so that the bgwriter would not wake
the waiting process as intended.
Tom Lane [Tue, 22 Mar 2011 21:20:50 +0000 (17:20 -0400)]
Make initdb ignore locales for client-only encodings.
While putting such entries into pg_collation is harmless (since backends
will ignore entries that don't match the database encoding), it's also
useless.
Tom Lane [Tue, 22 Mar 2011 20:55:32 +0000 (16:55 -0400)]
Improve reporting of run-time-detected indeterminate-collation errors.
pg_newlocale_from_collation does not have enough context to give an error
message that's even a little bit useful, so move the responsibility for
complaining up to its callers. Also, reword ERRCODE_INDETERMINATE_COLLATION
error messages in a less jargony, more message-style-guide-compliant
fashion.
Tom Lane [Tue, 22 Mar 2011 19:58:03 +0000 (15:58 -0400)]
Throw error for indeterminate collation of an ORDER/GROUP/DISTINCT target.
This restores a parse error that was thrown (though only in the ORDER BY
case) by the original collation patch. I had removed it in my recent
revisions because it was thrown at a place where collations now haven't
been computed yet; but I thought of another way to handle it.
Throwing the error at parse time, rather than leaving it to be done at
runtime, is good because a syntax error pointer is helpful for localizing
the problem. We can reasonably assume that the comparison function for a
collatable datatype will complain if it doesn't have a collation to use.
Now the planner might choose to implement GROUP or DISTINCT via hashing,
in which case no runtime error would actually occur, but it seems better
to throw error consistently rather than let the error depend on what the
planner chooses to do. Another possible objection is that the user might
specify a nondefault sort operator that doesn't care about collation
... but that's surely an uncommon usage, and it wouldn't hurt him to throw
in a COLLATE clause anyway. This change also makes the ORDER BY/GROUP
BY/DISTINCT case more consistent with the UNION/INTERSECT/EXCEPT case,
which was already coded to throw this error even though the same objections
could be raised there.
Tom Lane [Tue, 22 Mar 2011 17:00:24 +0000 (13:00 -0400)]
Avoid potential deadlock in InitCatCachePhase2().
Opening a catcache's index could require reading from that cache's own
catalog, which of course would acquire AccessShareLock on the catalog.
So the original coding here risks locking index before heap, which could
deadlock against another backend trying to get exclusive locks in the
normal order. Because InitCatCachePhase2 is only called when a backend
has to start up without a relcache init file, the deadlock was seldom seen
in the field. (And by the same token, there's no need to worry about any
performance disadvantage; so not much point in trying to distinguish
exactly which catalogs have the risk.)
Bug report, diagnosis, and patch by Nikhil Sontakke. Additional commentary
by me. Back-patch to all supported branches.
Tom Lane [Tue, 22 Mar 2011 04:34:31 +0000 (00:34 -0400)]
Reimplement planner's handling of MIN/MAX aggregate optimization (again).
Instead of playing cute games with pathkeys, just build a direct
representation of the intended sub-select, and feed it through
query_planner to get a Path for the index access. This is a bit slower
than 9.1's previous method, since we'll duplicate most of the overhead of
query_planner; but since the whole optimization only applies to rather
simple single-table queries, that probably won't be much of a problem in
practice. The advantage is that we get to do the right thing when there's
a partial index that needs the implicit IS NOT NULL clause to be usable.
Also, although this makes planagg.c be a bit more closely tied to the
ordering of operations in grouping_planner, we can get rid of some coupling
to lower-level parts of the planner. Per complaint from Marti Raudsepp.
When two base backups are started at the same time with pg_basebackup,
ensure that they use different checkpoints as the starting point. We use
the checkpoint redo location as a unique identifier for the base backup in
the end-of-backup record, and in the backup history file name.
The local variable "sock" can be unused depending on compilation flags.
But there seems no particular need for it, since the kernel calls can
just as easily say port->sock instead.
Tom Lane [Sun, 20 Mar 2011 16:43:39 +0000 (12:43 -0400)]
Fix up handling of C/POSIX collations.
Install just one instance of the "C" and "POSIX" collations into
pg_collation, rather than one per encoding. Make these instances exist
and do something useful even in machines without locale_t support: to wit,
it's now possible to force comparisons and case-folding functions to use C
locale in an otherwise non-C database, whether or not the platform has
support for using any additional collations.
Fix up severely broken upper/lower/initcap functions, too: the C/POSIX
fastpath now does what it is supposed to, and non-default collations are
handled correctly in single-byte database encodings.
Merge the two separate collation hashtables that were being maintained in
pg_locale.c, and be more wary of the possibility that we fail partway
through filling a cache entry.
Tom Lane [Sun, 20 Mar 2011 00:29:08 +0000 (20:29 -0400)]
Revise collation derivation method and expression-tree representation.
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record). Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.
Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree. This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.
Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
Magnus Hagander [Sat, 19 Mar 2011 17:44:35 +0000 (18:44 +0100)]
Rename ident authentication over local connections to peer
This removes an overloading of two authentication options where
one is very secure (peer) and one is often insecure (ident). Peer
is also the name used in libpq from 9.1 to specify the same type
of authentication.
Also make initdb select peer for local connections when ident is
chosen, and ident for TCP connections when peer is chosen.
ident keyword in pg_hba.conf is still accepted and maps to peer
authentication.
Robert Haas [Sat, 19 Mar 2011 02:09:57 +0000 (22:09 -0400)]
Fix possible "tuple concurrently updated" error in ALTER TABLE.
When adding an inheritance parent to a table, an AccessShareLock on the
parent isn't strong enough to prevent trouble, so take
ShareUpdateExclusiveLock instead. Since this is a behavior change,
albeit a fairly unobtrusive one, and since we have only one report
from the field, no back-patch.
Report by Jon Nelson, analysis by Alvaro Herrera, fix by me.
Robert Haas [Sat, 19 Mar 2011 01:43:45 +0000 (21:43 -0400)]
Move synchronous_standbys_defined updates from WAL writer to BG writer.
This is advantageous because the BG writer is alive until much later in
the shutdown sequence than WAL writer; we want to make sure that it's
possible to shut off synchronous replication during a smart shutdown,
else it might not be possible to complete the shutdown at all.
Per very reasonable gripes from Fujii Masao and Simon Riggs.
Robert Haas [Fri, 18 Mar 2011 13:44:44 +0000 (09:44 -0400)]
Remove ancient -X options to pg_dump, pg_dumpall, pg_restore.
The last version in which these options were documented is now EOL, so
it's time to get rid of them for real. We now use GNU-style long
options instead.
Peter Eisentraut [Thu, 17 Mar 2011 18:19:51 +0000 (20:19 +0200)]
Raise maximum value of several timeout parameters
The maximum value of deadlock_timeout, max_standby_archive_delay,
max_standby_streaming_delay, log_min_duration_statement, and
log_autovacuum_min_duration was INT_MAX/1000 milliseconds, which is
about 35min, which is too short for some practical uses. Raise the
maximum value to INT_MAX; the code that uses the parameters already
supports that just fine.
Robert Haas [Thu, 17 Mar 2011 17:10:42 +0000 (13:10 -0400)]
Fix various possible problems with synchronous replication.
1. Don't ignore query cancel interrupts. Instead, if the user asks to
cancel the query after we've already committed it, but before it's on
the standby, just emit a warning and let the COMMIT finish.
2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown).
Instead, emit a warning message and close the connection without
acknowledging the commit. Other backends will still see the effect of
the commit, but there's no getting around that; it's too late to abort
at this point, and ignoring die interrupts altogether doesn't seem like
a good idea.
3. If synchronous_standby_names becomes empty, wake up all backends
waiting for synchronous replication to complete. Without this, someone
attempting to shut synchronous replication off could easily wedge the
entire system instead.
4. Avoid depending on the assumption that if a walsender updates
MyProc->syncRepState, we'll see the change even if we read it without
holding the lock. The window for this appears to be quite narrow (and
probably doesn't exist at all on machines with strong memory ordering)
but protecting against it is practically free, so do that.
5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and
doesn't actually do anything.
There's still some further work needed here to make the behavior of fast
shutdown plausible, but that looks complex, so I'm leaving it for a
separate commit. Review by Fujii Masao.
Andrew Dunstan [Thu, 17 Mar 2011 04:06:52 +0000 (00:06 -0400)]
Use correct PATH separator for Cygwin in pg_regress.c.
This has been broken for years, and I'm not sure why it has not been
noticed before, but now a very modern Cygwin breaks on it, and the fix
is clearly correct. Backpatching to all live branches.
Tom Lane [Wed, 16 Mar 2011 01:51:36 +0000 (21:51 -0400)]
Improve handling of unknown-type literals in UNION/INTERSECT/EXCEPT.
This patch causes unknown-type Consts to be coerced to the resolved output
type of the set operation at parse time. Formerly such Consts were left
alone until late in the planning stage. The disadvantage of that approach
is that it disables some optimizations, because the planner sees the set-op
leaf query as having different output column types than the overall set-op.
We saw an example of that in a recent performance gripe from Claudio
Freire.
Fixing such a Const requires scribbling on the leaf query in
transformSetOperationTree, but that should be all right since if the leaf
query's semantics depended on that output column, it would already have
resolved the unknown to something else.
Most of the bulk of this patch is a simple adjustment of
transformSetOperationTree's API so that upper levels can get at the
TargetEntry containing a Const to be replaced: it now returns a list of
TargetEntries, instead of just the bare expressions.
Bruce Momjian [Tue, 15 Mar 2011 15:26:20 +0000 (11:26 -0400)]
Add database comments to template0 and postgres databases, and improve
the comments on the template1 database. No catalog version bump because
they are just comments.
Robert Haas [Tue, 15 Mar 2011 15:25:04 +0000 (11:25 -0400)]
Minor sync rep documentation improvements.
- Make the name of the ID tag for the GUC entry match the GUC name.
- Clarify that synchronous_replication waits for xlog flush, not receipt.
- Mention that synchronous_replication won't wait if max_wal_senders=0.
Magnus Hagander [Mon, 14 Mar 2011 18:46:52 +0000 (19:46 +0100)]
Remove special case allowing parameters to ident auth for initdb
This was required in pre-8.4 versions to allow the specification of
"ident sameuser", but sameuser is no longer required. It could be extended
to allow all parameters in the future, but should then apply to all
methods and not just ident.
Tom Lane [Mon, 14 Mar 2011 16:52:03 +0000 (12:52 -0400)]
Adjust regression test to avoid platform-dependent failure.
We have a test that verifies that max(anyarray) will cope if the array
column elements aren't all the same array type. However, it's now possible
for that to produce a collation-related error message instead of the
expected one, if the first two column elements happen to be of the same
type and it's one that expects to be given collation info. Tweak the test
to ensure this doesn't happen. Per buildfarm member pika.
Tom Lane [Sat, 12 Mar 2011 21:30:36 +0000 (16:30 -0500)]
Make all comparisons done for/with statistics use the default collation.
While this will give wrong answers when estimating selectivity for a
comparison operator that's using a non-default collation, the estimation
error probably won't be large; and anyway the former approach created
estimation errors of its own by trying to use a histogram that might have
been computed with some other collation. So we'll adopt this simplified
approach for now and perhaps improve it sometime in the future.
This patch incorporates changes from Andres Freund to make sure that
selfuncs.c passes a valid collation OID to any datatype-specific function
it calls, in case that function wants collation information. Said OID will
now always be DEFAULT_COLLATION_OID, but at least we won't get errors.