Alvaro Herrera [Wed, 16 May 2007 16:36:56 +0000 (16:36 +0000)]
Have the rewriteheap code freeze old tuples. This is safe because it is only
applied to live tuples older than a recent Xmin, not to tuples that may be part
of an update chain. Those still keep their original markings.
This patch makes it possible for CLUSTER to advance relfrozenxid, thus avoiding
the need of vacuuming the table for Xid wraparound purposes. That will be
patched separately.
Alvaro Herrera [Tue, 15 May 2007 20:20:21 +0000 (20:20 +0000)]
Avoid emitting empty role names in the GRANTED BY clause of GRANT ROLE
when the grantor has been dropped. This is a workaround for the fact
that we don't track the grantor as a shared dependency.
Andrew Dunstan [Tue, 15 May 2007 19:47:51 +0000 (19:47 +0000)]
Remove directory qualification in <ossp/uuid.h> because it's not always installed in ossp.
Workaround for when it is: include the ossp directory using --with-includes.
Neil Conway [Tue, 15 May 2007 19:13:55 +0000 (19:13 +0000)]
Various fixes for the SGML docs. Consistently use spaces before/after
parentheses in syntax descriptions. Consistently use the present tense
when describing the basic purpose of each "DROP" command. Add a few
more hyperlinks.
Neil Conway [Tue, 15 May 2007 15:35:46 +0000 (15:35 +0000)]
Add a note to the documentation to clarify that even when
"autovacuum = off", the system may still periodically start autovacuum
processes to prevent XID wraparound. Patch from David Fetter, with
editorializing.
Tom Lane [Mon, 14 May 2007 20:24:41 +0000 (20:24 +0000)]
Get rid of the pg_shdepend entry for a TOAST table; it's unnecessary since
there's an indirect dependency on the owner via the parent table. We were
already handling indexes that way, but not toast tables for some reason.
Saves a little catalog space and cuts down the verbosity of checkSharedDependencies
reports.
Tom Lane [Mon, 14 May 2007 18:13:21 +0000 (18:13 +0000)]
Prevent RevalidateCachedPlan from making any permanent change in
ActiveSnapshot. Having it affect ActiveSnapshot only in the unusual
case of needing to replan seems a bad idea, and there's also the problem
that the created snap might be in a relatively short-lived context, as
noted by Jan Wieck. Also, there's no need to force a new snap at all
unless we are called with no snap currently set, which is an unusual
case in itself.
Alvaro Herrera [Mon, 14 May 2007 16:50:36 +0000 (16:50 +0000)]
Report all dependent objects to the server log when a shared object is dropped,
and only a truncated log of the objects in the current database to the client.
Also, instead of reporting object counts for all databases on which the user
might own objects, report only as many as fit in the predefined line count.
This is to avoid flooding the client when the user owns too many objects,
which could cause problems.
Per report from Ed L. on April 4th and subsequent discussion.
Bruce Momjian [Sun, 13 May 2007 11:22:04 +0000 (11:22 +0000)]
Mark as done, add URL for other item:
< o Add support for arrays of complex types
>
> http://archives.postgresql.org/pgsql-patches/2007-05/msg00114.php
>
> o -Add support for arrays of complex types
Tom Lane [Sat, 12 May 2007 19:22:35 +0000 (19:22 +0000)]
Improve predicate_refuted_by_simple_clause() to handle IS NULL and IS NOT NULL
more completely. The motivation for having it understand IS NULL at all was
to allow use of "foo IS NULL" as one of the subsets of a partitioning on
"foo", but as reported by Aleksander Kmetec, it wasn't really getting the job
done. Backpatch to 8.2 since this is arguably a performance bug.
Tom Lane [Sat, 12 May 2007 00:55:00 +0000 (00:55 +0000)]
Fix the problem that creating a user-defined type named _foo, followed by one
named foo, would work but the other ordering would not. If a user-specified
type or table name collides with an existing auto-generated array name, just
rename the array type out of the way by prepending more underscores. This
should not create any backward-compatibility issues, since the cases in which
this will happen would have failed outright in prior releases.
Also fix an oversight in the arrays-of-composites patch: ALTER TABLE RENAME
renamed the table's rowtype but not its array type.
Tom Lane [Fri, 11 May 2007 20:17:15 +0000 (20:17 +0000)]
Fix my oversight in enabling domains-of-domains: ALTER DOMAIN ADD CONSTRAINT
needs to check the new constraint against columns of derived domains too.
Also, make it error out if the domain to be modified is used within any
composite-type columns. Eventually we should support that case, but it seems
a bit painful, and not suitable for a back-patch. For the moment just let the
user know we can't do it.
Backpatch to 8.2, which is the only released version that allows nested
domains. Possibly the other part should be back-patched further.
Tom Lane [Fri, 11 May 2007 17:57:14 +0000 (17:57 +0000)]
Support arrays of composite types, including the rowtypes of regular tables
and views (but not system catalogs, nor sequences or toast tables). Get rid
of the hardwired convention that a type's array type is named exactly "_type",
instead using a new column pg_type.typarray to provide the linkage. (It still
will be named "_type", though, except in odd corner cases such as
maximum-length type names.)
Along the way, make tracking of owner and schema dependencies for types more
uniform: a type directly created by the user has these dependencies, while a
table rowtype or auto-generated array type does not have them, but depends on
its parent object instead.
Neil Conway [Tue, 8 May 2007 18:56:48 +0000 (18:56 +0000)]
Add a hash function for "numeric". Mark the equality operator for
numerics as "oprcanhash", and make the corresponding system catalog
updates. As a result, hash indexes, hashed aggregation, and hash
joins can now be used with the numeric type. Bump the catversion.
The only tricky aspect to doing this is writing a correct hash
function: it's possible for two Numerics to be equal according to
their equality operator, but have different in-memory bit patterns.
To cope with this, the hash function doesn't consider the Numeric's
"scale" or "sign", and explictly skips any leading or trailing
zeros in the Numeric's digit buffer (the current implementation
should suppress any such zeros, but it seems unwise to rely upon
this). See discussion on pgsql-patches for more details.
Tom Lane [Tue, 8 May 2007 17:02:59 +0000 (17:02 +0000)]
Add an explicit comment about POSIX time zone names having the reverse
sign convention from everyplace else in Postgres. I don't suppose that
this will stop people from being confused, but at least we can say that
it's documented.
The appended patch addresses the outstanding issues of the recent guc patch.
It makes PGCLIENTENCODING work again and uses bsearch() instead of
iterating over the array of guc variables in guc_get_index().
Magnus Hagander [Sat, 5 May 2007 17:05:48 +0000 (17:05 +0000)]
Check return code from strxfrm on Windows since it has a
non-standard way of indicating errors, so we don't try to
allocate INT_MAX bytes to store a result in.
Bruce Momjian [Sat, 5 May 2007 15:40:01 +0000 (15:40 +0000)]
Done:
< Last updated: Sat May 5 10:47:39 EDT 2007
> Last updated: Sat May 5 11:39:57 EDT 2007
< * Flush cached query plans when the dependent objects change,
< when the cardinality of parameters changes dramatically, or
> * -Flush cached query plans when the dependent objects change or
<
< A more complex solution would be to save multiple plans for different
< cardinality and use the appropriate plan based on the EXECUTE values.
<
< * Track dependencies in function bodies and recompile/invalidate
<
< This is particularly important for references to temporary tables
< in PL/PgSQL because PL/PgSQL caches query plans. The only workaround
< in PL/PgSQL is to use EXECUTE. One complexity is that a function
< might itself drop and recreate dependent tables, causing it to
< invalidate its own query plan.
<
< * Invalidate prepared queries, like INSERT, when the table definition
> * -Track dependencies in function bodies and recompile/invalidate
> * -Invalidate prepared queries, like INSERT, when the table definition
Bruce Momjian [Sat, 5 May 2007 14:47:45 +0000 (14:47 +0000)]
Move item:
< * Invalidate prepared queries, like INSERT, when the table definition
< is altered
>
> * Invalidate prepared queries, like INSERT, when the table definition
> is altered
Bruce Momjian [Sat, 5 May 2007 03:14:40 +0000 (03:14 +0000)]
Done:
> * -Allow ORDER BY ... LIMIT # to select high/low value without sort or
<
< Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
< all values to return the high/low value. Instead The idea is to do a
< sequential scan to find the high/low value, thus avoiding the sort.
< MIN/MAX already does this, but not for LIMIT > 1.
<
Andrew Dunstan [Fri, 4 May 2007 14:55:32 +0000 (14:55 +0000)]
Make clearer how arguments and return values in pl/perl are escaped. This is to clarify the situation that Theo Schlossnagle recently reported on -bugs.
Tom Lane [Fri, 4 May 2007 02:01:02 +0000 (02:01 +0000)]
A few fixups in error handling: mark pg_re_throw() as noreturn for gcc,
and for other compilers, insert a dummy exit() call so that they understand
PG_RE_THROW() doesn't return. Insert fflush(stderr) in ExceptionalCondition,
per recent buildfarm evidence that that might not happen automatically on some
platforms. And const-ify ExceptionalCondition's declaration while at it.
Tom Lane [Fri, 4 May 2007 01:13:45 +0000 (01:13 +0000)]
Teach tuplesort.c about "top N" sorting, in which only the first N tuples
need be returned. We keep a heap of the current best N tuples and sift-up
new tuples into it as we scan the input. For M input tuples this means
only about M*log(N) comparisons instead of M*log(M), not to mention a lot
less workspace when N is small --- avoiding spill-to-disk for large M
is actually the most attractive thing about it. Patch includes planner
and executor support for invoking this facility in ORDER BY ... LIMIT
queries. Greg Stark, with some editorialization by moi.
Tom Lane [Thu, 3 May 2007 16:45:58 +0000 (16:45 +0000)]
Tweak hash index AM to use the new ReadOrZeroBuffer bufmgr API when fetching
pages it intends to zero immediately. Just to show there is some use for that
function besides WAL recovery :-).
Along the way, fold _hash_checkpage and _hash_pageinit calls into _hash_getbuf
and friends, instead of expecting callers to do that separately.
Magnus Hagander [Thu, 3 May 2007 14:04:03 +0000 (14:04 +0000)]
Release builds generate different strangely formatted export names
for local symbols, that shouldn't be exported. This patch excludes them,
cutting down about 10,000 exported symbols and decreasing the binary size
by 20%.
Tom Lane [Wed, 2 May 2007 23:34:48 +0000 (23:34 +0000)]
Dept. of second thoughts: add comments cautioning against using
ReadOrZeroBuffer to fetch pages from beyond physical EOF. This would
usually work, but would cause problems for md.c if writes occurred
beyond a segment boundary when the previous segment file hadn't been
fully extended.
Tom Lane [Wed, 2 May 2007 23:18:03 +0000 (23:18 +0000)]
During WAL recovery, when reading a page that we intend to overwrite completely
from the WAL data, don't bother to physically read it; just have bufmgr.c
return a zeroed-out buffer instead. This speeds recovery significantly,
and also avoids unnecessary failures when a page-to-be-overwritten has corrupt
page headers on disk. This replaces a former kluge that accomplished the
latter by pretending zero_damaged_pages was always ON during WAL recovery;
which was OK when the kluge was put in, but is unsafe when restoring a WAL
log that was written with full_page_writes off.
Tom Lane [Wed, 2 May 2007 21:08:46 +0000 (21:08 +0000)]
Fix things so that when CREATE INDEX CONCURRENTLY sets pg_index.indisvalid
true at the very end of its processing, the update is broadcast via a
shared-cache-inval message for the index; without this, existing backends that
already have relcache entries for the index might never see it become valid.
Also, force a relcache inval on the index's parent table at the same time,
so that any cached plans for that table are re-planned; this ensures that
the newly valid index will be used if appropriate. Aside from making
C.I.C. behave more reasonably, this is necessary infrastructure for some
aspects of the HOT patch. Pavan Deolasee, with a little further stuff from
me.
Alvaro Herrera [Wed, 2 May 2007 18:27:57 +0000 (18:27 +0000)]
Use the new TimestampDifferenceExceeds API instead of timestamp_cmp_internal
and TimestampDifference, to make coding clearer. I think this should also fix
the failure to start workers in platforms with low resolution timers, as
reported by Itagaki Takahiro.
Alvaro Herrera [Wed, 2 May 2007 15:47:14 +0000 (15:47 +0000)]
Fix failure to check for INVALID worker entry in the new autovacuum code, which
could happen when a worker took to long to start and was thus "aborted" by the
launcher. Noticed by lionfish buildfarm member.
Tom Lane [Wed, 2 May 2007 15:32:42 +0000 (15:32 +0000)]
Fix oversight in PG_RE_THROW processing: it's entirely possible that there
isn't any place to throw the error to. If so, we should treat the error
as FATAL, just as we would have if it'd been thrown outside the PG_TRY
block to begin with.
Although this is clearly a *potential* source of bugs, it is not clear
at the moment whether it is an *actual* source of bugs; there may not
presently be any PG_TRY blocks in code that can be reached with no outer
longjmp catcher. So for the moment I'm going to be conservative and not
back-patch this. The change breaks ABI for users of PG_RE_THROW and hence
might create compatibility problems for loadable modules, so we should not
put it into released branches without proof that it's needed.
Tom Lane [Tue, 1 May 2007 18:53:52 +0000 (18:53 +0000)]
Fix a thinko in my patch of a couple months ago for bug #3116: it did the
wrong thing when inlining polymorphic SQL functions, because it was using the
function's declared return type where it should have used the actual result
type of the current call. In 8.1 and 8.2 this causes obvious failures even if
you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem
if the inlined expression were used as an input to a function that did
run-time type determination on its inputs. Add a regression test, since this
is evidently an under-tested area.
Tom Lane [Mon, 30 Apr 2007 21:01:53 +0000 (21:01 +0000)]
Change the timestamps recorded in transaction commit/abort xlog records
from time_t to TimestampTz representation. This provides full gettimeofday()
resolution of the timestamps, which might be useful when attempting to
do point-in-time recovery --- previously it was not possible to specify
the stop point with sub-second resolution. But mostly this is to get
rid of TimestampTz-to-time_t conversion overhead during commit. Per my
proposal of a day or two back.
Tom Lane [Mon, 30 Apr 2007 03:23:49 +0000 (03:23 +0000)]
Implement rate-limiting logic on how often backends will attempt to send
messages to the stats collector. This avoids the problem that enabling
stats_row_level for autovacuum has a significant overhead for short
read-only transactions, as noted by Arjen van der Meijden. We can avoid
an extra gettimeofday call by piggybacking on the one done for WAL-logging
xact commit or abort (although that doesn't help read-only transactions,
since they don't WAL-log anything).
In my proposal for this, I noted that we could change the WAL log entries
for commit/abort to record full TimestampTz precision, instead of only
time_t as at present. That's not done in this patch, but will be committed
separately.
Tom Lane [Mon, 30 Apr 2007 00:16:43 +0000 (00:16 +0000)]
Marginal performance hack: use a dedicated routine instead of copyObject
to copy nodes that are known to be Vars during plan reference adjustment.
Saves useless memzero operation as well as the big switch in copyObject.
Tom Lane [Mon, 30 Apr 2007 00:14:54 +0000 (00:14 +0000)]
Marginal performance hack: avoid unnecessary work in expression_tree_mutator.
We can just palloc, instead of using makeNode, when we are going to
overwrite the whole node anyway in the FLATCOPY macro. Also, use
FLATCOPY instead of copyObject for common node types Var and Const.
Tom Lane [Mon, 30 Apr 2007 00:12:08 +0000 (00:12 +0000)]
Marginal performance hack: remove the loop that used to be needed to
look through a freelist for a chunk of adequate size. For a long time
now, all elements of a given freelist have been exactly the same
allocated size, so we don't need a loop. Since the loop never iterated
more than once, you'd think this wouldn't matter much, but it makes a
noticeable savings in a simple test --- perhaps because the compiler
isn't optimizing on a mistaken assumption that the loop would repeat.
AllocSetAlloc is called often enough that saving even a couple of
instructions is worthwhile.
Bruce Momjian [Sun, 29 Apr 2007 06:48:11 +0000 (06:48 +0000)]
Pl/pgsql MOVE done:
< o Add support for MOVE and SCROLL cursors
<
< PL/pgSQL cursors should support the same syntax as
< backend cursors.
<
> o -Add support for MOVE cursors
> o Add support for SCROLL cursors
Neil Conway [Sat, 28 Apr 2007 23:54:59 +0000 (23:54 +0000)]
Add support for IN as alternative to FROM in PL/PgSQL's FETCH statement,
for consistency with the backend's FETCH command. Patch from Pavel
Stehule, reviewed by Neil Conway.
Tom Lane [Fri, 27 Apr 2007 22:05:49 +0000 (22:05 +0000)]
Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve the
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes. DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility. This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement? Not going to
worry about that now, though.) That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
Neil Conway [Fri, 27 Apr 2007 20:08:43 +0000 (20:08 +0000)]
Remove no-longer-true statement from the docs. Since the default config
now enables row-level stats, the out of the box stats volume is no
longer particularly low.
Tom Lane [Thu, 26 Apr 2007 23:24:46 +0000 (23:24 +0000)]
Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
Neil Conway [Thu, 26 Apr 2007 22:25:56 +0000 (22:25 +0000)]
Another tweak for tab completion of CREATE TEMP. Instead of only
completing CREATE { TEMP | TEMPORARY } TABLE, we should also suggest
VIEW and SEQUENCE. Per Greg Sabino Mullane.
Neil Conway [Thu, 26 Apr 2007 18:10:28 +0000 (18:10 +0000)]
Minor enhancement to psql tab completion. If we see "CREATE TEMPORARY",
we can complete "TABLE". The previous coding only looked for "CREATE TEMP".
Note that I didn't add TEMPORARY to the list of suggested completions
after we've seen "CREATE", since TEMP is equivalent and more concise. But
if the user has already manually typed TEMPORARY, we may as well
complete TABLE for them.
Neil Conway [Thu, 26 Apr 2007 16:13:15 +0000 (16:13 +0000)]
Rename the newly-added commands for discarding session state.
RESET SESSION, RESET PLANS, and RESET TEMP are now DISCARD ALL,
DISCARD PLANS, and DISCARD TEMP, respectively. This is to avoid
confusion with the pre-existing RESET variants: the DISCARD
commands are not actually similar to RESET. Patch from Marko
Kreen, with some minor editorialization.
Tom Lane [Sun, 22 Apr 2007 03:52:40 +0000 (03:52 +0000)]
Remove some of the most blatant brain-fade in the recent guc patch
(it's so nice to have a buildfarm member that actively rejects naked
uses of strcasecmp). This coding is still pretty awful, though, since
it's going to be O(N^2) in the number of guc variables. May I direct
your attention to bsearch?
Tom Lane [Sat, 21 Apr 2007 21:01:45 +0000 (21:01 +0000)]
Some further performance tweaks for planning large inheritance trees that
are mostly excluded by constraints: do the CE test a bit earlier to save
some adjust_appendrel_attrs() work on excluded children, and arrange to
use array indexing rather than rt_fetch() to fetch RTEs in the main body
of the planner. The latter is something I'd wanted to do for awhile anyway,
but seeing list_nth_cell() as 35% of the runtime gets one's attention.