Magnus Hagander [Sat, 5 May 2007 17:05:48 +0000 (17:05 +0000)]
Check return code from strxfrm on Windows since it has a
non-standard way of indicating errors, so we don't try to
allocate INT_MAX bytes to store a result in.
Bruce Momjian [Sat, 5 May 2007 15:40:01 +0000 (15:40 +0000)]
Done:
< Last updated: Sat May 5 10:47:39 EDT 2007
> Last updated: Sat May 5 11:39:57 EDT 2007
< * Flush cached query plans when the dependent objects change,
< when the cardinality of parameters changes dramatically, or
> * -Flush cached query plans when the dependent objects change or
<
< A more complex solution would be to save multiple plans for different
< cardinality and use the appropriate plan based on the EXECUTE values.
<
< * Track dependencies in function bodies and recompile/invalidate
<
< This is particularly important for references to temporary tables
< in PL/PgSQL because PL/PgSQL caches query plans. The only workaround
< in PL/PgSQL is to use EXECUTE. One complexity is that a function
< might itself drop and recreate dependent tables, causing it to
< invalidate its own query plan.
<
< * Invalidate prepared queries, like INSERT, when the table definition
> * -Track dependencies in function bodies and recompile/invalidate
> * -Invalidate prepared queries, like INSERT, when the table definition
Bruce Momjian [Sat, 5 May 2007 14:47:45 +0000 (14:47 +0000)]
Move item:
< * Invalidate prepared queries, like INSERT, when the table definition
< is altered
>
> * Invalidate prepared queries, like INSERT, when the table definition
> is altered
Bruce Momjian [Sat, 5 May 2007 03:14:40 +0000 (03:14 +0000)]
Done:
> * -Allow ORDER BY ... LIMIT # to select high/low value without sort or
<
< Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
< all values to return the high/low value. Instead The idea is to do a
< sequential scan to find the high/low value, thus avoiding the sort.
< MIN/MAX already does this, but not for LIMIT > 1.
<
Andrew Dunstan [Fri, 4 May 2007 14:55:32 +0000 (14:55 +0000)]
Make clearer how arguments and return values in pl/perl are escaped. This is to clarify the situation that Theo Schlossnagle recently reported on -bugs.
Tom Lane [Fri, 4 May 2007 02:01:02 +0000 (02:01 +0000)]
A few fixups in error handling: mark pg_re_throw() as noreturn for gcc,
and for other compilers, insert a dummy exit() call so that they understand
PG_RE_THROW() doesn't return. Insert fflush(stderr) in ExceptionalCondition,
per recent buildfarm evidence that that might not happen automatically on some
platforms. And const-ify ExceptionalCondition's declaration while at it.
Tom Lane [Fri, 4 May 2007 01:13:45 +0000 (01:13 +0000)]
Teach tuplesort.c about "top N" sorting, in which only the first N tuples
need be returned. We keep a heap of the current best N tuples and sift-up
new tuples into it as we scan the input. For M input tuples this means
only about M*log(N) comparisons instead of M*log(M), not to mention a lot
less workspace when N is small --- avoiding spill-to-disk for large M
is actually the most attractive thing about it. Patch includes planner
and executor support for invoking this facility in ORDER BY ... LIMIT
queries. Greg Stark, with some editorialization by moi.
Tom Lane [Thu, 3 May 2007 16:45:58 +0000 (16:45 +0000)]
Tweak hash index AM to use the new ReadOrZeroBuffer bufmgr API when fetching
pages it intends to zero immediately. Just to show there is some use for that
function besides WAL recovery :-).
Along the way, fold _hash_checkpage and _hash_pageinit calls into _hash_getbuf
and friends, instead of expecting callers to do that separately.
Magnus Hagander [Thu, 3 May 2007 14:04:03 +0000 (14:04 +0000)]
Release builds generate different strangely formatted export names
for local symbols, that shouldn't be exported. This patch excludes them,
cutting down about 10,000 exported symbols and decreasing the binary size
by 20%.
Tom Lane [Wed, 2 May 2007 23:34:48 +0000 (23:34 +0000)]
Dept. of second thoughts: add comments cautioning against using
ReadOrZeroBuffer to fetch pages from beyond physical EOF. This would
usually work, but would cause problems for md.c if writes occurred
beyond a segment boundary when the previous segment file hadn't been
fully extended.
Tom Lane [Wed, 2 May 2007 23:18:03 +0000 (23:18 +0000)]
During WAL recovery, when reading a page that we intend to overwrite completely
from the WAL data, don't bother to physically read it; just have bufmgr.c
return a zeroed-out buffer instead. This speeds recovery significantly,
and also avoids unnecessary failures when a page-to-be-overwritten has corrupt
page headers on disk. This replaces a former kluge that accomplished the
latter by pretending zero_damaged_pages was always ON during WAL recovery;
which was OK when the kluge was put in, but is unsafe when restoring a WAL
log that was written with full_page_writes off.
Tom Lane [Wed, 2 May 2007 21:08:46 +0000 (21:08 +0000)]
Fix things so that when CREATE INDEX CONCURRENTLY sets pg_index.indisvalid
true at the very end of its processing, the update is broadcast via a
shared-cache-inval message for the index; without this, existing backends that
already have relcache entries for the index might never see it become valid.
Also, force a relcache inval on the index's parent table at the same time,
so that any cached plans for that table are re-planned; this ensures that
the newly valid index will be used if appropriate. Aside from making
C.I.C. behave more reasonably, this is necessary infrastructure for some
aspects of the HOT patch. Pavan Deolasee, with a little further stuff from
me.
Alvaro Herrera [Wed, 2 May 2007 18:27:57 +0000 (18:27 +0000)]
Use the new TimestampDifferenceExceeds API instead of timestamp_cmp_internal
and TimestampDifference, to make coding clearer. I think this should also fix
the failure to start workers in platforms with low resolution timers, as
reported by Itagaki Takahiro.
Alvaro Herrera [Wed, 2 May 2007 15:47:14 +0000 (15:47 +0000)]
Fix failure to check for INVALID worker entry in the new autovacuum code, which
could happen when a worker took to long to start and was thus "aborted" by the
launcher. Noticed by lionfish buildfarm member.
Tom Lane [Wed, 2 May 2007 15:32:42 +0000 (15:32 +0000)]
Fix oversight in PG_RE_THROW processing: it's entirely possible that there
isn't any place to throw the error to. If so, we should treat the error
as FATAL, just as we would have if it'd been thrown outside the PG_TRY
block to begin with.
Although this is clearly a *potential* source of bugs, it is not clear
at the moment whether it is an *actual* source of bugs; there may not
presently be any PG_TRY blocks in code that can be reached with no outer
longjmp catcher. So for the moment I'm going to be conservative and not
back-patch this. The change breaks ABI for users of PG_RE_THROW and hence
might create compatibility problems for loadable modules, so we should not
put it into released branches without proof that it's needed.
Tom Lane [Tue, 1 May 2007 18:53:52 +0000 (18:53 +0000)]
Fix a thinko in my patch of a couple months ago for bug #3116: it did the
wrong thing when inlining polymorphic SQL functions, because it was using the
function's declared return type where it should have used the actual result
type of the current call. In 8.1 and 8.2 this causes obvious failures even if
you don't have assertions turned on; in 8.0 and 7.4 it would only be a problem
if the inlined expression were used as an input to a function that did
run-time type determination on its inputs. Add a regression test, since this
is evidently an under-tested area.
Tom Lane [Mon, 30 Apr 2007 21:01:53 +0000 (21:01 +0000)]
Change the timestamps recorded in transaction commit/abort xlog records
from time_t to TimestampTz representation. This provides full gettimeofday()
resolution of the timestamps, which might be useful when attempting to
do point-in-time recovery --- previously it was not possible to specify
the stop point with sub-second resolution. But mostly this is to get
rid of TimestampTz-to-time_t conversion overhead during commit. Per my
proposal of a day or two back.
Tom Lane [Mon, 30 Apr 2007 03:23:49 +0000 (03:23 +0000)]
Implement rate-limiting logic on how often backends will attempt to send
messages to the stats collector. This avoids the problem that enabling
stats_row_level for autovacuum has a significant overhead for short
read-only transactions, as noted by Arjen van der Meijden. We can avoid
an extra gettimeofday call by piggybacking on the one done for WAL-logging
xact commit or abort (although that doesn't help read-only transactions,
since they don't WAL-log anything).
In my proposal for this, I noted that we could change the WAL log entries
for commit/abort to record full TimestampTz precision, instead of only
time_t as at present. That's not done in this patch, but will be committed
separately.
Tom Lane [Mon, 30 Apr 2007 00:16:43 +0000 (00:16 +0000)]
Marginal performance hack: use a dedicated routine instead of copyObject
to copy nodes that are known to be Vars during plan reference adjustment.
Saves useless memzero operation as well as the big switch in copyObject.
Tom Lane [Mon, 30 Apr 2007 00:14:54 +0000 (00:14 +0000)]
Marginal performance hack: avoid unnecessary work in expression_tree_mutator.
We can just palloc, instead of using makeNode, when we are going to
overwrite the whole node anyway in the FLATCOPY macro. Also, use
FLATCOPY instead of copyObject for common node types Var and Const.
Tom Lane [Mon, 30 Apr 2007 00:12:08 +0000 (00:12 +0000)]
Marginal performance hack: remove the loop that used to be needed to
look through a freelist for a chunk of adequate size. For a long time
now, all elements of a given freelist have been exactly the same
allocated size, so we don't need a loop. Since the loop never iterated
more than once, you'd think this wouldn't matter much, but it makes a
noticeable savings in a simple test --- perhaps because the compiler
isn't optimizing on a mistaken assumption that the loop would repeat.
AllocSetAlloc is called often enough that saving even a couple of
instructions is worthwhile.
Bruce Momjian [Sun, 29 Apr 2007 06:48:11 +0000 (06:48 +0000)]
Pl/pgsql MOVE done:
< o Add support for MOVE and SCROLL cursors
<
< PL/pgSQL cursors should support the same syntax as
< backend cursors.
<
> o -Add support for MOVE cursors
> o Add support for SCROLL cursors
Neil Conway [Sat, 28 Apr 2007 23:54:59 +0000 (23:54 +0000)]
Add support for IN as alternative to FROM in PL/PgSQL's FETCH statement,
for consistency with the backend's FETCH command. Patch from Pavel
Stehule, reviewed by Neil Conway.
Tom Lane [Fri, 27 Apr 2007 22:05:49 +0000 (22:05 +0000)]
Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve the
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes. DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility. This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement? Not going to
worry about that now, though.) That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
Neil Conway [Fri, 27 Apr 2007 20:08:43 +0000 (20:08 +0000)]
Remove no-longer-true statement from the docs. Since the default config
now enables row-level stats, the out of the box stats volume is no
longer particularly low.
Tom Lane [Thu, 26 Apr 2007 23:24:46 +0000 (23:24 +0000)]
Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
Neil Conway [Thu, 26 Apr 2007 22:25:56 +0000 (22:25 +0000)]
Another tweak for tab completion of CREATE TEMP. Instead of only
completing CREATE { TEMP | TEMPORARY } TABLE, we should also suggest
VIEW and SEQUENCE. Per Greg Sabino Mullane.
Neil Conway [Thu, 26 Apr 2007 18:10:28 +0000 (18:10 +0000)]
Minor enhancement to psql tab completion. If we see "CREATE TEMPORARY",
we can complete "TABLE". The previous coding only looked for "CREATE TEMP".
Note that I didn't add TEMPORARY to the list of suggested completions
after we've seen "CREATE", since TEMP is equivalent and more concise. But
if the user has already manually typed TEMPORARY, we may as well
complete TABLE for them.
Neil Conway [Thu, 26 Apr 2007 16:13:15 +0000 (16:13 +0000)]
Rename the newly-added commands for discarding session state.
RESET SESSION, RESET PLANS, and RESET TEMP are now DISCARD ALL,
DISCARD PLANS, and DISCARD TEMP, respectively. This is to avoid
confusion with the pre-existing RESET variants: the DISCARD
commands are not actually similar to RESET. Patch from Marko
Kreen, with some minor editorialization.
Tom Lane [Sun, 22 Apr 2007 03:52:40 +0000 (03:52 +0000)]
Remove some of the most blatant brain-fade in the recent guc patch
(it's so nice to have a buildfarm member that actively rejects naked
uses of strcasecmp). This coding is still pretty awful, though, since
it's going to be O(N^2) in the number of guc variables. May I direct
your attention to bsearch?
Tom Lane [Sat, 21 Apr 2007 21:01:45 +0000 (21:01 +0000)]
Some further performance tweaks for planning large inheritance trees that
are mostly excluded by constraints: do the CE test a bit earlier to save
some adjust_appendrel_attrs() work on excluded children, and arrange to
use array indexing rather than rt_fetch() to fetch RTEs in the main body
of the planner. The latter is something I'd wanted to do for awhile anyway,
but seeing list_nth_cell() as 35% of the runtime gets one's attention.
Tom Lane [Sat, 21 Apr 2007 05:56:41 +0000 (05:56 +0000)]
Tweak make_inh_translation_lists() to check the common case wherein parent and
child attnums are the same, before it grovels through each and every child
column looking for a name match. Saves some time in large inheritance trees,
per example from Greg.
Tom Lane [Sat, 21 Apr 2007 04:49:20 +0000 (04:49 +0000)]
Improve the way in which CatalogCacheComputeHashValue combines multiple key
values: don't throw away perfectly good hash bits, and increase the shift
distances so as to provide more separation in the common case where some of
the key values are small integers (and so their hashes are too, because
hashfunc.c doesn't try all that hard). This reduces the runtime of
SearchCatCache by a factor of 4 in an example provided by Greg Stark,
in which the planner spends a whole lot of time searching the two-key
STATRELATT cache. It seems unlikely to hurt in other cases, but maybe
we could do even better?
Tom Lane [Sat, 21 Apr 2007 04:10:53 +0000 (04:10 +0000)]
Adjust pgstat_initstats() to avoid repeated searches of the TabStat arrays
when a relation is opened multiple times in the same transaction. This is
particularly useful for system catalogs, which we may heap_open or index_open
many times in a transaction, and it doesn't really cost anything extra even
if the rel is touched but once. Motivated by study of an example from Greg
Stark, in which pgstat_initstats() accounted for an unreasonably large
fraction of the runtime.
Tom Lane [Sat, 21 Apr 2007 02:41:13 +0000 (02:41 +0000)]
Tweak set_rel_width() to avoid redundant executions of getrelid().
In very large queries this accounts for a noticeable fraction of
planning time. Per an example from Greg Stark.
Tom Lane [Fri, 20 Apr 2007 02:37:38 +0000 (02:37 +0000)]
Support explicit placement of the temporary-table schema within search_path.
This is needed to allow a security-definer function to set a truly secure
value of search_path. Without it, a malicious user can use temporary objects
to execute code with the privileges of the security-definer function. Even
pushing the temp schema to the back of the search path is not quite good
enough, because a function or operator at the back of the path might still
capture control from one nearer the front due to having a more exact datatype
match. Hence, disable searching the temp schema altogether for functions and
operators.
Tom Lane [Thu, 19 Apr 2007 20:24:04 +0000 (20:24 +0000)]
Repair PANIC condition in hash indexes when a previous index extension attempt
failed (due to lock conflicts or out-of-space). We might have already
extended the index's filesystem EOF before failing, causing the EOF to be
beyond what the metapage says is the last used page. Hence the invariant
maintained by the code needs to be "EOF is at or beyond last used page",
not "EOF is exactly the last used page". Problem was created by my patch
of 2006-11-19 that attempted to repair bug #2737. Since that was
back-patched to 7.4, this needs to be as well. Per report and test case
from Vlastimil Krejcir.
Tom Lane [Thu, 19 Apr 2007 16:33:24 +0000 (16:33 +0000)]
Fix plpgsql to avoid reference to already-freed memory when returning a
pass-by-reference data type and the RETURN statement is within an EXCEPTION
block. Bug introduced by my fix of 2007-01-28 to use per-subtransaction
ExprContexts/EStates; since that wasn't back-patched into older branches,
only 8.2 and HEAD are affected. Per report from Gary Winslow.
Bruce Momjian [Wed, 18 Apr 2007 00:17:56 +0000 (00:17 +0000)]
Document that the COPY delimiter must be an ASCII byte, rather than a
multi-byte value. It can also be a single-byte encoded character if
the client and server versions match.