Tom Lane [Mon, 30 Apr 2007 03:23:49 +0000 (03:23 +0000)]
Implement rate-limiting logic on how often backends will attempt to send
messages to the stats collector. This avoids the problem that enabling
stats_row_level for autovacuum has a significant overhead for short
read-only transactions, as noted by Arjen van der Meijden. We can avoid
an extra gettimeofday call by piggybacking on the one done for WAL-logging
xact commit or abort (although that doesn't help read-only transactions,
since they don't WAL-log anything).
In my proposal for this, I noted that we could change the WAL log entries
for commit/abort to record full TimestampTz precision, instead of only
time_t as at present. That's not done in this patch, but will be committed
separately.
Tom Lane [Mon, 30 Apr 2007 00:16:43 +0000 (00:16 +0000)]
Marginal performance hack: use a dedicated routine instead of copyObject
to copy nodes that are known to be Vars during plan reference adjustment.
Saves useless memzero operation as well as the big switch in copyObject.
Tom Lane [Mon, 30 Apr 2007 00:14:54 +0000 (00:14 +0000)]
Marginal performance hack: avoid unnecessary work in expression_tree_mutator.
We can just palloc, instead of using makeNode, when we are going to
overwrite the whole node anyway in the FLATCOPY macro. Also, use
FLATCOPY instead of copyObject for common node types Var and Const.
Tom Lane [Mon, 30 Apr 2007 00:12:08 +0000 (00:12 +0000)]
Marginal performance hack: remove the loop that used to be needed to
look through a freelist for a chunk of adequate size. For a long time
now, all elements of a given freelist have been exactly the same
allocated size, so we don't need a loop. Since the loop never iterated
more than once, you'd think this wouldn't matter much, but it makes a
noticeable savings in a simple test --- perhaps because the compiler
isn't optimizing on a mistaken assumption that the loop would repeat.
AllocSetAlloc is called often enough that saving even a couple of
instructions is worthwhile.
Bruce Momjian [Sun, 29 Apr 2007 06:48:11 +0000 (06:48 +0000)]
Pl/pgsql MOVE done:
< o Add support for MOVE and SCROLL cursors
<
< PL/pgSQL cursors should support the same syntax as
< backend cursors.
<
> o -Add support for MOVE cursors
> o Add support for SCROLL cursors
Neil Conway [Sat, 28 Apr 2007 23:54:59 +0000 (23:54 +0000)]
Add support for IN as alternative to FROM in PL/PgSQL's FETCH statement,
for consistency with the backend's FETCH command. Patch from Pavel
Stehule, reviewed by Neil Conway.
Tom Lane [Fri, 27 Apr 2007 22:05:49 +0000 (22:05 +0000)]
Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve the
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes. DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility. This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement? Not going to
worry about that now, though.) That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
Neil Conway [Fri, 27 Apr 2007 20:08:43 +0000 (20:08 +0000)]
Remove no-longer-true statement from the docs. Since the default config
now enables row-level stats, the out of the box stats volume is no
longer particularly low.
Tom Lane [Thu, 26 Apr 2007 23:24:46 +0000 (23:24 +0000)]
Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
Neil Conway [Thu, 26 Apr 2007 22:25:56 +0000 (22:25 +0000)]
Another tweak for tab completion of CREATE TEMP. Instead of only
completing CREATE { TEMP | TEMPORARY } TABLE, we should also suggest
VIEW and SEQUENCE. Per Greg Sabino Mullane.
Neil Conway [Thu, 26 Apr 2007 18:10:28 +0000 (18:10 +0000)]
Minor enhancement to psql tab completion. If we see "CREATE TEMPORARY",
we can complete "TABLE". The previous coding only looked for "CREATE TEMP".
Note that I didn't add TEMPORARY to the list of suggested completions
after we've seen "CREATE", since TEMP is equivalent and more concise. But
if the user has already manually typed TEMPORARY, we may as well
complete TABLE for them.
Neil Conway [Thu, 26 Apr 2007 16:13:15 +0000 (16:13 +0000)]
Rename the newly-added commands for discarding session state.
RESET SESSION, RESET PLANS, and RESET TEMP are now DISCARD ALL,
DISCARD PLANS, and DISCARD TEMP, respectively. This is to avoid
confusion with the pre-existing RESET variants: the DISCARD
commands are not actually similar to RESET. Patch from Marko
Kreen, with some minor editorialization.
Tom Lane [Sun, 22 Apr 2007 03:52:40 +0000 (03:52 +0000)]
Remove some of the most blatant brain-fade in the recent guc patch
(it's so nice to have a buildfarm member that actively rejects naked
uses of strcasecmp). This coding is still pretty awful, though, since
it's going to be O(N^2) in the number of guc variables. May I direct
your attention to bsearch?
Tom Lane [Sat, 21 Apr 2007 21:01:45 +0000 (21:01 +0000)]
Some further performance tweaks for planning large inheritance trees that
are mostly excluded by constraints: do the CE test a bit earlier to save
some adjust_appendrel_attrs() work on excluded children, and arrange to
use array indexing rather than rt_fetch() to fetch RTEs in the main body
of the planner. The latter is something I'd wanted to do for awhile anyway,
but seeing list_nth_cell() as 35% of the runtime gets one's attention.
Tom Lane [Sat, 21 Apr 2007 05:56:41 +0000 (05:56 +0000)]
Tweak make_inh_translation_lists() to check the common case wherein parent and
child attnums are the same, before it grovels through each and every child
column looking for a name match. Saves some time in large inheritance trees,
per example from Greg.
Tom Lane [Sat, 21 Apr 2007 04:49:20 +0000 (04:49 +0000)]
Improve the way in which CatalogCacheComputeHashValue combines multiple key
values: don't throw away perfectly good hash bits, and increase the shift
distances so as to provide more separation in the common case where some of
the key values are small integers (and so their hashes are too, because
hashfunc.c doesn't try all that hard). This reduces the runtime of
SearchCatCache by a factor of 4 in an example provided by Greg Stark,
in which the planner spends a whole lot of time searching the two-key
STATRELATT cache. It seems unlikely to hurt in other cases, but maybe
we could do even better?
Tom Lane [Sat, 21 Apr 2007 04:10:53 +0000 (04:10 +0000)]
Adjust pgstat_initstats() to avoid repeated searches of the TabStat arrays
when a relation is opened multiple times in the same transaction. This is
particularly useful for system catalogs, which we may heap_open or index_open
many times in a transaction, and it doesn't really cost anything extra even
if the rel is touched but once. Motivated by study of an example from Greg
Stark, in which pgstat_initstats() accounted for an unreasonably large
fraction of the runtime.
Tom Lane [Sat, 21 Apr 2007 02:41:13 +0000 (02:41 +0000)]
Tweak set_rel_width() to avoid redundant executions of getrelid().
In very large queries this accounts for a noticeable fraction of
planning time. Per an example from Greg Stark.
Tom Lane [Fri, 20 Apr 2007 02:37:38 +0000 (02:37 +0000)]
Support explicit placement of the temporary-table schema within search_path.
This is needed to allow a security-definer function to set a truly secure
value of search_path. Without it, a malicious user can use temporary objects
to execute code with the privileges of the security-definer function. Even
pushing the temp schema to the back of the search path is not quite good
enough, because a function or operator at the back of the path might still
capture control from one nearer the front due to having a more exact datatype
match. Hence, disable searching the temp schema altogether for functions and
operators.
Tom Lane [Thu, 19 Apr 2007 20:24:04 +0000 (20:24 +0000)]
Repair PANIC condition in hash indexes when a previous index extension attempt
failed (due to lock conflicts or out-of-space). We might have already
extended the index's filesystem EOF before failing, causing the EOF to be
beyond what the metapage says is the last used page. Hence the invariant
maintained by the code needs to be "EOF is at or beyond last used page",
not "EOF is exactly the last used page". Problem was created by my patch
of 2006-11-19 that attempted to repair bug #2737. Since that was
back-patched to 7.4, this needs to be as well. Per report and test case
from Vlastimil Krejcir.
Tom Lane [Thu, 19 Apr 2007 16:33:24 +0000 (16:33 +0000)]
Fix plpgsql to avoid reference to already-freed memory when returning a
pass-by-reference data type and the RETURN statement is within an EXCEPTION
block. Bug introduced by my fix of 2007-01-28 to use per-subtransaction
ExprContexts/EStates; since that wasn't back-patched into older branches,
only 8.2 and HEAD are affected. Per report from Gary Winslow.
Bruce Momjian [Wed, 18 Apr 2007 00:17:56 +0000 (00:17 +0000)]
Document that the COPY delimiter must be an ASCII byte, rather than a
multi-byte value. It can also be a single-byte encoded character if
the client and server versions match.
Bruce Momjian [Tue, 17 Apr 2007 20:50:34 +0000 (20:50 +0000)]
Add warning about TODO item:
< Currently all schemas are owned by the super-user because they are
< copied from the template1 database.
> Currently all schemas are owned by the super-user because they are copied
> from the template1 database. However, since all objects are inherited
> from the template database, it is not clear that setting schemas to the db
> owner is correct.
Tom Lane [Tue, 17 Apr 2007 20:49:39 +0000 (20:49 +0000)]
Don't assume rd_smgr stays open across all of a rewriteheap operation;
doing so can result in crash if an sinval reset occurs meanwhile.
I believe this explains intermittent buildfarm failures in cluster test.
Tom Lane [Tue, 17 Apr 2007 20:03:03 +0000 (20:03 +0000)]
Rewrite choose_bitmap_and() to make it more robust in the presence of
competing alternatives for indexes to use in a bitmap scan. The former
coding took estimated selectivity as an overriding factor, causing it to
sometimes choose indexes that were much slower to scan than ones with a
slightly worse selectivity. It was also too narrow-minded about which
combinations of indexes to consider ANDing. The rewrite makes it pay more
attention to index scan cost than selectivity; this seems sane since it's
impossible to have very bad selectivity with low cost, whereas the reverse
isn't true. Also, we now consider each index alone, as well as adding
each index to an AND-group led by each prior index, for a total of about
O(N^2) rather than O(N) combinations considered. This makes the results
much less dependent on the exact order in which the indexes are
considered. It's still a lot cheaper than an O(2^N) exhaustive search.
A prefilter step eliminates all but the cheapest of those indexes using
the same set of WHERE conditions, to keep the effective value of N down in
scenarios where the DBA has created lots of partially-redundant indexes.
Tom Lane [Mon, 16 Apr 2007 18:42:10 +0000 (18:42 +0000)]
Fix pg_dump to not crash if -t or a similar switch is used to select a serial
sequence for dumping without also selecting its owning table. Make it not try
to emit ALTER SEQUENCE OWNED BY in this situation.
Per report from Michael Nolan.
Add a multi-worker capability to autovacuum. This allows multiple worker
processes to be running simultaneously. Also, now autovacuum processes do not
count towards the max_connections limit; they are counted separately from
regular processes, and are limited by the new GUC variable
autovacuum_max_workers.
The launcher now has intelligence to launch workers on each database every
autovacuum_naptime seconds, limited only on the max amount of worker slots
available.
Also, the global worker I/O utilization is limited by the vacuum cost-based
delay feature. Workers are "balanced" so that the total I/O consumption does
not exceed the established limit. This part of the patch was contributed by
ITAGAKI Takahiro.
Tom Lane [Mon, 16 Apr 2007 18:21:07 +0000 (18:21 +0000)]
Make plancache store cursor options so it can pass them to planner during
a replan. I had originally thought this was not necessary, but the new
SPI facilities create a path whereby queries planned with non-default
options can get into the cache, so it is necessary.
Tom Lane [Mon, 16 Apr 2007 01:14:58 +0000 (01:14 +0000)]
Expose more cursor-related functionality in SPI: specifically, allow
access to the planner's cursor-related planning options, and provide new
FETCH/MOVE routines that allow access to the full power of those commands.
Small refactoring of planner(), pg_plan_query(), and pg_plan_queries()
APIs to make it convenient to pass the planning options down from SPI.
This is the core-code portion of Pavel Stehule's patch for scrollable
cursor support in plpgsql; I'll review and apply the plpgsql changes
separately.
Tom Lane [Sun, 15 Apr 2007 20:09:28 +0000 (20:09 +0000)]
Avoid running build_index_pathkeys() in situations where there cannot
possibly be any useful pathkeys --- to wit, queries with neither any
join clauses nor any ORDER BY request. It's nearly free to check for
this case and it saves a useful fraction of the planning time for simple
queries.
Bruce Momjian [Fri, 13 Apr 2007 23:23:22 +0000 (23:23 +0000)]
Update TODO:
< o Consider reducing on-disk varlena length from four to two
< because a heap row cannot be more than 64k in length
> o Consider reducing on-disk varlena length from four bytes to
> two because a heap row cannot be more than 64k in length
Andrew Dunstan [Fri, 13 Apr 2007 18:50:01 +0000 (18:50 +0000)]
Enable building contrib/xml2 if configured using --with-libxml.
If this breaks things due to missing libxslt, then I'll have to
revert it, but let's see if it breaks the buildfarm.
Workarounds in case libxslt is missing include:
. don't configure with libxml, or
. don't build contrib modules from the contrib Makefile (use the individual module Makefiles instead), or
. change the xml2 Makefile
Neil Conway [Thu, 12 Apr 2007 22:39:21 +0000 (22:39 +0000)]
Minor fixes for the EXPLAIN reference page. Mention the fact that
EXPLAIN ANALYZE can sometimes be significantly slower than running
the same query normally, and make some minor markup improvements.