Tom Lane [Wed, 3 Jan 2007 18:11:01 +0000 (18:11 +0000)]
Clean up smgr.c/md.c APIs as per discussion a couple months ago. Instead of
having md.c return a success/failure boolean to smgr.c, which was just going
to elog anyway, let md.c issue the elog messages itself. This allows better
error reporting, particularly in cases such as "short read" or "short write"
which Peter was complaining of. Also, remove the kluge of allowing mdread()
to return zeroes from a read-beyond-EOF: this is now an error condition
except when InRecovery or zero_damaged_pages = true. (Hash indexes used to
require that behavior, but no more.) Also, enforce that mdwrite() is to be
used for rewriting existing blocks while mdextend() is to be used for
extending the relation EOF. This restriction lets us get rid of the old
ad-hoc defense against creating huge files by an accidental reference to
a bogus block number: we'll only create new segments in mdextend() not
mdwrite() or mdread(). (Again, when InRecovery we allow it anyway, since
we need to allow updates of blocks that were later truncated away.)
Also, clean up the original makeshift patch for bug #2737: move the
responsibility for padding relation segments to full length into md.c.
Bruce Momjian [Wed, 3 Jan 2007 04:21:47 +0000 (04:21 +0000)]
For float4/8, remove errno checks for pow() and exp() because only some
platforms set errno, and we already have a check macro that detects
under/overflow, so there is no reason for platform-specific code
anymore.
Bruce Momjian [Tue, 2 Jan 2007 20:00:50 +0000 (20:00 +0000)]
Fix float4/8 to handle Infinity and Nan consistently, e.g. Infinity is a
valid result from a computation if one of the input values was infinity.
The previous code assumed an operation that returned infinity was an
overflow.
Handle underflow/overflow consistently, and add checks for aggregate
overflow.
Consistently prevent Inf/Nan from being cast to integer data types.
Tom Lane [Sun, 31 Dec 2006 20:32:04 +0000 (20:32 +0000)]
Found the problem with my operator-family changes: by fetching from
pg_opclass during LookupOpclassInfo(), I'd turned pg_opclass_oid_index
into a critical system index. However the problem could only manifest
during a backend's first attempt to load opclass data, and then only
if it had successfully loaded pg_internal.init and subsequently received
a relcache flush; which made it impossible to reproduce in sequential
tests and darn hard even in parallel tests. Memo to self: when
exercising cache flush scenarios, must disable LookupOpclassInfo's
internal cache too.
Tom Lane [Sat, 30 Dec 2006 21:21:56 +0000 (21:21 +0000)]
Support type modifiers for user-defined types, and pull most knowledge
about typmod representation for standard types out into type-specific
typmod I/O functions. Teodor Sigaev, with some editorialization by
Tom Lane.
Tom Lane [Thu, 28 Dec 2006 23:16:39 +0000 (23:16 +0000)]
Fix up btree's initial scankey processing to be able to detect redundant
or contradictory keys even in cross-data-type scenarios. This is another
benefit of the opfamily rewrite: we can find the needed comparison
operators now.
Tom Lane [Thu, 28 Dec 2006 19:53:05 +0000 (19:53 +0000)]
Enable btree_predicate_proof() to make proofs involving cross-data-type
predicate operators. The hard stuff turns out to be already done in the
previous commit, we need merely open the floodgates...
Bruce Momjian [Thu, 28 Dec 2006 18:01:20 +0000 (18:01 +0000)]
Done:
< * Move some /contrib modules out to their own project sites
<
< Particularly, move GPL-licensed /contrib/userlock and
< /contrib/dbmirror/clean_pending.pl.
<
Tom Lane [Thu, 28 Dec 2006 01:09:01 +0000 (01:09 +0000)]
Add a defense to prevent core dumps if 8.2 version of rank_cd() is used with
the 8.1 SQL function definition for it. Per report from Rajesh Kumar Mallah,
such a DBA error doesn't seem at all improbable, and the cost of checking for
it is not very high compared to the cost of running this function. (It would
have been better to change the C name of the function so it wouldn't be called
by the old SQL definition, but it's too late for that now in the 8.2 branch.)
Tom Lane [Thu, 28 Dec 2006 00:29:13 +0000 (00:29 +0000)]
fflush the \o file, if any, after each backslash command. We already
do this for ordinary SQL commands, so it seems consistent to do it for
backslash commands too. Per gripe from Rajesh Kumar Mallah.
Tom Lane [Wed, 27 Dec 2006 22:31:54 +0000 (22:31 +0000)]
Modify local buffer management to request memory for local buffers in blocks
of increasing size, instead of one at a time. This reduces the memory
management overhead when num_temp_buffers is large: in the previous coding
we would actually waste 50% of the space used for temp buffers, because aset.c
would round the individual requests up to 16K. Problem noted while studying
a performance issue reported by Steven Flatt.
Back-patch as far as 8.1 --- older versions used few enough local buffers
that the issue isn't significant for them.
Tom Lane [Wed, 27 Dec 2006 22:30:48 +0000 (22:30 +0000)]
Improve memory management code to avoid inefficient behavior when a context
has a small maxBlockSize: the maximum request size that we will treat as a
"chunk" needs to be limited to fit in maxBlockSize. Otherwise we will round
up the request size to the next power of 2, wasting space, which is a bit
pointless if we aren't going to make the blocks big enough to fit additional
stuff in them. The example motivating this is local buffer management, which
makes repeated allocations of 8K (one BLCKSZ buffer) in TopMemoryContext,
which has maxBlockSize = 8K because for the most part allocations there are
small. This leads to each local buffer actually eating 16K of space, which
adds up when there are thousands of them. I intend to change localbuf.c to
aggregate its requests, which will prevent this particular misbehavior, but
it seems likely that similar scenarios could arise elsewhere, so fixing the
core problem seems wise as well.
Tom Lane [Wed, 27 Dec 2006 19:45:36 +0000 (19:45 +0000)]
Print combining characters (those reported as having zero width by
PQdsplen()) normally, instead of replacing them by \uXXXX sequences.
Assume that they in fact occupy zero screen space for formatting purposes.
Per gripe from Michael Fuhr and ensuing discussion.
Tom Lane [Wed, 27 Dec 2006 16:07:36 +0000 (16:07 +0000)]
Use FROM clause in example UPDATE commands where appropriate. Also
remove long-obsolete statement that there isn't a check for infinite
recursion in view rules.
Tom Lane [Tue, 26 Dec 2006 21:37:20 +0000 (21:37 +0000)]
Fix failure due to accessing an already-freed tuple descriptor in a plan
involving HashAggregate over SubqueryScan (this is the known case, there
may well be more). The bug is only latent in releases before 8.2 since they
didn't try to access tupletable slots' descriptors during ExecDropTupleTable.
The least bogus fix seems to be to make subqueries share the parent query's
memory context, so that tupdescs they create will have the same lifespan as
those of the parent query. There are comments in the code envisioning going
even further by not having a separate child EState at all, but that will
require rethinking executor access to range tables, which I don't want to
tackle right now. Per bug report from Jean-Pierre Pelletier.
Tom Lane [Tue, 26 Dec 2006 19:26:46 +0000 (19:26 +0000)]
Repair bug #2839: the various ExecReScan functions need to reset
ps_TupFromTlist in plan nodes that make use of it. This was being done
correctly in join nodes and Result nodes but not in any relation-scan nodes.
Bug would lead to bogus results if a set-returning function appeared in the
targetlist of a subquery that could be rescanned after partial execution,
for example a subquery within EXISTS(). Bug has been around forever :-(
... surprising it wasn't reported before.
Tom Lane [Tue, 26 Dec 2006 16:56:18 +0000 (16:56 +0000)]
Repair bug #2836: SPI_execute_plan returned zero if none of the querytrees
were marked canSetTag. While it's certainly correct to return the result
of the last one that is marked canSetTag, it's less clear what to do when
none of them are. Since plpgsql will complain if zero is returned, the
8.2.0 behavior isn't good. I've fixed it to restore the prior behavior of
returning the physically last query's result code when there are no
canSetTag queries.
Tatsuo Ishii [Tue, 26 Dec 2006 01:02:05 +0000 (01:02 +0000)]
Call srandom() instead of srand().
pgbench calls random() later, so it should have called srandom().
On most platforms except Windows srandom() is actually identical
to srand(), so the bug only bites Windows users.
per bug report from Akio Ishida.
Tom Lane [Sun, 24 Dec 2006 18:25:58 +0000 (18:25 +0000)]
Bring some order and sanity to error handling in the xml patch.
Use a TRY block instead of (inadequate) ad-hoc coding to ensure that
libxml is cleaned up after a failure. Report the intended SQLCODE
instead of defaulting to XX000. Avoid risking use of a dangling
pointer by keeping the persistent error buffer in TopMemoryContext.
Be less trusting that error messages don't contain %.
This patch doesn't do anything about changing the way the messages
are put together --- this is just about mechanism.
Tom Lane [Sun, 24 Dec 2006 00:57:48 +0000 (00:57 +0000)]
Fix machine-dependent crash in sqlchar_to_unicode(). Get rid of
bletcherous and unsafe manipulation of global encoding setting.
Clean up libxml reporting mechanism a bit (it still looks like a
dangling-pointer crash waiting to happen, though, not to mention
being far less than sane from a localization standpoint).
Tom Lane [Sun, 24 Dec 2006 00:29:20 +0000 (00:29 +0000)]
Code review for XML patch. Instill a bit of sanity in the location of
the XmlExpr code in various lists, use a representation that has some hope
of reverse-listing correctly (though it's still a de-escaping function
shy of correctness), generally try to make it look more like Postgres
coding conventions.
Bruce Momjian [Sat, 23 Dec 2006 00:52:40 +0000 (00:52 +0000)]
For GUC values, check for partial string matches on 'on' and 'off', but
require at least two characters for uniqueness. This now matches the
behavior of other boolean strings we support, per report from Gurjeet
Singh.
Tom Lane [Sat, 23 Dec 2006 00:43:13 +0000 (00:43 +0000)]
Restructure operator classes to allow improved handling of cross-data-type
cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.
This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.
Tom Lane [Mon, 18 Dec 2006 18:56:29 +0000 (18:56 +0000)]
Set pg_am.amstrategies to zero for index AMs that don't have fixed
operator strategy numbers, ie, GiST and GIN. This is almost cosmetic
enough to not need a catversion bump, but since the opr_sanity regression
test has to change in sync with the catalog entry, I figured I'd better
do one.
Tom Lane [Fri, 15 Dec 2006 18:42:26 +0000 (18:42 +0000)]
Fix some planner bugs exposed by reports from Arjen van der Meijden. These
are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr
(IN-clauses) or amortization of indexscan costs across repeated indexscans
on the inside of a nestloop. In particular:
Fix some logic errors in the estimation for multiple scans induced by a
ScalarArrayOpExpr indexqual.
Include a small cost component in bitmap index scans to reflect the costs of
manipulating the bitmap itself; this is mainly to prevent a bitmap scan from
appearing to have the same cost as a plain indexscan for fetching a single
tuple.
Also add a per-index-scan-startup CPU cost component; while prior releases
were clearly too pessimistic about the cost of repeated indexscans, the
original 8.2 coding allowed the cost of an indexscan to effectively go to zero
if repeated often enough, which is overly optimistic.
Pay some attention to index correlation when estimating costs for a nestloop
inner indexscan: this is significant when the plan fetches multiple heap
tuples per iteration, since high correlation means those tuples are probably
on the same or adjacent heap pages.
Bruce Momjian [Fri, 15 Dec 2006 15:40:52 +0000 (15:40 +0000)]
TODO item not wanted:
>
> * Embedded server (not wanted)
>
> While PostgreSQL clients runs fine limited-resource environments, the
> server requires multiple processes and a stable pool of resources to
> run reliabily and efficiently. Stripping down the PostgreSQL server
> to run in the same process address space as the client application
> would add too much complexity and failure cases.
Bruce Momjian [Fri, 15 Dec 2006 13:28:54 +0000 (13:28 +0000)]
Link to summary XML email, rather than thread top:
< * Consider changing documentation from SGML to XML
> * Consider changing documentation format from SGML to XML
< http://archives.postgresql.org/pgsql-docs/2006-12/msg00033.php
> http://archives.postgresql.org/pgsql-docs/2006-12/msg00152.php
Bruce Momjian [Tue, 12 Dec 2006 22:31:19 +0000 (22:31 +0000)]
Update entry:
< * Have EXPLAIN ANALYZE highlight poor optimizer estimates
> * Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and
> actual row counts differ by a specified percentage
Tom Lane [Tue, 12 Dec 2006 21:31:02 +0000 (21:31 +0000)]
Fix planner to do the right thing when a degenerate outer join (one whose
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created. The normal heuristic to avoid joins with no joinclause has to be
overridden in that case. Problem is new in 8.2; before that we forced the
outer join order anyway. Per example from Teodor.
Tom Lane [Sun, 10 Dec 2006 22:13:27 +0000 (22:13 +0000)]
Add a paramtypmod field to Param nodes. This is dead weight for Params
representing externally-supplied values, since the APIs that carry such
values only specify type not typmod. However, for PARAM_SUBLINK Params
it is handy to carry the typmod of the sublink's output column. This
is a much cleaner solution for the recently reported 'could not find
pathkey item to sort' and 'failed to find unique expression in subplan
tlist' bugs than my original 8.2-compatible patch. Besides, someday we
might want to support typmods for external parameters ...
Tom Lane [Fri, 8 Dec 2006 19:50:53 +0000 (19:50 +0000)]
Remove the logId/logSeg fields from pg_control, because they are not needed
in normal operation, and we can avoid rewriting pg_control at every log
segment switch if we don't insist that these values be valid. Reducing
the number of pg_control updates is a good idea for both performance and
reliability. It does make pg_resetxlog's life a bit harder, but that seems
a good tradeoff; and anyway the change to pg_resetxlog amounts to automating
something people formerly needed to do by hand, namely look at the existing
pg_xlog files to make sure the new WAL start point was past them.
In passing, change the wording of xlog.c's "database system was interrupted"
messages: describe the pg_control timestamp as "last known up at" rather than
implying it is the exact time of service interruption. With this change the
timestamp will generally be the time of the last checkpoint, which could be
many minutes before the failure; and we've already seen indications that
people tend to misinterpret the old wording.
initdb forced due to change in pg_control layout. Simon Riggs and Tom Lane
Tom Lane [Fri, 8 Dec 2006 00:40:27 +0000 (00:40 +0000)]
Avoid double free of _SPI_current->tuptable. AtEOSubXact_SPI() now tries to
release it in a subtransaction abort, but this neglects possibility that
someone outside SPI already did. Fix is for spi.c to forget about a tuptable
as soon as it's handed it back to the caller.
Per bug #2817 from Michael Andreen.