Teodor Sigaev [Wed, 22 Oct 2008 12:53:56 +0000 (12:53 +0000)]
Fix GiST's killing tuple: GISTScanOpaque->curpos wasn't
correctly set. As result, killtuple() marks as dead
wrong tuple on page. Bug was introduced by me while fixing
possible duplicates during GiST index scan.
Tom Lane [Tue, 21 Oct 2008 20:42:53 +0000 (20:42 +0000)]
Add a concept of "placeholder" variables to the planner. These are variables
that represent some expression that we desire to compute below the top level
of the plan, and then let that value "bubble up" as though it were a plain
Var (ie, a column value).
The immediate application is to allow sub-selects to be flattened even when
they are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't properly
go to NULL when evaluated above the outer join. Now, we wrap it in a
PlaceHolderVar and arrange for the actual evaluation to occur below the outer
join. When the resulting Var bubbles up through the join, it will be set to
NULL if necessary, yielding the correct results. This fixes a planner
limitation that's existed since 7.1.
In future we might want to use this mechanism to re-introduce some form of
Hellerstein's "expensive functions" optimization, ie place the evaluation of
an expensive function at the most suitable point in the plan tree.
Alvaro Herrera [Mon, 20 Oct 2008 19:18:18 +0000 (19:18 +0000)]
Rework subtransaction commit protocol for hot standby.
This patch eliminates the marking of subtransactions as SUBCOMMITTED in pg_clog
during their commit; instead they remain in-progress until main transaction
commit. At main transaction commit, the commit protocol is atomic-by-page
instead of one transaction at a time. To avoid a race condition with some
subtransactions appearing committed before others in the case where they span
more than one pg_clog page, we conserve the logic that marks them subcommitted
before marking the parent committed.
Tom Lane [Fri, 17 Oct 2008 22:10:30 +0000 (22:10 +0000)]
Add a new column to pg_am to specify whether an index AM supports backward
scanning; GiST and GIN do not, and it seems like too much trouble to make
them do so. By teaching ExecSupportsBackwardScan() about this restriction,
we ensure that the planner will protect a scroll cursor from the problem
by adding a Materialize node.
In passing, fix another longstanding bug in the same area: backwards scan of
a plan with set-returning functions in the targetlist did not work either,
since the TupFromTlist expansion code pays no attention to direction (and
has no way to run a SRF backwards anyway). Again the fix is to make
ExecSupportsBackwardScan check this restriction.
Also adjust the index AM API specification to note that mark/restore support
is unnecessary if the AM can't produce ordered output.
Tom Lane [Fri, 17 Oct 2008 20:27:24 +0000 (20:27 +0000)]
Salvage a little bit of work from a failed patch: simplify and speed up
set_rel_width(). The code had been catering for the possibility of different
varnos in the relation targetlist, but this is impossible for a base relation
(and if it were possible, putting all the widths in the same RelOptInfo would
be wrong anyway).
Teodor Sigaev [Fri, 17 Oct 2008 17:27:46 +0000 (17:27 +0000)]
Fix small bug in headline generation.
Patch from Sushant Sinha <sushant354@gmail.com>
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00785.php
Teodor Sigaev [Fri, 17 Oct 2008 17:02:21 +0000 (17:02 +0000)]
During repeated rescan of GiST index it's possible that scan key
is NULL but SK_SEARCHNULL is not set. Add checking IS NULL of keys
to set during key initialization. If key is NULL and SK_SEARCHNULL is not
set then nothnig can be satisfied.
With assert-enabled compilation that causes coredump.
Bug was introduced in 8.3 by support of IS NULL index scan.
Neil Conway [Thu, 16 Oct 2008 19:25:55 +0000 (19:25 +0000)]
Fix a small memory leak in ExecReScanAgg() in the hashed aggregation case.
In the previous coding, the list of columns that needed to be hashed on
was allocated in the per-query context, but we reallocated every time
the Agg node was rescanned. Since this information doesn't change over
a rescan, just construct the list of columns once during ExecInitAgg().
Tom Lane [Thu, 16 Oct 2008 13:23:21 +0000 (13:23 +0000)]
Fix SPI_getvalue and SPI_getbinval to range-check the given attribute number
according to the TupleDesc's natts, not the number of physical columns in the
tuple. The previous coding would do the wrong thing in cases where natts is
different from the tuple's column count: either incorrectly report error when
it should just treat the column as null, or actually crash due to indexing off
the end of the TupleDesc's attribute array. (The second case is probably not
possible in modern PG versions, due to more careful handling of inheritance
cases than we once had. But it's still a clear lack of robustness here.)
The incorrect error indication is ignored by all callers within the core PG
distribution, so this bug has no symptoms visible within the core code, but
it might well be an issue for add-on packages. So patch all the way back.
Tom Lane [Tue, 14 Oct 2008 23:27:40 +0000 (23:27 +0000)]
Make the system-attributes loop in AddNewAttributeTuples depend on
lengthof(SysAtt) not FirstLowInvalidHeapAttributeNumber, for consistency with
the other uses of the SysAtt array, and to make it clearer that it doesn't
walk off the end of that array.
Tom Lane [Tue, 14 Oct 2008 21:47:39 +0000 (21:47 +0000)]
Add a defense to prevent storing pseudo-type data into index columns.
Formerly, the lack of any opclasses that could accept such data was enough
of a defense, but now with a "record" opclass we need to check more carefully.
(You can still use that opclass for an index, but you have to store a named
composite type not an anonymous one.)
Tom Lane [Tue, 14 Oct 2008 21:39:41 +0000 (21:39 +0000)]
Update citext expected output for recent change in error message location
pointers. This is only a whitespace change, which ought to be ignored
by regression testing, but for some reason buildfarm member spoonbill
doesn't like it.
Tom Lane [Tue, 14 Oct 2008 17:12:33 +0000 (17:12 +0000)]
Extend the date type to support infinity and -infinity, analogously to
the timestamp types. Turns out this doesn't even reduce the available
range of dates, since the restriction to dates that work for Julian-date
arithmetic is much tighter than the int32 range anyway. Per a longstanding
TODO item.
Fix oversight in the relation forks patch: forgot to copy fork number to
fsync requests. This should fix the installcheck failure of the buildfarm
member "kudu".
Tom Lane [Tue, 14 Oct 2008 00:41:35 +0000 (00:41 +0000)]
Add docs and regression test about sorting the output of a recursive query in
depth-first search order. Upon close reading of SQL:2008, it seems that the
spec's SEARCH DEPTH FIRST and SEARCH BREADTH FIRST options do not actually
guarantee any particular result order: what they do is provide a constructed
column that the user can then sort on in the outer query. So this is actually
just as much functionality ...
Tom Lane [Mon, 13 Oct 2008 16:25:20 +0000 (16:25 +0000)]
Implement comparison of generic records (composite types), and invent a
pseudo-type record[] to represent arrays of possibly-anonymous composite
types. Since composite datums carry their own type identification, no
extra knowledge is needed at the array level.
The main reason for doing this right now is that it is necessary to support
the general case of detection of cycles in recursive queries: if you need to
compare more than one column to detect a cycle, you need to compare a ROW()
to an array built from ROW()s, at least if you want to do it as the spec
suggests. Add some documentation and regression tests concerning the cycle
detection issue.
Tom Lane [Mon, 13 Oct 2008 00:41:41 +0000 (00:41 +0000)]
Fix corner case wherein a WorkTableScan node could get initialized before the
RecursiveUnion to which it refers. It turns out that we can just postpone the
relevant initialization steps until the first exec call for the node, by which
time the ancestor node must surely be initialized. Per report from Greg Stark.
Tom Lane [Fri, 10 Oct 2008 13:48:05 +0000 (13:48 +0000)]
Fix omission of DiscardStmt in GetCommandLogLevel, per report from Hubert
Depesz Lubaczewski. In HEAD, also move a couple of other cases to make the
code ordering match up with ProcessUtility.
Tom Lane [Thu, 9 Oct 2008 19:27:40 +0000 (19:27 +0000)]
Improve the recently-added code for inlining set-returning functions so that
it can handle functions returning setof record. The case was left undone
originally, but it turns out to be simple to fix.
Alvaro Herrera [Thu, 9 Oct 2008 17:24:05 +0000 (17:24 +0000)]
Improve translatability of error messages for external modules by tweaking
the ereport macro. Included in this commit are enough files for starting
plpgsql, plpython, plperl and pltcl translations.
Tom Lane [Thu, 9 Oct 2008 16:35:07 +0000 (16:35 +0000)]
Fix overly tense optimization of PLpgSQL_func_hashkey: we must represent
the isTrigger state explicitly, not rely on nonzero-ness of trigrelOid
to indicate trigger-hood, because trigrelOid will be left zero when compiling
for validation. The (useless) function hash entry built by the validator
was able to match an ordinary non-trigger call later in the same session,
thereby bypassing the check that is supposed to prevent such a call.
Per report from Alvaro.
It might be worth suppressing the useless hash entry altogether, but
that's a bigger change than I want to consider back-patching.
Back-patch to 8.0. 7.4 doesn't have the problem because it doesn't
have validation mode.
Tom Lane [Thu, 9 Oct 2008 15:49:04 +0000 (15:49 +0000)]
Fix crash in bytea-to-XML mapping when the source value is toasted.
Report and fix by Michael McMaster. Some minor code beautification by me,
also avoid memory leaks in the special-case paths.
Force a checkpoint in CREATE DATABASE before starting to copy the files,
to process any pending unlinks for the source database.
Before, if you dropped a relation in the template database just before
CREATE DATABASE, and a checkpoint happened during copydir(), the checkpoint
might delete a file that we're just about to copy, causing lstat() in
copydir() to fail with ENOENT.
Backpatch to 8.3, where the pending unlinks were introduced.
Per report by Matthew Wakeling and analysis by Tom Lane.
Tom Lane [Wed, 8 Oct 2008 01:14:44 +0000 (01:14 +0000)]
Modify the parser's error reporting to include a specific hint for the case
of referencing a WITH item that's not yet in scope according to the SQL
spec's semantics. This seems to be an easy error to make, and the bare
"relation doesn't exist" message doesn't lead one's mind in the correct
direction to fix it.
Tom Lane [Tue, 7 Oct 2008 19:27:04 +0000 (19:27 +0000)]
Extend CTE patch to support recursive UNION (ie, without ALL). The
implementation uses an in-memory hash table, so it will poop out for very
large recursive results ... but the performance characteristics of a
sort-based implementation would be pretty unpleasant too.
When a relation is moved to another tablespace, we can't assume that we can
use the old relfilenode in the new tablespace. There might be another relation
in the new tablespace with the same relfilenode, so we must generate a fresh
relfilenode in the new tablespace.
The 8.3 patch to let deleted relation files linger as zero-length files until
the next checkpoint made this more obvious: moving a relation from one table
space another, and then back again, caused a collision with the lingering
file.
Back-patch to 8.1. The issue is present in 8.0 as well, but it doesn't seem
worth fixing there, because we didn't have protection from OID collisions
after OID wraparound before 8.1.
Tom Lane [Tue, 7 Oct 2008 01:47:55 +0000 (01:47 +0000)]
Improve parser error location for cases where an INSERT or UPDATE command
supplies an expression that can't be coerced to the target column type.
The code previously attempted to point at the target column name, which
doesn't work at all in an INSERT with omitted column name list, and is
also not remarkably helpful when the problem is buried somewhere in a
long INSERT-multi-VALUES command. Make it point at the failed expression
instead.
Tom Lane [Tue, 7 Oct 2008 00:05:55 +0000 (00:05 +0000)]
Fix oversight in recent patch to support multiple read positions in
tuplestore: in READFILE state tuplestore_select_read_pointer must
save the current file seek position in the read pointer being
deactivated.
Tom Lane [Mon, 6 Oct 2008 20:29:38 +0000 (20:29 +0000)]
Fix up ruleutils.c for CTE features. The main problem was that
get_name_for_var_field didn't have enough context to interpret a reference to
a CTE query's output. Fixing this requires separate hacks for the regular
deparse case (pg_get_ruledef) and for the EXPLAIN case, since the available
context information is quite different. It's pretty nearly parallel to the
existing code for SUBQUERY RTEs, though. Also, add code to make sure we
qualify a relation name that matches a CTE name; else the CTE will mistakenly
capture the reference when reloading the rule.
In passing, fix a pre-existing problem with get_name_for_var_field not working
on variables in targetlists of SubqueryScan plan nodes. Although latent all
along, this wasn't a problem until we made EXPLAIN VERBOSE try to print
targetlists. To do this, refactor the deparse_context_for_plan API so that
the special case for SubqueryScan is all on ruleutils.c's side.
Tom Lane [Mon, 6 Oct 2008 17:39:26 +0000 (17:39 +0000)]
When expanding a whole-row Var into a RowExpr during ResolveNew(), attach
the column alias names of the RTE referenced by the Var to the RowExpr.
This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes
referencing such a construct. Per my recent bug report.
Adding a field to RowExpr forces initdb (because of stored rules changes)
so this solution is not back-patchable; which is unfortunate because 8.2
and 8.3 have this issue. But it only affects EXPLAIN for some pretty odd
corner cases, so we can probably live without a solution for the back
branches.
Use fork names instead of numbers in the file names for additional
relation forks. While the file names are not visible to users, for those
that do peek into the data directory, it's nice to have more descriptive
names. Per Greg Stark's suggestion.
Magnus Hagander [Mon, 6 Oct 2008 13:05:40 +0000 (13:05 +0000)]
Add columns boot_val and reset_val to the pg_settings view, to expose
the value a parameter has at server start and will have after RESET,
respectively.
Tom Lane [Mon, 6 Oct 2008 02:12:56 +0000 (02:12 +0000)]
Fix the implicit-RTE code to be able to handle implicit RTEs for CTEs, as
well as regular tables. Per discussion, this seems necessary to meet the
principle of least astonishment.
In passing, simplify the error messages in warnAutoRange(). Now that we
have parser error position info for these errors, it doesn't seem very
useful to word the error message differently depending on whether we are
inside a sub-select or not.
Tom Lane [Sun, 5 Oct 2008 23:18:37 +0000 (23:18 +0000)]
Tweak the overflow checks in integer division functions to complain if the
machine produces zero (rather than the more usual minimum-possible-integer)
for the only possible overflow case. This has been seen to occur for at least
some word widths on some hardware, and it's cheap enough to check for
everywhere. Per Peter's analysis of buildfarm reports.
This could be back-patched, but in the absence of any gripes from the field
I doubt it's worth the trouble.
Tom Lane [Sun, 5 Oct 2008 22:20:17 +0000 (22:20 +0000)]
Fix markTargetListOrigin() to not fail on a simple-Var reference to a
recursive CTE that we're still in progress of analyzing. Add a similar guard
to the similar code in expandRecordVariable(), and tweak regression tests to
cover this case. Per report from Dickson S. Guedes.
Remove obsolete internal functions istrue, isfalse, isnottrue, isnotfalse,
nullvalue, nonvalue. A long time ago, these were used to implement the SQL
constructs IS TRUE, etc.
Tom Lane [Sat, 4 Oct 2008 21:56:55 +0000 (21:56 +0000)]
Implement SQL-standard WITH clauses, including WITH RECURSIVE.
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.
There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly. But let's land
the patch now so we can get on with other development.
Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
Add relation fork support to pg_relation_size() function. You can now pass
name of a fork ('main' or 'fsm', at the moment) to pg_relation_size() to
get the size of a specific fork. Defaults to 'main', if none given.
While we're at it, modify pg_relation_size to take a regclass as argument,
instead of separate variants taking oid and name. This change is
transparent to typical use where the table name is passed as a string
literal, like pg_relation_size('table'), but will break queries like
pg_relation_size(namecol), where namecol is of type name. text-type input
still works, and using a non-schema-qualified table name is not very
reliable anyway, so this is unlikely to break anyone's queries in practice.
Make the blkno arguments bigints instead of int4s. A signed int4 is not
large enough for block numbers higher than 2^31. The old pre-FSM-rewrite
pg_freespacemap implementation got this right. While we're at it, remove
some unnecessary #includes.
Tom Lane [Wed, 1 Oct 2008 19:51:50 +0000 (19:51 +0000)]
Improve tuplestore.c to support multiple concurrent read positions.
This facility replaces the former mark/restore support but is otherwise
upward-compatible with previous uses. It's expected to be needed for
single evaluation of CTEs and also for window functions, so I'm committing
it separately instead of waiting for either one of those patches to be
finished. Per discussion with Greg Stark and Hitoshi Harada.
Note: I removed nodeFunctionscan's mark/restore support, instead of bothering
to update it for this change, because it was dead code anyway.