Tom Lane [Sat, 30 Apr 2005 20:31:39 +0000 (20:31 +0000)]
Change catalog entries for record_out and record_send to show only one
argument, since that's all they are using now. Adjust type_sanity
regression test so that it will complain if anyone tries to define
multiple-argument output functions in future.
Tom Lane [Sat, 30 Apr 2005 20:04:33 +0000 (20:04 +0000)]
Make record_out and record_send extract type information from the passed
record object itself, rather than relying on a second OID argument to be
correct. This patch just changes the function behavior and not the
catalogs, so it's OK to back-patch to 8.0. Will remove the now-redundant
second argument in pg_proc in a separate patch in HEAD only.
Tom Lane [Sat, 30 Apr 2005 19:03:33 +0000 (19:03 +0000)]
Use the standard lock manager to establish priority order when there
is contention for a tuple-level lock. This solves the problem of a
would-be exclusive locker being starved out by an indefinite succession
of share-lockers. Per recent discussion with Alvaro.
Neil Conway [Sat, 30 Apr 2005 08:08:51 +0000 (08:08 +0000)]
GCC 4.0 includes a new warning option, -Wformat-literal, that emits
a warning when a variable is used as a format string for printf()
and similar functions (if the variable is derived from untrusted
data, it could include unexpected formatting sequences). This
emits too many warnings to be enabled by default, but it does
flag a few dubious constructs in the Postgres tree. This patch
fixes up the obvious variants: functions that are passed a variable
format string but no additional arguments.
Most of these are harmless (e.g. the ruleutils stuff), but there
is at least one actual bug here: if you create a trigger named
"%sfoo", pg_dump will read uninitialized memory and fail to dump
the trigger correctly.
Tom Lane [Fri, 29 Apr 2005 22:28:24 +0000 (22:28 +0000)]
Restructure LOCKTAG as per discussions of a couple months ago.
Essentially, we shoehorn in a lockable-object-type field by taking
a byte away from the lockmethodid, which can surely fit in one byte
instead of two. This allows less artificial definitions of all the
other fields of LOCKTAG; we can get rid of the special pg_xactlock
pseudo-relation, and also support locks on individual tuples and
general database objects (including shared objects). None of those
possibilities are actually exploited just yet, however.
I removed pg_xactlock from pg_class, but did not force initdb for
that change. At this point, relkind 's' (SPECIAL) is unused and
could be removed entirely.
Neil Conway [Fri, 29 Apr 2005 07:08:06 +0000 (07:08 +0000)]
This patch fixes a bug in the error message emitted by pg_restore on an
incorrect -F argument: write_msg() expects its first parameter to be a
"module name", not the format string.
Tom Lane [Thu, 28 Apr 2005 21:47:18 +0000 (21:47 +0000)]
Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets. When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX. This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared. Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.
Bruce Momjian [Thu, 28 Apr 2005 13:09:59 +0000 (13:09 +0000)]
Add psql \set ON_ERROR_ROLLBACK to allow statements in a transaction to
error without affecting the entire transaction. Valid values are
"on|interactive|off".
Tom Lane [Mon, 25 Apr 2005 21:03:25 +0000 (21:03 +0000)]
Fix ExpandIndirectionStar to handle cases where the expression to be
expanded is of RECORD type, eg
'select (foo).* from (select foo(f1) from t1) ss'
where foo() is a function declared with multiple OUT parameters.
Tom Lane [Mon, 25 Apr 2005 20:59:44 +0000 (20:59 +0000)]
get_expr_result_type probably needs to be able to handle OpExpr as well
as FuncExpr, to cover cases where a function returning tuple is invoked
via an operator.
Bruce Momjian [Mon, 25 Apr 2005 15:35:32 +0000 (15:35 +0000)]
Update description:
< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
> * Allow ORDER BY ... LIMIT # to select high/low value without sort or 868c868
< Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
> Right now, if no index exists, ORDER BY ... LIMIT # requires we sort 870a871
> MIN/MAX already does this, but not for LIMIT > 1.
Bruce Momjian [Mon, 25 Apr 2005 13:03:37 +0000 (13:03 +0000)]
Re-add item with better description:
> * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
> index using a sequential scan for highest/lowest values
>
> Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
> all values to return the high/low value. Instead The idea is to do a
> sequential scan to find the high/low value, thus avoiding the sort.
>
Tom Lane [Mon, 25 Apr 2005 03:58:30 +0000 (03:58 +0000)]
While determining the filter clauses for an index scan (either plain
or bitmap), use pred_test to be a little smarter about cases where a
filter clause is logically unnecessary. This may be overkill for the
plain indexscan case, but it's definitely useful for OR'd bitmap scans.
Tom Lane [Mon, 25 Apr 2005 02:14:48 +0000 (02:14 +0000)]
Replace slightly klugy create_bitmap_restriction() function with a
more efficient routine in restrictinfo.c (which can make use of
make_restrictinfo_internal).
Bruce Momjian [Mon, 25 Apr 2005 01:42:41 +0000 (01:42 +0000)]
Add description for concurrent sequential scans:
> One possible implementation is to start sequential scans from the lowest
> numbered buffer in the shared cache, and when reaching the end wrap
> around to the beginning, rather than always starting sequential scans
> at the start of the table.
Tom Lane [Mon, 25 Apr 2005 01:30:14 +0000 (01:30 +0000)]
Remove support for OR'd indexscans internal to a single IndexScan plan
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
Tom Lane [Sun, 24 Apr 2005 18:16:38 +0000 (18:16 +0000)]
Adjust nodeBitmapIndexscan.c to not keep the index open across calls,
but just to open and close it during MultiExecBitmapIndexScan. This
avoids acquiring duplicate resources (eg, multiple locks on the same
relation) in a tree with many bitmap scans. Also, don't bother to
lock the parent heap at all here, since we must be underneath a
BitmapHeapScan node that will be holding a suitable lock.
Bruce Momjian [Sun, 24 Apr 2005 12:39:07 +0000 (12:39 +0000)]
Update wording:
< This allows vacuum to reclaim free space without requiring
< a sequential scan
> This allows vacuum to target specific pages for possible free space
> without requiring a sequential scan.
Tom Lane [Sat, 23 Apr 2005 22:53:05 +0000 (22:53 +0000)]
Repair two TIME WITH TIME ZONE bugs found by Dennis Vshivkov. Comparison
of timetz values misbehaved in --enable-integer-datetime cases, and
EXTRACT(EPOCH) subtracted the zone instead of adding it in all cases.
Backpatch to all supported releases (except --enable-integer-datetime code
does not exist in 7.2).
Tom Lane [Sat, 23 Apr 2005 22:09:58 +0000 (22:09 +0000)]
Remove useless argtype_inherit() code, and make consequent simplifications.
As I pointed out a few days ago, this code has failed to do anything useful
for some time ... and if we did want to revive the capability to select
functions by nearness of inheritance ancestry, this is the wrong place
and way to do it anyway. The knowledge would need to go into
func_select_candidate() instead. Perhaps someday someone will be motivated
to do that, but I am not today.
Bruce Momjian [Sat, 23 Apr 2005 21:44:52 +0000 (21:44 +0000)]
Item already added to existing 'thread' item:
< * Consider parallel processing a single query
<
< This would involve using multiple threads or processes to do optimization,
< sorting, or execution of single query. The major advantage of such a
< feature would be to allow multiple CPUs to work together to process a
< single query.
<
Bruce Momjian [Sat, 23 Apr 2005 21:43:24 +0000 (21:43 +0000)]
Remove item, not sure what it refers to:
< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
< index using a sequential scan for highest/lowest values
<
< If only one value is needed, there is no need to sort the entire
< table. Instead a sequential scan could get the matching value.
<
Bruce Momjian [Sat, 23 Apr 2005 21:39:27 +0000 (21:39 +0000)]
Update threading item:
< Solaris) might benefit from threading.
> Solaris) might benefit from threading. Also explore the idea of
> a single session using multiple threads to execute a query faster.
Tom Lane [Sat, 23 Apr 2005 21:32:34 +0000 (21:32 +0000)]
Remove explicit FreeExprContext calls during plan node shutdown. The
ExprContexts will be freed anyway when FreeExecutorState() is reached,
and letting that routine do the work is more efficient because it will
automatically free the ExprContexts in reverse creation order. The
existing coding was effectively freeing them in exactly the worst
possible order, resulting in O(N^2) behavior inside list_delete_ptr,
which becomes highly visible in cases with a few thousand plan nodes.
ExecFreeExprContext is now effectively a no-op and could be removed,
but I left it in place in case we ever want to put it back to use.
Bruce Momjian [Sat, 23 Apr 2005 18:57:25 +0000 (18:57 +0000)]
Update FAQ items to point to existing web pages rather than duplication
such information. Remove MySQL mention. Move server-side debug item to
developer's FAQ. Update URLs.
Tom Lane [Sat, 23 Apr 2005 17:22:16 +0000 (17:22 +0000)]
Define the right-hand input of AT TIME ZONE as a full a_expr instead of
c_expr. Perhaps the restriction was once needed to avoid bison errors,
but it seems to work just fine now --- and even generates a slightly
smaller state machine. This change allows examples like
SELECT '13:45'::timetz AT TIME ZONE '-07:00'::interval;
to work without parentheses around the right-hand input.
Tom Lane [Sat, 23 Apr 2005 04:42:53 +0000 (04:42 +0000)]
Turns out that my recent elimination of the 'redundant' flatten_andors()
code in prepqual.c had a small drawback: the flatten_andors code was
able to cope with deeply nested AND/OR structures (like 10000 ORs in
a row), whereas eval_const_expressions tends to recurse until it
overruns the stack. Revise eval_const_expressions so that it doesn't
choke on deeply nested ANDs or ORs.
Tom Lane [Sat, 23 Apr 2005 01:57:34 +0000 (01:57 +0000)]
Teach choose_bitmap_and() to actually be choosy --- that is, try to
make some estimate of which available indexes to AND together, rather
than blindly taking 'em all. This could probably stand further
improvement, but it seems to do OK in simple tests.
Tom Lane [Fri, 22 Apr 2005 21:58:32 +0000 (21:58 +0000)]
First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working. Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever. orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
Bruce Momjian [Fri, 22 Apr 2005 15:40:16 +0000 (15:40 +0000)]
Fix typo:
< Currently indexes do not have enough tuple tuple visibility
< information to allow data to be pulled from the index without
< also accessing the heap. One way to allow this is to set a bit
< to index tuples to indicate if a tuple is currently visible to
< all transactions when the first valid heap lookup happens. This
< bit would have to be cleared when a heap tuple is expired.
> Currently indexes do not have enough tuple visibility information
> to allow data to be pulled from the index without also accessing
> the heap. One way to allow this is to set a bit to index tuples
> to indicate if a tuple is currently visible to all transactions
> when the first valid heap lookup happens. This bit would have to
> be cleared when a heap tuple is expired.
Tom Lane [Thu, 21 Apr 2005 19:18:13 +0000 (19:18 +0000)]
Rethink original decision to use AND/OR Expr nodes to represent bitmap
logic operations during planning. Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
Bruce Momjian [Thu, 21 Apr 2005 15:20:39 +0000 (15:20 +0000)]
Updated text for bitmaps:
< Bitmap indexes index single columns that can be combined with other bitmap
< indexes to dynamically create a composite index to match a specific query.
< Each index is a bitmap, and the bitmaps are bitwise AND'ed or OR'ed to be
< combined. They can index by tid or can be lossy requiring a scan of the
< heap page to find matching rows, or perhaps use a mixed solution where
< tids are recorded for pages with only a few matches and per-page bitmaps
< are used for more dense pages. Another idea is to use a 32-bit bitmap
< for every page and set a bit based on the item number mod(32).
> This feature allows separate indexes to be ANDed or ORed together. This
> is particularly useful for data warehousing applications that need to
> query the database in an many permutations. This feature scans an index
> and creates an in-memory bitmap, and allows that bitmap to be combined
> with other bitmap created in a similar way. The bitmap can either index
> all TIDs, or be lossy, meaning it records just page numbers and each
> page tuple has to be checked for validity in a separate pass.
Tom Lane [Wed, 20 Apr 2005 15:48:36 +0000 (15:48 +0000)]
Minor performance improvement: avoid unnecessary creation/unioning of
bitmaps for multiple indexscans. Instead just let each indexscan add
TIDs directly into the BitmapOr node's result bitmap.
Tom Lane [Tue, 19 Apr 2005 22:35:18 +0000 (22:35 +0000)]
Create executor and planner-backend support for decoupled heap and index
scans, using in-memory tuple ID bitmaps as the intermediary. The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed. I have tested it using some hacked planner
code that is far too ugly to see the light of day, however. Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
Bruce Momjian [Tue, 19 Apr 2005 03:55:43 +0000 (03:55 +0000)]
>>>>Luckily, PG 8 is available for this. Do you have a short example?
>>>
>>>No, and I think it should be in the manual as an example.
>>>
>>>You will need to enter a loop that uses exception handling to detect
>>>unique_violation.
>>
>>Pursuant to an IRC discussion to which Dennis Bjorklund and
>>Christopher Kings-Lynne made most of the contributions, please find
>>enclosed an example patch demonstrating an UPSERT-like capability.
>>
Bruce Momjian [Tue, 19 Apr 2005 03:37:20 +0000 (03:37 +0000)]
> >Luckily, PG 8 is available for this. Do you have a short example?
>
> No, and I think it should be in the manual as an example.
>
> You will need to enter a loop that uses exception handling to detect
> unique_violation.
Pursuant to an IRC discussion to which Dennis Bjorklund and
Christopher Kings-Lynne made most of the contributions, please find
enclosed an example patch demonstrating an UPSERT-like capability.