Tom Lane [Thu, 2 Sep 1999 02:57:50 +0000 (02:57 +0000)]
Repair a bunch of problems in md.c. This builds on Hiroshi's
insight that RelationFlushRelation ought to invoke smgrclose, and that the
way to make that work is to ensure that mdclose doesn't fail if the relation
is already closed (or unlinked, if we are looking at a DROP TABLE). While
I was testing that, I was able to identify several problems that we had
with multiple-segment relations. The system is now able to do initdb and
pass the regression tests with a very small segment size (I had it set to
64Kb per segment for testing). I don't believe that ever worked before.
File descriptor leaks seem to be gone too.
I have partially addressed the concerns we had about mdtruncate(), too.
On a Win32 or NFS filesystem it is not possible to unlink a file that
another backend is holding open, so what md.c now does is to truncate
unwanted files to zero length before trying to unlink them. The other
backends will be forced to close their open files by relation cache
invalidation --- but I think it would take considerable work to make
that happen before vacuum truncates the relation rather than after.
Leaving zero-length files lying around seems a usable compromise.
Fix wording on allowed/forbidden keyword usage.
Thanks to Michael Deck <deckm@cleansoft.com> for the tipoff.
Add more examples for language components.
Tom Lane [Tue, 31 Aug 1999 01:37:37 +0000 (01:37 +0000)]
Update frontend libpq to remove limits on query lengths,
error/notice message lengths, and number of fields per tuple. Add
pqexpbuffer.c/.h, a frontend version of backend's stringinfo module.
This is first step in applying Mike Ansley's long-query patches,
even though he didn't do any of these particular changes...
Tom Lane [Sun, 29 Aug 1999 01:35:11 +0000 (01:35 +0000)]
Correct broken entries for pg_proc OIDs 1364 (time(abstime))
and 1370 (timestamp(datetime)). This does not force an initdb, exactly,
but you won't see the effects of the bug fix until you do one.
BTW, OID 1358 for timespan(time) is still broken:
select timespan('21:11:26'::time);
ERROR: No such function 'time_timespan' with the specified attributes
But I couldn't figure out what it ought to be defined as, so I left it be.
Tom Lane [Sat, 28 Aug 1999 03:59:05 +0000 (03:59 +0000)]
Fix several problems in rule deparsing: didn't handle array
references or CASE expressions, didn't parenthesize complex expressions
properly. Also, always output variable references as fully qualified
names to eliminate ambiguity bug recently reported. (This could be
smarter, but reliability comes first.)
Tom Lane [Thu, 26 Aug 1999 05:09:06 +0000 (05:09 +0000)]
Clean up some mistakes in handling of uplevel Vars in planner.
Most parts of the planner should ignore, or indeed never even see, uplevel
Vars because they will be or have been replaced by Params. There were a
couple of places that got it wrong though, probably my fault from recent
changes...
Tom Lane [Thu, 26 Aug 1999 04:59:15 +0000 (04:59 +0000)]
Clean up some bugs in oper_select_candidate(), notably the
last loop which would return the *first* surviving-to-that-point candidate
regardless of which one actually passed the test. This was producing
such curious results as 'oid % 2' getting translated to 'int2(oid) % 2'.
Tom Lane [Wed, 25 Aug 1999 23:21:43 +0000 (23:21 +0000)]
Revise implementation of SubLinks so that there is a consistent,
documented intepretation of the lefthand and oper fields. Fix a number of
obscure problems while at it --- for example, the old code failed if the parser
decided to insert a type-coercion function just below the operator of a
SubLink.
CAUTION: this will break stored rules that contain subplans. You may
need to initdb.
Tatsuo Ishii [Wed, 25 Aug 1999 12:18:31 +0000 (12:18 +0000)]
Add new vpl_num_allocated_pages member to VPageListData.
It will keep track the number of pages allocated so that
vacuum could allocate twice of the previous allocation.
This will greatly reduce the total memory consumption of
vacuum.
Tom Lane [Tue, 24 Aug 1999 20:11:19 +0000 (20:11 +0000)]
Alter AllocSet routines so that requests larger than
ALLOC_BIGCHUNK_LIMIT are always allocated as separate malloc() blocks,
and are free()d immediately upon pfree(). Also, if such a chunk is enlarged
with repalloc(), translate the operation into a realloc() so as to
minimize memory usage. Of course, these large chunks still get freed
automatically if the alloc set is reset.
I have set ALLOC_BIGCHUNK_LIMIT at 64K for now, but perhaps another
size would be better?
Tom Lane [Tue, 24 Aug 1999 00:09:56 +0000 (00:09 +0000)]
coerce_type() failed to guard against trying to convert a NULL
constant to a different type. Not sure that this could happen in ordinary
parser usage, but it can in some new code I'm working on...
Tom Lane [Mon, 23 Aug 1999 23:48:39 +0000 (23:48 +0000)]
Remove bogus code in oper_exact --- if it didn't find an exact
match then it tried for a self-commutative operator with the reversed input
data types. This is pretty silly; there could never be such an operator,
except maybe in binary-compatible-type scenarios, and we have oper_inexact
for that. Besides which, the oprsanity regress test would complain about
such an operator. Remove nonfunctional code and simplify routine calling
convention accordingly.
Tom Lane [Sun, 22 Aug 1999 20:15:04 +0000 (20:15 +0000)]
Further planner/optimizer cleanups. Move all set_tlist_references
and fix_opids processing to a single recursive pass over the plan tree
executed at the very tail end of planning, rather than haphazardly here
and there at different places. Now that tlist Vars do not get modified
until the very end, it's possible to get rid of the klugy var_equal and
match_varid partial-matching routines, and just use plain equal()
throughout the optimizer. This is a step towards allowing merge and
hash joins to be done on expressions instead of only Vars ...
Tom Lane [Sat, 21 Aug 1999 03:49:17 +0000 (03:49 +0000)]
Major revision of sort-node handling: push knowledge of query
sort order down into planner, instead of handling it only at the very top
level of the planner. This fixes many things. An explicit sort is now
avoided if there is a cheaper alternative (typically an indexscan) not
only for ORDER BY, but also for the internal sort of GROUP BY. It works
even when there is no other reason (such as a WHERE condition) to consider
the indexscan. It works for indexes on functions. It works for indexes
on functions, backwards. It's just so cool...
CAUTION: I have changed the representation of SortClause nodes, therefore
THIS UPDATE BREAKS STORED RULES. You will need to initdb.
Tom Lane [Sat, 21 Aug 1999 03:06:58 +0000 (03:06 +0000)]
Cleanups for int8: guard against null inputs in comparison
operators (and some other places), fix rangechecks in int8 to int4
conversion (same problem we recently figured out in pg_atoi).
Tom Lane [Wed, 18 Aug 1999 04:15:16 +0000 (04:15 +0000)]
Remove extraneous SeqScan node that make_noname was inserting
above a Sort or Materialize node. As far as I can tell, the only place
that actually needed that was set_tlist_references, which was being lazy
about checking to see if it had a noname node to fix or not...
Tom Lane [Tue, 17 Aug 1999 21:21:22 +0000 (21:21 +0000)]
Add script that runs the regression tests with all valid
combinations of query-plan-type backend options. Good for testing
planner/optimizer. Tedious, though.
Tom Lane [Mon, 16 Aug 1999 23:07:20 +0000 (23:07 +0000)]
Assign sort keys properly when there are duplicate entries in
pathkey list --- corrects misbehavior seen with multiple mergejoin clauses
mentioning same variable.
Bruce Momjian [Mon, 16 Aug 1999 20:27:19 +0000 (20:27 +0000)]
I've sent 3 mails to pgsql-patches. There are two files, one for doc
and
for src/data directories, and one minor patch for doc/README.locale.
Please apply.
Tom Lane [Mon, 16 Aug 1999 02:17:58 +0000 (02:17 +0000)]
Major planner/optimizer revision: get rid of PathOrder node type,
store all ordering information in pathkeys lists (which are now lists of
lists of PathKeyItem nodes, not just lists of lists of vars). This was
a big win --- the code is smaller and IMHO more understandable than it
was, even though it handles more cases. I believe the node changes will
not force an initdb for anyone; planner nodes don't show up in stored
rules.
Repair the check for redundant UNIQUE and PRIMARY KEY indices.
Also, improve it so that it checks for multi-column constraints.
Thanks to Mark Dalphin <mdalphin@amgen.com> for reporting the problem.
Tom Lane [Sat, 14 Aug 1999 19:29:35 +0000 (19:29 +0000)]
LispUnion routine didn't generate a proper union: anytime
l2 contained more than one entry, there would be duplicates in the output
list. Miscellaneous code beautification in other routines, too.
Tom Lane [Thu, 12 Aug 1999 04:32:54 +0000 (04:32 +0000)]
Clean up optimizer's handling of indexscan quals that need to be
commuted (ie, the index var appears on the right). These are now handled
the same way as merge and hash join quals that need to be commuted: the
actual reversing of the clause only happens if we actually choose the path
and generate a plan from it. Furthermore, the clause is only reversed in
the 'indexqual' field of the plan, not in the 'indxqualorig' field. This
allows the clause to still be recognized and removed from qpquals of upper
level join plans. Also, simplify and generalize match_clause_to_indexkey;
now it recognizes binary-compatible indexes for join as well as restriction
clauses.
Tom Lane [Thu, 12 Aug 1999 00:42:43 +0000 (00:42 +0000)]
Add commentary to show that even though ExecInitIndexScan()
contains much code that looks like it will handle indexquals with the index
key on either side of the operator, in fact indexquals must have the index
key on the left because of limitations of the ScanKey machinery. Perhaps
someone will be motivated to fix that someday...
Tom Lane [Tue, 10 Aug 1999 02:58:56 +0000 (02:58 +0000)]
Revise create_nestloop_node's handling of inner indexscan to
work under a wider range of scenarios than it did --- it formerly did not
handle a multi-pass inner scan, nor cases in which the inner scan's
indxqualorig or non-index qual contained outer var references. I am not
sure that these limitations could be hit in the existing optimizer, but
they need to be fixed for future expansion.
Bruce Momjian [Mon, 9 Aug 1999 06:20:27 +0000 (06:20 +0000)]
> > Prevent sorting if result is already sorted
> >
> > was implemented by Jan Wieck.
> > His work is for ascending order cases.
> >
> > Here is a patch to prevent sorting also in descending
> > order cases.
> > Because I had already changed _bt_first() to position
> > backward correctly before v6.5,this patch would work.
> >
Hiroshi Inoue
Inoue@tpf.co.jp
Tom Lane [Mon, 9 Aug 1999 01:01:42 +0000 (01:01 +0000)]
Rewrite fix_indxqual_references, which was entirely bogus for
multi-scan indexscan plans; it tried to use the same table-to-index
attribute mapping for all the scans, even if they used different indexes.
It would klugily work as long as OR indexquals never used multikey indexes,
but that's not likely to hold up much longer...
Tom Lane [Mon, 9 Aug 1999 00:51:26 +0000 (00:51 +0000)]
Create a standardized expression_tree_mutator support routine
to go along with expression_tree_walker. (_walker is not suitable for
routines that need to alter the tree structure significantly.) Other minor
cleanups in clauses.c.
Tom Lane [Sun, 8 Aug 1999 20:12:52 +0000 (20:12 +0000)]
Fix nbtree's failure to clear BTScans list during xact abort.
Also, move responsibility for calling vc_abort into main xact.c list of
things-to-call-at-abort. What in the world was it doing down inside of
TransactionIdAbort()?
Fix cross-reference markup so that only the *title* of the Operators
chapter is included, not the chapter itself.
Thanks to Evelyn Mitchell <efm@tummy.com> for pointing it out.
Remove explicit references to ref/ path in file names; use vpath instead.
Fix rules for man pages to ensure double-pass to get cross references.
Add a few new man pages.
Try to clarify characteristics of the SERIAL type.
Fix source indenting, which does not affect output.
Note: still need docs on NUMERIC and DECIMAL
(and let's not talk about regression tests :()
Tom Lane [Fri, 6 Aug 1999 04:00:17 +0000 (04:00 +0000)]
Revise generation of hashjoin paths: generate one path per
hashjoinable clause, not one path for a randomly-chosen element of each
set of clauses with the same join operator. That is, if you wrote
SELECT ... WHERE t1.f1 = t2.f2 and t1.f3 = t2.f4,
and both '=' ops were the same opcode (say, all four fields are int4),
then the system would either consider hashing on f1=f2 or on f3=f4,
but it would *not* consider both possibilities. Boo hiss.
Also, revise estimation of hashjoin costs to include a penalty when the
inner join var has a high disbursion --- ie, the most common value is
pretty common. This tends to lead to badly skewed hash bucket occupancy
and way more comparisons than you'd expect on average.
I imagine that the cost calculation still needs tweaking, but at least
it generates a more reasonable plan than before on George Young's example.
Tom Lane [Thu, 5 Aug 1999 02:33:54 +0000 (02:33 +0000)]
Revise parse_coerce() to handle coercion of int and float
constants, not only string constants, at parse time. Get rid of
parser_typecast2(), which is bogus and redundant...
Tom Lane [Tue, 3 Aug 1999 00:09:32 +0000 (00:09 +0000)]
Fix ELF test so it doesn't spit up on all non-ELF systems...
use Autoconf-approved method of testing for predefined symbols, and move
it down to where we know what compiler to run and how to run it.
Tom Lane [Mon, 2 Aug 1999 02:05:41 +0000 (02:05 +0000)]
Further selectivity-estimation work. Speed up eqsel()
(it should just call the given operator, not look up an = operator).
Fix intltsel() so that all numeric data types are converted to double
before trying to estimate where the given comparison value is in the
known range of column values. intltsel() still needs work, or replacement,
for non-numeric data types ... but for nonintegral numeric types it
should now be delivering reasonable estimates.
Bruce Momjian [Sun, 1 Aug 1999 16:30:05 +0000 (16:30 +0000)]
I didn't see any further discussion so here is, I hope, a clean fix to
configure.in to determine if a system is ELF or not. Note that some
of the tests earlier may be redundant but I took the safest route.
Tom Lane [Sun, 1 Aug 1999 04:54:25 +0000 (04:54 +0000)]
First step in fixing selectivity-estimation code. eqsel and
neqsel now behave as per my suggestions in pghackers a few days ago.
selectivity for < > <= >= should work OK for integral types as well, but
still need work for nonintegral types. Since these routines have never
actually executed before :-(, this may result in some significant changes
in the optimizer's choices of execution plans. Let me know if you see
any serious misbehavior.
CAUTION: THESE CHANGES REQUIRE INITDB. pg_statistic table has changed.
Tom Lane [Fri, 30 Jul 1999 04:07:25 +0000 (04:07 +0000)]
Further cleanups of indexqual processing: simplify control
logic in indxpath.c, avoid generation of redundant indexscan paths for the
same relation and index.