From 99f964fcc6a4fe05bc8504a890e44b922373f759 Mon Sep 17 00:00:00 2001 From: Bruce Momjian Date: Mon, 29 Jan 2001 17:57:26 +0000 Subject: [PATCH] Remove unused TODO.detail functions. --- doc/TODO | 10 +- doc/TODO.detail/function | 519 --------------------------------------- doc/TODO.detail/logging | 285 --------------------- doc/TODO.detail/outer | 392 ----------------------------- 4 files changed, 5 insertions(+), 1201 deletions(-) delete mode 100644 doc/TODO.detail/function delete mode 100644 doc/TODO.detail/logging delete mode 100644 doc/TODO.detail/outer diff --git a/doc/TODO b/doc/TODO index 443b2f28f7..0aab52065e 100644 --- a/doc/TODO +++ b/doc/TODO @@ -56,7 +56,7 @@ ENHANCEMENTS URGENT -* -Add OUTER joins, left and right[outer] (Tom, Thomas) +* -Add OUTER joins, left and right (Tom, Thomas) * -Allow long tuples by chaining or auto-storing outside db (TOAST) (Jan) * -Fix memory leak for expressions (Tom) * Add replication of distributed databases [replication] @@ -95,7 +95,7 @@ TYPES o -Allow large object vacuuming o -Tables that start with xinv confused to be large objects * Add IPv6 capability to INET/CIDR types -* -Fix improper masking of some inet/cidr types [cidr] +* -Fix improper masking of some inet/cidr types * Add conversion function from text to inet * Make a separate SERIAL type? * Store binary-compatible type information in the system @@ -224,7 +224,7 @@ EXOTIC FEATURES * Add the concept of dataspaces/tablespaces [tablespaces] * Allow queries across multiple databases * Allow nested transactions (Vadim) -* Allow [INSERT/UPDATE] ... RETURNING new.col or old.col (Philip) +* Allow INSERT/UPDATE ... RETURNING new.col or old.col (Philip) * SQL*Net listener that makes PostgreSQL appear as an Oracle database to clients * Incremental backups @@ -242,13 +242,13 @@ MISCELLANEOUS * Allow cursors to be DECLAREd/OPENed/CLOSEed outside transactions * Allow DELETE WHERE CURRENT OF cursor * -Transaction log, so re-do log can be on a separate disk by - with after-row images (Vadim) [logging] + with after-row images (Vadim) * Populate backend status area and write program to dump status data * Make oid use unsigned int more reliably, pg_atoi() * Put sort files in their own directory * Allow autocommit so always in a transaction block * Show location of syntax error in query [yacc] -* -Redesign the function call interface to handle NULLs better[function] (Tom) +* -Redesign the function call interface to handle NULLs better (Tom) * Missing optimizer selectivities for date, r-tree, etc. [optimizer] * Overhaul bufmgr/lockmgr/transaction manager * -redesign UNION structures to have separarate target lists diff --git a/doc/TODO.detail/function b/doc/TODO.detail/function deleted file mode 100644 index 84dc48f905..0000000000 --- a/doc/TODO.detail/function +++ /dev/null @@ -1,519 +0,0 @@ -From owner-pgsql-hackers@hub.org Wed Sep 22 20:31:02 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA15611 - for ; Wed, 22 Sep 1999 20:31:01 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id UAA02926 for ; Wed, 22 Sep 1999 20:21:24 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id UAA75413; - Wed, 22 Sep 1999 20:09:35 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 20:08:50 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id UAA75058 - for pgsql-hackers-outgoing; Wed, 22 Sep 1999 20:06:58 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.9.3/8.9.3) with ESMTP id UAA74982 - for ; Wed, 22 Sep 1999 20:06:25 -0400 (EDT) - (envelope-from tgl@sss.pgh.pa.us) -Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) - by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id UAA06411 - for ; Wed, 22 Sep 1999 20:05:40 -0400 (EDT) -To: pgsql-hackers@postgreSQL.org -Subject: [HACKERS] Progress report: buffer refcount bugs and SQL functions -Date: Wed, 22 Sep 1999 20:05:39 -0400 -Message-ID: <6408.938045139@sss.pgh.pa.us> -From: Tom Lane -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -I have been finding a lot of interesting stuff while looking into -the buffer reference count/leakage issue. - -It turns out that there were two specific things that were camouflaging -the existence of bugs in this area: - -1. The BufferLeakCheck routine that's run at transaction commit was -only looking for nonzero PrivateRefCount to indicate a missing unpin. -It failed to notice nonzero LastRefCount --- which meant that an -error in refcount save/restore usage could leave a buffer pinned, -and BufferLeakCheck wouldn't notice. - -2. The BufferIsValid macro, which you'd think just checks whether -it's handed a valid buffer identifier or not, actually did more: -it only returned true if the buffer ID was valid *and* the buffer -had positive PrivateRefCount. That meant that the common pattern - if (BufferIsValid(buf)) - ReleaseBuffer(buf); -wouldn't complain if it were handed a valid but already unpinned buffer. -And that behavior masks bugs that result in buffers being unpinned too -early. For example, consider a sequence like - -1. LockBuffer (buffer now has refcount 1). Store reference to - a tuple on that buffer page in a tuple table slot. -2. Copy buffer reference to a second tuple-table slot, but forget to - increment buffer's refcount. -3. Release second tuple table slot. Buffer refcount drops to 0, - so it's unpinned. -4. Release original tuple slot. Because of BufferIsValid behavior, - no assert happens here; in fact nothing at all happens. - -This is, of course, buggy code: during the interval from 3 to 4 you -still have an apparently valid tuple reference in the original slot, -which someone might try to use; but the buffer it points to is unpinned -and could be replaced at any time by another backend. - -In short, we had errors that would mask both missing-pin bugs and -missing-unpin bugs. And naturally there were a few such bugs lurking -behind them... - -3. The buffer refcount save/restore stuff, which I had suspected -was useless, is not only useless but also buggy. The reason it's -buggy is that it only works if used in a nested fashion. You could -save state A, pin some buffers, save state B, pin some more -buffers, restore state B (thereby unpinning what you pinned since -the save), and finally restore state A (unpinning the earlier stuff). -What you could not do is save state A, pin, save B, pin more, then -restore state A --- that might unpin some of A's buffers, or some -of B's buffers, or some unforeseen combination thereof. If you -restore A and then restore B, you do not necessarily return to a zero- -pins state, either. And it turns out the actual usage pattern was a -nearly random sequence of saves and restores, compounded by a failure to -do all of the restores reliably (which was masked by the oversight in -BufferLeakCheck). - - -What I have done so far is to rip out the buffer refcount save/restore -support (including LastRefCount), change BufferIsValid to a simple -validity check (so that you get an assert if you unpin something that -was pinned), change ExecStoreTuple so that it increments the refcount -when it is handed a buffer reference (for symmetry with ExecClearTuple's -decrement of the refcount), and fix about a dozen bugs exposed by these -changes. - -I am still getting Buffer Leak notices in the "misc" regression test, -specifically in the queries that invoke more than one SQL function. -What I find there is that SQL functions are not always run to -completion. Apparently, when a function can return multiple tuples, -it won't necessarily be asked to produce them all. And when it isn't, -postquel_end() isn't invoked for the function's current query, so its -tuple table isn't cleared, so we have dangling refcounts if any of the -tuples involved are in disk buffers. - -It may be that the save/restore code was a misguided attempt to fix -this problem. I can't tell. But I think what we really need to do is -find some way of ensuring that Postquel function execution contexts -always get shut down by the end of the query, so that they don't leak -resources. - -I suppose a straightforward approach would be to keep a list of open -function contexts somewhere (attached to the outer execution context, -perhaps), and clean them up at outer-plan shutdown. - -What I am wondering, though, is whether this addition is actually -necessary, or is it a bug that the functions aren't run to completion -in the first place? I don't really understand the semantics of this -"nested dot notation". I suppose it is a Berkeleyism; I can't find -anything about it in the SQL92 document. The test cases shown in the -misc regress test seem peculiar, not to say wrong. For example: - -regression=> SELECT p.hobbies.equipment.name, p.hobbies.name, p.name FROM person p; -name |name |name --------------+-----------+----- -advil |posthacking|mike -peet's coffee|basketball |joe -hightops |basketball |sally -(3 rows) - -which doesn't appear to agree with the contents of the underlying -relations: - -regression=> SELECT * FROM hobbies_r; -name |person ------------+------ -posthacking|mike -posthacking|jeff -basketball |joe -basketball |sally -skywalking | -(5 rows) - -regression=> SELECT * FROM equipment_r; -name |hobby --------------+----------- -advil |posthacking -peet's coffee|posthacking -hightops |basketball -guts |skywalking -(4 rows) - -I'd have expected an output along the lines of - -advil |posthacking|mike -peet's coffee|posthacking|mike -hightops |basketball |joe -hightops |basketball |sally - -Is the regression test's expected output wrong, or am I misunderstanding -what this query is supposed to do? Is there any documentation anywhere -about how SQL functions returning multiple tuples are supposed to -behave? - - regards, tom lane - -************ - - -From owner-pgsql-hackers@hub.org Thu Sep 23 11:03:19 1999 -Received: from hub.org (hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16211 - for ; Thu, 23 Sep 1999 11:03:17 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id KAA58151; - Thu, 23 Sep 1999 10:53:46 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:53:05 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id KAA57948 - for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:52:23 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.9.3/8.9.3) with ESMTP id KAA57841 - for ; Thu, 23 Sep 1999 10:51:50 -0400 (EDT) - (envelope-from tgl@sss.pgh.pa.us) -Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) - by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA14211; - Thu, 23 Sep 1999 10:51:10 -0400 (EDT) -To: Andreas Zeugswetter -cc: hackers@postgreSQL.org -Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions -In-reply-to: Your message of Thu, 23 Sep 1999 10:07:24 +0200 - <37E9DFBC.5C0978F@telecom.at> -Date: Thu, 23 Sep 1999 10:51:10 -0400 -Message-ID: <14209.938098270@sss.pgh.pa.us> -From: Tom Lane -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -Andreas Zeugswetter writes: -> That is what I use it for. I have never used it with a -> returns setof function, but reading the comments in the regression test, -> -- mike needs advil and peet's coffee, -> -- joe and sally need hightops, and -> -- everyone else is fine. -> it looks like the results you expected are correct, and currently the -> wrong result is given. - -Yes, I have concluded the same (and partially fixed it, per my previous -message). - -> Those that don't have a hobbie should return name|NULL|NULL. A hobbie -> that does'nt need equipment name|hobbie|NULL. - -That's a good point. Currently (both with and without my uncommitted -fix) you get *no* rows out from ExecTargetList if there are any Iters -that return empty result sets. It might be more reasonable to treat an -empty result set as if it were NULL, which would give the behavior you -suggest. - -This would be an easy change to my current patch, and I'm prepared to -make it before committing what I have, if people agree that that's a -more reasonable definition. Comments? - - regards, tom lane - -************ - - -From owner-pgsql-hackers@hub.org Thu Sep 23 04:31:15 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA11344 - for ; Thu, 23 Sep 1999 04:31:15 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id EAA05350 for ; Thu, 23 Sep 1999 04:24:29 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id EAA85679; - Thu, 23 Sep 1999 04:16:26 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 04:09:52 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id EAA84708 - for pgsql-hackers-outgoing; Thu, 23 Sep 1999 04:08:57 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from gandalf.telecom.at (gandalf.telecom.at [194.118.26.84]) - by hub.org (8.9.3/8.9.3) with ESMTP id EAA84632 - for ; Thu, 23 Sep 1999 04:08:03 -0400 (EDT) - (envelope-from andreas.zeugswetter@telecom.at) -Received: from telecom.at (w0188000580.f000.d0188.sd.spardat.at [172.18.65.249]) - by gandalf.telecom.at (xxx/xxx) with ESMTP id KAA195294 - for ; Thu, 23 Sep 1999 10:07:27 +0200 -Message-ID: <37E9DFBC.5C0978F@telecom.at> -Date: Thu, 23 Sep 1999 10:07:24 +0200 -From: Andreas Zeugswetter -X-Mailer: Mozilla 4.61 [en] (Win95; I) -X-Accept-Language: en -MIME-Version: 1.0 -To: hackers@postgreSQL.org -Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -> Is the regression test's expected output wrong, or am I -> misunderstanding -> what this query is supposed to do? Is there any -> documentation anywhere -> about how SQL functions returning multiple tuples are supposed to -> behave? - -They are supposed to behave somewhat like a view. -Not all rows are necessarily fetched. -If used in a context that needs a single row answer, -and the answer has multiple rows it is supposed to -runtime elog. Like in: - -select * from tbl where col=funcreturningmultipleresults(); --- this must elog - -while this is ok: -select * from tbl where col in (select funcreturningmultipleresults()); - -But the caller could only fetch the first row if he wanted. - -The nested notation is supposed to call the function passing it the tuple -as the first argument. This is what can be used to "fake" a column -onto a table (computed column). -That is what I use it for. I have never used it with a -returns setof function, but reading the comments in the regression test, --- mike needs advil and peet's coffee, --- joe and sally need hightops, and --- everyone else is fine. -it looks like the results you expected are correct, and currently the -wrong result is given. - -But I think this query could also elog whithout removing substantial -functionality. - -SELECT p.name, p.hobbies.name, p.hobbies.equipment.name FROM person p; - -Actually for me it would be intuitive, that this query return one row per -person, but elog on those that have more than one hobbie or a hobbie that -needs more than one equipment. Those that don't have a hobbie should -return name|NULL|NULL. A hobbie that does'nt need equipment name|hobbie|NULL. - -Andreas - -************ - - -From owner-pgsql-hackers@hub.org Wed Sep 22 22:01:07 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA16360 - for ; Wed, 22 Sep 1999 22:01:05 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id VAA08386 for ; Wed, 22 Sep 1999 21:37:24 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id VAA88083; - Wed, 22 Sep 1999 21:28:11 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 21:27:48 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id VAA87938 - for pgsql-hackers-outgoing; Wed, 22 Sep 1999 21:26:52 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) - by hub.org (8.9.3/8.9.3) with SMTP id VAA87909 - for ; Wed, 22 Sep 1999 21:26:36 -0400 (EDT) - (envelope-from wieck@debis.com) -Received: by orion.SAPserv.Hamburg.dsh.de - for pgsql-hackers@postgresql.org - id m11TxXw-0003kLC; Thu, 23 Sep 99 03:19 MET DST -Message-Id: -From: wieck@debis.com (Jan Wieck) -Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions -To: tgl@sss.pgh.pa.us (Tom Lane) -Date: Thu, 23 Sep 1999 03:19:39 +0200 (MET DST) -Cc: pgsql-hackers@postgreSQL.org -Reply-To: wieck@debis.com (Jan Wieck) -In-Reply-To: <6408.938045139@sss.pgh.pa.us> from "Tom Lane" at Sep 22, 99 08:05:39 pm -X-Mailer: ELM [version 2.4 PL25] -Content-Type: text -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -Tom Lane wrote: - -> [...] -> -> What I am wondering, though, is whether this addition is actually -> necessary, or is it a bug that the functions aren't run to completion -> in the first place? I don't really understand the semantics of this -> "nested dot notation". I suppose it is a Berkeleyism; I can't find -> anything about it in the SQL92 document. The test cases shown in the -> misc regress test seem peculiar, not to say wrong. For example: -> -> [...] -> -> Is the regression test's expected output wrong, or am I misunderstanding -> what this query is supposed to do? Is there any documentation anywhere -> about how SQL functions returning multiple tuples are supposed to -> behave? - - I've said some time (maybe too long) ago, that SQL functions - returning tuple sets are broken in general. This nested dot - notation (which I think is an artefact from the postquel - querylanguage) is implemented via set functions. - - Set functions have total different semantics from all other - functions. First they don't really return a tuple set as - someone might think - all that screwed up code instead - simulates that they return something you could consider a - scan of the last SQL statement in the function. Then, on - each subsequent call inside of the same command, they return - a "tupletable slot" containing the next found tuple (that's - why their Func node is mangled up after the first call). - - Second they have a targetlist what I think was originally - intended to extract attributes out of the tuples returned - when the above scan is asked to get the next tuple. But as I - read the code it invokes the function again and this might - cause the resource leakage you see. - - Third, all this seems to never have been implemented - (thought?) to the end. A targetlist doesn't make sense at - this place because it could at max contain a single attribute - - so a single attno would have the same power. And if set - functions could appear in the rangetable (FROM clause), than - they would be treated as that and regular Var nodes in the - query would do it. - - I think you shouldn't really care for that regression test - and maybe we should disable set functions until we really - implement stored procedures returning sets in the rangetable. - - Set functions where planned by Stonebraker's team as - something that today is called stored procedures. But AFAIK - they never reached the useful state because even in Postgres - 4.2 you haven't been able to get more than one attribute out - of a set function. It was a feature of the postquel - querylanguage that you could get one attribute from a set - function via - - RETRIEVE (attributename(setfuncname())) - - While working on the constraint triggers I've came across - another regression test (triggers :-) that's errorneous too. - The funny_dup17 trigger proc executes an INSERT into the same - relation where it get fired for by a previous INSERT. And it - stops this recursion only if it reaches a nesting level of - 17, which could only occur if it is fired DURING the - execution of it's own SPI_exec(). After Vadim quouted some - SQL92 definitions about when constraint checks and triggers - are to be executed, I decided to fire regular triggers at the - end of a query too. Thus, there is absolutely no nesting - possible for AFTER triggers resulting in an endless loop. - - -Jan - --- - -#======================================================================# -# It's easier to get forgiveness for being wrong than for being right. # -# Let's break this rule - forgive me. # -#========================================= wieck@debis.com (Jan Wieck) # - - - -************ - - -From owner-pgsql-hackers@hub.org Thu Sep 23 11:01:06 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16162 - for ; Thu, 23 Sep 1999 11:01:04 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id KAA28544 for ; Thu, 23 Sep 1999 10:45:54 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id KAA52943; - Thu, 23 Sep 1999 10:20:51 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:19:58 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id KAA52472 - for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:19:03 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) - by hub.org (8.9.3/8.9.3) with ESMTP id KAA52431 - for ; Thu, 23 Sep 1999 10:18:47 -0400 (EDT) - (envelope-from tgl@sss.pgh.pa.us) -Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) - by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA13253; - Thu, 23 Sep 1999 10:18:02 -0400 (EDT) -To: wieck@debis.com (Jan Wieck) -cc: pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions -In-reply-to: Your message of Thu, 23 Sep 1999 03:19:39 +0200 (MET DST) - -Date: Thu, 23 Sep 1999 10:18:01 -0400 -Message-ID: <13251.938096281@sss.pgh.pa.us> -From: Tom Lane -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -wieck@debis.com (Jan Wieck) writes: -> Tom Lane wrote: ->> What I am wondering, though, is whether this addition is actually ->> necessary, or is it a bug that the functions aren't run to completion ->> in the first place? - -> I've said some time (maybe too long) ago, that SQL functions -> returning tuple sets are broken in general. - -Indeed they are. Try this on for size (using the regression database): - - SELECT p.name, p.hobbies.equipment.name FROM person p; - SELECT p.hobbies.equipment.name, p.name FROM person p; - -You get different result sets!? - -The problem in this example is that ExecTargetList returns the isDone -flag from the last targetlist entry, regardless of whether there are -incomplete iterations in previous entries. More generally, the buffer -leak problem that I started with only occurs if some Iter nodes are not -run to completion --- but execQual.c has no mechanism to make sure that -they have all reached completion simultaneously. - -What we really need to make functions-returning-sets work properly is -an implementation somewhat like aggregate functions. We need to make -a list of all the Iter nodes present in a targetlist and cycle through -the values returned by each in a methodical fashion (run the rightmost -through its full cycle, then advance the next-to-rightmost one value, -run the rightmost through its cycle again, etc etc). Also there needs -to be an understanding of the hierarchy when an Iter appears in the -arguments of another Iter's function. (You cycle the upper one for -*each* set of arguments created by cycling its sub-Iters.) - -I am not particularly interested in working on this feature right now, -since AFAIK it's a Berkeleyism not found in SQL92. What I've done -is to hack ExecTargetList so that it behaves semi-sanely when there's -more than one Iter at the top level of the target list --- it still -doesn't really give the right answer, but at least it will keep -generating tuples until all the Iters are done at the same time. -It happens that that's enough to give correct answers for the examples -shown in the misc regress test. Even when it fails to generate all -the possible combinations, there will be no buffer leaks. - -So, I'm going to declare victory and go home ;-). We ought to add a -TODO item along the lines of - * Functions returning sets don't really work right -in hopes that someone will feel like tackling this someday. - - regards, tom lane - -************ - - diff --git a/doc/TODO.detail/logging b/doc/TODO.detail/logging deleted file mode 100644 index fa8e2dd9d3..0000000000 --- a/doc/TODO.detail/logging +++ /dev/null @@ -1,285 +0,0 @@ -From owner-pgsql-hackers@hub.org Fri Nov 13 13:24:37 1998 -Received: from hub.org (majordom@hub.org [209.47.148.200]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13457 - for ; Fri, 13 Nov 1998 13:24:35 -0500 (EST) -Received: from localhost (majordom@localhost) - by hub.org (8.9.1/8.9.1) with SMTP id NAA02464; - Fri, 13 Nov 1998 13:22:52 -0500 (EST) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Nov 1998 13:21:14 +0000 (EST) -Received: (from majordom@localhost) - by hub.org (8.9.1/8.9.1) id NAA02331 - for pgsql-hackers-outgoing; Fri, 13 Nov 1998 13:21:12 -0500 (EST) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) - by hub.org (8.9.1/8.9.1) with SMTP id NAA02316 - for ; Fri, 13 Nov 1998 13:21:06 -0500 (EST) - (envelope-from wieck@sapserv.debis.de) -Received: by orion.SAPserv.Hamburg.dsh.de - for pgsql-hackers@postgreSQL.org - id m0zeOEf-000EBPC; Fri, 13 Nov 98 19:46 MET -Message-Id: -From: jwieck@debis.com (Jan Wieck) -Subject: [HACKERS] shmem limits and redolog -To: pgsql-hackers@postgreSQL.org (PostgreSQL HACKERS) -Date: Fri, 13 Nov 1998 19:46:20 +0100 (MET) -Reply-To: jwieck@debis.com (Jan Wieck) -X-Mailer: ELM [version 2.4 PL25] -Content-Type: text -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: ROr - -Hi, - - I'm currently hacking around on a solution for logging all - database operations at query level that can recover a crashed - database from the last successful backup by redoing all the - commands. - - Well, I wanted it to be as flexible as can. So I decided to - make it per database configurable. One could say which - databases are logged and if a database is, if it is logged - sync or async (in sync mode, every COMMIT forces an fsync of - the actual logfile and controlfiles). - - To make async mode as fast as can, I'm using a shared memory - of 32K per database (not per backend) that is used as a wrap - around buffer from the backends to place their query - information. So the log writer can fall a little behind if - there are many backends doing different things that don't - lock each other. - - Now I'm a little in doubt about the shared memory limits - reported. Was it a good decision to use shared memory? Am I - better off using socket's? - - The bad thing in what I have up to now (it's far from - complete) is, that even if a database isn't currently logged, - a redolog writer is started and creates the 32K shmem segment - (plus a semaphore set with 5 semaphores). This is because I - plan to create commands like - - ALTER DATABASE LOG MODE=ASYNC LOGDIR='/somewhere/dbname'; - - and the like that can be used at runtime (while more than one - backend is connected to the database) to turn logging on/off, - switch to/from backup mode (all other activity is stopped) - etc. - - So every 32 databases will require another megabyte of shared - memory. The logging master controls which databases have - activity and kills redolog writers after some time of - inactivity, and the shmem is freed then. But it can hurt if - someone really has many many databases that are all used at - the same time. - - What do the others say? - - -Jan - --- - -#======================================================================# -# It's easier to get forgiveness for being wrong than for being right. # -# Let's break this rule - forgive me. # -#======================================== jwieck@debis.com (Jan Wieck) # - - - - -From owner-pgsql-hackers@hub.org Wed Dec 16 15:46:41 1998 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA00521 - for ; Wed, 16 Dec 1998 15:46:40 -0500 (EST) -Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id PAA08772 for ; Wed, 16 Dec 1998 15:10:01 -0500 (EST) -Received: from localhost (majordom@localhost) - by hub.org (8.9.1/8.9.1) with SMTP id PAA01254; - Wed, 16 Dec 1998 15:06:56 -0500 (EST) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Dec 1998 14:58:11 +0000 (EST) -Received: (from majordom@localhost) - by hub.org (8.9.1/8.9.1) id OAA00660 - for pgsql-hackers-outgoing; Wed, 16 Dec 1998 14:58:10 -0500 (EST) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) - by hub.org (8.9.1/8.9.1) with SMTP id OAA00643 - for ; Wed, 16 Dec 1998 14:58:05 -0500 (EST) - (envelope-from wieck@sapserv.debis.de) -Received: by orion.SAPserv.Hamburg.dsh.de - for pgsql-hackers@postgreSQL.org - id m0zqNDo-000EBTC; Wed, 16 Dec 98 21:07 MET -Message-Id: -From: jwieck@debis.com (Jan Wieck) -Subject: Re: [HACKERS] redolog - for discussion -To: vadim@krs.ru (Vadim Mikheev) -Date: Wed, 16 Dec 1998 21:07:00 +0100 (MET) -Cc: jwieck@debis.com, pgsql-hackers@postgreSQL.org -Reply-To: jwieck@debis.com (Jan Wieck) -In-Reply-To: <3677B71D.C67462B3@krs.ru> from "Vadim Mikheev" at Dec 16, 98 08:35:25 pm -X-Mailer: ELM [version 2.4 PL25] -Content-Type: text -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -Vadim wrote: - -> -> Jan Wieck wrote: -> > -> > RECOVER DATABASE {ALL | UNTIL 'datetime' | RESET}; -> > -> ... -> > -> > For the others, the backend starts the recovery program -> > which reads the redolog files, establishes database -> > connections as required and reruns all the commands in -> ^^^^^^^^^^^^^^^^^^^^^^^^^^ -> > them. If a required logfile isn't found, it tells the -> ^^^^^ -> -> I foresee problems with using _commands_ logging for -> recovery/replication -:(( -> -> Let's consider two concurrent updates in READ COMMITTED mode: -> -> update test set x = 2 where y = 1; -> -> and -> -> update test set x = 3 where y = 1; -> -> The result of both committed transaction will be x = 2 -> if the 1st transaction updated row _after_ 2nd transaction -> and x = 3 if the 2nd transaction gets row after 1st one. -> Order of updates is not defined by order in which commands -> begun and so order in which commands should be rerun -> will be unknown... - - Yepp, the order in which commands begun is absolutely not of - interest. Locking could already delay the execution of one - command until another one started later has finished and - released the lock. It's a classic race condition. - - Thus, my plan was to log the queries just before the call to - CommitTransactionCommand() in tcop. This has the advantage, - that queries which bail out with errors don't get into the - log at all and must not get rerun. And I can set a static - flag to false before starting the command, which is set to - true in the buffer manager when a buffer is written (marked - dirty), so filtering out queries that do no updates at all is - easy. - - Unfortunately query level logging get's hit by the current - implementation of sequence numbers. If a query that get's - aborted somewhere in the middle (maybe by a trigger) called - nextval() for rows processed earlier, the sequence number - isn't advanced at recovery time, because the query is - suppressed at all. And sequences aren't locked, so for - concurrently running queries getting numbers from the same - sequence, the results aren't reproduceable. If some - application selects a value resulting from a sequence and - uses that later in another query, how could the redolog know - that this has changed? It's a Const in the query logged, and - all that corrupts the whole thing. - - All that is painful and I don't see another solution yet than - to hook into nextval(), log out the numbers generated in - normal operation and getting back the same numbers in redo - mode. - - The whole thing gets more and more complicated :-( - - -Jan - --- - -#======================================================================# -# It's easier to get forgiveness for being wrong than for being right. # -# Let's break this rule - forgive me. # -#======================================== jwieck@debis.com (Jan Wieck) # - - - - -From owner-pgsql-hackers@hub.org Wed Jun 16 09:29:31 1999 -Received: from hub.org (hub.org [209.167.229.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA22504 - for ; Wed, 16 Jun 1999 09:29:29 -0400 (EDT) -Received: from hub.org (hub.org [209.167.229.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id JAA02132; - Wed, 16 Jun 1999 09:18:20 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Jun 1999 09:14:07 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id JAA01318 - for pgsql-hackers-outgoing; Wed, 16 Jun 1999 09:14:06 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f -Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37]) - by hub.org (8.9.3/8.9.3) with ESMTP id JAA01278 - for ; Wed, 16 Jun 1999 09:13:48 -0400 (EDT) - (envelope-from vadim@krs.ru) -Received: from krs.ru (dune.krs.ru [195.161.16.38]) - by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id VAA06276 - for ; Wed, 16 Jun 1999 21:12:49 +0800 (KRSS) -Message-ID: <3767A2CF.E6E4A5F9@krs.ru> -Date: Wed, 16 Jun 1999 21:12:47 +0800 -From: Vadim Mikheev -Organization: OJSC Rostelecom (Krasnoyarsk) -X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386) -X-Accept-Language: ru, en -MIME-Version: 1.0 -To: PostgreSQL Developers List -Subject: [HACKERS] Savepoints... -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: ROr - -To have them I need to add tuple id (6 bytes) to heap tuple -header. Are there objections? Though it's not good to increase -tuple header size, subj is, imho, very nice feature... - -Implementation is , hm, "easy": - -- heap_insert/heap_delete/heap_replace/heap_mark4update will - remember updated tid (and current command id) in relation cache - and store previously updated tid (remembered in relation cache) - in additional heap header tid; -- lmgr will remember command id when lock was acquired; -- for a savepoint we will just store command id when - the savepoint was setted; -- when going to sleep due to concurrent the-same-row update, - backend will store MyProc and tuple id in shmem hash table. - -When rolling back to a savepoint, backend will: - -- release locks acquired after savepoint; -- for a relation updated after savepoint, get last updated tid - from relation cache, walk through relation, set - HEAP_XMIN_INVALID/HEAP_XMAX_INVALID in all tuples updated - after savepoint and wake up concurrent writers blocked - on these tuples (using shmem hash table mentioned above). - -The last feature (waking up of concurrent writers) is most hard -part to implement. AFAIK, Oracle 7.3 was not able to do it. -Can someone comment is this feature implemented in Oracle 8.X, -other DBMSes? - -Now about implicit savepoints. Backend will place them before -user statements execution. In the case of failure, transaction -state will be rolled back to the one before execution of query. -As side-effect, this means that we'll get rid of complaints -about entire transaction abort in the case of mistyping -causing abort due to parser errors... - -Comments? - -Vadim - - diff --git a/doc/TODO.detail/outer b/doc/TODO.detail/outer deleted file mode 100644 index 99eab30d36..0000000000 --- a/doc/TODO.detail/outer +++ /dev/null @@ -1,392 +0,0 @@ -From lockhart@alumni.caltech.edu Thu Jan 7 13:31:08 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA07771 - for ; Thu, 7 Jan 1999 13:31:06 -0500 (EST) -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id NAA14597 for ; Thu, 7 Jan 1999 13:27:37 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA13416; - Thu, 7 Jan 1999 18:26:56 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <3694FC70.FAD67BC3@alumni.caltech.edu> -Date: Thu, 07 Jan 1999 18:26:56 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.30 i686) -MIME-Version: 1.0 -To: Bruce Momjian -CC: Postgres Hackers List -Subject: Outer Joins (and need CASE help) -References: <199901071747.MAA07054@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - -> Thomas, do you need help on outer joins? - -Yes. I'm going slowly partly because I get distracted with other -Postgres stuff like docs, and partly because I don't understand all of -the pieces I'm working with. - -I've identified the place in the MergeJoin code where the null filling -for outer joins needs to happen, and have the "merge walk" code done. -But I don't have the supporting code which actually would know how to -null-fill a result tuple from the left or right. I thought you might be -interested in that? - -I've done some work in the parser, and can now do things like: - -postgres=> select * from t1 join t2 using (i); -NOTICE: JOIN not yet implemented -i|j|i|k --+-+-+- -1|2|1|3 -(1 row) - -But this is just an inner join, and the result isn't quite right since -the second "i" column should probably be omitted. At the moment I -transform it from the syntax above into existing parse nodes, and -everything from there on works. - -I don't yet pass an explicit join node into the planner/optimizer, and -that will be the hardest part I assume. Perhaps we can work on that -together. - -So, what I'll try to do (soon, in the next few days?) is put in - - #ifdef ENABLE_OUTER_JOINS - -conditional code into the parser area (already there for the executor) -and commit everything to the development tree. Does that sound OK? - -Oh, and if anyone is looking for something to do, I've got a couple of -CASE statements in the case.sql regression test which are commented out -because they crash the backend. They involve references to multiple -tables within a single result column, and in other contexts that -construct works. It would be great if someone had time to track it -down... - - - Tom - -From lockhart@alumni.caltech.edu Mon Feb 22 02:01:13 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA22073 - for ; Mon, 22 Feb 1999 02:01:12 -0500 (EST) -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id BAA26054 for ; Mon, 22 Feb 1999 01:57:00 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA04715; - Mon, 22 Feb 1999 06:56:36 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <36D0FFA4.32ADB75C@alumni.caltech.edu> -Date: Mon, 22 Feb 1999 06:56:36 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) -MIME-Version: 1.0 -To: Bruce Momjian -CC: hackers@postgreSQL.org -Subject: Re: start on outer join -References: <199902220304.WAA10066@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -Bruce Momjian wrote: -> -> > Will apply ... some other changes laying a bit of -> > groundwork for outer joins so you can start on the planner/optimizer -> > parts :) -> Those will be a synch now that I understand the optimizer. In fact, I -> think it all will happen in the executor. - -I've modified executor/nodeMergeJoin.c to walk a left/right/both outer -join, but didn't fill in the part which actually creates the result -tuple (which will be the current left- or right-side tuple plus nulls -for filler). I hope this is up your alley :) - -So far, I'm not certain what to pass to the planner. The syntax leads me -to pass a select structure from gram.y with a "JoinExpr" structure in -the "fromClause" list. I need to expand that with a combination of -column names and qualifications, but at the time I see the JoinExpr I -don't have access to the top query structure itself. So I may just keep -a modestly transformed JoinExpr to expand later or to pass to the -planner. - -btw, the EXCEPT/INTERSECT stuff from Stefan has some ugliness in gram.y -which needs to be fixed (the shift/reduce conflict is not acceptable for -our release version) and some of that code clearly needs to move to -analyze.c or some other module. - - - Tom - -From maillist Wed Feb 24 05:27:08 1999 -Received: (from maillist@localhost) - by candle.pha.pa.us (8.9.0/8.9.0) id FAA09648; - Wed, 24 Feb 1999 05:27:08 -0500 (EST) -From: Bruce Momjian -Message-Id: <199902241027.FAA09648@candle.pha.pa.us> -Subject: Re: [HACKERS] OUTER joins -In-Reply-To: <199902240953.EAA08561@candle.pha.pa.us> from Bruce Momjian at "Feb 24, 1999 4:53:21 am" -To: maillist@candle.pha.pa.us (Bruce Momjian) -Date: Wed, 24 Feb 1999 05:27:07 -0500 (EST) -Cc: lockhart@alumni.caltech.edu, hackers@postgreSQL.org -X-Mailer: ELM [version 2.4ME+ PL47 (25)] -MIME-Version: 1.0 -Content-Type: text/plain; charset=US-ASCII -Content-Transfer-Encoding: 7bit -Status: RO - -> -> How do you propose doing outer joins in non-mergejoin situations? -> Mergejoins can only be used currently in equal joins. - -Is your solution going to be to make sure the OUTER table is always a -MergeJoin, or on the outside of a join loop? That could work. - -That could get tricky if the table is joined to _two_ other tables. -With the cleaned-up optimizer, we can disable non-merge joins in certain -circumstances, and prevent OUTER tables from being inner in the others. -Is that the plan? - --- - Bruce Momjian | http://www.op.net/~candle - maillist@candle.pha.pa.us | (610) 853-3000 - + If your life is a hard drive, | 830 Blythe Avenue - + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 - -From lockhart@alumni.caltech.edu Mon Mar 1 13:01:08 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA21672 - for ; Mon, 1 Mar 1999 13:01:06 -0500 (EST) -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id MAA12756 for ; Mon, 1 Mar 1999 12:14:16 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id RAA09406; - Mon, 1 Mar 1999 17:10:49 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <36DACA19.E6DBE7D8@alumni.caltech.edu> -Date: Mon, 01 Mar 1999 17:10:49 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) -MIME-Version: 1.0 -To: Bruce Momjian -CC: PostgreSQL-development -Subject: Re: OUTER joins -References: <199902240953.EAA08561@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -(back from a short vacation...) - -> How do you propose doing outer joins in non-mergejoin situations? -> Mergejoins can only be used currently in equal joins. - -Hadn't thought about it, other than figuring that implementing the -equi-join first was a good start. There is a class of outer join syntax -(the USING clause) which is implicitly an equi-join... - - - Tom - -From lockhart@alumni.caltech.edu Mon Mar 8 21:55:02 1999 -Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA15978 - for ; Mon, 8 Mar 1999 21:54:57 -0500 (EST) -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id VAA15837 for ; Mon, 8 Mar 1999 21:48:33 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id CAA06996; - Tue, 9 Mar 1999 02:46:40 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <36E48B90.F3E902B7@alumni.caltech.edu> -Date: Tue, 09 Mar 1999 02:46:40 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) -MIME-Version: 1.0 -To: Bruce Momjian -CC: hackers@postgreSQL.org -Subject: Re: OUTER joins -References: <199903070325.WAA10357@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -> > Hadn't thought about it, other than figuring that implementing the -> > equi-join first was a good start. There is a class of outer join -> > syntax (the USING clause) which is implicitly an equi-join... -> Not that easy. You don't automatically get a mergejoin from an -> equijoin. I will have to force outer's to be either mergejoins, or -> inners of non-merge joins. Can you add code to non-merge joins in the -> executor to throw out a null row if it does not find an inner match -> for the outer row, and I will handle the optimizer so it doesn't throw -> a non-conforming plan to the executor. - -So far I don't have enough info in the parser to get the -planner/optimizer going. Should we work from the front to the back, or -should I go ahead and look at the non-merge joins? It's painfully -obvious that I don't know anything about the middle parts of this to -proceed without lots more research. - - - Tom - -From lockhart@alumni.caltech.edu Tue Mar 9 22:47:57 1999 -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07869 - for ; Tue, 9 Mar 1999 22:47:54 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id DAA14761; - Wed, 10 Mar 1999 03:46:43 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <36E5EB23.F5CD959B@alumni.caltech.edu> -Date: Wed, 10 Mar 1999 03:46:43 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) -MIME-Version: 1.0 -To: Bruce Momjian , tgl@mythos.jpl.nasa.gov -Subject: Re: SQL outer -References: <199903100112.UAA05772@candle.pha.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: RO - -> select * -> from outer tab1, tab2, tab3 -> where tab1.col1 = tab2.col1 and -> tab1.col1 = tab3.col1 - -select * -from t1 left join t2 using (c1) - join t3 on (c1 = t3.c1) - -Result: -t1.c1 t1.c2 t2.c2 t3.c1 -2 12 NULL 32 - -t1: -c1 c2 -1 11 -2 12 -3 13 -4 14 - -t2: -c1 c2 -1 21 -3 23 - -t3: -c1 c2 -2 32 - -From lockhart@alumni.caltech.edu Wed Mar 10 10:48:54 1999 -Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA16741 - for ; Wed, 10 Mar 1999 10:48:51 -0500 (EST) -Received: from alumni.caltech.edu (localhost [127.0.0.1]) - by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id PAA17723; - Wed, 10 Mar 1999 15:48:31 GMT -Sender: tgl@mythos.jpl.nasa.gov -Message-ID: <36E6944F.1F93B08@alumni.caltech.edu> -Date: Wed, 10 Mar 1999 15:48:31 +0000 -From: "Thomas G. Lockhart" -Organization: Caltech/JPL -X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) -MIME-Version: 1.0 -To: Bruce Momjian -CC: Thomas Lockhart -Subject: Re: SQL outer -References: <199903100112.UAA05772@candle.pha.pa.us> <36E5EB23.F5CD959B@alumni.caltech.edu> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Status: ROr - -Just thinking... - -If the initial RelOptInfo groupings are derived from the WHERE clause -expressions, how about marking the "outer" property in those expressions -in the parser? istm that is where the parser knows about two tables in -one place, and I'm generating those expressions anyway. We could add a -field(s) to the expression structure, or pass along a slightly different -structure... - - - Tom - -From owner-pgsql-hackers@hub.org Wed Jul 21 02:35:13 1999 -Received: from hub.org (hub.org [216.126.84.1]) - by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA13837 - for ; Wed, 21 Jul 1999 02:35:12 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) - by hub.org (8.9.3/8.9.3) with ESMTP id CAA88539; - Wed, 21 Jul 1999 02:27:41 -0400 (EDT) - (envelope-from owner-pgsql-hackers@hub.org) -Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 21 Jul 1999 02:24:08 +0000 (EDT) -Received: (from majordom@localhost) - by hub.org (8.9.3/8.9.3) id CAA87850 - for pgsql-hackers-outgoing; Wed, 21 Jul 1999 02:23:13 -0400 (EDT) - (envelope-from owner-pgsql-hackers@postgreSQL.org) -Received: from localhost (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) - by hub.org (8.9.3/8.9.3) with ESMTP id CAA87810 - for ; Wed, 21 Jul 1999 02:22:52 -0400 (EDT) - (envelope-from lockhart@alumni.caltech.edu) -Received: from alumni.caltech.edu (lockhart@localhost [127.0.0.1]) - by localhost (8.8.7/8.8.7) with ESMTP id GAA14480; - Wed, 21 Jul 1999 06:20:22 GMT -Message-ID: <379566A6.A4CDF97F@alumni.caltech.edu> -Date: Wed, 21 Jul 1999 06:20:22 +0000 -From: Thomas Lockhart -X-Mailer: Mozilla 4.6 [en] (X11; I; Linux 2.0.36 i686) -X-Accept-Language: en -MIME-Version: 1.0 -To: Tom Lane -CC: Bruce Momjian , pgsql-hackers@postgreSQL.org -Subject: Re: [HACKERS] Another reason to redesign querytree representation -References: <591.932505751@sss.pgh.pa.us> -Content-Type: text/plain; charset=us-ascii -Content-Transfer-Encoding: 7bit -Sender: owner-pgsql-hackers@postgreSQL.org -Precedence: bulk -Status: RO - -> Thomas, what do you think is needed for outer joins? - -Bruce and I have talked about it some already: - -For outer joins, tables must be combined in a particular order. For -example, a left outer join requires that any entries in the left-side -table which do not have a corresponding entry in the right-side table -be expanded with nulls during the join. The information on the outer -join can't be carried by the rte since the same table can appear twice -in an outer join expression: - - select * from t1 left join t2 using (i) - left join t1 on (i = t1.j); - -For a query like - - select * from t1 left join t2 using (i) where t2.j = 3; - -istm that the outer join must be done before the t2 qualification is -applied, and that another ordering may produce the wrong result. - ->From what I understand Bruce to say, the planner/optimizer is allowed -to try all kinds of permutations of plans, choosing the one with the -lowest cost. But if the info for the join is carried in a -qualification node, then the planner/optimizer must know that it can't -reorder the query as freely as it does now. - -I was thinking of having a new qualification node to carry this info, -and it could be transformed into a mergejoin node which has a couple -of new fields indicating left and/or right outer join behavior. - -A hashjoin method may be possible for queries which are structured as -a left outer join; other outer joins will need to use the mergejoin -method. Also, some poorly-qualified outer joins reduce to inner joins, -and perhaps the optimizer can be smart enough to realize this. - - - Thomas - --- -Thomas Lockhart lockhart@alumni.caltech.edu -South Pasadena, California - - -- 2.40.0