From: Bruce Momjian Date: Thu, 12 Feb 2004 18:13:29 +0000 (+0000) Subject: Add bitmap discussion to performance TODO.detail. X-Git-Tag: REL8_0_0BETA1~1185 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=6a13bdd8f347e4b044307d689e64179e4728d2f3;p=postgresql Add bitmap discussion to performance TODO.detail. --- diff --git a/doc/TODO.detail/performance b/doc/TODO.detail/performance index 2fbfabe4cb..90397ba2bc 100644 --- a/doc/TODO.detail/performance +++ b/doc/TODO.detail/performance @@ -345,7 +345,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 10:31:10 1999 Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA29087 for ; Tue, 19 Oct 1999 10:31:08 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.16 $) with ESMTP id KAA27535 for ; Tue, 19 Oct 1999 10:19:47 -0400 (EDT) +Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.17 $) with ESMTP id KAA27535 for ; Tue, 19 Oct 1999 10:19:47 -0400 (EDT) Received: from localhost (majordom@localhost) by hub.org (8.9.3/8.9.3) with SMTP id KAA30328; Tue, 19 Oct 1999 10:12:10 -0400 (EDT) @@ -454,7 +454,7 @@ From owner-pgsql-hackers@hub.org Tue Oct 19 21:25:30 1999 Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA28130 for ; Tue, 19 Oct 1999 21:25:26 -0400 (EDT) -Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.16 $) with ESMTP id VAA10512 for ; Tue, 19 Oct 1999 21:15:28 -0400 (EDT) +Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.17 $) with ESMTP id VAA10512 for ; Tue, 19 Oct 1999 21:15:28 -0400 (EDT) Received: from localhost (majordom@localhost) by hub.org (8.9.3/8.9.3) with SMTP id VAA50745; Tue, 19 Oct 1999 21:07:23 -0400 (EDT) @@ -1006,7 +1006,7 @@ From pgsql-general-owner+M2497@hub.org Fri Jun 16 18:31:03 2000 Received: from renoir.op.net (root@renoir.op.net [207.29.195.4]) by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04165 for ; Fri, 16 Jun 2000 17:31:01 -0400 (EDT) -Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.16 $) with ESMTP id RAA13110 for ; Fri, 16 Jun 2000 17:20:12 -0400 (EDT) +Received: from hub.org (root@hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.17 $) with ESMTP id RAA13110 for ; Fri, 16 Jun 2000 17:20:12 -0400 (EDT) Received: from hub.org (majordom@localhost [127.0.0.1]) by hub.org (8.10.1/8.10.1) with SMTP id e5GLDaM14477; Fri, 16 Jun 2000 17:13:36 -0400 (EDT) @@ -3264,3 +3264,1893 @@ Bruce Momjian wrote: > > > saying this is a major issue for PostgreSQL but the numbers would be > > > interesting. +From pgsql-hackers-owner+M49418=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 15:52:28 2004 +Return-path: +Received: from vm2.hub.org ([200.46.204.60]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0RKqPe07814 + for ; Tue, 27 Jan 2004 15:52:28 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by vm2.hub.org (Postfix) with ESMTP id 70DC3CD397A + for ; Tue, 27 Jan 2004 20:52:19 +0000 (GMT) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id A93D7D1D3A4 + for ; Tue, 27 Jan 2004 20:41:43 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 54186-02 + for ; + Tue, 27 Jan 2004 16:41:12 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by svr1.postgresql.org (Postfix) with ESMTP id 33243D1E1F2 + for ; Tue, 27 Jan 2004 16:36:24 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id 2A41136C44; Tue, 27 Jan 2004 15:36:21 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AlZwa-0006sL-00; Tue, 27 Jan 2004 15:36:20 -0500 +To: pgsql-hackers@postgresql.org +Subject: [HACKERS] Question about indexes +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 27 Jan 2004 15:36:20 -0500 +Message-ID: <87isixt9h7.fsf@stark.xeocode.com> +Lines: 9 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + + +How feasible would it be to have a btree index on ctid? I'm thinking it ought +to work simply enough for the normal case of insert/delet/update, but I'm not +completely certain how vacuum, vacuum full, and cluster would interact. + +You may think this would be utterly useless, but I have a cunning plan. + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 8: explain analyze is your friend + +From pgsql-hackers-owner+M49439=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 18:01:59 2004 +Return-path: +Received: from bricolage.postgresql.org ([200.46.204.116]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0RN1we27517 + for ; Tue, 27 Jan 2004 18:01:59 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by bricolage.postgresql.org (Postfix) with ESMTP id 946B3148343C + for ; Tue, 27 Jan 2004 23:01:52 +0000 (GMT) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 778CED1D362 + for ; Tue, 27 Jan 2004 22:52:27 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 09353-02 + for ; + Tue, 27 Jan 2004 18:51:56 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id 5C5D5D1B47D + for ; Tue, 27 Jan 2004 18:51:55 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.10/8.12.10) with ESMTP id i0RMpunX029816; + Tue, 27 Jan 2004 17:51:56 -0500 (EST) +To: Greg Stark +cc: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <87isixt9h7.fsf@stark.xeocode.com> +References: <87isixt9h7.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "27 Jan 2004 15:36:20 -0500" +Date: Tue, 27 Jan 2004 17:51:56 -0500 +Message-ID: <29815.1075243916@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Greg Stark writes: +> How feasible would it be to have a btree index on ctid? + +Why would you want one? Direct access by ctid beats out an index lookup +every time. In any case, vacuum and friends would break such an index +entirely. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 3: if posting/reading through Usenet, please send an appropriate + subscribe-nomail command to majordomo@postgresql.org so that your + message can get through to the mailing list cleanly + +From pgsql-hackers-owner+M49440=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 18:19:13 2004 +Return-path: +Received: from krusty-motorsports.com (IDENT:exim@krusty-motorsports.com [192.94.170.8]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0RNJCe00301 + for ; Tue, 27 Jan 2004 18:19:13 -0500 (EST) +Received: from [200.46.204.71] (helo=postgresql.org) + by krusty-motorsports.com with esmtp (Exim 4.22) + id 1AldQ9-0007JC-2z + for pgman@candle.pha.pa.us; Wed, 28 Jan 2004 00:19:05 +0000 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 6D641D1D54A + for ; Tue, 27 Jan 2004 23:12:01 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 14466-06 + for ; + Tue, 27 Jan 2004 19:11:30 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by svr1.postgresql.org (Postfix) with ESMTP id 6D58FD1D49E + for ; Tue, 27 Jan 2004 19:11:29 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id 9B74536ADA; Tue, 27 Jan 2004 18:11:31 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AlcMl-0007Tk-00; Tue, 27 Jan 2004 18:11:31 -0500 +To: Tom Lane +cc: Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <87isixt9h7.fsf@stark.xeocode.com> + <29815.1075243916@sss.pgh.pa.us> +In-Reply-To: <29815.1075243916@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 27 Jan 2004 18:11:31 -0500 +Message-ID: <87d695t2ak.fsf@stark.xeocode.com> +Lines: 33 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Tom Lane writes: + +> Greg Stark writes: +> +> > How feasible would it be to have a btree index on ctid? +> +> Why would you want one? Direct access by ctid beats out an index lookup +> every time. + +Of course. But as I mentioned, I have a cunning plan. + +If you have two indexes (a,ctid) and (b,ctid) and do a query where a=1 and b=2 +then it would be particularly easy to combine the two efficiently. + +If specially marked btree indexes -- or even all btree indexes -- implicitly +had ctid as a final sort order after all the index column, then it would +esentially obviate the need for bitmap indexes. They wouldn't have the space +advantage, but they would be possible to combine using arbitrary boolean +expressions without looking at the actual tuples. + +This is essentially what is in the TODO about using bitmaps, but without +having to do any extra sorts. + +This would only really be an advantage for particularly wide tables where the +combination of boolean clauses narrows the result set down a lot more than any +one clause. + +> In any case, vacuum and friends would break such an index entirely. + +That was what I was afraid of. + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 5: Have you checked our extensive FAQ? + + http://www.postgresql.org/docs/faqs/FAQ.html + +From pgsql-hackers-owner+M49442=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 18:32:25 2004 +Return-path: +Received: from vm2.hub.org ([200.46.204.60]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0RNWNe02539 + for ; Tue, 27 Jan 2004 18:32:24 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by vm2.hub.org (Postfix) with ESMTP id DC003CD49A4 + for ; Tue, 27 Jan 2004 23:32:17 +0000 (GMT) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 34466D1D17D + for ; Tue, 27 Jan 2004 23:25:11 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 20117-05 + for ; + Tue, 27 Jan 2004 19:24:41 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id 33E28D1D548 + for ; Tue, 27 Jan 2004 19:24:40 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.10/8.12.10) with ESMTP id i0RNOfnX000404; + Tue, 27 Jan 2004 18:24:41 -0500 (EST) +To: Greg Stark +cc: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <87d695t2ak.fsf@stark.xeocode.com> +References: <87isixt9h7.fsf@stark.xeocode.com> <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "27 Jan 2004 18:11:31 -0500" +Date: Tue, 27 Jan 2004 18:24:41 -0500 +Message-ID: <403.1075245881@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Greg Stark writes: +> If you have two indexes (a,ctid) and (b,ctid) and do a query where a=1 and b=2 +> then it would be particularly easy to combine the two efficiently. + +> If specially marked btree indexes -- or even all btree indexes -- implicitly +> had ctid as a final sort order after all the index column, then it would +> esentially obviate the need for bitmap indexes. + +I don't think so. You are thinking only of exact-equality queries --- +as soon as the WHERE clause describes a range of index entries, the +readout wouldn't be sorted by ctid anyway. + +Combining indexes via a bitmap intermediate step (which is not really +the same thing as bitmap indexes, IIUC) seems like a more robust +approach than relying on the index entries to be in ctid order. + +But if we did want to sort indexes that way, we could do it today, +I think. The ctid is already stored in index entries (it is the +"payload" remember...) and we could use it as a tiebreaker when +determining insertion position. This doesn't have the problems that +putting ctid into the user columns would do, because the system knows +about that ctid as being special; the difficulty with ctid in the user +columns is the code not knowing that it'd need to change on a tuple move. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 5: Have you checked our extensive FAQ? + + http://www.postgresql.org/docs/faqs/FAQ.html + +From pgsql-hackers-owner+M49450=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 21:28:20 2004 +Return-path: +Received: from postgresql.wavefire.com (postgresql.wavefire.com [64.141.14.48]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0S2SIe29755 + for ; Tue, 27 Jan 2004 21:28:19 -0500 (EST) +Received: from postgresql.org ([200.46.204.71]) + by postgresql.wavefire.com (8.9.3/8.9.3) with ESMTP id TBM02845 + for ; Tue, 27 Jan 2004 19:06:45 -0800 (PST) + (envelope-from pgsql-hackers-owner+M49450=pgman=candle.pha.pa.us@postgresql.org) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 6213BD1B85F + for ; Wed, 28 Jan 2004 02:19:56 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 69438-06 + for ; + Tue, 27 Jan 2004 22:19:26 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by svr1.postgresql.org (Postfix) with ESMTP id 1964FD1B47D + for ; Tue, 27 Jan 2004 22:19:24 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id BE92136B37; Tue, 27 Jan 2004 21:19:26 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AlfIc-00084d-00; Tue, 27 Jan 2004 21:19:26 -0500 +To: Tom Lane +cc: Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <87isixt9h7.fsf@stark.xeocode.com> + <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> + <403.1075245881@sss.pgh.pa.us> +In-Reply-To: <403.1075245881@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 27 Jan 2004 21:19:26 -0500 +Message-ID: <877jzcu85t.fsf@stark.xeocode.com> +Lines: 43 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + + +Tom Lane writes: + +> I don't think so. You are thinking only of exact-equality queries --- +> as soon as the WHERE clause describes a range of index entries, the +> readout wouldn't be sorted by ctid anyway. + +But then even bitmap indexes would fail in that way too, or at least have a +lot of extra cost that would have to be taken into account based on the number +of values in the range. + +> Combining indexes via a bitmap intermediate step (which is not really +> the same thing as bitmap indexes, IIUC) seems like a more robust +> approach than relying on the index entries to be in ctid order. + +I would see that as the next step, But it seems to me it would be only a small +set of queries where it would really help enough to outweigh the extra work of +the sort. Whereas if the ctid is already pre-sorted then the extra cost is +fairly low. Sort of like the difference in cost between a merge join where +both sides have to be sorted and a merge join where both sides are pre-sorted. + +> But if we did want to sort indexes that way, we could do it today, +> I think. The ctid is already stored in index entries (it is the +> "payload" remember...) and we could use it as a tiebreaker when +> determining insertion position. This doesn't have the problems that +> putting ctid into the user columns would do, because the system knows +> about that ctid as being special; the difficulty with ctid in the user +> columns is the code not knowing that it'd need to change on a tuple move. + +That's exactly what I was thinking. I just don't know how badly it would +complicate the vacuum{,full}/cluster code and whether those are the only cases +to worry about. + + +Note that the space saving of bitmap indexes is still a substantial factor. +Using btree indexes the i/o costs of doing multiple index scans plus a table +scan of the relevant pages would still be quite substantial. So this doesn't +completely obviate the need for bitmap indexes, but I think it would remove a +lot of the pressure from people who just need them to handle a few select +queries. + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org + +From pgsql-hackers-owner+M49453=pgman=candle.pha.pa.us@postgresql.org Tue Jan 27 21:53:09 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0S2r3e04133 + for ; Tue, 27 Jan 2004 21:53:08 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 791556 for pgman@candle.pha.pa.us; Tue, 27 Jan 2004 18:49:49 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (neptune.hub.org [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id C4A10D1B47D + for ; Wed, 28 Jan 2004 02:49:28 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 76787-10 + for ; + Tue, 27 Jan 2004 22:48:59 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id A5C5CD1B4DC + for ; Tue, 27 Jan 2004 22:48:56 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0S2mxTx005814; + Tue, 27 Jan 2004 21:48:59 -0500 (EST) +To: Greg Stark +cc: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <877jzcu85t.fsf@stark.xeocode.com> +References: <87isixt9h7.fsf@stark.xeocode.com> <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> <403.1075245881@sss.pgh.pa.us> <877jzcu85t.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "27 Jan 2004 21:19:26 -0500" +Date: Tue, 27 Jan 2004 21:48:59 -0500 +Message-ID: <5813.1075258139@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Greg Stark writes: +>> Combining indexes via a bitmap intermediate step (which is not really +>> the same thing as bitmap indexes, IIUC) seems like a more robust +>> approach than relying on the index entries to be in ctid order. + +> I would see that as the next step, But it seems to me it would be only a small +> set of queries where it would really help enough to outweigh the extra work of +> the sort. + +What sort? The whole point of a bitmap is that it makes it easy to +visit the tuples in heap order. You scan the index, you set the +appropriate bits in the bitmap, and then you scan the bitmap and go to +the heap tuples that have their bits set. If you are using multiple +indexes you can AND or OR their results at the bitmap phase before you +go to the heap. + +An implementation of this kind would not produce tuples in index order, +so if you have an ORDER BY to satisfy then you end up doing an explicit +sort after you have the tuples. It would be up to the planner to +consider this cost versus the advantages of being able to use multiple +indexes; we'd certainly want to keep the existing scan mechanism as an +available alternative. But if the query is suited to multiple indexes +I suspect it'd be a win pretty often. + +> Note that the space saving of bitmap indexes is still a substantial factor. + +I think you are still confusing what I'm talking about with a bitmap +index, ie, a persistent structure on-disk. It's not that at all, but +a transient structure built in-memory during an index scan. + +I'm a little dubious that true bitmap indexes would be worth building +for Postgres. Seems like partial indexes cover the same sorts of +applications and are more flexible. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 5: Have you checked our extensive FAQ? + + http://www.postgresql.org/docs/faqs/FAQ.html + +From pgsql-hackers-owner+M49462=pgman=candle.pha.pa.us@postgresql.org Wed Jan 28 13:10:48 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0SIAle25230 + for ; Wed, 28 Jan 2004 13:10:47 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 793300 for pgman@candle.pha.pa.us; Wed, 28 Jan 2004 10:07:34 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 19389D1CCAF + for ; Wed, 28 Jan 2004 17:56:46 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 10780-09 + for ; + Wed, 28 Jan 2004 13:56:14 -0400 (AST) +Received: from www.postgresql.com (www.postgresql.com [200.46.204.209]) + by svr1.postgresql.org (Postfix) with ESMTP id A53DAD1DF6B + for ; Wed, 28 Jan 2004 13:52:13 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by www.postgresql.com (Postfix) with ESMTP id E0414CF6FBA + for ; Wed, 28 Jan 2004 10:47:17 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id C4D5036BA2; Wed, 28 Jan 2004 09:13:47 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AlqRv-0001fZ-00; Wed, 28 Jan 2004 09:13:47 -0500 +To: Tom Lane +cc: Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <87isixt9h7.fsf@stark.xeocode.com> + <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> + <403.1075245881@sss.pgh.pa.us> <877jzcu85t.fsf@stark.xeocode.com> + <5813.1075258139@sss.pgh.pa.us> +In-Reply-To: <5813.1075258139@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 28 Jan 2004 09:13:47 -0500 +Message-ID: <871xpktb38.fsf@stark.xeocode.com> +Lines: 38 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Tom Lane writes: + +> Greg Stark writes: +> > +> > I would see that as the next step, But it seems to me it would be only a small +> > set of queries where it would really help enough to outweigh the extra work of +> > the sort. +> +> What sort? + +To build the in-memory bitmap you effectively have to do a sort. If the tuples +come out of the index in heap order then you can combine them without having +to go through that step. + +> I'm a little dubious that true bitmap indexes would be worth building +> for Postgres. Seems like partial indexes cover the same sorts of +> applications and are more flexible. + +I'm clear on the distinction. I think bitmap indexes still have a place, but +if regular btree indexes could be combined efficiently then that would be an +even narrower niche. + +Partial indexes are very handy, and they're useful in corner cases where +bitmap indexes are useful, such as flags for special types of records. + +But I think bitmap indexes are specifically wanted by certain types of data +warehousing applications where you have an index on virtually every column and +then want to do arbitrary boolean combinations of all of them. btree indexes +would generate more i/o scanning all the indexes than just doing a sequential +scan would. Whereas bitmap indexes are much denser on disk. + +However my experience leans more towards the OLTP side and I very rarely saw +applications like this. + + + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 3: if posting/reading through Usenet, please send an appropriate + subscribe-nomail command to majordomo@postgresql.org so that your + message can get through to the mailing list cleanly + +From pgsql-hackers-owner+M49465=pgman=candle.pha.pa.us@postgresql.org Wed Jan 28 13:30:48 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0SIUke29027 + for ; Wed, 28 Jan 2004 13:30:47 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 793371 for pgman@candle.pha.pa.us; Wed, 28 Jan 2004 10:27:31 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 92005D1D3F7 + for ; Wed, 28 Jan 2004 18:14:02 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 21680-08 + for ; + Wed, 28 Jan 2004 14:13:31 -0400 (AST) +Received: from www.postgresql.com (www.postgresql.com [200.46.204.209]) + by svr1.postgresql.org (Postfix) with ESMTP id 088B0D1DC77 + for ; Wed, 28 Jan 2004 14:08:44 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by www.postgresql.com (Postfix) with ESMTP id CFF50CF77BD + for ; Wed, 28 Jan 2004 11:00:42 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0SExBYA018093; + Wed, 28 Jan 2004 09:59:12 -0500 (EST) +To: Greg Stark +cc: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <871xpktb38.fsf@stark.xeocode.com> +References: <87isixt9h7.fsf@stark.xeocode.com> <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> <403.1075245881@sss.pgh.pa.us> <877jzcu85t.fsf@stark.xeocode.com> <5813.1075258139@sss.pgh.pa.us> <871xpktb38.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "28 Jan 2004 09:13:47 -0500" +Date: Wed, 28 Jan 2004 09:59:11 -0500 +Message-ID: <18092.1075301951@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Greg Stark writes: +> Tom Lane writes: +>> What sort? + +> To build the in-memory bitmap you effectively have to do a sort. + +Hm, you're thinking that the operation of inserting a bit into a bitmap +has to be at least O(log N). Seems to me that that depends on the data +structure you use. In principle it could be O(1), if you use a true +bitmap (linear array) -- just index and set the bit. You might be right +that practical data structures would be O(log N), but I'm not totally +convinced. + +> If the tuples come out of the index in heap order then you can combine +> them without having to go through that step. + +But considering the restrictions implied by that assumption --- no range +scans, no non-btree indexes --- I doubt we will take the trouble to +implement that variant. We'll want to do the generalized bitmap code +anyway. + +In any case, this discussion is predicated on the assumption that the +operations involving the bitmap are a significant fraction of the total +time, which I think is quite uncertain. Until we build it and profile +it, we won't know that. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 4: Don't 'kill -9' the postmaster + +From pgsql-hackers-owner+M49457=pgman=candle.pha.pa.us@postgresql.org Wed Jan 28 10:42:58 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0SFgue00574 + for ; Wed, 28 Jan 2004 10:42:57 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 792727 for pgman@candle.pha.pa.us; Wed, 28 Jan 2004 07:39:41 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 08484D1CA01 + for ; Wed, 28 Jan 2004 15:38:28 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 36717-02 + for ; + Wed, 28 Jan 2004 11:37:55 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by svr1.postgresql.org (Postfix) with ESMTP id E27BDD1D201 + for ; Wed, 28 Jan 2004 11:37:55 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id 1E70F36BBA; Wed, 28 Jan 2004 10:09:35 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AlrJu-0001rj-00; Wed, 28 Jan 2004 10:09:34 -0500 +To: Tom Lane +cc: Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <87isixt9h7.fsf@stark.xeocode.com> + <29815.1075243916@sss.pgh.pa.us> <87d695t2ak.fsf@stark.xeocode.com> + <403.1075245881@sss.pgh.pa.us> <877jzcu85t.fsf@stark.xeocode.com> + <5813.1075258139@sss.pgh.pa.us> <871xpktb38.fsf@stark.xeocode.com> + <18092.1075301951@sss.pgh.pa.us> +In-Reply-To: <18092.1075301951@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 28 Jan 2004 10:09:34 -0500 +Message-ID: <87vfmwrtxt.fsf@stark.xeocode.com> +Lines: 15 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: ORr + + +Tom Lane writes: + +> In any case, this discussion is predicated on the assumption that the +> operations involving the bitmap are a significant fraction of the total +> time, which I think is quite uncertain. Until we build it and profile +> it, we won't know that. + +The other thought I had was that it would be difficult to tell when to follow +this path. Since the main case where it wins is when the individual indexes +aren't very selective but the combination is very selective, and we don't have +inter-column correlation statistics ... + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 9: the planner will ignore your desire to choose an index scan if your + joining column's datatypes do not match + +From pgsql-hackers-owner+M49467=pgman=candle.pha.pa.us@postgresql.org Wed Jan 28 17:29:11 2004 +Return-path: +Received: from svr1.postgresql.org ([200.46.204.71]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0SMT9e09381 + for ; Wed, 28 Jan 2004 17:29:10 -0500 (EST) +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 7E6A1D1D0F9 + for ; Wed, 28 Jan 2004 22:29:02 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 30501-10 for ; + Wed, 28 Jan 2004 18:28:33 -0400 (AST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by svr1.postgresql.org (Postfix) with ESMTP id 002FED1CCDA + for ; Wed, 28 Jan 2004 18:28:30 -0400 (AST) +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id BC300D1B4BD + for ; Wed, 28 Jan 2004 22:16:19 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 29171-03 + for ; + Wed, 28 Jan 2004 18:15:50 -0400 (AST) +Received: from cmailm1.svr.pol.co.uk (cmailm1.svr.pol.co.uk [195.92.193.18]) + by svr1.postgresql.org (Postfix) with ESMTP id 99F4BD1C50E + for ; Wed, 28 Jan 2004 18:15:47 -0400 (AST) +Received: from modem-182.leopard.dialup.pol.co.uk ([217.135.144.182] helo=LaptopDellXP) + by cmailm1.svr.pol.co.uk with esmtp (Exim 4.14) + id 1AlxyO-0002XD-Ab; Wed, 28 Jan 2004 22:15:48 +0000 +Reply-To: +From: "Simon Riggs" +To: "'Tom Lane'" , "'Greg Stark'" +cc: +Subject: Re: [HACKERS] Question about indexes +Date: Wed, 28 Jan 2004 22:15:40 -0000 +Organization: 2nd Quadrant +Message-ID: <003701c3e5ec$44306250$efb887d9@LaptopDellXP> +MIME-Version: 1.0 +Content-Type: text/plain; + charset="US-ASCII" +Content-Transfer-Encoding: 7bit +X-Priority: 3 (Normal) +X-MSMail-Priority: Normal +X-Mailer: Microsoft Outlook, Build 10.0.2627 +X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2727.1300 +Importance: Normal +In-Reply-To: <18092.1075301951@sss.pgh.pa.us> +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Some potentially helpful background comments on the discussion so far... + +>Tom Lane writes +>>Greg Stark writes +>> Note that the space saving of bitmap indexes is still a substantial +>> factor. +>I think you are still confusing what I'm talking about with a bitmap +index, >ie, a persistent structure on-disk. It's not that at all, but a +transient >structure built in-memory during an index scan. + +Oracle allows the creation of bitmap indices as persistent data +structures. + +The "space saving" of bitmap indices is only a saving when compared with +btree indices. If you don't have them at all because they are built +dynamically when required, as Tom is suggesting, then you "save" even +more space. + +Maintaining the bitmap index is a costly operation. You tend to want to +build them on "characteristic" columns, of which there tends to be more +of in a database than "partial/full identity" columns on which you build +btrees (forgive the vagueness of that comment), so you end up with loads +of the damn things, so the space soon adds up. It can be hard to judge +which ones are the important ones, especially when each is used by a +different user/group. Building them dynamically is a good way of solving +the question "which ones are needed?". Ever seen 58 indices on a table? +Don't go there. + +My vote would be implement the dynamic building capability, then return +to implement a persisted structure later if that seems like it would be +a further improvement. [The option would be nice] + +If we do it dynamically, as Tom suggests, then we don't have to code the +index maintenance logic at all and the functionality will be with us all +the sooner. Go Tom! + +>Tom Lane writes +> In any case, this discussion is predicated on the assumption that the +> operations involving the bitmap are a significant fraction of the +total +> time, which I think is quite uncertain. Until we build it and profile +> it, we won't know that. + +Dynamically building the bitmaps has been the strategy in use by +Teradata for nearly a decade on many large datawarehouses. I can +personally vouch for the effectiveness of this approach - I was +surprised when Oracle went for the persistent option. Certainly in that +case building the bitmaps adds much less time than is saved overall by +the better total query strategy. + +>Greg Stark writes +> > To build the in-memory bitmap you effectively have to do a sort. + +Not sure on this latter point: I think I agree with Greg on that point, +but want to believe Tom because requiring a sort will definitely add +time. + +To shed some light in this area, some other major implementations are: + +In Teradata, tables are stored based upon a primary index, which is +effectively an index-organised table. The index pointers are stored in +sorted order lock step with the blocks of the associated table - No sort +required. (The ordering is based upon a hashed index, but that doesn't +change the technique). + +Oracle's tables/indexes use heaps/btrees also, though they do provide an +index-organised table feature similar to Teradata. Maybe the lack of +heap/btree consistent ordering in Oracle and their subsequent design +choice of persistent bitmap indices is an indication for PostgreSQL too? + +In Oracle, bitmap indices are an important precursor to the star join +technique. AFAICS it is still possible to have a star join plan without +having persistent bitmap indices. IMHO, the longer term goal of a good +star join plan is an important one - that may influence the design +selection for this discussion. + +Hope some of that helps, + +Best regards, Simon Riggs + + +---------------------------(end of broadcast)--------------------------- +TIP 8: explain analyze is your friend + +From pgsql-hackers-owner+M49477=pgman=candle.pha.pa.us@postgresql.org Thu Jan 29 04:24:47 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0T9Ohe19178 + for ; Thu, 29 Jan 2004 04:24:43 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 794811 for pgman@candle.pha.pa.us; Thu, 29 Jan 2004 01:21:28 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 639A8D1B4CE + for ; Thu, 29 Jan 2004 09:17:40 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 24681-09 + for ; + Thu, 29 Jan 2004 05:17:16 -0400 (AST) +Received: from loki.hnit.is (unknown [193.4.243.180]) + by svr1.postgresql.org (Postfix) with ESMTP id 98971D1C9FD + for ; Thu, 29 Jan 2004 05:17:07 -0400 (AST) +Received: from seifur.hnit.is ([193.4.243.99]) by 193.4.243.180 with trend_isnt_name_B; Thu, 29 Jan 2004 09:17:12 -0000 +X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1 +Content-Class: urn:content-classes:message +MIME-Version: 1.0 +Content-Type: text/plain; + charset="us-ascii" +Subject: Re: [HACKERS] Question about indexes +Date: Thu, 29 Jan 2004 09:17:11 -0000 +Message-ID: <0A5B2E3C3A64CA4AB14F76DBCA76DDA44EF9B2@seifur.hnit.is> +Thread-Topic: [HACKERS] Question about indexes +Thread-Index: AcPl7J1SKohPpCtfSZq2EeeqhKLynAAW3BDw +From: +To: +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Content-Transfer-Encoding: 8bit +X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id i0T9Ohe19178 +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.7 required=5.0 tests=BAYES_00,NO_REAL_NAME + autolearn=no version=2.61 +Status: OR + + +A small comment on Oracle's implementation of persistent bitmap indexes: + +Oracle's bitmap index is concurently locked by DML, i.e. it suites for OLAP +(basically read only data warehouses) but in no way for OLTP. + +IMHO, +Laimis + +> Maybe the lack of heap/btree consistent ordering in Oracle +> and their subsequent design choice of persistent bitmap +> indices is an indication for PostgreSQL too? + + +---------------------------(end of broadcast)--------------------------- +TIP 9: the planner will ignore your desire to choose an index scan if your + joining column's datatypes do not match + +From pgsql-hackers-owner+M49497=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 01:22:15 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0U6MCe03385 + for ; Fri, 30 Jan 2004 01:22:14 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 797306 for pgman@candle.pha.pa.us; Thu, 29 Jan 2004 22:18:52 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 6CCBCD1C967 + for ; Fri, 30 Jan 2004 06:16:52 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 81674-05 + for ; + Fri, 30 Jan 2004 02:16:22 -0400 (AST) +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by svr1.postgresql.org (Postfix) with ESMTP id 6DC4BD1CC98 + for ; Fri, 30 Jan 2004 02:16:21 -0400 (AST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id 8FD5F369BB; Fri, 30 Jan 2004 01:16:21 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AmRwz-0004kf-00; Fri, 30 Jan 2004 01:16:21 -0500 +To: pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <0A5B2E3C3A64CA4AB14F76DBCA76DDA44EF9B2@seifur.hnit.is> +In-Reply-To: <0A5B2E3C3A64CA4AB14F76DBCA76DDA44EF9B2@seifur.hnit.is> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 30 Jan 2004 01:16:21 -0500 +Message-ID: <87y8rqx8p6.fsf@stark.xeocode.com> +Lines: 31 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + + + writes: + +> A small comment on Oracle's implementation of persistent bitmap indexes: +> +> Oracle's bitmap index is concurently locked by DML, i.e. it suites for OLAP +> (basically read only data warehouses) but in no way for OLTP. + +I knew this. I think they figured that was ok because bitmap indexes were +mainly intended to solve data warehouse problems anyways. + +Thinking out loud here, I wonder whether this would be less of a problem for +postgres. Since tuples are never updated in place there would never be a need +to lock the entire bitmap until a transaction completes. + +There would never be as much concurrency as btrees, assuming there was any +kind of compression on the bitmap, but I don't see any reason why a long-term +lock would have to be held for updates. + +Even regular vacuum might not have to lock anything for long, just long enough +to clear the bits. and vacuum full/cluster already take table locks anyways. + +I think the problem Oracle ran into was that storing rollback ids in the +bitmap is untenable. The whole point of persistent bitmap indexes is to store +a very dense representation that represents thousands of records per page. +Allocating space to store thousands of pending transaction ids and having +thousands of old versions of the page in the rollback segment would defeat the +purpose. + +-- +greg + + +---------------------------(end of broadcast)--------------------------- +TIP 7: don't forget to increase your free space map settings + +From pgsql-hackers-owner+M49502=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 06:37:25 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UBbOe07302 + for ; Fri, 30 Jan 2004 06:37:25 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 797695 for pgman@candle.pha.pa.us; Fri, 30 Jan 2004 03:34:06 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 92A3CD1CCB7 + for ; Fri, 30 Jan 2004 11:31:21 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 76882-10 + for ; + Fri, 30 Jan 2004 07:31:24 -0400 (AST) +Received: from candle.pha.pa.us (candle.pha.pa.us [207.106.42.251]) + by svr1.postgresql.org (Postfix) with ESMTP id 59850D1CACB + for ; Fri, 30 Jan 2004 07:31:20 -0400 (AST) +Received: (from pgman@localhost) + by candle.pha.pa.us (8.11.6/8.11.6) id i0UBVHU04169; + Fri, 30 Jan 2004 06:31:17 -0500 (EST) +From: Bruce Momjian +Message-ID: <200401301131.i0UBVHU04169@candle.pha.pa.us> +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <87vfmwrtxt.fsf@stark.xeocode.com> +To: Greg Stark +Date: Fri, 30 Jan 2004 06:31:17 -0500 (EST) +cc: Tom Lane , pgsql-hackers@postgresql.org +X-Mailer: ELM [version 2.4ME+ PL108 (25)] +MIME-Version: 1.0 +Content-Transfer-Encoding: 7bit +Content-Type: text/plain; charset=US-ASCII +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Greg Stark wrote: +> +> Tom Lane writes: +> +> > In any case, this discussion is predicated on the assumption that the +> > operations involving the bitmap are a significant fraction of the total +> > time, which I think is quite uncertain. Until we build it and profile +> > it, we won't know that. +> +> The other thought I had was that it would be difficult to tell when to follow +> this path. Since the main case where it wins is when the individual indexes +> aren't very selective but the combination is very selective, and we don't have +> inter-column correlation statistics ... + +I like the idea of building in-memory bitmapped indexes. + +In your example, if you are restricting on A and B, and have no A,B +index but an A index and B index, why wouldn't you always create an +in-memory bitmapped index from indexes A and B, unless index A hits only +a few rows. In fact, from the optimizer statistics, you can guess on +how many bits you will hit from index A and index B, so we only have to +decide if it is better to take the more restrictive index and do heap +lookups for those, or scan the second index and then hit the heap. The +only thing A,B combined statistics would tell you is how many heap +matches you will find. The time to scan A and B indexes and create the +bitmap is already guessable from the single column statistics. + +Also, what does an in-memory bitmapped index look like? Is it: + + value: bitmap... + value: bitmap... + +with the values organized in a btree fashion? + +-- + Bruce Momjian | http://candle.pha.pa.us + pgman@candle.pha.pa.us | (610) 359-1001 + + If your life is a hard drive, | 13 Roberts Road + + Christ can be your backup. | Newtown Square, Pennsylvania 19073 + +---------------------------(end of broadcast)--------------------------- +TIP 6: Have you searched our list archives? + + http://archives.postgresql.org + +From pgsql-hackers-owner+M49505=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 09:55:27 2004 +Return-path: +Received: from zippy.ims.net (IDENT:BTCTknqFfnMWdPgoZjvES928uVdg+CPr@zippy.ims.net [208.166.202.2]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UEtPe12397 + for ; Fri, 30 Jan 2004 09:55:26 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by zippy.ims.net (8.11.6/linuxconf) with ESMTP id i0UEsQt01250 + for ; Fri, 30 Jan 2004 08:54:31 -0600 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 3DF5DD1C9E1 + for ; Fri, 30 Jan 2004 14:48:26 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 55394-05 + for ; + Fri, 30 Jan 2004 10:48:29 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id 79B71D1C992 + for ; Fri, 30 Jan 2004 10:48:25 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0UEmJw9012966; + Fri, 30 Jan 2004 09:48:19 -0500 (EST) +To: Bruce Momjian +cc: Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <200401301131.i0UBVHU04169@candle.pha.pa.us> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> +Comments: In-reply-to Bruce Momjian + message dated "Fri, 30 Jan 2004 06:31:17 -0500" +Date: Fri, 30 Jan 2004 09:48:19 -0500 +Message-ID: <12965.1075474099@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=no + version=2.61 +Status: ORr + +Bruce Momjian writes: +> Also, what does an in-memory bitmapped index look like? + +One idea that might work: a binary search tree in which each node +represents a single page of the table, and contains a bit array with +one bit for each possible item number on the page. You could not need +more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits +in a node, or about 36 bytes at default BLCKSZ --- for most tables you +could probably prove it would be a great deal less. You only allocate +nodes for pages that have at least one interesting row. + +I think this would represent a reasonable compromise between size and +insertion speed. It would only get large if the indexscan output +demanded visiting many different pages --- but at some point you could +abandon index usage and do a sequential scan, so I think that property +is okay. + +A variant is to make the per-page bit arrays be entries in a hash table +with page number as hash key. This would reduce insertion to a nearly +constant-time operation, but the drawback is that you'd need an explicit +sort at the end to put the per-page entries into page number order +before you scan 'em. You might come out ahead anyway, not sure. + +Or we could try a true linear bitmap (indexed by page number times +max-items-per-page plus item number) that's compressed in some fashion, +probably just by eliminating large runs of zeroes. The difficulty here +is that inserting a new one-bit could be pretty expensive, and we need +it to be cheap. + +Perhaps someone can come up with other better ideas ... + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 8: explain analyze is your friend + +From pgsql-hackers-owner+M49506=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 10:23:37 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UFNZe17036 + for ; Fri, 30 Jan 2004 10:23:36 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 797996 for pgman@candle.pha.pa.us; Fri, 30 Jan 2004 07:20:18 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 8901ED1C9B3 + for ; Fri, 30 Jan 2004 15:14:26 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 67347-02 + for ; + Fri, 30 Jan 2004 11:14:30 -0400 (AST) +Received: from candle.pha.pa.us (candle.pha.pa.us [207.106.42.251]) + by svr1.postgresql.org (Postfix) with ESMTP id F021AD1C95E + for ; Fri, 30 Jan 2004 11:14:24 -0400 (AST) +Received: (from pgman@localhost) + by candle.pha.pa.us (8.11.6/8.11.6) id i0UFEMl15556; + Fri, 30 Jan 2004 10:14:22 -0500 (EST) +From: Bruce Momjian +Message-ID: <200401301514.i0UFEMl15556@candle.pha.pa.us> +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <12965.1075474099@sss.pgh.pa.us> +To: Tom Lane +Date: Fri, 30 Jan 2004 10:14:22 -0500 (EST) +cc: Greg Stark , pgsql-hackers@postgresql.org +X-Mailer: ELM [version 2.4ME+ PL108 (25)] +MIME-Version: 1.0 +Content-Transfer-Encoding: 7bit +Content-Type: text/plain; charset=US-ASCII +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Tom Lane wrote: +> Bruce Momjian writes: +> > Also, what does an in-memory bitmapped index look like? +> +> One idea that might work: a binary search tree in which each node +> represents a single page of the table, and contains a bit array with +> one bit for each possible item number on the page. You could not need +> more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits +> in a node, or about 36 bytes at default BLCKSZ --- for most tables you +> could probably prove it would be a great deal less. You only allocate +> nodes for pages that have at least one interesting row. + +Actually, I think I made a mistake. I was wondering what on-disk +bitmapped indexes look like. + +-- + Bruce Momjian | http://candle.pha.pa.us + pgman@candle.pha.pa.us | (610) 359-1001 + + If your life is a hard drive, | 13 Roberts Road + + Christ can be your backup. | Newtown Square, Pennsylvania 19073 + +---------------------------(end of broadcast)--------------------------- +TIP 9: the planner will ignore your desire to choose an index scan if your + joining column's datatypes do not match + +From pgsql-hackers-owner+M49507=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 10:31:27 2004 +Return-path: +Received: from zippy.ims.net (IDENT:AWZrLd+EfFmX1x4Ch6+4AfIqn908pAfY@zippy.ims.net [208.166.202.2]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UFVOe18065 + for ; Fri, 30 Jan 2004 10:31:26 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by zippy.ims.net (8.11.6/linuxconf) with ESMTP id i0UFURt02719 + for ; Fri, 30 Jan 2004 09:30:32 -0600 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 9DF9ED1CCA7 + for ; Fri, 30 Jan 2004 15:22:35 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 66733-09 + for ; + Fri, 30 Jan 2004 11:22:39 -0400 (AST) +Received: from candle.pha.pa.us (candle.pha.pa.us [207.106.42.251]) + by svr1.postgresql.org (Postfix) with ESMTP id 235C3D1CCB2 + for ; Fri, 30 Jan 2004 11:22:33 -0400 (AST) +Received: (from pgman@localhost) + by candle.pha.pa.us (8.11.6/8.11.6) id i0UFMYr16926; + Fri, 30 Jan 2004 10:22:34 -0500 (EST) +From: Bruce Momjian +Message-ID: <200401301522.i0UFMYr16926@candle.pha.pa.us> +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <87vfmwrtxt.fsf@stark.xeocode.com> +To: Greg Stark +Date: Fri, 30 Jan 2004 10:22:34 -0500 (EST) +cc: Tom Lane , pgsql-hackers@postgresql.org +X-Mailer: ELM [version 2.4ME+ PL108 (25)] +MIME-Version: 1.0 +Content-Transfer-Encoding: 7bit +Content-Type: text/plain; charset=US-ASCII +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Greg Stark wrote: +> +> Tom Lane writes: +> +> > In any case, this discussion is predicated on the assumption that the +> > operations involving the bitmap are a significant fraction of the total +> > time, which I think is quite uncertain. Until we build it and profile +> > it, we won't know that. +> +> The other thought I had was that it would be difficult to tell when to follow +> this path. Since the main case where it wins is when the individual indexes +> aren't very selective but the combination is very selective, and we don't have +> inter-column correlation statistics ... + +We actually have heap access cost and index access cost. You could +compare costs of looking at all of index A's heap vs. looking at index +B and then hopefully fewer heap rows. + +-- + Bruce Momjian | http://candle.pha.pa.us + pgman@candle.pha.pa.us | (610) 359-1001 + + If your life is a hard drive, | 13 Roberts Road + + Christ can be your backup. | Newtown Square, Pennsylvania 19073 + +---------------------------(end of broadcast)--------------------------- +TIP 2: you can get off all lists at once with the unregister command + (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) + +From alvherre@CM-lcon2-51-253.cm.vtr.net Fri Jan 30 10:24:32 2004 +Return-path: +Received: from CM-lcon2-51-253.cm.vtr.net (CM-lcon2-51-253.cm.vtr.net [200.83.51.253]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UFOSe17199 + for ; Fri, 30 Jan 2004 10:24:31 -0500 (EST) +Received: by CM-lcon2-51-253.cm.vtr.net (Postfix, from userid 500) + id 9A93157578; Fri, 30 Jan 2004 10:24:18 -0500 (EST) +Date: Fri, 30 Jan 2004 12:24:18 -0300 +From: Alvaro Herrera +To: Tom Lane +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +Message-ID: <20040130152418.GB24123@dcc.uchile.cl> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> <12965.1075474099@sss.pgh.pa.us> +MIME-Version: 1.0 +Content-Type: text/plain; charset=iso-8859-1 +Content-Disposition: inline +Content-Transfer-Encoding: 8bit +In-Reply-To: <12965.1075474099@sss.pgh.pa.us> +User-Agent: Mutt/1.4.1i +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: ORr + +On Fri, Jan 30, 2004 at 09:48:19AM -0500, Tom Lane wrote: + +> A variant is to make the per-page bit arrays be entries in a hash table +> with page number as hash key. This would reduce insertion to a nearly +> constant-time operation, but the drawback is that you'd need an explicit +> sort at the end to put the per-page entries into page number order +> before you scan 'em. You might come out ahead anyway, not sure. + +Is there a reason sort the pages before scanning them? The result won't +come out sorted one way or the other. + +-- +Alvaro Herrera () +"Para tener más hay que desear menos" + +From pgsql-hackers-owner+M49508=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 10:33:18 2004 +Return-path: +Received: from zippy.ims.net (IDENT:Lj5veoF1GO3p04hu8b6BDDLvyD1wii0f@zippy.ims.net [208.166.202.2]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UFXHe18303 + for ; Fri, 30 Jan 2004 10:33:18 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by zippy.ims.net (8.11.6/linuxconf) with ESMTP id i0UFWIt02804 + for ; Fri, 30 Jan 2004 09:32:21 -0600 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id E41F6D1CCDC + for ; Fri, 30 Jan 2004 15:24:25 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 72118-01 + for ; + Fri, 30 Jan 2004 11:24:29 -0400 (AST) +Received: from CM-lcon2-51-253.cm.vtr.net (CM-lcon2-51-253.cm.vtr.net [200.83.51.253]) + by svr1.postgresql.org (Postfix) with ESMTP id 219F9D1CCDB + for ; Fri, 30 Jan 2004 11:24:25 -0400 (AST) +Received: by CM-lcon2-51-253.cm.vtr.net (Postfix, from userid 500) + id 9A93157578; Fri, 30 Jan 2004 10:24:18 -0500 (EST) +Date: Fri, 30 Jan 2004 12:24:18 -0300 +From: Alvaro Herrera +To: Tom Lane +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +Message-ID: <20040130152418.GB24123@dcc.uchile.cl> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> <12965.1075474099@sss.pgh.pa.us> +MIME-Version: 1.0 +Content-Type: text/plain; charset=iso-8859-1 +Content-Disposition: inline +Content-Transfer-Encoding: 8bit +In-Reply-To: <12965.1075474099@sss.pgh.pa.us> +User-Agent: Mutt/1.4.1i +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=no + version=2.61 +Status: OR + +On Fri, Jan 30, 2004 at 09:48:19AM -0500, Tom Lane wrote: + +> A variant is to make the per-page bit arrays be entries in a hash table +> with page number as hash key. This would reduce insertion to a nearly +> constant-time operation, but the drawback is that you'd need an explicit +> sort at the end to put the per-page entries into page number order +> before you scan 'em. You might come out ahead anyway, not sure. + +Is there a reason sort the pages before scanning them? The result won't +come out sorted one way or the other. + +-- +Alvaro Herrera () +"Para tener más hay que desear menos" + +---------------------------(end of broadcast)--------------------------- +TIP 4: Don't 'kill -9' the postmaster + +From pgsql-hackers-owner+M49509=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 10:39:11 2004 +Return-path: +Received: from zippy.ims.net (IDENT:QumGpJuSSF+qB+W577trqd4FqP6fc1O+@zippy.ims.net [208.166.202.2]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UFd9e19273 + for ; Fri, 30 Jan 2004 10:39:10 -0500 (EST) +Received: from postgresql.org (svr1.postgresql.org [200.46.204.71]) + by zippy.ims.net (8.11.6/linuxconf) with ESMTP id i0UFcDt02990 + for ; Fri, 30 Jan 2004 09:38:17 -0600 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id 606FBD1BA96 + for ; Fri, 30 Jan 2004 15:31:24 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 73148-04 + for ; + Fri, 30 Jan 2004 11:31:28 -0400 (AST) +Received: from candle.pha.pa.us (candle.pha.pa.us [207.106.42.251]) + by svr1.postgresql.org (Postfix) with ESMTP id D7A47D1B4BD + for ; Fri, 30 Jan 2004 11:31:22 -0400 (AST) +Received: (from pgman@localhost) + by candle.pha.pa.us (8.11.6/8.11.6) id i0UFUgQ18014; + Fri, 30 Jan 2004 10:30:42 -0500 (EST) +From: Bruce Momjian +Message-ID: <200401301530.i0UFUgQ18014@candle.pha.pa.us> +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <20040130152418.GB24123@dcc.uchile.cl> +To: Alvaro Herrera +Date: Fri, 30 Jan 2004 10:30:42 -0500 (EST) +cc: Tom Lane , Greg Stark , + pgsql-hackers@postgresql.org +X-Mailer: ELM [version 2.4ME+ PL108 (25)] +MIME-Version: 1.0 +Content-Transfer-Encoding: 7bit +Content-Type: text/plain; charset=US-ASCII +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +Status: OR + +Alvaro Herrera wrote: +> On Fri, Jan 30, 2004 at 09:48:19AM -0500, Tom Lane wrote: +> +> > A variant is to make the per-page bit arrays be entries in a hash table +> > with page number as hash key. This would reduce insertion to a nearly +> > constant-time operation, but the drawback is that you'd need an explicit +> > sort at the end to put the per-page entries into page number order +> > before you scan 'em. You might come out ahead anyway, not sure. +> +> Is there a reason sort the pages before scanning them? The result won't +> come out sorted one way or the other. + +I think the goal would be to hit the heap in sequential order as much as +possible. When we are doing reading right from the index, we haven't +collected all the heap values in one place, but since we have them in +memory, we might as well sort them, though I don't think that is a +requirement, just a performance enhancement, or at least that is my +guess. + +-- + Bruce Momjian | http://candle.pha.pa.us + pgman@candle.pha.pa.us | (610) 359-1001 + + If your life is a hard drive, | 13 Roberts Road + + Christ can be your backup. | Newtown Square, Pennsylvania 19073 + +---------------------------(end of broadcast)--------------------------- +TIP 8: explain analyze is your friend + +From hannu@tm.ee Fri Jan 30 17:44:13 2004 +Return-path: +Received: from fuji.krosing.net (217-159-136-226-dsl.kt.estpak.ee [217.159.136.226]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UMi5e23093 + for ; Fri, 30 Jan 2004 17:44:12 -0500 (EST) +Received: from fuji.krosing.net (localhost.localdomain [127.0.0.1]) + by fuji.krosing.net (8.12.8/8.12.8) with ESMTP id i0UMhuEl005243; + Sat, 31 Jan 2004 00:43:57 +0200 +Received: (from hannu@localhost) + by fuji.krosing.net (8.12.8/8.12.8/Submit) id i0UMhs94005241; + Sat, 31 Jan 2004 00:43:54 +0200 +X-Authentication-Warning: fuji.krosing.net: hannu set sender to hannu@tm.ee using -f +Subject: Re: [HACKERS] Question about indexes +From: Hannu Krosing +To: Tom Lane +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +In-Reply-To: <12965.1075474099@sss.pgh.pa.us> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> + <12965.1075474099@sss.pgh.pa.us> +Content-Type: text/plain; charset= +Message-ID: <1075502634.4007.32.camel@fuji.krosing.net> +MIME-Version: 1.0 +X-Mailer: Ximian Evolution 1.4.5 +Date: Sat, 31 Jan 2004 00:43:54 +0200 +Content-Transfer-Encoding: 8bit +X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id i0UMi5e23093 +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Tom Lane kirjutas R, 30.01.2004 kell 16:48: +> Bruce Momjian writes: +> > Also, what does an in-memory bitmapped index look like? +> +> One idea that might work: a binary search tree in which each node +> represents a single page of the table, and contains a bit array with +> one bit for each possible item number on the page. You could not need +> more than BLCKSZ/(sizeof(HeapTupleHeaderData)+sizeof(ItemIdData)) bits +> in a node, or about 36 bytes at default BLCKSZ --- for most tables you +> could probably prove it would be a great deal less. You only allocate +> nodes for pages that have at least one interesting row. + +Another idea would be using bitmaps where we have just one bit per +database page and do a seq scan but just over marked pages. + +Even when allocating them in full such indexes would occupy just +1/(8k*8bit) of the amount they describe, so index for 1GB table would be +1G/(8k*8bit) = 16 kilobytes (2 pages) + +Also, such indexes, if persistent, could also be used (together with +FSM) when deciding placement of new tuples, so they provide a form of +clustering. + +This would of course be most useful for data-warehouse type operations, +where database is significantöy bigger than memory. + +And the seqscan over bitmap should not be done in simple page order, but +rather in two passes - + 1. over those pages which are already in cache (either postgresqls + or systems (if we find a way to get such info from the system)) + 2. in sequential order over the rest. + +> I think this would represent a reasonable compromise between size and +> insertion speed. It would only get large if the indexscan output +> demanded visiting many different pages --- but at some point you could +> abandon index usage and do a sequential scan, so I think that property +> is okay. + +One case where almost full intermediate bitmap could be needed is when +doing a star join or just AND of several conditions, where each single +index spans a significant part of the table, but the result does not. + +> A variant is to make the per-page bit arrays be entries in a hash table +> with page number as hash key. This would reduce insertion to a nearly +> constant-time operation, but the drawback is that you'd need an explicit +> sort at the end to put the per-page entries into page number order +> before you scan 'em. You might come out ahead anyway, not sure. +> +> Or we could try a true linear bitmap (indexed by page number times +> max-items-per-page plus item number) that's compressed in some fashion, +> probably just by eliminating large runs of zeroes. The difficulty here +> is that inserting a new one-bit could be pretty expensive, and we need +> it to be cheap. +> +> Perhaps someone can come up with other better ideas ... + +I have also contemplated a scenario, where we could use some +not-quite-max power-of-2 bits-per-page linear bitmap and mark intra-page +wraps (when we tried to mark a point past that not-quite-max number in a +page) in high bit (or another bitmap) making info for that page folded. +AN example would be setting bit 40 in 32-bits/page index - this would +set bit 40&31 and mark the page folded. + +When combining such indexes using AND or OR, we need some spcial +handling of folded pages, but could still get non-folded (0) results out +from AND of 2 folded pages if the bits are distributed nicely. + +-------------- +Hannu + + + + + + + + + + + + + +From pgsql-hackers-owner+M49529=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 18:10:22 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UNAKe25860 + for ; Fri, 30 Jan 2004 18:10:21 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 799059 for pgman@candle.pha.pa.us; Fri, 30 Jan 2004 15:07:00 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id C2AB7D1CCDD + for ; Fri, 30 Jan 2004 23:03:05 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 46819-09 + for ; + Fri, 30 Jan 2004 19:03:08 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id AD55DD1C967 + for ; Fri, 30 Jan 2004 19:03:04 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0UN2wBL020777; + Fri, 30 Jan 2004 18:02:58 -0500 (EST) +To: Hannu Krosing +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <1075502634.4007.32.camel@fuji.krosing.net> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> <12965.1075474099@sss.pgh.pa.us> <1075502634.4007.32.camel@fuji.krosing.net> +Comments: In-reply-to Hannu Krosing + message dated "Sat, 31 Jan 2004 00:43:54 +0200" +Date: Fri, 30 Jan 2004 18:02:58 -0500 +Message-ID: <20776.1075503778@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=no + version=2.61 +Status: OR + +Hannu Krosing writes: +> Another idea would be using bitmaps where we have just one bit per +> database page and do a seq scan but just over marked pages. + +That seems a bit too lossy for me, but I really like your later idea +about folding. Generalizing that a little, we can choose any fold point +we like. We could allocate, say, one 32-bit word per page and set the +(i mod 32) bit when item i is fingered by the index. After retrieving +the heap page, we'd need to test all the valid rows that have item +numbers matching a set bit mod 32. On typical tables (with circa 100 +items per page) this would require testing only about 3 rows per page. +ORing and ANDing of such bitmaps still works, with the understanding +that it's lossy and you have to double check each retrieved tuple. + +If the fold point is above about 100, your idea of keeping track of +whether we actually set any wrapped-around bits would become useful, +but below that I think we'd just be wasting a bit. + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 5: Have you checked our extensive FAQ? + + http://www.postgresql.org/docs/faqs/FAQ.html + +From tgl@sss.pgh.pa.us Fri Jan 30 18:03:08 2004 +Return-path: +Received: from sss.pgh.pa.us (root@[192.204.191.242]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UN37e24951 + for ; Fri, 30 Jan 2004 18:03:08 -0500 (EST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0UN2wBL020777; + Fri, 30 Jan 2004 18:02:58 -0500 (EST) +To: Hannu Krosing +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <1075502634.4007.32.camel@fuji.krosing.net> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> <12965.1075474099@sss.pgh.pa.us> <1075502634.4007.32.camel@fuji.krosing.net> +Comments: In-reply-to Hannu Krosing + message dated "Sat, 31 Jan 2004 00:43:54 +0200" +Date: Fri, 30 Jan 2004 18:02:58 -0500 +Message-ID: <20776.1075503778@sss.pgh.pa.us> +From: Tom Lane +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Hannu Krosing writes: +> Another idea would be using bitmaps where we have just one bit per +> database page and do a seq scan but just over marked pages. + +That seems a bit too lossy for me, but I really like your later idea +about folding. Generalizing that a little, we can choose any fold point +we like. We could allocate, say, one 32-bit word per page and set the +(i mod 32) bit when item i is fingered by the index. After retrieving +the heap page, we'd need to test all the valid rows that have item +numbers matching a set bit mod 32. On typical tables (with circa 100 +items per page) this would require testing only about 3 rows per page. +ORing and ANDing of such bitmaps still works, with the understanding +that it's lossy and you have to double check each retrieved tuple. + +If the fold point is above about 100, your idea of keeping track of +whether we actually set any wrapped-around bits would become useful, +but below that I think we'd just be wasting a bit. + + regards, tom lane + +From hannu@tm.ee Fri Jan 30 18:21:59 2004 +Return-path: +Received: from fuji.krosing.net (217-159-136-226-dsl.kt.estpak.ee [217.159.136.226]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0UNLue27301 + for ; Fri, 30 Jan 2004 18:21:57 -0500 (EST) +Received: from fuji.krosing.net (localhost.localdomain [127.0.0.1]) + by fuji.krosing.net (8.12.8/8.12.8) with ESMTP id i0UNLpEl006023; + Sat, 31 Jan 2004 01:21:51 +0200 +Received: (from hannu@localhost) + by fuji.krosing.net (8.12.8/8.12.8/Submit) id i0UNLgx1006021; + Sat, 31 Jan 2004 01:21:42 +0200 +X-Authentication-Warning: fuji.krosing.net: hannu set sender to hannu@tm.ee using -f +Subject: Re: [HACKERS] Question about indexes +From: Hannu Krosing +To: Tom Lane +cc: Bruce Momjian , Greg Stark , + pgsql-hackers@postgresql.org +In-Reply-To: <20776.1075503778@sss.pgh.pa.us> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> + <12965.1075474099@sss.pgh.pa.us> + <1075502634.4007.32.camel@fuji.krosing.net> + <20776.1075503778@sss.pgh.pa.us> +Content-Type: text/plain +Content-Transfer-Encoding: 7bit +Message-ID: <1075504902.4007.43.camel@fuji.krosing.net> +MIME-Version: 1.0 +X-Mailer: Ximian Evolution 1.4.5 +Date: Sat, 31 Jan 2004 01:21:42 +0200 +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + +Tom Lane kirjutas L, 31.01.2004 kell 01:02: +> Hannu Krosing writes: +> > Another idea would be using bitmaps where we have just one bit per +> > database page and do a seq scan but just over marked pages. +> +> That seems a bit too lossy for me, + +I originally thought of it in context of data-warehousing and persistent +bitmap indexes. there the use of these same bitmaps for clustering would +un-lossify this approach. + +> but I really like your later idea +> about folding. Generalizing that a little, we can choose any fold point +> we like. We could allocate, say, one 32-bit word per page and set the +> (i mod 32) bit when item i is fingered by the index. After retrieving +> the heap page, we'd need to test all the valid rows that have item +> numbers matching a set bit mod 32. On typical tables (with circa 100 +> items per page) this would require testing only about 3 rows per page. +> ORing and ANDing of such bitmaps still works, with the understanding +> that it's lossy and you have to double check each retrieved tuple. +> +> If the fold point is above about 100, your idea of keeping track of +> whether we actually set any wrapped-around bits would become useful, +> but below that I think we'd just be wasting a bit. + +Not only wasting bits, but also making the code hairier - we can't just +do simple ANDs and ORs. + +-------------- +Hannu + +From gsstark@mit.edu Fri Jan 30 19:04:21 2004 +Return-path: +Received: from smtp.istop.com (dci.doncaster.on.ca [66.11.168.194]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0V04De01505 + for ; Fri, 30 Jan 2004 19:04:21 -0500 (EST) +Received: from stark.xeocode.com (gsstark.mtl.istop.com [66.11.160.162]) + by smtp.istop.com (Postfix) with ESMTP + id 7CC2436E2F; Fri, 30 Jan 2004 19:04:04 -0500 (EST) +Received: from localhost ([127.0.0.1] helo=stark.xeocode.com) + by stark.xeocode.com with smtp (Exim 3.36 #1 (Debian)) + id 1AmicG-0007zf-00; Fri, 30 Jan 2004 19:04:04 -0500 +Sender: gsstark@mit.edu +To: Tom Lane +cc: Hannu Krosing , Bruce Momjian , + Greg Stark , pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> + <12965.1075474099@sss.pgh.pa.us> + <1075502634.4007.32.camel@fuji.krosing.net> + <20776.1075503778@sss.pgh.pa.us> +In-Reply-To: <20776.1075503778@sss.pgh.pa.us> +From: Greg Stark +Organization: The Emacs Conspiracy; member since 1992 +Date: 30 Jan 2004 19:04:03 -0500 +Message-ID: <87wu79vv9o.fsf@stark.xeocode.com> +Lines: 21 +User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham + version=2.61 +Status: OR + + +Tom Lane writes: + +> That seems a bit too lossy for me, but I really like your later idea +> about folding. Generalizing that a little, we can choose any fold point +> we like. We could allocate, say, one 32-bit word per page and set the +> (i mod 32) bit when item i is fingered by the index. After retrieving +> the heap page, we'd need to test all the valid rows that have item +> numbers matching a set bit mod 32. On typical tables (with circa 100 +> items per page) this would require testing only about 3 rows per page. +> ORing and ANDing of such bitmaps still works, with the understanding +> that it's lossy and you have to double check each retrieved tuple. + +That would make it really hard to ever clear the bits. What do you do when you +vacuum and one of the tuples is no longer needed. You can't be sure you can +clear the bit in the index because there could be multiple tuples represented +by the bit being set. You would have to test the condition on the other tuples +covered by the bit to see if it can be cleared. + +-- +greg + +From pgsql-hackers-owner+M49533=pgman=candle.pha.pa.us@postgresql.org Fri Jan 30 19:56:45 2004 +Return-path: +Received: from joeconway.com (66-146-172-86.skyriver.net [66.146.172.86]) + by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i0V0uhe05716 + for ; Fri, 30 Jan 2004 19:56:44 -0500 (EST) +Received: from postgresql.org ([200.46.204.71] verified) + by joeconway.com (CommuniGate Pro SMTP 4.1.8) + with ESMTP id 799253 for pgman@candle.pha.pa.us; Fri, 30 Jan 2004 16:53:23 -0800 +X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org +Received: from localhost (unknown [200.46.204.2]) + by svr1.postgresql.org (Postfix) with ESMTP id B7F53D1CC9B + for ; Sat, 31 Jan 2004 00:50:25 +0000 (GMT) +Received: from svr1.postgresql.org ([200.46.204.71]) + by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) + with ESMTP id 76472-01 + for ; + Fri, 30 Jan 2004 20:50:28 -0400 (AST) +Received: from sss.pgh.pa.us (unknown [192.204.191.242]) + by svr1.postgresql.org (Postfix) with ESMTP id 0A06FD1CB1D + for ; Fri, 30 Jan 2004 20:50:25 -0400 (AST) +Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1]) + by sss.pgh.pa.us (8.12.11/8.12.11) with ESMTP id i0V0oN9U023293; + Fri, 30 Jan 2004 19:50:24 -0500 (EST) +To: Greg Stark +cc: Hannu Krosing , Bruce Momjian , + pgsql-hackers@postgresql.org +Subject: Re: [HACKERS] Question about indexes +In-Reply-To: <87wu79vv9o.fsf@stark.xeocode.com> +References: <200401301131.i0UBVHU04169@candle.pha.pa.us> <12965.1075474099@sss.pgh.pa.us> <1075502634.4007.32.camel@fuji.krosing.net> <20776.1075503778@sss.pgh.pa.us> <87wu79vv9o.fsf@stark.xeocode.com> +Comments: In-reply-to Greg Stark + message dated "30 Jan 2004 19:04:03 -0500" +Date: Fri, 30 Jan 2004 19:50:23 -0500 +Message-ID: <23292.1075510223@sss.pgh.pa.us> +From: Tom Lane +X-Virus-Scanned: by amavisd-new at postgresql.org +X-Mailing-List: pgsql-hackers +Precedence: bulk +Sender: pgsql-hackers-owner@postgresql.org +X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on + candle.pha.pa.us +X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=no + version=2.61 +Status: OR + +Greg Stark writes: +> Tom Lane writes: +>> ORing and ANDing of such bitmaps still works, with the understanding +>> that it's lossy and you have to double check each retrieved tuple. + +> That would make it really hard to ever clear the bits. + +We're speaking of in-memory bitmaps constructed on-the-fly here. You're +right that it wouldn't work for persistent indexes, but I'm not very +interested in that case at the moment ... + + regards, tom lane + +---------------------------(end of broadcast)--------------------------- +TIP 8: explain analyze is your friend +