From b6e42bdd92cc35ae3dfdbcc48fc6417280a14c42 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 9 Apr 2009 19:07:44 +0000 Subject: [PATCH] Update GIN limitations documentation to match current reality. --- doc/src/sgml/gin.sgml | 52 ++++++++++++++++++++++++++----------------- 1 file changed, 31 insertions(+), 21 deletions(-) diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml index 4c0438f910..adcb0455e0 100644 --- a/doc/src/sgml/gin.sgml +++ b/doc/src/sgml/gin.sgml @@ -1,4 +1,4 @@ - + GIN Indexes @@ -103,8 +103,10 @@ If the query contains no keys then extractQuery should store 0 or -1 into *nkeys, depending on the semantics of the operator. 0 means that every - value matches the query and a sequential scan should be - performed. -1 means nothing can match the query. + value matches the query and a full-index scan should be + performed (but see ). + -1 means that nothing can match the query, and + so the index scan can be skipped entirely. pmatch is an output argument for use when partial match is supported. To use it, extractQuery must allocate an array of *nkeys booleans and store its address at @@ -354,26 +356,20 @@ Limitations - GIN doesn't support full index scans: because there are - often many keys per value, each heap pointer would be returned many times, - and there is no easy way to prevent this. + GIN doesn't support full index scans. The reason for + this is that extractValue is allowed to return zero keys, + as for example might happen with an empty string or empty array. In such + a case the indexed value will be unrepresented in the index. It is + therefore impossible for GIN to guarantee that a + scan of the index can find every row in the table. - When extractQuery returns zero keys, - GIN will emit an error. Depending on the operator, - a void query might match all, some, or none of the indexed values (for - example, every array contains the empty array, but does not overlap the - empty array), and GIN cannot determine the correct - answer, nor produce a full-index-scan result if it could determine that - that was correct. - - - - It is not an error for extractValue to return zero keys, - but in this case the indexed value will be unrepresented in the index. - This is another reason why full index scan is not useful — it would - miss such rows. + Because of this limitation, when extractQuery returns + nkeys = 0 to indicate that all values match the query, + GIN will emit an error. (If there are multiple ANDed + indexable operators in the query, this happens only if they all return zero + for nkeys.) @@ -383,7 +379,21 @@ extractQuery must convert an unrestricted search into a partial-match query that will scan the whole index. This is inefficient but might be necessary to avoid corner-case failures with operators such - as LIKE. + as LIKE or subset inclusion. + + + + GIN assumes that indexable operators are strict. + This means that extractValue will not be called at all on + a NULL value (so the value will go unindexed), and + extractQuery will not be called on a NULL comparison + value either (instead, the query is presumed to be unmatchable). + + + + A possibly more serious limitation is that GIN cannot + handle NULL keys — for example, an array containing a NULL cannot + be handled except by ignoring the NULL. -- 2.40.0